For any cyber-defender continuing to deny the impact of AI on attacker efficiency, welcome to Exhibit A.
Over the past few months, a small group of hacktivists compromised the computers and networks of at least nine Mexican government agencies, stealing more than 195 million identities and tax records, along with vehicle registrations and more than 2.2 million property records, startup Gambit Security stated in a blog post this week that detailed the attack.
Using a playbook coded as a long prompt of about a thousand lines, the attackers relied on Anthropic’s Claude and OpenAI’s ChatGPT AI platforms to infiltrate Mexico’s tax authority and at least eight other Mexican government organizations, the company said.
The simple technique of masquerading as legitimate penetration testers foiled the AI models’ guardrails within 40 minutes and soon the attackers had a pair of powerful allies: AI systems in full attack mode, finding and exploiting vulnerabilities, building tools for the attacks, and bypassing defenses.
The attackers were mainly focused on collecting sensitive information and appear to be hacktivists, says Alon Gromakov, CEO and co-founder of Gambit Security. They also had compromised enough systems to make it difficult to shake loose their foothold, he adds.
“These attackers got into multiple systems, and they were there for a very extended period of time … more than a month, and they left backdoors,” he says. “How do you clean all of that and make sure that nothing is left?”
Mexican authorities have not yet publicly confirmed the attack. Anthropic has “disrupted the activity and banned the accounts,” according to Bloomberg. The incident may be related to similar reports from earlier this year.
Latin American organizations have increasingly become targeted by cyberattacks, with an average of 3,100 cyberthreats per week, compared to less than 1,500 in the United States. Many nations in the region lack national initiatives to harden their systems, but the surge in incidents is due partly to the adoption of AI among attackers. For the most part, fraudsters have relied on AI to make their social engineering more effective. LLMs are very good at producing native-sounding business memos, leading to click-through rates for phishing that are now five times higher, according to Microsoft.
Yet, it’s the technical innovations that are the most worrisome, says Victor Ruiz, freelance cybersecurity professional and founder of Silikn, a Mexican cybersecurity startup.
“Threat actors are no longer using generative AI solely to improve communication; they are leveraging it to develop increasingly sophisticated malware,” he says. “This marks a strategic inflection point for the region, as traditional defenses built around static signatures and behavioral patterns become less effective against code capable of evolving in real-time.”
Augmented Cyberattackers Who Forgot Defense
Gambit Security’s threat researchers gained significant insight into the cyberattack by recovering the chats between the threat group and their AI helpers. The researchers scanned the Internet for specific types of activities and, in the end, struck the motherlode: full conversations between the small team of cyberattackers and the two large language models (LLMs) that they were using, Anthropic’s Claude and OpenAI’s ChatGPT, says Curtis Simpson, Gambit’s chief strategy officer.
“We identified attack infrastructure being used by a threat actor, and we identified that it was available to us [and] was not secured,” he says. “We dove into it, and what we discovered were full transcripts of the conversations between the threat actors and the two different AI platforms.”
Gambit identified the threat group as small — likely less than five — but with no signs of a nation-state affiliation and seemingly no financial motive. They had access to the government systems since at least December, and while not very sophisticated, they knew how to co-opt the AI systems to do their work, Simpson says.
“These are folks that knew what an attack should look like and how the playbook [should be designed],” he says, adding that the playbook “to fundamentally jailbreak the AI was quite detailed — these are folks that know what they’re doing.”
Once the guardrails were bypassed, the AI systems became very effective in helping the attackers identify critical assets, such as digital certificates, full architectural diagrams, and vulnerabilities that could be exploited, Simpson says.
“They were fundamentally using Claude as a flashlight and would have been otherwise poking around in the dark until they gave up, like many attacks have in the past,” he says.
Commercial AI for the Cyber Win
Like other instances of AI systems going above and beyond what users have requested, the technology found ways around blocks that would otherwise have kept the attackers out. In one case, the threat actors wanted the AI systems to test some of the stolen credentials to see if they worked, Simpson says.
“The AI said, ‘None of those credentials work, but let me try some other things. You haven’t asked for any of this, but I’m going to go ahead and do this,'” he says. The system then enumerated all the identities in Active Directory, applied different techniques to compromise those identities, and finally gained access.
“The level of work that it was doing on the attacker’s behalf without even being asked” is impressive, Simpson says. “They’re fundamentally enabling experienced threat actors to be nation-state grade today and inexperienced threat actors to be able to do damage today.”
Using AI to help accelerate offensive tasks and adapt proof-of-concept code into exploits has become a significant aid to attackers, says Silikn’s Ruiz. Tasks that once required significant time and skill to do manually can now be done quickly, in minutes or hours.
In addition, commercial LLMs seemingly remain favored among attackers.
“Researchers still lack concrete proof that so-called ‘dark LLMs’ are being broadly used,” he says. “Tracking their deployment is inherently difficult, as investigators lack reliable tools to definitively attribute malicious artifacts to AI assistance — except in rare cases where attackers themselves disclose it.”