AI Security in 2025: What I Learned, and What I'll Be Watching for in 2026

This weekend I sat down with a couple of AI security reports for the year. I wanted to understand what the industry actually experienced in 2025, what broke in the real world, and how well those lessons line up with what many of us have been seeing in our own work. I read the Adversa AI incident report, IBM’s 2025 Cost of a Data Breach, and the EchoLeak research paper that documents the first real zero click prompt injection exploit in a production LLM system.

There were pretty clear patterns across the reports. Some of them were expected, like the rise in prompt injection and the steady stream of agent related failures where systems were given too much authority and not enough guardrails. Anyone who has worked with agents this year could feel that coming.

Some patterns were not expected. I did not expect to see how often environments lacked basic access control around AI systems, including environments inside large enterprises. I also did not expect shadow AI to show up with the amount of sensitive data it carried with it but it kind of makes sense. And the EchoLeak exploit was cool to see (as a Red teamer at heart). It proved that prompt injection can fire without any clicks, confirmation dialogs, or warnings. It only needed the normal workflow of email ingestion and a browser auto fetching an image.

All of these point to the same conclusion: AI systems are creating new security problems faster than most organizations can adapt, and 2026 is going to require a significant jump in discipline, visibility, and design maturity.

Here are the takeaways that stood out.

Agentic AI caused the most severe failures

The Adversa report tracked a number of AI incidents and found that 70.6% of them involved generative AI systems. While generative AI accounted for the most breaches, agentic AI was responsible for the most severe failures. These were not minor misfires. These were incidents where an AI system combined multiple tools, workflows, or automation steps and went far outside the bounds of what developers expected or intended.

The report included several examples that illustrate the real-world impact. In one case, internal Amazon source code leaked because an AI assistant pulled from the wrong repository. In another, an airline chatbot gave out incorrect operational instructions that could have affected customer safety. In yet another, a company’s marketing system generated promotional content that caused reputational harm and market confusion. The pattern was consistent. Once an agent received the wrong instruction or misunderstood context, the fall out rippled through multiple systems.

Prompt injection was a major contributor and accounted for 35.3% of incidents, and was often the original trigger that pushed an agent into unsafe behavior.

Unfortunately, the Adversa report does not provide a numeric breakdown of agent only incidents, and I think this would have been interesting to see. It also does not include per industry cost or any quarter by quarter trend data, which would have helped show how fast these incidents are accelerating.

Governance failures quietly made every breach worse

The IBM report reinforced something that a lot of security teams have been feeling, AI adoption is outpacing AI governance. The numbers were a bit worse than I expected. According to the report, 97% of organizations that experienced an AI related breach lacked basic access control around their models or AI workflows. In other words, AI systems were often deployed without RBAC, without structured approval paths, and with very little visibility about who was using them and how.

Shadow AI was another major theme in the IBM report. Shadow AI accounted for 20% of AI related incidents and it was not trivial data being exposed. In those cases, 65% involved customer PII, 40% involved intellectual property, and 34% involved employee PII. These numbers matter because they confirm that shadow AI is not just an IT nuisance. It is a data exposure vector.

The report also looked at what actually happened during these incidents. Operational disruption and unauthorized access both accounted for 31% of the impacts. Other impacts included financial loss, data integrity issues, and reputational damage. The pattern suggests that once an AI system is involved in a breach, the consequences travel across business units rather than staying contained.

I wish the report would have included a sector by sector comparison of AI incident frequency, to better understand which industries are being hit hardest.

EchoLeak showed prompt injection can happen with zero user action

EchoLeak was a very interesting read. It documented a real zero click prompt injection exploit targeting Microsoft 365 Copilot. The attacker only needed to send a crafted email. That was enough to trigger data exfiltration without the victim clicking anything, confirming anything, or even interacting with the content.

The exploit chain bypassed Microsoft’s prompt injection classifier, bypassed link redaction, and then piggybacked on an allowed Microsoft Teams domain to sneak the exfiltration through browser auto fetching.

While interesting, Microsoft stated that there was no evidence of this being exploited in the wild.

What was interesting was how normal everything looked. Copilot ingested the email, generated a response, then the interface fetched an image. Every step looked like standard, everyday usage. It just happened to leak data at the same time.

These types of attacks are why prompt injection needs to be treated as a security vulnerability and not a misconfiguration.

What this all means for 2026

After reading these three papers, a few themes are impossible to ignore.

AI systems fail in the connective tissue. The model is rarely the part that breaks. Instead, the failures show up in orchestration layers, retrieval pipelines, permission boundaries, and all the pieces that glue AI into the rest of the environment. The EchoLeak example shows this perfectly. The model did not break. The system around it did.

The attack surface is expanding because the AI layer now reaches into every corner of the business: email, documents, source code, APIs, knowledge bases, and CI pipelines. Every connection point becomes another potential failure point.

Attackers also do not need deep model knowledge to exploit these systems. They only need to understand where AI overlaps with traditional application behavior. EchoLeak was triggered by an email and a browser fetch, not by model internals.

Based on these papers, here is where I thik we need to focus in 2026. Better privilege design for agents and workflows. Agents should not have broad tool access by default. Narrow, explicit permissions must become the norm. Every tool should be tested, monitored, and bound to clear guardrails.

Stronger AI governance. AI needs the same level of governance that we apply to cloud, identity, and software development. That includes RBAC for models, data handling policies, and complete logging of AI interactions.

Defense in depth around prompt injection and retrieval. EchoLeak confirmed that input filtering, output filtering, provenance controls, and sandboxing are not optional. Retrieval systems also need strict metadata governance to prevent poisoned or misleading context.

Shadow AI discovery and containment. Shadow AI needs to move to the front of the conversation in 2026. It increases data exposure, bypasses controls, and creates parallel AI systems that security teams cannot see. You cannot secure what you do not know exists. Shadow AI visibility must be a priority before any other control can be effective.

It was a good reminder that the AI technology we currently have is incredible, but the risks are real and increasing. We can build safe and resilient AI systems, but only if we treat AI features like production software instead of experiments.