EchoLeak (CVE-2025-32711) Part-1: When an email hijacks Copilot

How a zero-click bug in Microsoft 365 Copilot exposed the hidden risks of retrieval-augmented generation.

Oct 23, 2025

In June 2025 researchers disclosed a zero-click data-exposure bug in Microsoft 365 Copilot called EchoLeak, tracked as CVE-2025-32711. Microsoft assigned a critical severity and according to public reporting the issue was fixed server-side in May with no customer action required and no evidence of exploitation. That timeline matters because it shows how fast LLM-adjacent attack paths move and how much of the risk lives outside the model itself.

The anatomy of EchoLeak

At the core, EchoLeak wasn’t a jailbreak or weight manipulation. It was a retrieval-boundary failure inside Microsoft’s enterprise assistant stack.

Copilot grounds answers on enterprise data sources such as Outlook mail, SharePoint files and Teams chat. An attacker realized that if a crafted email contained specific HTML and metadata patterns, Copilot would ingest that content into its retrieval pipeline. When a user later asked a contextually similar question, Copilot faithfully retrieved the malicious email, summarized it and embedded portions of its content including encoded data fragments into a rendered answer.

On the client side, Outlook and Teams preview panes auto-fetched embedded links and images for convenience. That auto-fetch completed the exfiltration loop: sensitive information encoded inside the malicious message left the tenant boundary without user interaction.

No buffer overflow. No API key leak. Just an over-helpful retrieval engine.

Why it mattered

Microsoft’s advisory stated that only limited configurations were affected, yet the pattern matters. It exposed that Retrieval Augmented Generation (RAG) is itself a trust boundary.

In most Copilot architectures, retrieval and generation run in separate services. Security reviews often focus on the model sandbox, while retrieval is treated as a content pipeline. EchoLeak flipped that assumption: content pipelines can be exploit paths.

OWASP’s LLM Top 10 for 2025 calls this blend of issues LLM01 – Prompt Injection and LLM05 – Improper Output Handling: One stage injects and the next stage exfiltrates.

What the fix tells us

Microsoft patched Copilot server-side by tightening retrieval scope, sanitizing HTML payloads during indexing and applying deterministic outbound filters on generated answers. These changes didn’t touch the model weights at all; the vulnerability was pure plumbing.

That’s a key lesson: most LLM security bugs live in orchestration, not inference.

The broader pattern

EchoLeak joined a growing family of context poisoning and indirect prompt injection flaws across vendors.

2024 saw several RAG-based chat products returning sensitive summaries from public web pages that contained crafted “ignore previous instruction” text.
2023 had the “Sydney” (Bing Chat) prompt-leak incident, proving that model instructions can be extracted through conversational steering.
EchoLeak was the first known case where the injected content lived inside ordinary enterprise email.

Each case reinforces the same architectural truth: RAG adds a second untrusted input channel of your own data.

Lessons for builders

Treat retrieval as a security boundary. Validate and sanitize at ingestion.
Do not trust rich-text or HTML inputs from user-controlled sources.
Post-process model outputs before rendering them in clients.
Make exfiltration deterministically impossible through allowlists and scrubbers.
Instrument for side effects network calls, first-seen domains, link previews.

EchoLeak was patched fast and left no known damage, but it marks a shift in LLM security focus from model alignment to data alignment. If your retrieval tier is porous, the model’s guardrails don’t matter - RAG is now part of your attack surface.

In Part-2, we turn this incident into a design checklist - engineering patterns to make retrieval-based assistants fail predictably instead of dangerously. We’ll cover scope boundaries, ingestion hygiene, output filters and the simple detections that could have stopped EchoLeak before it started.

We’re FortifyRoot - the LLM Cost, Safety & Audit Control Layer for Production GenAI.

If you’re facing unpredictable LLM spend, safety risks or need auditability across GenAI workloads - we’d be glad to help.

🔗 Contact Us | FortifyRoot

Discussion about this post

Ready for more?