MCP

Cloudeflare reference architecture for enterprise deployments of MCP

Cloudflare · Apr 16, 2026

When Cloudflare rolled out Model Context Protocol company-wide — from engineering to finance to sales — it ran headfirst into a hard truth: giving employees agentic AI superpowers is easy. Making it safe is not.

MCP, or Model Context Protocol, is the open standard that lets AI assistants connect directly to tools, databases, and services. Think of it as the USB-C port for AI agents — a universal connector that lets a model talk to your Jira, your Google Drive, your internal wiki, all in one agentic workflow. Anthropic, OpenAI, and virtually every major AI company has rallied behind it.

Cloudflare has been one of MCP's most aggressive internal adopters. But broad adoption brought serious risks: authorization sprawl, prompt injection, supply chain vulnerabilities. In a detailed post published this week, Cloudflare's team shared the full architecture they built to govern MCP at scale — and announced two meaningful new tools in the process.

The problem with local MCP servers

Early MCP deployments often run locally — a small server on an employee's laptop that bridges their AI client to, say, Notion or GitHub. It's convenient, but from a security standpoint, it's a mess. No centralized visibility, no update controls, no audit logs. Each employee essentially manages their own infrastructure.

"Locally-hosted MCP servers were a security liability... leaving it up to individual employees and developers to choose which MCP servers they want to run and how they want to keep them up to date. This is a losing game."

Cloudflare's answer was centralization. A dedicated team built a shared MCP platform inside their monorepo. Any employee who wants to expose an internal resource via MCP first gets approval from the AI governance team, then copies a template, writes their tool definitions, and deploys — inheriting audit logging, CI/CD pipelines, and secrets management automatically. Standing up a governed MCP server became a matter of minutes, not weeks.

These servers deploy as remote MCP servers on Cloudflare's own developer platform, hosted on custom domains and distributed across their global network for low-latency access regardless of where an employee is located.

Identity and access: the Cloudflare Access layer

Not all MCP servers are created equal. Some, like Cloudflare's public documentation server, are open to the world. But internal MCP servers — sitting in front of code repositories, project management tools, financial systems — need tight access controls. Cloudflare integrates Cloudflare Access as the OAuth provider for these servers, enforcing SSO, MFA, device certificates, and location-based policies before any agent can even see what tools are available.

MCP server portals: a single pane of glass

As the number of remote MCP servers grew, a new problem emerged: how does an employee — especially one new to MCP — discover what's available to them? The answer is MCP server portals, which act as a centralized directory. An employee connects their AI client to one portal URL, and instantly sees every internal and third-party MCP server they're authorized to use.

Portals also centralize logging and policy enforcement. Administrators can create data loss prevention rules — preventing, say, personally identifiable information from being passed to certain servers — and can define which tools from each server are exposed based on team membership, device type, or other contextual attributes.

Code Mode: the biggest quality-of-life upgrade

This is where things get genuinely clever. Standard MCP requires defining a separate tool schema for every API operation you want to expose. If you have a large platform with hundreds of endpoints, all of those schemas get loaded into the model's context window — burning through tokens and crowding out the actual task.

Cloudflare previously solved this for its own API (which has thousands of endpoints) with "Code Mode": instead of enumerating every tool upfront, the model gets just two tools — a search tool that lets it discover what's available on demand by writing JavaScript, and an execute tool that lets it call what it finds. The model only loads what it needs.

This week, Cloudflare made Code Mode available for all MCP server portals. The numbers are striking: connecting just four internal MCP servers normally exposes 52 tools consuming ~9,400 tokens of context. With Code Mode, those collapse to 2 portal tools consuming ~600 tokens — a 94% reduction. And crucially, the cost stays flat as more servers are added. Enabling it is as simple as appending ?codemode=search_and_execute to your portal URL.

AI Gateway for cost control, and catching shadow MCP

Cloudflare also positions AI Gateway between MCP clients and the underlying LLMs, enabling token budgets per employee and quick switching between model providers to avoid vendor lock-in.

On the detection side, Cloudflare Gateway — their secure web gateway product — can identify employees accessing unauthorized "shadow" MCP servers outside sanctioned portals. The technique combines hostname scanning, URL path detection, and DLP-based body inspection using regex patterns that detect the JSON-RPC method fields unique to MCP traffic ("tools/call", "initialize", etc.). Flagged traffic can be blocked, redirected, or simply logged for review.

Protecting public-facing MCP servers

The architecture also covers the other direction: Cloudflare's own customer-facing MCP servers, which let users agentically manage Cloudflare products. These sit behind the Cloudflare WAF with the AI Security for Apps feature enabled, automatically inspecting inbound MCP traffic for prompt injection attempts and sensitive data leakage.

The full stack, summarized

Centralized remote MCP servers replace ad-hoc local deployments
Cloudflare Access enforces SSO, MFA, and device policy per server
MCP server portals unify discovery, logging, and DLP enforcement
Code Mode cuts context token costs by ~94% and keeps them flat as you scale
AI Gateway manages LLM costs and prevents vendor lock-in
Cloudflare Gateway detects unauthorized shadow MCP traffic
WAF + AI Security for Apps protects public-facing MCP endpoints

Why this matters beyond Cloudflare

The broader lesson here isn't really about Cloudflare's specific product stack — it's about the governance model. MCP adoption in enterprises is accelerating fast, and most organizations are still in the "let employees run whatever they want locally" phase. That works until it doesn't.

The pattern Cloudflare describes — centralize servers, enforce identity at the gateway, surface everything through a discovery portal, instrument the LLM connection, and scan for shadow usage — is a blueprint that applies regardless of which vendor's tools you use. The risks (supply chain attacks, prompt injection, data leakage, runaway token costs) are universal.

Cloudflare's recommendation for organizations earlier in this journey is to start by putting existing remote and third-party MCP servers behind a portal with Code Mode enabled. That single step captures most of the governance and cost benefits without requiring a full platform rethink.

MCP is moving fast. The organizations that get the governance layer right early will be the ones that can move fastest later.

View original source