The Agent Protocol: How to Build AI Systems That Work Together

← Issue #29: The Safety Gate

On April 9, Anthropic publicly launched Managed Agents — production infrastructure for deploying multiple Claude instances that coordinate on complex work. One agent researches. Another writes code. A third reviews. They share context, hand off tasks, and escalate disagreements to a supervisor.

This is not a research preview. It is a production-ready system built into the Claude API. Google followed days later with a similar multi-agent framework in Vertex AI. OpenAI is reportedly building the same for its Assistants API.

The shift matters because single-agent architectures hit a wall. Context windows overflow, instructions conflict, and errors compound without review. Multi-agent systems solve this by dividing cognitive labor the same way organizations divide human labor — specialized roles, clear interfaces, escalation paths.

But multi-agent systems fail in predictable ways that most teams discover only in production. These three prompts force you to design the protocol before you write the agents.

The Coordination Problem

Multi-agent systems are easy to start and hard to keep working. The failure modes are predictable and universal:

Failure Mode	What Happens	Frequency
Conflicting actions	Agent A edits a file while Agent B reads stale state	Very common
Infinite delegation	Agent A asks B which asks C which asks A	Common
Context loss	Critical information from Agent A never reaches Agent C	Very common
Silent failure	One agent fails quietly; others continue on bad assumptions	Common
Priority drift	Agents optimize locally while the global objective suffers	Universal

The principle: A multi-agent system is not a collection of agents. It is a protocol — a set of rules governing how agents discover work, claim tasks, share state, and handle failure. Without the protocol, you have chaos with more compute.

Prompt 1 — The Agent Blueprint

Before writing any agent code, you need to define what each agent does, what it cannot do, and how agents find each other. This prompt forces you to design the org chart before hiring.

Prompt 1 — The Agent Blueprint

You are a systems architect designing a multi-agent AI
system. Define the agent topology before writing any code.

OBJECTIVE:
[Describe the overall goal of the system. What does it
produce? Who consumes the output? What is the quality bar?]

TASK DECOMPOSITION:
Break the objective into discrete subtasks. For each:
- What is the input?
- What is the output?
- What skills/tools are required?
- How long should it take?
- What can go wrong?

Now design the agents:

## 1. AGENT ROSTER
For each agent, define:
- NAME: A clear, role-based name (not "Agent 1")
- RESPONSIBILITY: One sentence. What does it own?
- TOOLS: Exact tools/APIs this agent can access
- CANNOT DO: Explicit list of what this agent must NOT do
- INPUTS: What it receives and from whom
- OUTPUTS: What it produces and for whom
- SUCCESS CRITERIA: How do you know it worked?

## 2. AUTHORITY HIERARCHY
- Who is the SUPERVISOR? (receives escalations, breaks ties)
- Who can CREATE new tasks? Who can only EXECUTE?
- Who can OVERRIDE another agent's output?
- What requires HUMAN approval?

## 3. BOUNDARIES
For each pair of agents that interact:
- What is the INTERFACE? (shared file, message queue, API)
- What is the CONTRACT? (schema, format, SLA)
- Who is responsible if the handoff fails?

## OUTPUT: AGENT TOPOLOGY DOCUMENT
Produce an agent roster table, authority diagram, and
interface contract for each agent-to-agent connection.

What happens when you run this: You will discover that most multi-agent designs have overlapping responsibilities and undefined interfaces. The “CANNOT DO” field is the most important — it prevents agents from stepping on each other. If you cannot clearly define what an agent must NOT do, you do not yet understand what it SHOULD do.

Pro tip: The authority hierarchy question catches the #1 architectural mistake: building multi-agent systems where every agent is a peer. Peer-to-peer agent systems produce brilliant, incompatible work. Someone has to be the manager.

Prompt 2 — The Coordination Protocol

The agent blueprint defines who does what. The coordination protocol defines how they work together without colliding. This is where most multi-agent systems break down.

Prompt 2 — The Coordination Protocol

You are designing the coordination rules for a multi-agent
system. These rules prevent conflicts, ensure consistency,
and keep agents aligned on the global objective.

AGENT ROSTER:
[Paste your agent topology from Prompt 1.]

Design the coordination protocol:

## 1. STATE MANAGEMENT
Define a single source of truth for shared state:
- What data is SHARED vs. PRIVATE to each agent?
- How do agents READ shared state? (pull vs. push)
- How do agents WRITE shared state? (locking, versioning)
- What happens on CONFLICT? (last-write-wins, merge, escalate)

Rule: if two agents can write to the same resource,
you MUST define a conflict resolution strategy.

## 2. TASK LIFECYCLE
Define the state machine for every task:
  CREATED -> CLAIMED -> IN_PROGRESS -> REVIEW -> DONE
- Who can transition each state?
- What happens if a task stays IN_PROGRESS for >N minutes?
- What happens if REVIEW rejects the work?
- Maximum retry count before escalation?

## 3. COMMUNICATION RULES
- BROADCAST: Messages all agents receive (new priorities,
  system alerts, configuration changes)
- DIRECTED: Messages between specific agents (task handoffs,
  review requests, status queries)
- ESCALATION: When and how an agent asks the supervisor
  for help (stuck >N minutes, conflicting instructions,
  uncertain confidence)

## 4. ORDERING GUARANTEES
- Which operations MUST be sequential?
- Which CAN be parallel?
- What is the critical path?
- Where are the bottlenecks?

## OUTPUT: COORDINATION SPEC
Produce a state machine diagram, communication matrix
(who talks to whom and when), and conflict resolution
rules for every shared resource.

What happens when you run this: The state management section will expose every race condition in your design. If Agent A reads a file, processes it for 30 seconds, and writes the result — but Agent B modifies the file during those 30 seconds — your system is broken. This prompt forces you to find and fix those collisions before they happen in production.

The Managed Agents Insight

Anthropic’s implementation reveals a key architectural choice that every builder should internalize:

Without a supervisor, agents optimize locally.

Local optimization + global ignorance = emergent dysfunction.

Each agent does its job perfectly while the system as a whole produces garbage. The supervisor exists to maintain global coherence — the same way an engineering manager ensures that the frontend team and backend team are building toward the same product, not two different ones.

Anthropic’s Managed Agents framework makes the supervisor mandatory. Every multi-agent workflow has a top-level orchestrator that delegates, monitors, and intervenes. This is not optional infrastructure — it is the core design pattern.

The three properties a supervisor must have:

Visibility — Can see the state and progress of every sub-agent
Authority — Can redirect, pause, or terminate any sub-agent
Context — Understands the global objective, not just individual task status

Prompt 3 — The Graceful Degradation Plan

In production, agents will fail. Networks drop, APIs throttle, models hallucinate, context windows overflow. The question is not if but how the system responds when one agent goes down.

Prompt 3 — The Graceful Degradation Plan

You are a reliability engineer designing failure handling
for a multi-agent AI system. Every agent WILL fail. Your
job is to ensure the system degrades gracefully instead
of catastrophically.

AGENT ROSTER:
[Paste your agent topology from Prompt 1.]

COORDINATION PROTOCOL:
[Paste key rules from Prompt 2.]

Design the degradation plan:

## 1. FAILURE TAXONOMY
For each agent, enumerate failure modes:
- CRASH: Agent process dies unexpectedly
- TIMEOUT: Agent takes longer than SLA allows
- BAD OUTPUT: Agent produces output that fails validation
- HALLUCINATION: Agent produces confident but wrong output
- STALL: Agent stops making progress but doesn't crash

For each failure mode, define:
- How is it DETECTED? (heartbeat, output validation, timeout)
- Detection latency: how long before we KNOW it failed?

## 2. DEGRADATION LEVELS
Define what the system can still do with N-1 agents:
- FULL: All agents operational. Normal behavior.
- DEGRADED: One agent down. What functionality is lost?
  What continues? Is the output still valuable?
- MINIMAL: Multiple agents down. What is the absolute
  minimum viable output?
- OFFLINE: System cannot produce useful output. What
  does the user see? How are they notified?

## 3. RECOVERY STRATEGIES
For each failure mode:
- RETRY: Same agent, same input. When is this safe?
- FAILOVER: Different agent takes over. Who is the backup?
- SKIP: Skip the failed step. When is partial output
  better than no output?
- ESCALATE: Alert a human. What information do they need
  to diagnose and fix?
- ROLLBACK: Undo the failed agent's partial work. How?

## 4. THE DEAD AGENT PROTOCOL
When an agent fails mid-task:
- What happens to its in-progress work?
- What happens to tasks in its queue?
- How are downstream agents notified?
- How is shared state cleaned up?
- What is logged for post-mortem?

## OUTPUT: FAILURE HANDLING MATRIX
| Agent | Failure Mode | Detection | Response | Recovery |
Plus degradation level definitions and dead agent protocol.

What happens when you run this: The degradation levels section is where most teams have their breakthrough moment. When you are forced to define what “DEGRADED” looks like, you discover which agents are truly critical and which are nice-to-have. This directly informs your architecture: critical agents get redundancy, nice-to-have agents get graceful skip logic.

Pro tip: Test your degradation plan by killing agents on purpose. Stop one agent mid-task and observe what happens. If the system hangs, produces garbage, or loses data, your degradation plan is not a plan — it is a wish. Chaos engineering for AI agents is just as important as it is for microservices.

The Bigger Picture

Single-agent AI is a solo developer. Multi-agent AI is a software team. The productivity gap is enormous, but so is the coordination cost.

Anthropic’s bet with Managed Agents is that the coordination cost can be reduced to near zero through infrastructure. Google and OpenAI are making the same bet with their own frameworks. The industry consensus is forming: the future of production AI is multi-agent.

Your bet, as a builder, is that the protocol matters more than the agents themselves:

Define the topology first — who does what, who cannot do what, who supervises (Prompt 1)
Design the coordination rules — state management, task lifecycle, conflict resolution (Prompt 2)
Plan for failure from day one — degradation levels, recovery strategies, dead agent protocol (Prompt 3)

Issue #29 gave you safety gates for individual AI systems. This issue gives you the protocol to keep multiple agents working together safely at scale.

Next Issue

The Evaluation Lab: How to Know If Your AI Actually Works

You shipped an AI feature. It works. But how do you know it works well? Next issue: three prompts to build evaluation frameworks that catch regressions, measure real quality, and replace “it seems fine” with data.

The Coordination Problem

Prompt 1 — The Agent Blueprint

Prompt 2 — The Coordination Protocol

The Managed Agents Insight

Prompt 3 — The Graceful Degradation Plan

The Bigger Picture

The Evaluation Lab: How to Know If Your AI Actually Works

Want deeper AI workflows?