Claude Code Mastery7 / 12
Multi-Agent Pipelines
Chaining sub-agents, running them in parallel, and the patterns for 'review-while-coding' without losing your mind. Where Claude Code starts to feel like a small engineering org.
Multi-agent is the buzzword everyone slaps on a slide. It also happens to be where Claude Code gets genuinely interesting — when used surgically.
The shape that works: a small pipeline of bounded sub-agents, each doing one thing, with an explicit handoff. The shape that fails: "swarm of agents debating the architecture."
Let's get tactical.
The three patterns that actually ship
1. Linear pipeline (the bread and butter)
test-writer → test-fixer → code-reviewer → release-bot
Each step has one input and one output. Failures stop the pipeline. This is 80% of what teams use.
2. Fan-out / fan-in
When a task is naturally parallel — translating 5 files, generating tests for 12 modules, scanning logs from 8 services — fan it out.
┌─ translator(es) ─┐
├─ translator(fr) ─┤
spawner ──> ├─ translator(ar) ─┤── merger
├─ translator(pt) ─┤
└─ translator(de) ─┘
You spawn N specialised sub-agents in parallel, then a merger sub-agent reconciles the outputs (deduplicates, picks the highest-confidence variant, writes a single PR).
3. Critic loop
writer ↔ critic
Writer produces. Critic scores against a rubric. Writer revises. Stop when the critic gives ≥ a threshold or after N rounds.
This pattern shines for:
- Documentation rewrites.
- Migration plans.
- Refactor proposals where "is this clean?" is the gate.
The critic must be a different sub-agent from the writer. Same agent self-criticising is theatre.
Where multi-agent stops paying off
After a year, here is my honest take on the diminishing returns:
- 2 agents: Big leap. Writer + reviewer is a real win.
- 3-4 agents: Useful for clear pipelines (test-writer → fixer → reviewer).
- 5+ agents: Marginal at best. Coordination cost > delegation gain.
- "Swarm of 10 agents debating": A demo, not a workflow.
If your pipeline has more than 4 sub-agents, ask whether half of them could be regular shell commands or Makefile targets.
Concrete: a "PR factory" pipeline
Goal: take a Linear ticket, ship a PR.
1. ticket-reader → parses the Linear ticket, outputs a /feature prompt
2. implementer → writes the code
3. test-writer → writes / updates tests
4. test-fixer → if any test fails, fix the code (not the test)
5. code-reviewer → reviews the diff, either SHIP / FIX-FIRST / REWRITE
6. release-bot → drafts PR description + changelog entry
7. (human) → reviews the diff and pushes
Each step is a sub-agent in .claude/agents/. The human step at the end is non-negotiable — that is where git push happens.
Run-time on a typical feature: 4-12 minutes wall clock. Human review at the end: 5-15 minutes. End-to-end: a feature an hour, sustainable.
How to actually invoke a pipeline
Two flavours.
Manual stepping (recommended at first)
> /agents implementer
> Goal: ... Constraints: ... DoD: ... Files: ...
# wait, review
> /agents test-writer
> Write tests for the new code.
# wait, review
> /agents code-reviewer
> Review the diff.
You stay in control. Slow but safe.
Orchestrated via a slash command
.claude/commands/pr-factory.md:
1. Spawn `implementer` with the user-provided goal.
2. Wait for completion. If implementer fails, abort.
3. Spawn `test-writer` on the diff.
4. Spawn `test-fixer` until tests pass or 3 retries reached.
5. Spawn `code-reviewer` on the final diff.
6. If verdict != SHIP, surface to user and stop.
7. Otherwise spawn `release-bot` for PR description.
8. Print a one-line summary and stop. Never push.
Then:
> /pr-factory
> Goal: <fill> Constraints: <fill> DoD: <fill> Files: <fill>
You hit one command. The pipeline runs. The push is still a human keystroke.
Parallel patterns — when to use them
Run agents in parallel when:
- The work is independent (translating 5 files, summarising 8 PRs).
- You can write a deterministic merger (concat, dedup, pick-highest-score).
- You can budget the cost (parallel = more API calls, faster wall clock).
Avoid parallel when:
- Tasks have dependencies (test-fixer needs the implementer's diff).
- The merge step is fuzzy ("which architecture do we like more?"). That is not a merge — that is a human decision.
The single most useful trick: explicit handoffs
Each sub-agent ends its turn by emitting a structured handoff:
status: ok | needs-human | failed
artifacts:
- path: src/cache.ts
- path: tests/cache.test.ts
notes: "Implemented LRU + TTL. All tests green."
next: test-writer | code-reviewer | done
The next agent reads this handoff and knows exactly where it is. No re-deriving context, no "wait, what was the goal again?"
Standardising the handoff is the single highest-leverage thing you can do once you have 3+ sub-agents. It is the multi-agent equivalent of a well-typed function signature.
Next article: Building Complete Features — taking everything from this and Articles 3-7 and walking through a real ticket-to-PR session, command by command.
Series — Claude Code Mastery
- Part 01Claude Code vs ChatGPT vs Copilot vs AgentsMost developers are using the wrong AI tool for the wrong job. Here is why — and what to do instead.
- Part 02Installation + The Antigravity WorkflowInstalling Claude Code is a 30-second job. Setting up the workflow that makes the agent feel like it's doing the heavy lifting — that's the part nobody writes about.
- Part 03Writing Prompts That Work"Make it better" is not a prompt. "Refactor this for performance" is not a prompt. Here is the four-part structure that makes Claude Code actually finish what you asked.
- Part 04Slash Commands — Building a Project from A to Z/init, /agents, /compact and your own custom commands. The toolkit that lets you go from empty folder to running app without leaving the Claude prompt.
- Part 05Sub-Agents — The 11 Specialized Experts Inside Claude CodeSlash commands reuse prompts. Sub-agents reuse whole personas — code-reviewer, test-writer, migration-runner. Here is the team you should have on day one.
- Part 06Production Codebase SafetyPermissions, guardrails, and what not to automate. The unsexy article that decides whether Claude Code becomes infrastructure or becomes the reason you got paged at 2 AM.
- Part 07Multi-Agent Pipelines — you are hereChaining sub-agents, running them in parallel, and the patterns for 'review-while-coding' without losing your mind. Where Claude Code starts to feel like a small engineering org.
- Part 08Building Complete FeaturesFrom Linear ticket to merged PR with Claude Code. A real, honest walk-through — what the prompt looked like, what the agent got right, what I caught in review.
- Part 09Testing and DebuggingLetting Claude Code own the entire test loop. Including the parts that make engineers nervous: regressions, flakies, integration tests, and the stack-trace whisperer.
- Part 10Team WorkflowsHow engineering teams are actually integrating Claude Code today. The shared .claude/ folder, the review rituals, and the anti-patterns I keep seeing in the wild.
- Part 11Advanced Patterns — Hooks, MCP Servers, Custom Tools, System PromptsOnce you've outgrown the defaults: hooks for deterministic side effects, MCP servers for org-specific data, custom tools, and system-prompt surgery.
- Part 12The Future of Agentic DevelopmentWhere this is going in 2026 and beyond. What I'd bet on, what I would not, and the line where I get sceptical of the hype.