Back to Blog🇮🇹 Leggi in Italiano

Specialist Agents and Slash-Command Skills: Dividing Labor for AI Coding

Why N specialists beat one generalist LLM, how role / skill / agent differ, what ships in the free Orchestrator (45 agents, 52 skills, AGPL-3.0), and what the upcoming Multi-Agent Orchestrator adds on top.

One generalist, or many specialists?

The default mental model for using a coding LLM is still "one chat, one model, one long conversation." It works until a task involves planning, implementation, testing, review, and documentation. Then a single conversation drags context from every phase into every other phase—the model confuses requirements with artifacts, review comments with code, and what it just finished with what it should do next.

Literature on agentic systems shows a pattern. Anthropic's multi-agent research system spawns specialist agents in parallel, each with its own context window, tools, and trajectory. On their research benchmark, this beat single-agent Claude Opus 4 by over 90%—at the cost of 15× more tokens. The takeaway is not "always go multi-agent," but that for tasks with clear sub-problems, parallel specialists outperform a monolithic conversation.

Role, skill, agent: three units of specialization

Three distinct concepts:

A role prompt is a one-off instruction: "act as a security reviewer for this diff." Ephemeral, unversioned, no trace.

A skill is a saved posture—a markdown file with a system prompt and optional tool allowlist, called via /skill-name. It doesn't persist across turns; it shifts the model's stance for one focused pass. No process overhead, cheap.

An agent owns its own context window, tools, and subprocess. It runs in parallel with other agents. CrewAI models agents as team members with roles; LangGraph uses explicit state machines; AutoGen orchestrates agents as conversation participants. Different approaches, same principle: narrow the role, sharpen the output.

Per task: do you need a long-lived role (agent), a posture shift for one pass (skill), or a single nudge (role prompt)?

What ships in the free Orchestrator

The free, AGPL-licensed VibeCoded Orchestrator ships 45 specialist agents and 52 skills out of the box. They cover the day-to-day loop of writing, reviewing, and maintaining a codebase on your own machine.

The 45 bundled agents cover the core development loop (coder, planner, tester, code-explorer, code-migrator, expert-coder, frontend-specialist, backend-specialist, gui-expert), knowledge curation (kg-navigator, knowledge-curator, code-graph-updater, doc-organizer, doc-extractor, doc-maintainer, graph-health-checker), workflow design (ai-agentic-architect, ai-llm-expert, helper-scripter, prompt-engineer, deep-researcher), project orchestration (orchestrator-installer, project-architect, project-bootstrapper, project-coordinator, project-migrator, project-organizer, gui-tester, web-explorer), and role-specialist packs (Consulting CTO, Senior Designer / UX, Vendor / Sales + Marketing, Senior Scientist, Automation / AI Engineer, Solo SaaS Founder, Senior DevOps / SRE — each pack adds ~2 agents). Every agent is a markdown file with a system prompt, a tool allowlist, and optional model and effort caps — you can read it, fork it, rewrite it.

The 52 skills cover posture-shifting moves: design (/architect, /api-designer, /architecture-consultant), implementation (/tdd, /debug-expert, /fix-issue), review (/security-reviewer, /code-review-expert, /performance-optimizer, /accessibility-checker), exploration (/explore-codebase, /kg-research, /extract-docs), plus a long tail of consulting + AI-domain skills. A skill is one command and a system prompt swap; an agent is a process.

Coordination in the free tier rides on Claude Code's native subagent model: each agent is a markdown file with its own system prompt, tool allowlist, and context window, spawned in parallel via the Agent tool. Research rates this kind of pull-style specialist parallelism 13–57% better than hierarchical delegation on both token cost and parallelism. In practice, three parallel subagents is the practical limit; more than that and merging their context back becomes lossy and token-wasteful.

Two concrete shapes

"Design and implement a new auth module." In the free Orchestrator you spawn the planner agent first; it breaks the work into tasks. You then dispatch coder, tester, and a /security-reviewer skill in parallel via the Agent tool. A reviewer agent gates the merge. The doc-organizer updates the reference. Four specialists, one outcome, and no single context window had to hold all of it. This is plain Claude Code subagent orchestration on the free tier.

"Coordinate a refactor that touches frontend, backend, and database, and produce a tradeoff doc first." This is where the upcoming Multi-Agent Orchestrator (MAO) comes in. You describe the goal once; the maestro routes the request, the HTN planner with an LLM Oracle fallback decomposes it into a task graph with dependencies and validation criteria, then dispatches ai-agentic-architect to write the tradeoff doc, backend-specialist and frontend-specialist to implement in parallel, and deep-researcher to cross-reference prior decisions in the knowledge graph. A Tauri desktop UI lets you watch the team operate in real time, approve risky actions, and redirect the plan when it drifts. This is beyond what Claude Code subagents alone do — it needs a runtime to drive it.

What MAO adds (coming soon)

The free Orchestrator gives you the parts: 45 agents, 52 skills, and Claude Code's native subagent dispatch. MAO (Multi-Agent Orchestrator) — currently pre-launch — is the runtime that wires them together.

MAO is built around the maestro: a conversational orchestration layer you talk to like a single agent, which delegates to the underlying specialists. Behind the maestro sits a hierarchical planner (HTN engine + LLM Oracle fallback that lifts learned decompositions back into the domain), a hybrid agent harness that routes individual roles across Claude / Ollama / OpenAI / Gemini depending on cost and capability, and a task graph with WAL-mode SQLite persistence so the team can pick up work in parallel without stepping on each other. A Tauri desktop UI lets you watch agents stream, monitor the task board, and intervene mid-plan.

MAO also introduces Tier 2 (workflow) agents — multi-step pipelines that mix deterministic code steps, LLM steps, checkpoints, and MCP elicitation prompts in a single durable workflow. These are architecturally distinct from a Claude Code subagent (one prompt, one context window) and from a Tier 1 tool agent: a Tier 2 agent is a code-driven state machine with DB-checkpointed steps, so it survives crashes and can ask the human a structured question mid-run via the MCP elicitation channel.

These MAO components are not part of the free base or of Orchestrator Pro. Pro (€19/mo, €149/yr, €199 lifetime) adds RL reranking, curated agent packs, auto-updates, and the VCO-side Coordination MCP client to the same free base. MAO is a separate product, currently pre-launch, with pricing to be announced.

Where this sits in the landscape

ToolBuilt-in specialistsParallel execUser-extensibleOpen source
VibeCoded Orchestrator (free)45 agents + 52 skillsYes (Claude Code subagents)Yes (markdown)AGPL
VibeCoded MAO (coming soon)maestro + HTN planner + Tier 2 workflow agentsYes (task graph + hybrid agent harness)YesCommercial
CursorComposer + Agent modesLimitedPartialNo
GitHub CopilotAgent mode (single)NoLimitedNo
AiderRole selectionNoYes (CLI)Yes
Devin (Cognition)Single autonomous agentInternal onlyNoNo
CrewAIDIY (role-based)YesYes (framework)Yes
LangGraphDIY (state machines)YesYes (framework)Yes
AutoGenDIY (conversational)YesYes (framework)Yes

Most tools offer either a single agent (Cursor, Copilot, Devin) or a framework to build your own (CrewAI, AutoGen, LangGraph). The free Orchestrator is in the middle: 45 curated specialists and 52 skills—opinionated enough to work out of the box, open enough to rewrite as markdown.

When you outgrow plain Claude Code subagent dispatch and need a maestro, hierarchical planner, and Tier 2 workflow agents that survive crashes and ask structured questions mid-run, MAO is where that runtime lives.


Sources: