Falcon: An Agent Governance Control Plane

The interface an organization uses to supervise, bound, and verify a fleet of autonomous agents acting on its behalf. A self-initiated concept, taken from first principles to a production-fidelity prototype.

By Manish Todkari · Senior Product Designer

The Fleet view — exceptions first, routine compressed Open prototype →

Overview

Falcon is an agent governance control plane: the interface an organization uses to supervise, bound, and verify a fleet of autonomous agents operating on its behalf. I took the premise from first principles to a production-fidelity, interactive prototype, in the highest-stakes domain I could pick. Money, access, and irreversible action.

Eight agents run across finance, payments, legal, support, operations, sales, and IT. Each has working approval queues, real audit trails, and live state that mutates when you click.

Role and scope

Role. Product Designer, self-initiated. Research, problem framing, UX architecture, interaction design, pattern language, and the prototype.

Scope. Fleet oversight model, a five-pattern governance design language, an eight-agent seeded scenario, working screens, and a fully interactive single-file prototype documented for engineering handoff.

Users. The accountable principal, the human whose name is on the line when an autonomous agent makes a mistake. Operations lead, financial controller, risk manager, IT owner. The patterns hold across any domain where an agent's actions have real-world consequences.

The challenge

Forty years of interface design optimized the efficiency of human action. When an agent does the work, the human's role shifts entirely. The labor moves to the machine, but the accountability stays with the person.

The reflexive industry answer is a chat window, which fails as a governance instrument. You cannot audit a conversation, you cannot filter one, and you cannot hand "everything looks fine" to a regulator. I studied how aviation, fund management, and executive delegation already solved supervised autonomy, then translated those precedents into a design language for agent oversight. The core principle: make the agent's authority legible, its decisions auditable, and its leash retractable, at any moment, in any state.

The five patterns

1. Autonomy as altitude

Agent permission is a three-position dial: Ask every action, Ask before irreversible, Autonomous. Set per agent, visible at a glance in the fleet table, retractable in one tap. The metaphor is deliberate. Altitude is a clearance, not a capability. The agent is cleared to fly at that level. The human holds the dial.

2. Plain-language scope contracts

What each agent may touch and may not touch is written as a readable list, not buried in a permissions system. Every item is affirmative or negative, in plain English, in the agent drawer. Authority as a first-class document, not a checkbox grid.

3. Blast-radius diffs

Before a human approves or denies an irreversible action, freeze an account, post a journal entry, cancel a purchase order, they see a before/after diff of exactly what changes, a reversibility tag, and the context that caused the agent to surface it. The decision is made with full information, not on a summary.

4. Exception-first attention

The approval queue foregrounds only what requires a human. The activity feed compresses routine success to one quiet row and expands failures into the full multi-step chain where the actual fault lives. On a good day, the queue is empty and the feed is green. The interface earns that quiet.

5. Eval-to-guardrail

Guardrails toggle per policy between Monitor, which records a breach and lets it through, and Block, which stops the action before execution. The right workflow is to tune in Monitor, verify accuracy, then switch to Block. The UI makes that transition two clicks, not a config-file edit.

Edge cases tested against the prototype

A pattern language is only as good as its behavior under pressure. Four scenarios stress the design:

An agent acting outside scope. A staged action that exceeds the agent's authority routes to the approval queue with the breached clause shown, rather than failing silently or proceeding.
The confident mistake. An agent surfaces an irreversible action it is sure about. The blast-radius diff lets a human catch what the agent's confidence missed.
The deadline collision. Routine work and an urgent approval compete for attention. Exception-first hierarchy keeps the urgent item from drowning in the routine.
The guardrail that fires too broadly. A new policy in Monitor mode reveals it would have blocked legitimate actions, so it gets tuned before it is switched to Block.

Why it matters

Execution is commoditizing. Capable models will perform the task in nearly every domain. What will not commoditize is adoption by accountable humans. A controller, a clinician, an operations lead cannot responsibly delegate to an autonomous system they cannot verify, no matter how capable it is. The verification layer is what converts raw capability into something a professional can put their name under.

The craft is not shrinking in the agentic era. It is relocating, from choreographing human actions to architecting machine trust. The control plane, the scope contract, the blast-radius diff, the audit log that is actually a feature. These are the new interface primitives. Most of the industry is still building the agent. This is the work on the layer that makes the agent usable.

Honest scope

A front-end prototype with seeded state. No real agents, no backend. Refreshing resets the scenario. The hard part of shipping it for real is integration with live systems of record and identifying areas where agents can act for you. I build these to think rigorously about where interface craft is heading as software learns to act for us.

Try it yourself. The prototype is clickable end to end. Launch the prototype →