Prompt, Agent, Harness: Where Authority Lives
There are three ways we have learned to put a language model to work, and they came to us in this order: prompt, agent, harness. None of them was a grand plan. Each was the thing we reached for when the one before it ran into a wall. But look at them from here, and they are not three tricks but one story, and the story is about authority: who gets to decide what happens next, and how much of that deciding we are willing to leave in the hands of the model. The whole distance from the first word to the last is two motions, and between them they hold most of the practical history of the field. First the loop leaves our hands and passes into the program. Then authority leaves the model and comes to rest in the small, knowable part of the system we can actually reason about.
Definitions would not help us here; they tend to hide as much as they show. Better to hold a whole world still and let one single thing change. So we will build a world small enough to keep in the mind all at once, and find our way out of it three times: once as a prompt, once as an agent, once as a harness. A world of our own making is a kind of laboratory, honest in a particular way: nothing lives in it that we did not place, and nothing moves in it that we cannot account for. Whatever differs across the three attempts is precisely the thing we came to understand, the place where authority comes to rest.
A world small enough to see
Our world is a maze: a small grid of corridors and walls, one square its entrance, one its way out. Call it the world model, the ground truth of the whole game, and notice well where it sits, outside the program we are about to write rather than inside it. A single rule gives the maze all its meaning: the model cannot see it. Everything that follows, every clean step and every collision, falls out of that one fact and its consequence. Between a model that cannot see and a world it must cross, something has to stand that can, and read the world on the model's behalf. And the whole of this post is one question: who is that something, and how much do we let it decide?
It is easier to show than to say. So before we build a single piece of it, here is the whole world at once, ours to take in from outside and hidden from the very thing we will one day set loose inside it.
The model is never handed the maze. It is given, at most, where it stands, and even then only the thin sliver we choose to show. The walls live in the world model and nowhere else, outside the program entirely, and this is exactly what makes the maze worth escaping rather than merely reading. What moves across the three acts is not the walls. It is who reads them. It is who stands in the gap between the blind model and the world, and how much they settle before the model is allowed to move.
The prompt: a path drawn in the dark
The oldest way to use a model is the simplest of all, and in it you do everything yourself. You are the operator. You can see the maze and the model cannot, so you read the world model with your own eyes and press what you saw into a sentence. A sentence comes back. The model is a function from text to text, and in that single gesture it is asked to be everything at once: to understand a maze it was only told about, to plan a route through it, and to commit to that route, all before it has taken one single step.
The failure is already visible in the very asking. A whole path, drawn in the dark, committed before the first step, into a wall the model has no way at all to see.
So we ask it for a route, and a route comes back, fluent and confident and fiction. The model has never seen the walls, so the path it draws runs through a maze it imagined, not the one we built. It holds complete authority over the answer, and not one single fact to stand it on.
When the route walks into a wall, as it must, there is no next move to make. The function has already returned, and whatever happens next happens in you. You read the failure, you frown, and you type the question again, a little differently. That is the only loop the prompt has, and it is worth seeing plainly, because the next idea begins by moving it. For now the loop lives outside the program, and you are the loop: the model is a single call with no memory of the last and no notion of a next, and every turn of the wheel is your own hand reaching in from outside to try once more. You stand between the world model and the program, in touch with both, while inside the program there is only the model, in touch with nothing at all. All authority, no sight, and the loop in your hands rather than its own. Both motions of our story are still ahead of us.
Permission to change the world
The prompt failed for a reason no better sentence could ever mend. We asked the model to author the whole crossing in advance, when what it truly needed was to meet the world one step at a time and let each step answer back. So we change the relationship. The model will no longer hand us a finished route; it will act on the world, and the world will change beneath it as it goes. But a model cannot reach into the world and alter it as it pleases, no more than you can reach your hand through a screen. The world model changes only through actions we define and place within its reach.
The unit of that reach is what Scott Wlaschin calls a capability. A capability is a move with the world already baked inside it: not a description of something the model might try, but the action itself, sealed and ready, that the model can only take or leave. move_east is a capability. To exercise it is not to ask to go east; it is to go east, changing where the explorer stands. The model cannot invent a move the maze does not define, cannot fabricate a square out of nothing and step onto it. It can only ever choose among the capabilities it has been handed.
To see the world and to change it are not the same act, and that difference is everything. We can show the model where it stands, or what lies beside it, at no risk, for reading leaves the world as it was. A capability is the other thing entirely. Every single one we hand over is a grant of authority, a standing permission to change the world in one particular way, and to give the model a capability is to give it exactly that much power over the state of the world model.
So the question that orders everything from here is not how clever the model is. It is who decides which capabilities the model holds when its turn comes. Hand it every capability the world defines and authority over the world rests entirely with the model. Hand it a chosen few and authority has quietly moved to whoever did the choosing. That one dial, the size and the source of the capability set, is where the second of our two motions will play out, and the agent and the harness are nothing more than two settings of it. The agent leaves it open wide. The harness begins, gently, to close it.
The agent: learning the walls by walking into them
Here is the first motion: the loop leaves your hands and is set down inside the program. Instead of the operator reading each answer and asking again, the program asks itself, through a loop we will call the agent. It calls the model, it reads the world model to see what the last move did, it feeds that back, and it calls again, turn after turn, with no hand reaching in from outside. The reading of the world has moved as well. You no longer lean over the maze between steps; the agent does it now, on the model's behalf. The loop that was yours becomes the program's own, and that is the whole of what we mean when we say agent.
And what does the agent give the model to act with? Every capability there is. All four moves, move_north, move_south, move_east, move_west, placed in the model's hands at once, the wall-bound and the open alike, before anything has checked which is which. The dial is open wide, and authority over the world stays with the model, which may attempt any change the maze defines, and learns which moves are walls the only way it now can: by walking straight into them.
A blocked move is not an error in the program. It is the world model answering back.
The loop is plain. The model is offered all four directions on every turn, it names one, and it lives with what comes back. Sometimes it is a step toward the exit; sometimes it is a wall, and a turn spent learning where that one wall stands. A budget of turns bounds the search, so even a run that never finds the way out still ends.
And it works. Given enough turns, the explorer maps the maze inside its head, one collision at a time, and reaches the exit. A single run reads like someone feeling their way along a wall in the dark, which is exactly what it is.
This is a real advance on the prompt, and we should say so plainly. The model is grounded now, correcting against a world instead of guessing at one. But look closely at where authority still sits. The model may name any of the four directions, the three that are wall among them, and the program can do nothing but let it try and report back. Safety here is a reaction, never a prevention. Half the turns in that run went to moves that could never land, and the only thing between the model and a wall was the wall itself. We gave the model eyes, yes, and left it free to keep walking into things. The loop has changed hands; authority has not. It sits whole in the model still, exactly where the prompt left it. And to move it is what the harness was built to do.
The harness: handing over only the doors that exist
Here, at last, is the second motion, the one that moves authority itself, and it is smaller than it sounds. It does nothing but turn that dial. The agent put every capability in the model's hands and left it to find the walls by trying them, one by one. The harness keeps the very same capabilities and changes one single thing: which of them the model is given. Before the model is asked anything at all, we work out the exact set of moves that are legal from where it stands, and hand it only those, and nothing else.
At the very same junction the agent stumbled through, the change is plain to see. The harness draws only the doors that truly exist and holds them out, and the walls are not refused but simply never offered at all.
The legal moves come from a plain little tool, and it pays to be clear about how little it actually does. Call it read_neighbors. From where the explorer stands, it keeps the directions whose next square is open, and names no others at all.
At the entrance, two of the four directions are wall, so read_neighbors returns only move_east and move_south, and the other two are never even named. The tool is deliberately, almost proudly dumb. It does not plan, it does not search, it does not know the way out; it reads the board and lists what is legal, the way a tic-tac-toe harness lists the empty cells. It is exactly the knowledge the model lacked and could buy in the agent loop only with collisions, and in a maze this small, one little line of arithmetic gives it away for free.
But a tool is not yet a harness. The harness is what wields it, and it is less a function than an orchestrator, itself a model. It reads the world model through tools of its very own, read_neighbors for the walls and read_user_position for where the explorer stands, and out of what it reads it provisions a second actor: the player. Orchestrator and player are the same shape, each a prompt and a set of tools, and the orchestrator holds the tools that build the player. With one of them, provide_guidance, it writes the player's prompt, the standing instruction for the turn. With another, provide_moves, it fills the player's tools, sealing each legal move into a capability the player can take. The player is the model we have been following all along, now narrowed to a choice among what it was handed. Legality belongs to read_neighbors, deterministic and dumb; the judgment of which moves to surface, and how to instruct the one who must take them, belongs to the orchestrator, and the reading we were once tempted to write by hand stays where the intelligence is.
And the narrowing buys us something the agent could never have. The model can still only choose among the capabilities it was handed; the harness has simply seen to it that what it was handed contains no walls at all. Each one was sealed from a move read_neighbors had already proven legal, so taking it cannot fail, and there is no branch in it for a wall.
The illegal move is not rejected, the way it once was in the agent loop. It is unrepresentable. The door into the wall was never placed in the player's hand at all, so the player cannot walk through it, cannot even name it. This is the principle of least authority: you make a system safe not by trusting the caller to behave, but by handing it only what it is allowed to do. And Wlaschin's counsel to design for a malicious caller turns out to describe a language model rather beautifully: not malicious, no, but unbounded, and best met the same way.
And watch what becomes of the model. Its work was to author a whole path; then to name a move and recover from its own mistakes; now it is only to choose. Its very type narrows as the harness quietly takes the work on.
The model began as the author of everything and arrives as a chooser among doors the harness has already opened for it. The loop around it is the simplest of the three: the orchestrator reads the world model and offers the doors that exist, the player walks through the one it picks, and the orchestrator reads the changed world and offers, patiently, again.
There are no wasted turns now, because there are no illegal moves to waste them on. There is no defensive validation, because there is nothing to defend against. Even stopping is itself a capability: when the explorer reaches the exit, the harness offers a different set of moves, or none at all, and the loop ends because the way onward has run out. The intelligence did not vanish between the agent and the harness. It moved. The deciding of what may be done next left the model and settled into the orchestrator, which stands on a deterministic tool. In a world this small, that tool is the single little line of arithmetic we wrote, and the tool, small as it is, is precisely the part we can actually reason about.
That smallness was a gift of the maze, and it lasts only as long as the world stays a tidy grid. In a real world the legal moves are no longer a line of arithmetic at all. Which slice of the problem bears on the next move, and which of the legal moves are worth surfacing, is a reading of the whole situation, and the orchestrator does that reading the only way anything reads a situation now: by being, itself, a model. The maze let its judgment stay trivial. Enlarge the world and that judgment becomes the entire game. But the shape holds either way: a model wielding tools to set the board for another model, and the one we watch playing is rarely the only intelligence in the room, only the louder of them.
And this is the sentence the rest of the post was built to earn: the safest agent is not the one we trust not to do the wrong thing. It is the one we never hand the wrong thing to.
The same loop, a different maze
None of this was ever really about mazes, of course. Strip the corridors away and look at what is left there on the page: a tool that reads the state of some world and returns the moves that are legal within it, and an orchestrator that offers those moves, takes the one the player picks, and then asks again. Change the world and the shape does not move at all.
We named it at the very outset, and the maze has now shown it to us: the three words are one single story about where authority lives. The prompt keeps all of it in the model and gives the model nothing at all to stand on. The agent grounds the model but still lets it reach for anything, and spends its safety swatting down the reaches that fail. The harness moves authority out of the model and into an orchestrator that hands over only what its tools prove real. The model did not grow more trustworthy across the three. We simply stopped needing it to be.
Now put a repository where the maze once was. The legal moves are no longer compass directions but the things you may do to a codebase, and which of them exist depends, exactly as before, on the state. Offer commit only when something is staged, merge only on a clean branch, deploy only when the tests are green, and the model can no more deploy a red build than the explorer could ever step through a wall. Swap the four directions for a repository's actions, swap the grid for your own project, and not one single line of the loop changes. This, exactly this, is what a coding agent is. The maze was never a metaphor for it. The maze was the very same program with a smaller world.
The runtime is the room
There is a more principled version of this very room, and it has a name in the stack I tend to reach for. We have drawn the room's shape by hand, but it is one that an effect system draws for you. A capability is a service: something the program may use only because the runtime has chosen to provide it. The agent loop becomes an effect that declares, right there in its very type, the services it depends on, and refuses to run at all until they are supplied.
The runtime is what provides them. It reads what an effect requires and hands over exactly that, and nothing more. This is the second motion turned into a type: authority the model cannot ever reclaim, because the runtime never placed it there to begin with. Least authority stops being a discipline you must remember to keep and becomes a property the types themselves enforce, a capability the model cannot use for the simple reason that it was never injected. That is a post of its own, and the very next one. For now, it is enough to notice that the room we built by hand already has a proper architecture waiting quietly beneath it.
Conclusion
We set out to bring a model out of a maze it could not see, and we did it three times. The prompt drew its confident path in the dark and walked straight into the wall it had no way to know was there. The agent felt its way out, learning each wall by colliding with it, free the whole time to collide again. The harness stopped the colliding altogether, not by making the model any wiser but by handing it, each turn, only the doors that were really there.
That is the whole arc the three words name, the two motions run to their end. The loop has left our hands for the program, and we are no longer the ones who must ask again. Authority has left the model for the harness, whose reach is bounded by the small, knowable tools we can actually reason about, and we are no longer the ones who have to be trusted. The model, you see, was always the player. The work of the last few years, the part we call progress, was mostly the building of the room it plays in, and the slow recognition that the room is exactly where the intelligence belongs.
A model is already the cleverest thing in the room, after all; what it lacks is never cleverness, only a room built well enough to hold it. So build the smallest world your model needs, hand it only the moves that exist, and let it choose.