First Principles / Part IV · Building with AI / Chapter 22
First Principles · Building with AI · 22
A model on its own can only write text. Put it in a loop with tools — search, a calculator, code, an API — and it can act: take a step, read the result, decide the next step, and repeat until the task is done. That loop is an agent.
01The answer, then the intuition
A bare model can't check today's price, run a calculation exactly, or query your database — its knowledge is frozen and its only output is text. An agent gets around this by letting that text be an action. The model writes a structured tool call; the surrounding program runs it and feeds the result back; the model reads the result and decides what to do next. Loop until it has the answer.
This "think → act → observe → repeat" cycle turns a predictor into a doer. Step through one agent solving a task that needs two tools it doesn't have built in — a live lookup and exact arithmetic:
The model can’t know a live price or do exact math alone — so it calls tools and reads the results.
Task: "What's a 15% tip on the current price of Bitcoin?"
02Mechanics
So an agent isn't a smarter model — it's the same model wrapped in a control loop that lets it interact with the world and correct course. The intelligence is in the model; the agency is in the loop.
04The math
expand ▾At each step $t$, the model acts as a policy, sampling an action $a_t$ from the full history of what it has thought, done, and seen:
If $a_t$ is a tool call, the environment returns an observation $o_t = \text{tool}(a_t)$, which is appended to the history; if $a_t$ is an answer, the loop halts. Each step is a full forward pass over an ever-growing context — so a $k$-step task costs on the order of:
And reliability compounds the wrong way: if each step succeeds with probability $p$, a $k$-step chain succeeds with only $p^{k}$. At $p=0.95$ and $k=10$, that's $0.95^{10} \approx 0.60$ — which is why long agent runs are fragile and why bounding $k$ matters as much as raising $p$.
05The code
expand ▾A minimal agent: tools, a policy (scripted here in place of the model), and the loop that ties them together.
agent.py
def search(q): return "$67,000" if "BTC" in q else "unknown"
def calc(expr): return eval(expr, {"__builtins__": {}}, {})
TOOLS = {"search": search, "calc": calc}
# a scripted policy standing in for the model's decisions
script = [
("think", "I need the current BTC price."),
("act", ("search", "BTC price")),
("think", "Now compute a 15% tip on 67000."),
("act", ("calc", "67000 * 0.15")),
("answer", "15% of $67,000 is ${}."),
]
obs = None
for kind, payload in script: # the agent loop
if kind == "act":
tool, arg = payload
obs = TOOLS[tool](arg) # run the tool, observe result
print(f"ACT {tool}({arg!r}) -> {obs}")
elif kind == "think":
print(f"THINK {payload}")
else:
print("ANSWER", payload.format(obs))
# THINK I need the current BTC price.
# ACT search('BTC price') -> $67,000
# THINK Now compute a 15% tip on 67000.
# ACT calc('67000 * 0.15') -> 10050.0
# ANSWER 15% of $67,000 is $10050.0.
06The economics
Agency → money
Agents are where AI stops answering questions and starts doing work — and that's the entire economic thesis of the build-out. A chatbot sells tokens; an agent that can complete a multi-step task competes with labor, a far larger market. This is the demand that the hundreds of billions in compute are betting will arrive.
But the cost structure is unforgiving. Every step is another model call over a longer context, so a single agent task can cost 10–100× a single chat reply — and the reliability math ($p^k$) means longer tasks fail more often, forcing retries that cost even more. The value has to clear a bill that grows with both the length and the fragility of the task.
That tension is the crux of the Circuit's central question. If agents become reliable enough to automate real knowledge work, the demand easily justifies the clusters being built. If they stay just unreliable enough to need a human watching, the economics stay stubbornly hard. The whole payoff of the build-out rides on which way that goes — which is exactly what an honest research desk should be measuring, not assuming.
07Going deeper
expand ▾
Yao et al. (2022) — ReAct · interleaving reasoning and tool actions.
Schick et al. (2023) — Toolformer · teaching models to call tools.
Model Context Protocol (MCP) · an open standard for connecting tools to models.
Anthropic — Building Effective Agents · patterns and the reliability problem.
Cite this chapter: Divergent Compute, "Agents & tool use", First Principles, 2026. divergentcompute.com/first-principles-agents · v1.0 · CC-BY.