What an AI agent actually is
Strip away the marketing and an agent is a loop: a model reads a goal, decides on an action, calls a tool, reads the result, and repeats until the goal is met or a stop condition fires. The difference from a chatbot is autonomy over multiple steps and access to real tools — your database, your APIs, a browser, a code runner.
That definition matters because it tells you where the engineering effort goes. It is not in the prompt. It is in the tools you expose, the guardrails around them, and the evaluation harness that tells you whether the loop actually works on your data.
Where agents deliver real ROI
The agents we ship that survive contact with production tend to fall into a few patterns:
- Research & synthesis — pulling from multiple sources, reconciling them, and producing a structured answer.
- Workflow automation — multi-step internal processes that today bounce between three tools and a human copy-pasting between them.
- Data extraction & routing — reading unstructured input (emails, PDFs, tickets), classifying it, and taking the next action.
What these share: a clear definition of "done," tolerance for a human-in-the-loop checkpoint, and a measurable baseline you can beat.
What agent development costs
For a focused, single-domain agent with 3–6 tools, a proper evaluation set, and a human review step, expect a £18k–£45k build over 4–8 weeks. Multi-agent systems, heavy integration surface, or strict compliance push that higher. The build cost is rarely the surprise — the running cost is. Token spend scales with how much context each loop carries and how many loops a task takes. We design for this with prompt caching, retrieval instead of stuffing, and hard caps on loop count.
How we build them
- Define the eval first. Before a line of agent code, we write the test cases that define success.
- Start with the smallest tool set. Every tool is attack surface and a chance for the loop to go sideways.
- Put a human in the loop where the cost of being wrong is real — until the data proves the agent can be trusted unattended.
- Instrument every loop — token cost, latency, tool-call traces and success rate, from day one.
When NOT to build an agent
If your process has no clear success criteria, if the cost of a wrong action is catastrophic and unrecoverable, or if a deterministic script would do the job — don't reach for an agent. The most expensive agent is the one built to solve a problem a well-placed if statement already handles.
The teams winning with agents in 2026 aren't the ones with the cleverest prompts. They're the ones who scoped the problem tightly and measured relentlessly.
If you're weighing an agent for a real workflow and want a straight answer on whether it's worth building, that's exactly the conversation we have on a discovery call.