Giving agents hands
Giving agents real hands
A fleet of thirteen small servers that give agents real hands. What they wrap, the two house styles I build them in, and why they all start read-only.
28 Jun 2026 · 3 min
An agent without tools is just an opinion. It can reason all day, but until it can touch something real, it is a very expensive autocomplete. So I went and built it hands.
The hands are a fleet of small servers, thirteen of them working right now. Each one wraps exactly one system I already use. DNS lives in two of them, one for cloudflare and one for godaddy. Hosting is three: dokku for the apps I run myself, netlify for the static sites, s3 for the object stores. CI and secrets go through github. Databases are mongo-tools and parse. Marketing and SEO are mailchimp and google search console. There is one for x, one for my own personal finance tracking I call life-tracker, and one that wraps image generation through codex. One server, one job. None of them are clever. That is the point.
One server, one job
The temptation is to build the mega-tool: a single server that can do everything, with forty parameters and a switch statement in the middle. Do not do this. The model gets lost, you get lost, and when something breaks you cannot tell which of the forty paths did it.
Instead, each server exposes a handful of tools with sharp schemas. The schema is the prompt. If the description is vague, the model improvises. If it is exact, the model behaves. Most of the work in writing one of these is not the code that calls the API. It is naming the tool well and writing a description tight enough that the model cannot misread it.
Two house styles
Not every server deserves the same weight, so I build them in two shapes.
The common one is a light CLI-MCP scaffold. It is a single index.ts that wires up the server over stdio with @modelcontextprotocol/sdk, a shared.ts that holds the client setup and the small helpers every tool needs, and a tools/ folder where each file is one tool. Schemas are zod, so the validation and the model-facing description come from the same definition. Cloudflare, godaddy, github, the s3 wrapper: they all look like this. You can read one end to end in a few minutes, which matters when you are deciding whether to trust it with a write.
The heavier shape is a proper workspace monorepo, and I only reach for it on the product-grade servers, the ones where the wrapped system has real surface area and the server itself will grow features over time. There the extra structure pays for itself: separate packages, a build step, shared types that actually get reused. The light scaffold would buckle under that; the monorepo would be silly overhead on a six-tool DNS wrapper. Match the structure to the blast radius of the thing it touches.
Read-only until it earns write
Every server starts read-only. The agent can list, describe, and report. It cannot create, set, or destroy until I have watched it drive for a while and trust the shape of what it does.
This sounds cautious. It is. The first time an agent confidently runs a destructive command against the wrong target, you stop thinking of "autonomy" as a feature and start thinking of it as a blast radius. Read-only is how you map the box before you let anything touch the walls. Widening happens deliberately, one verb at a time, never as a default.
The unglamorous part
The demo shows the agent doing something impressive in four seconds. The demo does not show:
- the auth setup, which is most of the work
- the rate limiting, so a loop cannot bankrupt you overnight
- the dry-run mode, so you can see what it would do
- the logging, so when it does something surprising you can reconstruct why
That plumbing is the actual product. The flashy part is a thin layer on top. If you only build the flashy part you get a great video and a system you cannot run on anything you care about.
What they are not for
These servers are not a general intelligence. They are a way to hand a specific, bounded capability to something that can decide when to use it. The judgment is still yours: which systems get hands at all, which operations stay behind a confirmation, what the agent is never allowed to touch.
Give an agent hands and it gets dramatically more useful. It also gets a way to be wrong in production. Build the boring guards first. The hands are the easy part.
- mcp-dokkuAn MCP server that drives a Dokku PaaS over SSH.Tool
- Small tool surfaces beat fat APIsA marketing API exposes 115 operations; my server hands the agent six tools. The boundary is set by token budget and model focus, not REST purity.Musing
- Infinite exercises, verifiedA model drafts maths questions against the component library, a verifier throws out the junk, and a clean one renders. Forever.Lab
- Sign an OAuth 1.0a request in plain NodePosting to X with user-context creds means signing the request yourself. Here is the HMAC-SHA1 signature, built by hand, no library.Snippet
- Guardrails for agents in productionA catalogue of the guards I actually ship: typed confirmations, blast-radius escalation, pay-to-play gating, ordered workflows, and read-only by default.Musing