A note on building agents without rewriting your product for CRUD based apps.

Everyone's racing to add AI to their product. Most teams end up rewriting half of it. New backend, new service, new data layer that slowly diverges from the old one until nobody knows which is the source of truth.

There's a calmer way. And it starts with noticing something you already built.

Every mature product has a layer of database functions sitting underneath the UI. Fetch this customer's portfolio. Pull the last 30 days of transactions. Compute the rolling average. People call it the CRUD layer: create, read, update, delete. It's boring, it's tested, it's been running in production for years.

Your dashboards are just a pretty face on top of it. The actual product is that layer.

And if you already have it, you're most of the way to having an agent. You just need something thin sitting above it that knows which function to call and what to do with the result.

What an agent actually is

Not a chatbot. A chatbot answers. An agent decides: picks a tool, calls it, looks at the result, then decides whether to stop or go again.

The "tool" in that sentence is just a function the AI is allowed to call. Your CRUD functions already fit that shape. So you don't need new code. You need a registry: a plain list of which functions exist, what they do, and what parameters they take.

The obvious move is to tag the CRUD functions directly. It works. But if that layer is shared across teams, you don't want to touch it every time you wire up a new capability. So you leave the CRUD code alone and keep the registry in a separate file. Five lines per function, pointing at the real code without touching it.

Boring. That's the point. Boring things don't break when another team ships.

Where most agent projects fall apart

You give the model a list of 50 functions. Someone asks "how's the business doing this quarter?" and it picks a weird combination, gets confused halfway through, and returns a confident-sounding paragraph based on incomplete data.

The fix isn't a smarter model. It's fewer choices.

You add a layer called skills on top of tools. A skill is a named recipe that hard-restricts what the agent can do for a specific type of question. Ask for a "business overview" and the agent doesn't see 50 tools. It sees one: the one pre-built for exactly that question.

Tools are atoms. Skills are the molecules you actually want to serve.

There's a real tension here. AI's whole pitch is that the model figures things out on its own. A good product's whole pitch is that the same question gets the same answer every time. Those go in opposite directions. Skills are how you decide which one wins.

My opinion: give up on "the AI decides everything" early. A constrained agent is less impressive in a demo. It's the only version users will actually trust.

What you do when a question falls outside your skills

Someone asks "what's in the preferences table?" or "do we have refund data somewhere?" You haven't built a skill for that.

Instead of adding a new tool every time, give the agent a small set of read-only introspection tools. List the tables. Describe the columns. Preview a few rows. That's it: no arbitrary queries, no writes, hard limits on how much comes back.

It covers maybe 10% of the edge cases. But it means you're not in an endless loop of "someone asked something new, go build a tool."

The part nobody wants to do but everyone eventually needs

Once you ship, every change becomes a potential silent regression.

You edit a tool description. The agent stops picking it. You change a word in a skill. It stops matching. The data team renames a column. Nothing breaks, the tool just returns empty results, and you find out from a support ticket three weeks later.

You need evals. A list of real questions paired with what a good answer looks like. Run it after every change.

Three checks, in order of cost:

Did the agent pick the right tool? Cheap set check, no model needed. Did the tool return actual data? Empty results are almost always a bug. Does the answer make sense? Hand this to another model: give it the question, the output, and what a good answer should contain, then ask for a pass/fail with a reason.

The third one is slow and costs money so you run it less. But it's the only one that catches quality drift before it compounds.

Skip all three and you'll ship regressions confidently, weekly. Not occasionally. Weekly.

The whole thing in one line

Leave the product alone. Wrap it.

Your CRUD layer is the knowledge. The agent is a thin wrapper with a registry, a set of skills, a peek-at-the-schema tool, and a test suite that runs on every change.

Adding a new capability is five lines in a sidecar file and one new test case.

Most teams skip the boring layer because it doesn't feel like AI work. Then they wonder why the interesting layer doesn't hold up.

One thing worth saying: everything here is drawn from what I've built and shipped, specifically on a CRUD-based fintech platform where the data layer is well-defined and stable. That context shapes a lot of the thinking. If your product doesn't have a clean CRUD layer: if it's event-driven, or heavily real-time, or the data model is genuinely messy, this pattern won't map cleanly. It's a starting point for a specific kind of app, not a universal blueprint.