I Write Far Less Code Than I Used To. The Job Got Harder.

A year ago I spent most of my day inside an editor. These days I spend it reading plans, reviewing diffs, and deciding whether to trust what an agent just produced — often across three or four sessions running at once. I write far less code by hand than I did in 2024. And I want to be honest about something the breathless takes skip: the job didn’t get easier. It got harder, just in a different place.

There’s a real shift happening under the noise, so let me try to say what actually changed, and what I think it means for those of us who’ve been doing this a while.

The metaphor moved from autocomplete to agent

The first wave of AI coding tools — Copilot and its kind — were autocomplete. They guessed the next line while you stayed in control of the editor. Useful, but you were still the one writing. What changed in the last year is the metaphor. The tools became agents: you hand them a task and they go write, run, and revise code in a loop with little intervention. Terminal agents like Claude Code overtook the autocomplete tools among professional developers in a matter of months, and a large share of new code is now machine-generated.

Andrej Karpathy calls this Software 3.0: you specify intent in plain language and the model implements it. The framing that stuck with me is that your role shifts from author of code to author of intent. That sounds clean. In practice it means the work moves from typing to specifying, reviewing, and deciding what to trust — and those are not the same muscle.

The productivity is not automatic

Here’s the part the marketing leaves out. More AI does not straightforwardly mean faster.

The most honest data I’ve seen comes from METR, which followed the same experienced developers over a year. Early on, the veterans were actually about 19% slower with AI tools — they lost more time reviewing and correcting output than they saved generating it. A year later, the same people were roughly 18% faster. Same engineers, same kind of work. What changed was that they learned how to work with the tools. That swing is the whole story: the tool didn’t make them faster, the workflow did.

And the gains are uneven. The studies keep showing big savings on routine work — boilerplate, tests, translation, docs — and small or negative savings on the hard stuff: cross-system debugging, novel architecture, security-critical code. One analysis of millions of pull requests found AI-generated code sits waiting for review far longer than human-written code. The generation got cheap. The reviewing, integrating, and trusting did not.

So if you drop these tools on a team and expect a speedup, you’ll often get the opposite. The speedup is real — it just lives in the parts nobody likes to talk about.

What produces the gains is reliability engineering

When I look at where the wins actually come from, it’s the boring infrastructure. Verification loops, so the agent can check its own work before it reaches me — a test it can run, a type-checker, a preview. A context file that tells the agent how this codebase really works, the conventions and the traps, so it stops repeating the same mistakes. Guardrails around what an agent is allowed to touch and how its changes get reviewed.

If that list sounds familiar, it’s because it’s the same discipline you’d apply to any unreliable component in a system you care about. I’ve spent fifteen years making business-critical systems reliable — making sure a thing that can fail does so safely, that every change is traceable, that nothing consequential happens without a check. An AI agent is just a new kind of unreliable component. The work of making it trustworthy is work I already knew how to do.

That’s why I keep telling people the valuable skill here isn’t prompting. Prompt tricks fade with every model release. Designing the verification, the context, and the guardrails around a non-deterministic worker survives, because no matter how good the model gets, you still have to decide whether to trust what it did.

The job became orchestration

The other real change is shape. On a good day I’m not writing a feature — I’m directing a few agents, each in its own checkout, and reviewing their output the way I’d review pull requests from junior engineers. The new system-design question is how those agents share context, which ones run in parallel versus in sequence, and what happens when one of them is confidently wrong. That’s starting to feel as foundational as microservices once did.

It’s a genuinely different job, and the failure mode scales with the leverage: a fleet of agents producing plausible, wrong work faster than you can check it. Which brings the whole thing back to verification again.

The part that worries me

I’ll end on the thing I don’t have a tidy answer for. Entry-level hiring has fallen off a cliff — by some counts down more than 70% in a year. The work juniors used to cut their teeth on is exactly the routine implementation we now hand to agents.

The trouble is that juniors are how you make seniors. If the industry stops training people now, the shortage shows up in 2029 or 2030 — right when these human-and-agent teams need the most experienced judgment they can get. Smaller teams are coming, that much seems settled. But cutting the bottom of the pipeline to get there is borrowing against a bill that comes due later, and I don’t think we’ve reckoned with it yet.

Where this leaves you

If you’re a senior engineer, this is a good moment rather than a threatening one — but only if you reorient. The people pulling ahead aren’t the best prompters. They’re the ones who already knew how to take parts that aren’t reliable and build something trustworthy on top of them. That skill just moved to the center of the job.

Write less code. Get very good at deciding whether the code is right. That’s the work now.