Ten years of building with AI: 2015 to 2025

In 2015 I trained a text classifier using scikit-learn during my MSc. It got 94% accuracy. I shipped it into a research project and it was mostly useless in practice. The gap between "94% accuracy" and "something that works" was where I've spent most of the last decade.

Here's the honest version of what that decade looked like.

2015–2017: It's harder than it looks

The early years were mostly about the distance between what AI could do in a controlled environment and what it could do in a real product. IBM Watson had impressive demos. The APIs were painful. The results were mixed. Sentiment analysis in production had different accuracy characteristics than sentiment analysis on a benchmark dataset. This was a pattern I would see repeatedly for ten years.

What I learned: production is the only test that counts.

2018–2020: Before the wave

Working at Zillion Pitches and then Draper University gave me a vantage point on the gap between what practitioners knew and what the public understood. By 2019, I was watching GPT-2 get released and thinking this was going to be bigger than most people realised. GPT-3 in 2020 confirmed it. The ceiling had been removed. The question was when the products would catch up to the capability.

What I learned: capabilities compound in ways that aren't visible until they suddenly are.

2021–2022: Real computer vision at scale

Countercheck was the most technically demanding work I'd done. Not because the ML was novel - we were using YOLO, not doing research - but because the gap between a model that works on a clean dataset and a model that works at the edge of a logistics facility at scale is enormous.

Getting to 99.6% accuracy in production took two years of data discipline, deployment-specific evaluation, and hardware that we owned and operated ourselves. The patent we filed wasn't for a fancy model. It was for an architecture that made the whole thing work reliably.

What I learned: ML engineering is not the same as ML research. The hard part is the deployment environment, not the algorithm.

2023: The LLM year

ChatGPT launched in November 2022 and by mid-2023 the entire stack had changed. We were integrating GPT-4 into workflows, evaluating Claude, running automated code review. The tools had crossed a threshold where integration was the right call for a broad class of tasks.

The thing I found most interesting in 2023 wasn't the model quality. It was the speed of workflow change. The way we do code review, documentation, and incident response changed significantly in a twelve-month period. I hadn't experienced that rate of workflow change before.

What I learned: when a tool is good enough, adoption is fast and the second-order effects are surprising.

2024–2025: Agentic everything

The shift from "LLMs as tools" to "LLMs as participants in the engineering loop" is still in progress. We're running agentic workflows in production. Some work well. Some have blown up in instructive ways. The failure modes are different from the failure modes I expected.

The thing I didn't predict: how much of the value would come not from dramatic autonomous tasks but from removing friction from the mechanical parts of knowledge work. First drafts, structured extraction, boilerplate. Not the flashy use cases. The ones that quietly compound.

What the decade was actually about

The hype arc goes: AI is revolutionary → AI is overhyped → AI is here and it's fine. That arc repeats every few years with a different technology. GPT-4 → AGI by Thursday. IBM Watson → AI for everyone. TensorFlow → anyone can do deep learning now.

The actual arc, from inside it: capabilities improve discontinuously, the deployment gap closes more slowly than the capability gap opens, and the value comes to the people who build things that work rather than the people who describe the future most vividly.

I've been building things in this space for ten years. The things I built in 2015 are embarrassing. The things I built in 2021 are good. The things I'm building in 2025 are faster and better than what I could have done two years ago.

That's the actual story of the decade.

With gusto, Fatih.