The endless pilot: why AI projects don't reach production in SMEs (and how to avoid it)

Only 25% of AI pilots reach production; 3 out of 4 stay in demo or proof-of-concept. In SMEs the pattern tightens because the stuck pilot drains the energy and trust of the only team that was going to operate the system. The three causes are always the same: the pilot was designed to impress rather than integrate (demo-clean data instead of real dirty data), nobody defined who would operate it once the vendor left, and there was no explicit criterion for what “going to production” meant. The way out is three clauses signed before any code is written: measurable exit criteria (for example, 80% of the flow processed for 4 weeks without the vendor), capability transfer as a deliverable (1 or 2 people by name trained as owners), and a binding production date on the calendar. Without those three, the system doesn’t leave the pilot by inertia.

25%

reach production

75%

stay in pilot

0

say it before invoicing

And in SMEs the pattern tightens further, because the stalled pilot consumes the one resource you can’t replenish: the energy and trust of the team that was supposed to operate the system.

We call this the endless pilot. It’s the project that started with good intentions, delivered decent results in a controlled environment, and froze in that state. It doesn’t get killed officially, because “we’ll pick it back up in Q3”, and it doesn’t move to production either, because “there are still things to figure out with the ERP integration.” Months go by and the system sits there, with one user, no operational metrics, nobody on the client side actually running it.

We’ve seen this pattern at companies that invested six figures and, three quarters later, couldn’t answer a simple question: who operates this when you aren’t around?

Why pilots stay in pilot

The endless pilot is almost never a technical problem. When we run a retrospective with clients coming out of an experience like this, the causes are always the same three, in this order.

1. The pilot was designed to impress, not to integrate. The vendor shows a polished demo with sample data, clean scenarios, predictable formats. The moment it gets plugged into the real ERP with its exceptions, its blank fields and its unwritten rules, things stall. Not because the model doesn’t work, but because the pilot wasn’t designed around the dirty data your real operation produces.

2. Nobody defined who would operate it once the vendor left. This is the most invisible one. It’s assumed that “the team will handle it”, but the team never participated in the design, doesn’t understand the system’s logic, and when the first exception appears, the response is to call the vendor. The consultancy charges for every call, and the system enters a strange equilibrium: it works technically, but the client never becomes its owner.

3. There was no explicit definition of what “going to production” meant. Without that definition, the moment of the leap gets postponed indefinitely. There’s always one more case to test, one more integration to validate, one more sign-off pending. The pilot stays as a safe harbour because moving the system to production demands a decision and nobody wants to sign it.

The three causes are solvable. But you have to address them before the pilot starts, not when it’s already stuck.

The contract that prevents the endless pilot

In every project we start at Zero Ops, three clauses get signed before any code is touched. They’re heavy to read and boring to explain. But they save full quarters.

	Endless pilot	Productive pilot
Data	Demo with clean data and happy paths	Real dirty data from week 1
Operator	”The team will handle it” in the abstract	1-2 named people trained as owners
Exit	”When it’s ready”	Binding date on the calendar
Success criterion	General feeling	80% of the flow processed without vendor for 4 weeks
Vendor model	Bills every support call	Leaves when criteria are met

The three clauses, in detail:

One. Explicit, measurable exit criteria. Before the first line of code, the contract states what has to be true for the pilot to be considered complete. No “satisfactorily deployed”. Sentences like: the admin team processes 80% of monthly invoices with the system, without vendor intervention, for 4 consecutive weeks. If that doesn’t happen, the pilot isn’t closed and we don’t leave. If it does, we leave for sure, even if the client wants to keep us.

Two. Capability transfer as a deliverable, not a courtesy. The final system is operated by the client team, not by us. That means during the project we train 1-2 specific people, not “the team” in the abstract, so they know how to modify the system when the first odd case appears. If the admin who’s going to use it every day doesn’t participate in design decisions from week 2, the project is poorly structured and we stop it.

Three. Production date locked in the calendar. Not “when it’s ready”. A specific date, agreed upfront, that acts as a forced deadline. Whatever isn’t ready by that date gets escalated or dropped, but the system goes live. Forcing the leap to production surfaces all the details that the endless pilot keeps in the shadow.

These three clauses are uncomfortable for the vendor, because they close the door on the business model based on extending the pilot while billing hours. That’s why almost nobody signs them. And that’s why almost everybody stays in pilot.

What happens when the contract works

Two concrete cases where we applied it.

Case 1 — Invoice and delivery note OCR at an industrial distributor. The company received dozens of invoices a day from suppliers with completely different formats. The admin team was spending 3 to 5 minutes per invoice, with a 2-5% error rate, and price discrepancies were detected weeks after payment.

The admin who had spent a decade processing those invoices manually joined the design from week 2. The exit criterion was: the system processes the bulk of the daily flow without our intervention for one full calendar month. We met it. Today that same person moved from keying 400 invoices a month to auditing the 400 the system extracts and proposes. Her judgement is still the critical layer. And when a new supplier arrives with a format the system doesn’t recognise, she adjusts the rules. Without calling us.

Case 2 — Catalog categorisation at the same company. 508 product tables, more than 3 million rows, 327 brands with all their variations (BRAND-A, BRAND-A FILTER, BRAND-A+GROUP, all the same brand but registered as three). The system ran the normalisation in 27 minutes.

But the real deliverable wasn’t the script: it was that the client’s operations team can run it again when they integrate a new manufacturer, without us. That’s what makes the system theirs.

In neither case did we talk about “digital transformation” or sell “saving two people.” The team didn’t shrink. It got leveraged. And the projects are in production precisely because day 1 there was a written contract defining what being in production meant.

How to know if you’re in an endless pilot

Three questions any operations director can ask themselves this week about any AI pilot in flight:

1. Is it written down anywhere what has to happen for this pilot to be considered complete? If the answer is “more or less yes but not exactly”, the pilot is going to drag on.

2. Is there someone on your team, with a first and last name, who can modify the system when a case appears that the vendor hadn’t foreseen? If the answer is “we’d have to call them”, the pilot is going to generate dependency, not capability.

3. Is there a concrete production date on someone’s calendar, beyond a slide? If the answer is no, the system isn’t going to leave the pilot by inertia. Things don’t move on their own.

If all three answers are “no” or “more or less”, you’re not in a pilot. You’re paying for a system that’s never going to become one.

The first step is operational, not technical

When an SME contacts us saying they have a stalled pilot, the first thing we ask for isn’t access to the code. It’s the contract and the transfer plan. Most of the time neither one exists. From there, the conversation shifts: we’re no longer talking about AI, we’re talking about how the company is operated.

And that, almost always, starts with an operational diagnosis. 2 weeks. Map the key processes as they actually run today, identify where time and money are being lost, and deliver a plan with line-by-line ROI estimates. No commitment to continue. The deliverable is the client’s, to execute with whoever they choose.

Because the problem with AI in SMEs isn’t the technology. The technology works. The problem is that it’s being applied to operations nobody has reviewed, with contracts that don’t force you out of pilot, and with teams that were never designed to operate the final system.

Solving that first, before touching AI, is what makes the fourth project, the one almost nobody talks about, finally reach production.

If your company has a stalled AI pilot and you want an honest second read on what it would take to get it out of there, the 2-week operations roadmap is the first step. No hidden pitch. The plan is yours, run it with whoever you run it with.