Internal AI agent builds usually fail at one of three points: no expert-input loop during the build, data access and permissions that nobody owns, and no maintenance owner after launch. The technology is rarely the problem. At TomorrowToday we have worked inside 36 client companies, and the internal builds we have watched stall or die failed at the same three points, in roughly that order, regardless of industry or team size.
How many AI projects actually fail?
Most corporate AI projects fail to deliver a measurable return: MIT puts the share of failing generative AI pilots at 95%, and S&P Global found 42% of companies abandoned most of their AI initiatives in 2025. The failure pattern is not rare, and it is not improving with better models.
- MIT’s NANDA initiative reviewed more than 300 enterprise AI initiatives and found that 95% of generative AI pilots deliver no measurable P&L return, despite an estimated $30–40 billion in enterprise spending.
- S&P Global’s 2025 survey of 1,006 IT and business leaders found the share of companies abandoning most of their AI initiatives jumped from 17% to 42% in a single year.
- The same survey found the average organization scraps 46% of its AI proofs-of-concept before they ever reach production.
The MIT authors are explicit that the divide “does not seem to be driven by model quality.” The models work. What fails is the implementation: the unglamorous work of wiring an agent into a real business, with real data, run by real people.
The pattern: the demo works, production never arrives
The most dangerous sentence in AI adoption is “the demo worked.”
A demo runs on a sample file, with the builder driving, on the happy path. Production runs on live data, with skeptical users, against edge cases, on a Tuesday when the builder is in a different meeting. Those are different environments, and the gap between them is where internal agent projects go to die. The three failure points below are the three places that bridge collapses. Each one is predictable. Each one is avoidable if you know it is coming.
Failure point 1: No expert-input loop in the build
The person building the agent is almost never the person who does the job the agent is automating. A developer or a power user builds an RFP screener, an invoice coder, a lease summarizer. The operator who actually screens RFPs or codes invoices is busy, and nobody puts their hours into the project plan.
So the agent ships without the operator’s judgment baked in. It handles the obvious cases and misses the exceptions that make the job a job. Then comes the moment that kills the project: the agent is confidently wrong in front of the team. Not obviously broken, which would be fine. Confidently wrong. Trust collapses on the spot, the team quietly routes around the agent, and within a month nobody is using it.
The fix is structural, not technical: schedule the experts’ time as a hard requirement of the build. In our Agent Factory process we require up to 2 hours of your subject-matter experts per build day. That is not overhead. It is the difference between an agent that encodes how your best people actually make decisions and one that guesses.
Failure point 2: Data plumbing and permissions nobody owns
The demo worked on a sample file. Production needs things no demo ever needs:
- API access to the systems the agent reads from and writes to
- Credentials, and a decision about where they live
- A security review someone has to schedule and pass
- An explicit decision about what the agent is allowed to touch, and what it is not
None of that is hard. All of it sits in the seam between IT, security, and the business unit that wants the agent. When connecting the plumbing is nobody’s actual job, the project does not fail loudly. It stalls. The pilot that impressed everyone in March is still “two weeks away” in September, and the S&P Global finding that 46% of AI proofs-of-concept never reach production stops being surprising.
The fix: name the owner before the build starts, and treat access and permissions as a deliverable with a date, not an assumption. In client work we handle this as an explicit connection phase before agents are built, because it fails when it is implicit.
Failure point 3: The week-3 maintenance cliff
Every internal build has the same week-3 story. The agent launched, the builder went back to their day job, and then something changed. An edge case appeared. An upstream system updated its export format. The agent broke, or worse, kept running and produced quietly wrong output.
With no one assigned to fix it, the agent dies in silence. And the lasting output of the project is not the agent. It is organizational skepticism. The next time someone proposes using AI for real work, the room remembers the last attempt, and the bar is higher. That skepticism is the most expensive thing an internal build produces, because it taxes every future initiative. Add the direct cost. A failed build typically eats a meaningful slice of a capable employee’s year, and that alone costs more than most professionally built agents.
The fix: decide who owns the agent in production before launch. Not “the team.” A name, with time allocated, and a review rhythm that catches breakage before users do.
When building in-house actually works
Honest answer: you can do this yourselves. The tools are public, the models are good, and some internal builds succeed. The successful ones share three traits:
- A dedicated owner with real time allocation, during the build and after launch
- IT and security at the table from day one, with the access work scheduled
- Operators’ hours written into the build plan, not borrowed from their evenings
If you can check all three boxes, build it in-house and skip the rest of this article. Most mid-market teams cannot check all three, not because they lack talent, but because everyone with the relevant judgment already has a full-time job. That is a staffing reality, not a failure.
Build vs buy: what changes at each failure point
The build-vs-buy decision for AI agents comes down to who owns each failure point, not who has the better technology.
| Failure point | How it shows up in-house | What a professional build changes |
|---|---|---|
| No expert-input loop | Operators’ time never scheduled; agent ships without their judgment | Expert hours are a required, scheduled input to every build day |
| Unowned data plumbing | Access and security stall between departments for months | Connection and permissions handled as an explicit first phase |
| Week-3 maintenance cliff | Builder returns to day job; agent breaks silently | Agent delivered with a named owner and a weekly review that catches breakage |
The price anchor matters here. Professionally built agents in our Agent Factory typically run $2,400–$6,000 each depending on complexity, billed weekly, with no retainer and no minimum commitment. You can stop any week. We break down what an AI agent actually costs to build, run, and maintain in a separate guide. A failed internal experiment usually consumes more than the bottom of that range in staff time alone, before counting the opportunity cost of the quarter it occupied.
What a working agent looks like in production
Kear Civil Corporation, a civil construction contractor, needed to monitor public RFPs across the country. We built them a research agent on Claude CoWork that searches 15+ public procurement platforms and emails the team a digest every Monday, early enough that it is in their inboxes when they get in. Delivery took under two weeks, as a $2,000 fixed-fee early-program build (the same agent prices at $2,400 today).
Note what it is not. It is not a platform, not a six-month roadmap, not a transformation program. It is one agent with one job, in production, with an owner, doing work a person used to do manually every morning. That is what surviving all three failure points looks like, and it is the unit of progress we think mid-market companies should buy: one working agent at a time.
FAQ
What percentage of AI projects fail?
MIT’s NANDA initiative found 95% of generative AI pilots deliver no measurable P&L return. S&P Global’s 2025 survey found 42% of companies abandoned most of their AI initiatives, up from 17% the year before, and that the average organization scraps 46% of AI proofs-of-concept before production. The consistent finding across both: the approach fails, not the models.
How much time do our subject-matter experts need to give an agent build?
Plan for up to 2 hours of expert time per build day. The agent is encoding how your best people make decisions, and that judgment only gets in if their time is scheduled into the build. Builds that skip this step ship agents that fail on the exceptions, and exceptions are most of the job.
How long does it take to build a working AI agent?
In a productized process, about a week per agent. Kear Civil’s RFP research agent took under two weeks from start to production, including connection work. Internal builds take longer not because the building is slow, but because access, permissions, and expert time are not scheduled.
Should we build AI agents in-house or hire someone?
Build in-house if you can commit a dedicated owner, involve IT and security from day one, and schedule your operators’ hours into the build. If you cannot check all three, a professional build is cheaper than a failed experiment: $2,400–$6,000 per agent versus months of diverted staff time and the skepticism a dead pilot leaves behind.
How much does a professionally built AI agent cost?
In TomorrowToday’s Agent Factory, agents typically cost $2,400–$6,000 each depending on complexity, billed weekly with no retainer and no minimum. You see each agent’s price before we build it, review the result every Friday, and can stop any week.
Start with one agent
Waiting has a price: every quarter a working agent does not exist, the manual work continues and compounds. Starting does not require a transformation budget. It requires one agent, picked well, and the Agent Factory exists to build exactly that. If you want help picking it, a free 30-minute assessment will identify 3–5 candidates from your own operations.