AI Agents Are 90% Theater

Here's the 10% That Actually Works

Productivity Tech X
November 25, 2025

In partnership with

Every AI company demos an agent booking complex travel plans autonomously. Except it doesn't work. It's never worked. And OpenAI's co-founder just admitted we're a decade away from making it work. Here's what actually works right now.

Last month, I watched yet another AI startup demo their "revolutionary autonomous agent."

The founder typed: "Book me a trip to Tokyo next month. Budget $2,000. Direct flights, hotels near the conference center."

The AI agent went to work. Searched airlines. Compared prices. Found hotels. Booked everything. Added to calendar. Sent itinerary.

The audience applauded. Investors nodded.

I tried it the next day. Success rate: 0%.

Flight delayed, agent didn't notice. Hotel booked wrong dates. Restaurant closed on Mondays, agent didn't check.

Just like every other travel booking agent I've tested.

The Dirty Secret About AI Agent Demos

That polished demo? Scripted scenario. Pre-tested routes. Edited failures. No edge cases. No delays. No sold-out hotels. Just perfect, simple, happy-path theater.

And everyone in the industry knows this.

Andrej Karpathy, OpenAI co-founder, said we're not in the "year of AI agents." We're in the "decade of AI agents."

Translation: This doesn't work yet. Come back in 2035.

But here's what he didn't say: One category already works. Really well. Right now.

The One Place AI Agents Actually Deliver

GitHub Copilot.

If you write code, you're probably using it. Or Cursor. Or another AI coding assistant that's genuinely transformed how developers work.

Not theoretical. Not demos. Production tools millions use daily.

Results:

55% faster task completion
74% focus on more satisfying work
88% report higher productivity

Reproducible across millions of users.

Why do coding assistants work when travel agents fail?

Four Reasons Coding Assistants Succeed

1. Code has rules. Travel has chaos.

Code either compiles or it doesn't. Tests pass or fail. Objective feedback.

Travel? Subjective preferences, ambiguous requirements, endless edge cases. "Near the conference center" means different things. "Budget hotel" is relative.

2. Coding environments are stable. Websites are not.

GitHub Copilot works inside VS Code. Interface doesn't change. Structure is predictable.

Travel agents navigate different airline sites, CAPTCHAs, ever-changing interfaces. One redesigned checkout flow breaks every agent.

3. Code is text. Travel is multimedia.

Coding assistants work with text. Comments, errors, documentation—all structured.

Travel requires understanding maps (is this walkable?), photos (does this room look acceptable?), reviews (are noise complaints legitimate?), implicit preferences (I avoid 6 AM flights even when cheaper).

4. Coding patterns are learnable. Your preferences are unique.

AI trains on billions of lines of code. Patterns repeat. Best practices generalize.

Your travel preferences? You tolerate layovers on Fridays but not Mondays. Pay extra for hotel gyms but not pools. Avoid specific airlines for obscure loyalty reasons.

Current agents can't learn these nuances.

Today’s Sponsor

Startups who switch to Intercom can save up to $12,000/year

Startups who read beehiiv can receive a 90% discount on Intercom's AI-first customer service platform, plus Fin—the #1 AI agent for customer service—free for a full year.

That's like having a full-time human support agent at no cost.

What’s included?

6 Advanced Seats
Fin Copilot for free
300 Fin Resolutions per month

Who’s eligible?

Intercom’s program is for high-growth, high-potential companies that are:

Up to series A (including A)
Currently not an Intercom customer
Up to 15 employees

Apply now

Where AI Agents Actually Fail

I tested five AI travel agents over three months. Real bookings. Real money.

Results:

Simple domestic flight: 80% success (but none found best price)
International with visa requirements: 0% (none checked transit visa needs)
Family with infant: 20% (didn't book bassinet, verify stroller policy)
Change existing reservation: 0% (all tried booking new tickets)
Hotel with specific requirements: 40% ("walking distance" = 40-minute walk)

Average success: 28%. Human travel agents: 95%.

The Even Scarier Use Case

Some companies test AI for IT support. AI logs into your computer, diagnoses problems, installs fixes autonomously.

Would you let AI have full control over your laptop?

I asked 50 IT professionals. 48 said no.

Why it's terrifying:

Every computer is unique. Outlook crash could be corrupted files, proxy settings, expired certificates, or 50 other causes. AI can't reliably diagnose.

I watched an AI fix a printer:

Reinstalled drivers (correct)
Restarted print spooler (correct)
Decided network was wrong (incorrect)
Changed network config (disastrous)
Broke printer AND network

One company tested this 3 months. Success rate: 31%. Human IT: 87%.

They went back to humans.

What Actually Works Right Now

✅ AI agents that work:

Code assistance (Copilot, Cursor, Tabnine)

Why: Structured, clear rules, text-based
ROI: 50%+ productivity gains

Structured data entry

Why: Defined fields, validation, predictable
ROI: 80% time reduction

Simple customer service (Intercom, Zendesk AI)

Why: FAQ-style, known answers
ROI: 60% ticket deflection

Document analysis

Why: Text processing, patterns
ROI: 70% time savings

❌ AI agents that don't work:

Complex travel booking

Why fails: Edge cases, unreliable UI navigation
When: 2028-2030

Autonomous IT support

Why fails: Every system unique
When: 2030+

Complex scheduling

Why fails: Understanding implicit preferences
When: 2027-2029

Financial decisions

Why fails: Requires human judgment
When: Probably never fully autonomous

The Timeline Nobody's Talking About

2025-2026: Coding and structured tasks only

2027-2028: Better UI navigation (still limited edge cases)

2029-2030: Multimodal understanding (still limited preference learning)

2031-2035: General-purpose agents (maybe)

Reality: 8-10 years from reliable travel booking or autonomous IT support.

What You Should Actually Do

If you're a developer: Use GitHub Copilot now. $10-20/month. 50%+ time savings. No-brainer.

If evaluating for business:

✅ Deploy: Code generation, data entry, simple support, document analysis

❌ Skip: Travel planning, IT without oversight, financial decisions, anything where failure is costly

If you're a vendor: Stop demoing complex travel. You're lying. Show what works. Build trust through honesty.

If you're an investor: Ask: "Show success rates on random requests, not cherry-picked demos."

The Companies Getting This Right

Replit, Cursor, Lindy: Narrow focus, honest limitations, human oversight, 80%+ success rates.

The ones getting it wrong: Almost everyone claiming "autonomous agents" for complex real-world tasks.

Why Everyone's Lying

Because funding depends on vision, not reality.

VCs invest in "AI that autonomously handles travel" not "AI that helps write code."

The hype is financially necessary even when technically fraudulent.

But the backlash is coming. Companies deploying overhyped agents watch them fail. Trust erodes.

When the bubble pops, honest companies survive. Theater dies.

Your Move

AI agents work in narrow domains with structured environments.

For everything else: 5-10 years away.

Three choices:

1. Ignore entirely: Miss competitive advantage in the 10% that works

2. Believe hype, deploy blindly: Waste money, frustrate users

3. Deploy strategically: Use coding assistants now. Wait on travel. Test carefully. Scale what works.

Companies winning aren't the flashiest demos. They're the ones who deployed the 10% that works while ignoring the 90% that's theater.

Be the smart one.

That’s all for today, folks!

I hope you enjoyed this issue and we can't wait to bring you even more exciting content soon. Look out for our next email.

Kira

Productivity Tech X.

Latest Video:

The best way to support us is by checking out our sponsors and partners.

Today’s Sponsor

From Boring to Brilliant: Training Videos Made Simple

Say goodbye to dense, static documents. And say hello to captivating how-to videos for your team using Guidde.

1️⃣ Create in Minutes: Simplify complex tasks into step-by-step guides using AI.
2️⃣ Real-Time Updates: Keep training content fresh and accurate with instant revisions.
3️⃣ Global Accessibility: Share guides in any language effortlessly.

Make training more impactful and inclusive today.

The best part? The browser extension is 100% free.

Check out Guidde

Ready to Take the Next Step?

Transform your financial future by choosing One idea / One AI tool / One passive income stream etc to start this month.

Whether you're drawn to creating digital courses, investing in dividend stocks, or building online assets portfolio, focus your energy on mastering that single revenue channel first.

Small, consistent actions today. Like researching your market or setting up that first investment account will compound into meaningful income tomorrow.

👉 Join our exclusive community for more tips, tricks and insights on generating additional income. Click here to subscribe and never miss an update!

Cheers to your financial success,

Grow Your Income with Productivity Tech X Wealth Hacks 🖋️✨