677 True AI Agents from 1,672 AI Companies
SHUR IQ classified the entire YC AI portfolio to isolate companies building autonomous agents—not wrappers, not chatbots, not fine-tuning services. The top 10 scored across five structural dimensions reveal where real autonomy is emerging and where capital is chasing vaporware.
The Classification Problem
Everyone calls themselves an "AI agent" company in 2026. Of 1,672 AI companies in the YC ecosystem, only 677 (40.5%) meet the structural criteria for agent classification: autonomous task execution, multi-step reasoning, tool integration, and persistent state. The remaining 995 are wrappers, fine-tuning services, prompt marketplaces, or chatbots with marketing departments.
The Top 10 Signal
The highest composite score is 69.0 (Persana AI). No company breaks 70. This is a structurally immature market—high capital, high traction, but shallow autonomy depth. The average Autonomy Depth score across the top 10 is 51.5/100, the weakest dimension by a wide margin. These companies have product-market fit and funding, but the agents themselves are operating at low levels of genuine independence.
Domain Concentration
Software dominates the top 10 (6 companies), followed by Productivity (2) and Health (2). The Health vertical shows an interesting structural divergence: Athelas and careCycle score nearly identically (64.5) but through different paths. Athelas leads on Capital & Defensibility (73); careCycle leads on Autonomy Depth (60). One bet is moated infrastructure, the other is agent sophistication.
W12-2026 Stack Ranking
Top 10 AI agent companies by composite score. Five dimensions: Model Capability (20%), Market Traction (25%), Platform Ecosystem (20%), Autonomy Depth (20%), Capital & Defensibility (15%). Click any row to expand.
| # | Company | Domain | Composite | Tier | Key Signal |
|---|---|---|---|---|---|
| 1 | Persana AI | Productivity | 69.0 | Emerging | Highest Market Traction (80) in the index. AI-powered sales prospecting with autonomous lead enrichment and outreach sequencing. |
Structural Signals
Autonomy Depth (55) is the weakest dimension. The agent executes pre-defined sequences well but lacks multi-step reasoning in ambiguous prospect scenarios. Vulnerable to commoditization as foundation model providers build native sales tools.
|
|||||
| 2 | Fiber AI | Software | 68.0 | Emerging | Balanced profile across all five dimensions. AI-powered outbound sales with deep data integration and autonomous campaign management. |
Structural Signals
Risk Indicators
|
|||||
| 3 | Warmly | Software | 66.0 | Emerging | Highest Capital & Defensibility (70) among Software companies. Autonomous website visitor identification and real-time engagement. |
Structural Signals
Lowest Autonomy Depth (50) in the top 3. The agent reacts to clear behavioral triggers but lacks genuine reasoning about visitor intent in ambiguous sessions.
|
|||||
| 4 | Fini | Productivity | 65.0 | Emerging | AI customer support agent with knowledge base integration. Strong Model Capability (70) for intent classification and resolution. |
Structural Signals
Risk Indicators
|
|||||
| 5 | Athelas | Health | 64.5 | Emerging | Highest Capital & Defensibility in the index (73). FDA-pathway medical devices combined with AI-driven remote patient monitoring agents. |
Structural Signals
Risk Indicators
|
|||||
| 6 | careCycle | Health | 64.5 | Emerging | Highest Autonomy Depth (60) in the Health vertical. Patient engagement agents that autonomously manage care coordination workflows. |
Structural Signals
Risk Indicators
|
|||||
| 7 | Mutiny | Software | 64.5 | Emerging | Lowest Autonomy Depth (45) in the top 10 offset by strong defensibility (73). AI-personalized web experiences for B2B conversion. |
Structural Signals
Lowest Autonomy Depth (45) in the entire top 10. The "agent" is closer to a sophisticated personalization engine than an autonomous system. The classification is borderline—Mutiny may drop from the agent index in future scoring cycles if autonomy criteria tighten.
|
|||||
| 8 | Daily | Software | 63.2 | Emerging | Strongest Platform Ecosystem (70) in the bottom half. Real-time video/audio infrastructure powering agent-to-human interactions. |
Structural Signals
Autonomy Depth (40) is the second-lowest in the top 10. Daily is more "agent infrastructure" than "agent company." The product enables agents built by others rather than deploying its own autonomous systems.
|
|||||
| 9 | Inkeep | Software | 63.0 | Emerging | Strong Model Capability (70) for documentation understanding. AI-powered search and support agent that ingests entire knowledge bases. |
Structural Signals
Risk Indicators
|
|||||
| 10 | QueryPie AI | Software | 63.0 | Emerging | Most evenly distributed score in the index—all five dimensions within 10 points of each other. Data access governance with AI-driven policy agents. |
Structural Signals
Risk Indicators
|
|||||
Structural Gaps
Three structural holes in the YC AI agent ecosystem, each representing a category-defining opportunity.
Critical The Autonomy Ceiling
The average Autonomy Depth across the top 10 is 51.5/100—the weakest dimension by a wide margin. Market Traction averages 72.5 while the agents themselves are operating at low levels of genuine independence. This market is selling "agents" that are, structurally, sophisticated automation with LLM-powered decision points.
Current leaders optimize for traction first and autonomy second. The company that inverts this—building genuinely autonomous multi-step reasoning systems, then proving market fit—creates a structural moat that "wrapper + traction" companies cannot replicate. This is the difference between Salesforce adding AI features and a company that makes Salesforce itself autonomous.
High The Health Vertical Divergence
Athelas and careCycle score identically (64.5) through structurally different paths. Athelas leads Capital & Defensibility (73 vs. 68); careCycle leads Autonomy Depth (60 vs. 55). This divergence signals a vertical-specific valuation question: in regulated markets, does defensive moat or agent sophistication compound faster?
Neither company has both. The entity that merges Athelas-style regulatory defensibility with careCycle-style autonomous care coordination creates a structurally unassailable position. This is not a technology gap—it is an organizational and regulatory strategy gap.
Medium The Infrastructure-vs-Agent Identity Crisis
Daily (rank #8) scores 70 across three dimensions but only 40 on Autonomy Depth. Mutiny (rank #7) scores 73 on defensibility but 45 on autonomy. Both are classified as "agent companies" but function more as infrastructure or optimization tools with agent marketing.
This matters for investors. A portfolio that thinks it holds 10 "AI agent" positions actually holds 6 agent companies and 4 infrastructure/optimization plays. The structural difference in exit multiples between "agent" and "SaaS with AI features" will widen as the category matures. The companies aware of this distinction are racing to increase their Autonomy Depth scores before the market reclassifies them.
Ecosystem Landscape
How the top 10 distribute across domains, and where the structural concentration reveals opportunity and risk.
Domain Distribution
Software dominates the top 10 with 6 companies, followed by Productivity (2) and Health (2). The Software concentration reflects both the market reality—developer tools and B2B SaaS are the first adopters of agent workflows—and a gap signal: consumer, finance, legal, and education verticals are underrepresented.
The Dimension Landscape
Average scores across the top 10 reveal the structural profile of the AI agent market in early 2026.
Score Distribution
The top 10 spans only 6 points (63.0 to 69.0). This is an unusually compressed range, indicating that the market has not yet produced a breakaway leader. For comparison, the K-Pop vertical's top 6 spans 24 points (68.5 to 92.15). The AI agent market is structurally undifferentiated.
Three companies share the same composite score at rank 5–7 (64.5 each: Athelas, careCycle, Mutiny), arrived at through entirely different dimensional profiles. This suggests the composite score alone is insufficient for investment decisions—the dimension breakdown is where the signal lives.
The "Traction-First" Pattern
Every company in the top 10 has Market Traction as its strongest or second-strongest dimension. Not a single company leads with Autonomy Depth. This is a market where companies are optimizing for customer acquisition and revenue, then retrofitting autonomy. The structural question: will the market reward this approach, or will a "deep autonomy first" entrant leapfrog the current leaders?
Methodology
Five dimensions, 100-point weighted composite scale. Every score traces to structural evidence.
SBPI Dimensions — AI Agent Vertical
Classification Criteria
Of 1,672 AI companies in the YC ecosystem, 677 (40.5%) met the structural criteria for "true AI agent" classification:
- Autonomous task execution: The system can complete multi-step tasks without human intervention at each step
- Multi-step reasoning: The system chains decisions across multiple actions, not just single-turn responses
- Tool integration: The system uses external tools, APIs, or data sources as part of its workflow
- Persistent state: The system maintains context across interactions and adapts based on prior outcomes
Companies that failed classification were categorized as: chatbot (single-turn), wrapper (thin layer over foundation model), fine-tuning service (model customization, not agent deployment), or platform/infrastructure (enables agents but does not deploy them).
Data Sources
- YC Company Database: Batch data, founding dates, team size, category classification
- Crunchbase: Funding rounds, valuations, investor composition, financial signals
- GitHub: Commit velocity, repository activity, open-source engagement, documentation depth
- API Documentation: Endpoint coverage, SDK availability, integration breadth, developer experience quality
- Product Hunt / G2: User reviews, adoption velocity, competitive comparison signals
- Public Filings & Press: Revenue disclosures, partnership announcements, regulatory filings
Scoring Process
- Nightly pipeline: Automated extraction from data sources listed above. Each dimension scored independently using structural evidence, not sentiment.
- Composite calculation:
Composite = (MC × 0.20) + (MT × 0.25) + (PE × 0.20) + (AD × 0.20) + (CD × 0.15) - Tier classification: W12-2026 is the baseline week. All companies are classified as "Emerging." Tier promotions (Established, Leader) require 4+ consecutive weeks of scoring with upward delta trends.
- Evidence chain: Every dimension score traces to specific data points. No score is generated without a source document.