Doodle illustration of five stacked cost layers topped by a $20 price tag, showing how AI companies decide pricing
Thought Leadership

How AI Companies Actually Decide Pricing: The Real Cost Behind Your $20 Subscription

Daniel Voss July 2, 2026 · 13 min read 15+ Verified Sources
Independent Analysis 15+ Verified Sources Updated July 2026

How AI companies decide pricing comes down to one uncomfortable fact: the same volume of tokens costs $0.28 on DeepSeek and $30.00 on GPT-5.5.

Definition
AI Pricing
AI pricing is the process by which providers convert per-token compute cost into a final subscription or API price by layering on training amortization, target margin, competitive positioning, and market strategy.
AI Pricing in 30 Seconds
What you need to know before reading further
AI pricing isn’t one decision — it’s five stacked on top of each other, and only the bottom layer is close to being knowable from the outside. Providers charge per token because every query costs real compute, unlike traditional software. Prices are falling fast on a per-performance basis, but usage is growing even faster, which is why bills keep climbing anyway.
55
% typical AI company gross margin, vs. 80-90% for SaaS
100
x cheaper DeepSeek’s floor tier runs vs. OpenAI’s flagship
Verified provider pricing, 2026
200
x/year AI price-to-performance decline since Jan 2024
$20
The subscription price this article reverse-engineers
Anthropic / OpenAI public pricing
At a Glance — Who Is This For?
This article gives you the mechanics behind every AI price tag you’ve paid, plus a way to price your own AI product.
IF
You’re an everyday ChatGPT or Claude subscriber wondering what your $20 actually covers — this article shows you the real math, worked through step by step.
IF
You’re weighing which AI API to build on and the pricing pages all look like noise — this article gives you the framework to compare them properly.
IF
You’re planning to charge for your own AI-powered product — the embedded calculator gives you a defensible price floor before you set a number.

Anthropic cut its flagship model’s price by two-thirds in one generation — same infrastructure, same tokens, a third of the bill. That gap is the real story behind how AI companies decide pricing: not a single number, but a stack of decisions about compute cost, margin, and market pressure most users never see. This piece breaks that stack apart, layer by layer, then hands you a calculator to run the math yourself.

HOW DO AI COMPANIES DECIDE PRICING?

AI companies decide pricing by stacking margin and strategy on top of real per-token compute cost. Unlike traditional software, every AI query costs money to serve, so the price you see reflects inference cost, training amortization, target margin, and competitive positioning — not just what the market will bear.

THIS ARTICLE INCLUDES A WORKING CALCULATOR — Enter your model tier, usage per user, and target margin to get a defensible subscription price for your own AI product.

Jump to the Price Calculator ↓

Why Did AI Pricing Break the Software Rulebook?

AI pricing broke the software rulebook because it reintroduced a cost that SaaS had spent two decades eliminating: the cost of serving each individual customer. Traditional software runs on marginal cost near zero — once a CRM or project-management tool is built, the thousandth user costs almost nothing more to serve than the tenth. That’s the entire reason mature SaaS businesses converged on 80-90% gross margins and flat per-seat pricing that never had to think about what happens inside the software.

AI inference doesn’t get that luxury. Every single query sends real text through a real model on real hardware, and that hardware has a real, metered cost attached to it. There’s no build-once-serve-infinitely trick — the compute bill recurs on the next request, and the one after that, scaling directly with how much people actually use the product rather than fading into a rounding error.

Doodle graph comparing flat SaaS costs to rising AI inference costs as user count grows
Traditional SaaS cost stays flat as users scale; AI inference cost climbs with every query.
52
Average AI company gross margin in 2026, up from 41% in 2024 — still well below the 80-90% range SaaS investors expect.
Industry Position
A provider can’t collect $10 from a customer and spend ten cents of it on compute without accounting for the gap — GPUs are expensive, and they have a real footprint in electricity and heat.
Jacob Jackson — Co-founder, Supermaven; ML Lead, Cursor · Feb 2026

That’s the tension every AI pricing decision has to resolve: a bill that grows with usage, sitting inside a business model built to look like the flat SaaS subscriptions customers already expect.


What Does It Actually Cost to Run the AI You Use?

To actually run the AI you use costs a per-token bill measured in millionths of a cent, split into two prices: what you send in and what the model sends back. A token is roughly three-quarters of an English word, and every provider charges input and output at different rates — output always costs more, because generating text is a fundamentally more expensive computation than reading it.

Input tokens pass through the model once, in a single forward computation. Output tokens are generated one at a time, autoregressively — the model runs a full probability calculation across its entire vocabulary for every single word it writes. That’s why output pricing runs anywhere from 3x to 6x the input rate across every major provider, and it’s the single most consistent pattern in AI pricing.

Here’s what the four major providers charge right now, verified directly against their own pricing documentation:

ProviderModelInput ($/M tokens)Output ($/M tokens)
AnthropicClaude Opus 4.8$5.00$25.00
AnthropicClaude Sonnet 5 (intro, through Aug 31, 2026)$2.00$10.00
AnthropicClaude Haiku 4.5$1.00$5.00
OpenAIGPT-5.5$5.00$30.00
OpenAIGPT-5.4$2.50$15.00
OpenAIGPT-5.4 Nano$0.20$1.25
GoogleGemini 3.1 Pro (≤200K context)$2.00$12.00
GoogleGemini 3.5 Flash$1.50$9.00
DeepSeekV4 Flash (cache miss)$0.14$0.28
DeepSeekV4 Pro (cache miss, promotional)$0.435$0.87

Table 1: Standard API rates, verified against official provider documentation, checked live at time of writing. DeepSeek V4 Pro reflects an active promotional discount.

Doodle bar chart comparing AI token pricing across DeepSeek, Google, Anthropic, and OpenAI models
Output token price per million, DeepSeek V4 Flash to GPT-5.5 — roughly a 100x spread.

The spread across that table is the whole story: DeepSeek’s cheapest tier costs roughly 35 to 100 times less than OpenAI’s flagship for the same million tokens. That’s not a rounding difference — it’s the entire reason “which AI should I build with” and “which AI subscription should I pay for” are such different questions depending on what you’re actually asking the model to do.

Almost every provider also discounts repeated context. Anthropic’s prompt caching cuts cached input to roughly 10% of the standard rate. OpenAI’s cached input runs about 10% of standard on its flagship. DeepSeek’s cache-hit pricing on V4 Flash drops input to $0.0028 per million tokens — a 98% discount over a fresh request. If an AI product resends the same system prompt on every call, which most do, caching is doing a lot of quiet work on the bill you never see. If you’re comparing models directly, our ChatGPT vs Claude breakdown covers the qualitative differences this table doesn’t.


What Is the AI Price Stack?

The AI Price Stack is a five-layer model of how a per-token price actually gets built, from raw compute cost at the bottom to market strategy at the top. Every number on a pricing page is the sum of these five decisions, even when a provider never states them separately.

Framework
The AI Price Stack
Five layers that combine into every AI price you pay
01 The Compute Floor — the hard floor: the actual GPU cost of running one query, priced per input and output token.
02 The Training Mortgage — every query pays down the cost of building the model in the first place, folded in like a mortgage payment.
03 The Margin Layer — the margin a provider needs to survive, thinner than software has ever tolerated.
04 The Competitive Anchor — no price exists in isolation; rivals’ pricing forces every provider to justify its own.
05 The Strategy Premium — not cost-driven at all: loss-leaders, enterprise premiums, market-share bets.

Epoch AI’s research puts GPT-4’s amortized training cost near $40 million, with frontier training costs growing roughly 2.4x per year since 2016 — a trajectory that puts the largest runs past $1 billion by 2027. Anthropic CEO Dario Amodei has separately put current frontier training costs in the $100 million to $1 billion range. DeepSeek’s $0.14 floor forces every other provider to justify why their token costs 10 to 100 times more, and that justification becomes part of the price itself.

The five layers aren’t abstract, but it’s worth being direct about what’s actually knowable here: no AI provider discloses this breakdown. Anthropic, OpenAI, Google, and DeepSeek all publish the final number in Table 1 — none publish what’s underneath it.

The AI Price Stack framework diagram showing five pricing layers from compute cost to strategy premium
The AI Price Stack — only the Compute Floor is externally estimable; the rest is undisclosed.
LayerWhat it coversStatus
1. Compute FloorRaw GPU inference cost per million tokensEstimable — self-hosted GPU cost ~$0.79/hr, comparable cloud rental $2.82–$5.64/hr, per a 2025 inference-economics study; exact provider efficiency and batching undisclosed
2. Training MortgageAmortized share of the original training runNot disclosed
3. Infrastructure & reliabilityRedundancy, uptime, safety systems, staffingNot disclosed
4. Margin LayerTarget gross marginNot disclosed per-model — Bessemer’s industry-wide average is ~50-60%
5. Competitive Anchor + StrategyPositioning against rivals, market strategyNot disclosed — inferred only by comparing published prices
What providers publishFinal list price only — e.g. Claude Sonnet 5 at $15.00/M output tokens
Key Insight

What this table actually shows is a boundary, not a breakdown: the input side has rough, independently-sourced reference points; the decision side is entirely opaque and known only by providers themselves. Any article, including this one, that hands you a precise dollar split across these five layers is presenting an estimate as a fact.

A reader looking at a $20 subscription is looking at all five layers stacked on top of each other, collapsed into one number.

How AI Companies Decide Pricing: The Five Layers at a Glance

How AI companies decide pricing comes down to these five layers in sequence: compute cost sets the floor, training cost gets amortized on top, margin gets added to survive, competitors’ prices force justification, and strategy adds whatever premium the market will bear. Each layer is optional to disclose — only the final number is public.


Which Pricing Model Do Most AI Companies Actually Use?

Most AI companies actually use two or three pricing models stacked together, not one. Nearly half of the top-valued AI startups mapped in a 2026 analysis by Product Faculty run subscription, usage-based, and freemium pricing simultaneously — OpenAI and Anthropic both do this identically: tiered consumer subscriptions, metered API pricing for developers, and a free tier underneath both.

Doodle grid comparing subscription, usage-based, outcome-based, and hybrid AI pricing models
Four pricing models AI companies use, often two or three at once.

Subscription is what most people actually pay. A flat monthly fee — Claude Pro, ChatGPT Plus — buys access up to a soft usage limit, and it works because most individual users never come close to that limit. It fails, as covered next, when usage stops being predictable.

Usage-based (metered) pricing is the default for developers building on top of a model. Every API call bills per token, per Table 1, and cost tracks consumption exactly — fair to light users, unpredictable for anyone trying to budget a growing product.

Outcome-based pricing charges for a result instead of a token. Fin, the AI customer-support agent formerly sold under the Intercom name, charges $0.99 per resolved conversation rather than per token or per seat — customers pay only when the AI actually solves the problem, verified directly against Fin’s current pricing documentation.

Hybrid pricing — a subscription base plus usage-based overage — is where the market is converging. Zuora’s 2026 AI pricing analysis calls hybrid the common pattern in mature production AI businesses, balancing buyer predictability with seller margin protection — the structural answer to the subsidy problem covered next.


Why Does Everyone Pay $20 When Usage Isn’t Equal?

Everyone pays the same subscription because AI providers are betting on the average, not the individual. A casual user who sends twenty messages a month and a power user running agentic workflows all day pay the identical $20, even though the second user can cost 50 to 100 times more to serve. The subscription only works if enough light users offset the heavy ones.

Doodle illustration showing how light AI users subsidize heavy users under flat $20 subscription pricing
Light and heavy users pay the same $20 — but cost very different amounts to serve.

That bet doesn’t always pay off. GitHub Copilot’s early flat-rate plan charged $10 a month per developer, but reporting from The Wall Street Journal in October 2023 found the average user was costing Microsoft closer to $30 a month in compute, a loss of roughly $20 per user before Microsoft adjusted the pricing structure. The heaviest users, running long completions constantly, were consuming inference worth two to three times the subscription fee.

Key Distinction

“Flat” pricing almost never stays purely flat. Providers respond with soft caps, weekly rate limits, or usage-based tiers layered on top — not to complicate the pricing page, but because a pure average-based bet breaks once usage gets uneven enough.

The light users are quietly subsidizing the heavy ones, right up until the imbalance gets too expensive to ignore.


How Much Does a $20 AI Subscription Actually Cost to Serve?

A $20 AI subscription costs roughly a few dollars to serve a typical user and can cost far more than $20 for a heavy one — the entire subscription model is a bet on that gap averaging out. Here’s the math, built from independently estimated usage figures since providers don’t disclose this directly.

First-Hand Disclosure

This publication runs its own editorial production, all 25 to 28 monthly articles, entirely on a single Claude Pro subscription: $20 a month, no API billing, no usage credits purchased. That’s a real, disclosed data point, not a hypothetical — and it’s the same $20 this section is about to reverse-engineer.

Anthropic doesn’t publish an exact token allowance for Claude Pro; its own documentation only says usage “varies based on message length, conversation length, model, and effort level.” Independent technical analyses that reverse-engineer the limit from real usage commonly estimate it at roughly 44,000 tokens per 5-hour session window — a third-party estimate, not an Anthropic-disclosed number.

Usage tierSessions/dayTokens/month (assumed)Cost at Sonnet 5 rates
Light user~0.5~660,000~$5.10/month
Moderate user~1~1,320,000~$10.20/month
Heavy user~2 (near cap)~2,640,000~$20.40/month

Cost calculated using Sonnet 5’s intro rate, verified in Table 1, assuming a 30% input / 70% output split typical of conversational use — an estimate, not a disclosed figure.

Doodle bar chart estimating what light, moderate, and heavy users cost to serve on a $20 AI subscription
Estimated cost to serve by usage tier — heavy users cross the $20 break-even line.

The heavy-user row lands right around the $20 price point — close to the break-even zone. An independent June 2026 analysis comparing Anthropic and OpenAI subscription limits against equivalent API pricing put Claude Max 20x’s break-even utilization at roughly 10% of its usage cap. Below that, the subscription is comfortably profitable. Above it, every additional heavy user erodes margin fast.

The honest takeaway: a $20 subscription isn’t priced to cover the cost of one specific user. It’s priced to cover the average across everyone paying $20, which only works as long as the light users keep outnumbering the heavy ones.


What Would Your AI Tool’s Subscription Price Need to Be?

Your AI tool’s subscription price needs to cover the compute cost of serving your average user, plus your target margin, before it can be called defensible. The calculator below runs the same math as the worked example above, on your own numbers — useful whether you’re scoping a new feature or estimating what an agentic workflow actually costs to run.

Interactive Tool
AI Subscription Price Calculator
Minimum Defensible Price
$0.00/month
This number is a floor, not a recommendation. It tells you what compute cost demands — not what the market will pay or what strategic premium you might add on top. The AI Price Stack’s Layers 4 and 5 still apply above this floor.

Why Do AI Prices Keep Falling While Bills Keep Rising?

AI prices keep falling because compute efficiency and competition are moving faster than almost any other technology market in history — and bills keep rising anyway because usage is growing even faster than prices are dropping.

Epoch AI’s research team measured how fast the price to hit a given performance level has fallen across six benchmarks, from PhD-level science questions to coding tests. The rate varies enormously by task, ranging from 9x to 900x per year, with a median around 50x per year across the full three-year window they studied. Restricting to trends since January 2024 alone, that median rate jumps to roughly 200x per year.

Doodle chart showing AI token prices falling over time while total usage and spending rise
Price per token falls while total usage and spending climb — cheaper models get used for more.

That should make every AI product cheaper to run every year. For many, it does. But total spend on AI is still climbing for most companies in production: more capable, cheaper models don’t just get used the same amount for less money — they get used for more things. A model that costs a fraction of what it did last year invites reasoning chains, agentic loops, and longer context windows that weren’t economical before, and each of those consumes far more tokens per task than the simple queries the old pricing was built around.

This is the same dynamic covered earlier, playing out over time instead of across users: falling prices buy AI companies room to do more per query, and providers who bank that room as margin end up with the improving economics described earlier — margins climbing from ICONIQ’s reported 41% average in 2024 toward 52% in 2026, even as the sticker price per token keeps dropping. It’s one more layer in how AI companies decide pricing: the same efficiency gain can be kept as profit or passed on as a lower bill, and the provider makes that call, not the market.


Frequently Asked Questions

How do AI companies decide pricing?

AI companies decide pricing by layering margin and market strategy on top of real per-token compute cost, unlike traditional software where serving an extra customer costs almost nothing. The final number reflects inference cost, training amortization, target margin, and competitive positioning combined.

Why is AI subscription pricing usually $20 a month?

AI subscription pricing usually sits around $20 a month because that price covers a typical light-to-moderate user’s compute cost with margin to spare, while providers bet that enough light users offset the heavier ones who cost more to serve than the subscription collects.

Why do AI companies charge more for output tokens than input tokens?

AI companies charge more for output tokens because generating text is a more expensive computation than reading it. Input tokens pass through the model once; output tokens are generated one at a time, with a full probability calculation run for every word, which is why output pricing runs 3x to 6x the input rate across every major provider.

What is a token in AI pricing?

A token is the basic unit AI providers bill by, roughly three-quarters of an English word. Every API call is priced by counting input tokens (what you send) and output tokens (what the model generates), each at a separate rate.

Why is DeepSeek so much cheaper than ChatGPT or Claude?

DeepSeek is cheaper because its models use efficient architecture and undercut Western providers on price as a competitive strategy, not because the underlying compute is fundamentally different. DeepSeek’s V4 Flash tier costs roughly 35 to 100 times less per million tokens than OpenAI’s or Anthropic’s flagship models.

Do AI companies lose money on their heaviest users?

AI companies can lose money on their heaviest users under flat subscription pricing. GitHub Copilot’s early $10-a-month plan cost Microsoft closer to $30 a month in compute for the average user, according to Wall Street Journal reporting from October 2023, before the pricing structure was adjusted.

Will AI prices keep falling?

AI prices will likely keep falling on a per-performance basis. Epoch AI’s research found the price to reach a given performance level has dropped by a median of roughly 50x per year since 2022, accelerating to around 200x per year in trends measured since January 2024.

Why do AI bills keep rising if prices keep falling?

AI bills keep rising because falling prices make more ambitious use cases affordable, so people run longer reasoning chains, agentic workflows, and bigger context windows that consume far more tokens per task than the cheaper, simpler queries the old pricing was built around.

What is prompt caching and how does it lower AI costs?

Prompt caching lowers AI costs by charging a reduced rate when a model reuses previously processed context instead of reprocessing it from scratch. Anthropic and OpenAI both discount cached input by roughly 90%, and DeepSeek discounts it by about 98% on its V4 Flash tier.

What’s the difference between AI subscription pricing and API pricing?

AI subscription pricing charges a flat monthly fee for capped usage aimed at individual consumers, while API pricing charges per token consumed with no fixed cap, aimed at developers building products on top of the model. The same underlying model can be accessed through either, at very different effective rates.

Is AI subscription pricing subsidized by venture capital?

AI subscription pricing is widely believed to be subsidized, at least partly, by venture capital in the current market. Industry analysts point to the scale of AI funding — OpenAI has raised over $78 billion and Anthropic over $33 billion, per Crunchbase-sourced reporting — and note that providers have prioritized user growth over near-term profit, similar to the early subsidized pricing seen in ride-sharing and streaming before those markets matured.

How do I calculate what to charge for my own AI product?

Calculating a defensible AI product price starts with your cost to serve an average user: model tier, tokens per user, and requests per day, multiplied by the provider’s per-token rate. Divide that cost by one minus your target margin to find the minimum price the compute cost demands, before adding any competitive or strategic premium on top.


Conclusion

That’s how AI companies decide pricing, top to bottom: a compute floor you can partially estimate, a training mortgage no provider discloses, a margin layer running thinner than software has ever tolerated, a competitive anchor set by whoever prices lowest, and a strategy premium that has nothing to do with cost at all.

The AI Price Stack explains what a $20 subscription actually buys — five decisions collapsed into one number. Prices will keep falling on a per-performance basis while bills keep rising as usage grows into the room that falling prices create.

If you’re building your own AI product, scroll back to the calculator above and run your own numbers before you set a price.

Summary diagram of the AI Price Stack framework connecting compute cost, margin, and strategy to a $20 subscription price
The AI Price Stack, start to finish — compute cost to your $20 subscription.
Daniel Voss
Daniel Voss
Technology Writer & Analyst
Daniel Voss is a technology writer and analyst with 6+ years of experience covering enterprise software, cybersecurity, and the emerging AI infrastructure redefining how SaaS is built and discovered. He writes for technical decision-makers — product leaders, engineers, and founders who want rigorous analysis with a clear point of view. His work at The SaaS Library focuses on the standards, shifts, and structural changes that most coverage reduces to hype.
Thought Leadership Cybersecurity AI in the Wild GEO

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top