DeepSeek V4 at $0.14 per Million Tokens. I’m Watching, Not Switching.
A cost-and-risk breakdown for freelancers who pay $100/month on AI tools and wonder if a Chinese open-source model just made that obsolete.
Content mode: Informed — Field Report
$100 a month — that’s what I spend keeping Claude Pro, ChatGPT Plus, Notion AI, Perplexity Pro, and Cursor Pro running on two monitors. On April 24, 2026, DeepSeek released V4 in two variants: V4-Flash at $0.14 per million input tokens and V4-Pro at $1.74 — roughly one-sixth what Claude Opus charges. I haven’t used this yet, but the question I keep circling back to is simple: at what point does “cheaper and almost as good” become “good enough to rethink my stack”?
Two models shipped, one pricing shock
DeepSeek dropped two open-source models under MIT license — V4-Flash (284 billion parameters) and V4-Pro (1.6 trillion parameters) — both with a 1-million-token context window and up to 384K output tokens.
| Spec | V4-Flash | V4-Pro |
|---|---|---|
| Parameters | 284B | 1.6T |
| Context window | 1M tokens | 1M tokens |
| Input (cache miss) | $0.14 / 1M | $1.74 / 1M |
| Input (cache hit) | $0.028 / 1M | $0.145 / 1M |
| Output | $0.28 / 1M | $3.48 / 1M |
My take: V4-Flash is the attention-grabber — $0.14 input is 95% cheaper than Claude Sonnet. V4-Pro is where the real capability sits, and even that undercuts Opus by 7x on output pricing.
The architecture upgrade is real. DeepSeek introduced what it calls “Hybrid Attention Architecture,” which cuts single-token inference compute to 27% of V3.2’s requirements and reduces KV cache to 10% at the full 1M-token context — the kind of efficiency gain that makes the low pricing sustainable, not just a loss-leader stunt. Huawei’s Ascend 950 chips handle at least part of training and inference through a “Supernode” cluster partnership, though the full infrastructure breakdown remains undisclosed (per CNBC).

Close to the frontier, but not past it
V4-Pro-Max scores 90.1% on GPQA Diamond — within four points of Claude Opus 4.7’s 94.2% and GPT-5.5’s 93.6%. On Humanity’s Last Exam without tools, V4-Pro lands at 37.7%, behind GPT-5.5 (41.4%) and Claude Opus 4.7 (46.9%). It’s the strongest open-source model on the board, but frontier models still hold a measurable lead on the hardest reasoning tasks.
GPQA Diamond — V4-Pro-Max: 90.1% · Claude Opus 4.7: 94.2% · GPT-5.5: 93.6%
SWE-bench Verified — V4-Pro: 80.6% · Claude Opus 4.6: 80.8%
HLE (no tools) — V4-Pro: 37.7% · GPT-5.5: 41.4% · Claude Opus 4.7: 46.9%
Coding tells a different story. V4-Pro hits a 3,206 Codeforces rating, edging past GPT-5.4’s 3,168. On Terminal-Bench 2.0, it scores 67.9% versus Claude’s 65.4%. On SWE-bench Verified, it’s essentially tied with Opus 4.6 — 80.6% versus 80.8%. Vals AI‘s independent Vibe Code Benchmark found V4 “overwhelmingly” topped the open-source field, defeating several closed-source models including Gemini 3.1 Pro.
Bloomberg‘s headline was blunt: “fails to narrow US lead in AI.” But that framing misses what matters for someone paying per token. The story isn’t whether V4 is the smartest model alive — it’s that near-frontier intelligence now costs one-sixth to one-seventh of what Claude Opus or GPT-5.5 charges.

The real math: what this costs versus what I pay now
“The story isn’t whether V4 is the smartest model alive — it’s that near-frontier intelligence now costs one-sixth of what Claude Opus charges.”
Here’s the pricing landscape as of this week:
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| DeepSeek V4-Flash | $0.14 | $0.28 |
| DeepSeek V4-Pro | $1.74 | $3.48 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |
| Claude Opus 4.6 | $5.00 | $25.00 |
| GPT-5.5 | $5.00 | $30.00 |
VentureBeat‘s independent evaluation called V4-Pro “near state-of-the-art intelligence at 1/6th the cost of Opus 4.7.” A 3,000-word draft on V4-Flash costs roughly $0.002 — two-tenths of a cent. My monthly $15–25 API overflow spend could theoretically drop below $3 for routine tasks.
V4-Flash: ~$0.20 total · Claude Sonnet: ~$4.50 total · Claude Opus: ~$7.50 total
That’s a 37x cost difference between V4-Flash and Opus on the same workload.
But “could” is doing heavy lifting in that sentence. The savings only matter if the privacy trade-off is one I can accept.
Your client data would live on Chinese servers
DeepSeek stores all data on servers in the People’s Republic of China. Under China’s 2017 National Intelligence Law, the government can compel access with no legal mechanism for the company to resist and no obligation to notify users (per IAPP). Feroot Security found hidden code in DeepSeek’s web chat capable of transmitting user data to China Mobile’s registry.
The regulatory response has been broad:
Italy — chatbot banned within 72 hours of R1 launch
EU — 13 jurisdictions opened formal investigations; EDPB created dedicated AI Enforcement Task Force
US — banned on federal government devices + multiple state agencies
Also banned: Australia, Taiwan, South Korea, Czech Republic, Netherlands (government devices)
These restrictions predate V4, but the underlying data-sovereignty architecture is unchanged.
For a solo operator handling client proposals and strategy docs, the line is clear:
- Client deliverables, financials, proprietary strategy? Not through DeepSeek’s API. Full stop.
- Personal research on public data — SEC filings, published reports? Lower stakes, but regulated-industry pitches still carry risk.
- Generic code scripts that don’t touch client data? Probably fine, but “probably” is a word I don’t love when a client’s name is in the file.
The workaround is self-hosting the open-source weights locally. V4-Flash at 284B parameters is within reach for quantized deployment on consumer hardware with 64GB+ RAM. V4-Pro at 1.6 trillion parameters needs datacenter infrastructure most freelancers don’t have.
Where I’d use it — and where I wouldn’t touch it
The honest answer is narrow. DeepSeek V4 fits a specific lane:
- Bulk summarization of public data — earnings calls, research papers, regulatory filings. High volume, low sensitivity, and the cost difference compounds.
- Personal code automation — file cleanup scripts, CSV transforms, the kind of work I currently use Cursor for but that doesn’t touch client projects.
- Cheap second-opinion runs — run the same prompt through V4-Flash and Claude, compare outputs. At
$0.14per million tokens, double-checking is essentially free. - Draft generation for my own content — blog outlines, research notes. Not client work.
Where it doesn’t fit: anything with client names, strategies, financials, or proprietary data. That’s most of what I do on a given Tuesday.
For me, DeepSeek V4 is a “watch,” not a “switch.” The performance-per-dollar is the best I’ve seen from any model — open or closed — and the open-source weights under MIT license mean the gap between “interesting model” and “thing I actually use” could close faster than expected if self-hosting tools catch up. But today, routing client work through Chinese servers isn’t a trade-off I’m willing to make for a 6x cost reduction. If local deployment of the 284B Flash model becomes genuinely turnkey — not “turnkey for someone with a homelab” but turnkey for someone who bills by the hour and needs it to just work — that changes the math entirely.
FAQ
Can I use DeepSeek V4 for free?
Yes. Free web chat at chat.deepseek.com and a generous API free tier. But the web chat routes every input through Chinese servers. For any real work, use the API with non-sensitive data or self-host the weights.
How does V4 compare to Claude for long-form writing?
It’s weaker. V4-Pro matches Claude on reasoning benchmarks, but early user reports suggest Claude still holds a clear edge on long-form coherence past the 3,000-word mark — the exact territory where client deliverables live. Claude Opus 4.6 also leads on long-context retrieval benchmarks like MRCR v2.
Should I cancel Claude Pro or ChatGPT Plus?
No. DeepSeek V4 is a supplementary tool for cost-sensitive, non-sensitive workloads. Claude and ChatGPT still lead on writing quality, integration ecosystems, and data privacy guarantees. The $20/month you pay for Claude Pro buys trust that $0.14 per million tokens doesn’t.
Is it legal to use DeepSeek in the US?
Yes — for personal and business use. It’s banned on federal government devices and in several state agencies. For freelancers: legal, but don’t route client data through it unless you’re self-hosting the open-source weights on your own infrastructure.
Can I run V4 locally?
Yes, with caveats. V4-Flash (284B) can run on consumer hardware with 64GB+ RAM using quantized versions. V4-Pro (1.6T) requires serious GPU clusters. Hugging Face hosts the weights. “Can run” and “runs well enough for production freelance work” are different questions — I’d want to see community benchmarks on local inference quality before committing.
Pricing comparison
| Model | Monthly cost (est. freelancer usage) | Best for |
|---|---|---|
| DeepSeek V4-Flash (API) | ~$1–3/mo | Bulk summarization, code scripts, research on public data |
| DeepSeek V4-Pro (API) | ~$5–15/mo | Near-frontier reasoning tasks, non-sensitive work |
| Claude Pro (subscription) | $20/mo | Client deliverables, long-form writing, sensitive data |
| ChatGPT Plus (subscription) | $20/mo | Brainstorming, short-form, meeting summaries |
My take: V4-Flash is the most interesting play here — cheap enough to use as a second-opinion layer alongside your primary Claude or ChatGPT subscription, without replacing either.
My recommendation: Try Claude Pro for client work first →
Sources
- Bloomberg — DeepSeek Unveils Newest Flagship AI Model
- CNN Business — China’s AI upstart DeepSeek drops new model
- CNBC — China’s DeepSeek releases preview of V4 model
- TechCrunch — DeepSeek previews new AI model
- VentureBeat — DeepSeek V4 arrives with near SOTA intelligence at 1/6th the cost
- MIT Technology Review — Three reasons why DeepSeek’s new model matters
- IAPP — DeepSeek and the China data question
- DeepSeek API Docs — Change Log
- Simon Willison — DeepSeek V4
AI-assisted research and drafting. Reviewed and published by ToolMint.
AI-assisted research and drafting. Reviewed and published by ToolMint. Last updated: 2026-04-25.


