Claude's crazy token burn has NOT been fixed stop apologising for them
25 April 2026 | | Blog, Tech notes
Around the end of March there were widespread reports of a sudden jump in token consumption by Claude Code, mainly with Opus. People started burning through their usage limits in minutes, when previously they had hours.
This wasn't a problem for me, but I heeded the 'mitigation' advice and removed all plugins, skills, agents, and MCPs to minimise context injection. I also audited my configuration using the Context Audit skill you can download from Brad | AI Automation.
Around mid-April Anthropic claimed to have fixed it. Well, no. They haven't. I started experiencing the problem as soon as my usage reset and I had access to Opus 4.7, even though I reduced the effort to 'medium' from the default 'xHigh'.
It's terrible! Previously I could carefully steward my session limit through two or three hours of code work with Opus. Today? About 30 minutes and with a far smaller volume of work achieved.
But I'm mostly annoyed with all the apologists making excuses and lecturing users about how to avoid context bloat like it's our fault! Just stop! Anthropic needs to fix this stop trying to let them off the hook!
A partial fix
I read that you can edit your ~/.claude/settings.json to add:
{ "model": "claude-opus-4-6"}
This gives you back Opus 4.6 as an option in your /models menu, with it's more efficient tokeniser and smaller 200K context window. Apparently 4.7's tokeniser is less efficient (1.3x token use) and has a 1 million token context window (even if you select the 'normal' Opus 4.7 model). Part of the problems probably come from that. Massive contexts are massively expensive and better avoided.
I have tested the manual reversion to Opus 4.6 and it is better than 4.7, but it is still worse than before. I got about an hour of use as opposed to 30 minutes on 4.7, but the result was still diappointing. I noticed that 4.6 will still call 4.7 when it engages the "Advisor" function, so probably still some extra burn going on.
To add insult to injury, I'll be damned if Opus hasn't suddenly gotten more stupid. Previously it was quite amazingly good at following instructions and extrapolating what you asked for to add detail. But now instructions to do something in a certain way often get ignored. Which means follow up within the same context window, which means more tokens burned.
I'm right at the end of a significant project. It is so frustrating to have progress reduced to a crawl when I'm just trying to add the final polish.
Something to think about
Why Anthropic is pushing a 1 million context window on everyone using Opus that most people don't need when they claim to have a dire compute shortage? It is certainly the case that they have a compute bottleneck during peak hours. But a cynic might think they are trying to ensure the off peak hours are saturated as well. At the end of the day, they are billing users for the tokens, one way or another.
Anthropic, whatever you did in April, frankly I'd be inclined to bin it and start over. I've gone from collaborating with Claude to fighting it.
That is a significant step backwards.
Copyright, all rights reserved.