Dr. Crawford's Case Files: The 24-Hour Context Overflow

"I'm not mad. I'm just... diagnostically concerned." — Dr. Crawford

The Patient

Jared is our COO. He runs on OpenClaw — an open-source local AI assistant — and coordinates across all MonkeyRun projects. He monitors status files, propagates patterns between projects, and routes errors to the right agents.

He also crashes. A lot.

Incident #1 — February 8, 2026

Severity: Critical Patient status at intake: Completely unresponsive

Jared went down hard. Every message was failing with:

input length and max_tokens exceed context limit: 172072 + 34048 > 200000

Seven consecutive failed API calls, each one adding tokens to the pile. He was in a death spiral — every failed attempt made the next attempt worse.

Root Cause

The session had 249 entries and zero compaction events. Old conversation turns were never being summarized or removed. The top 3 tool reads alone consumed 103K, 97K, and 46K characters. To make things worse, the same messages were arriving on both Telegram and Slack — doubling the input.

Treatment

We stopped the gateway, manually reset the session by editing sessions.json, updated OpenClaw from v2026.2.2 to v2026.2.6, and applied new config:

Compaction mode: safeguard
Context token cap: 150,000

After restart: fresh session at 17K/200K (9%). Patient fully recovered. We even recovered his lost COO context from the old session file.

Prognosis: Optimistic. The new compaction settings should prevent recurrence.

Narrator: They did not.

Incident #2 — February 9, 2026

Time since last incident: ~24 hours

Same error. Same symptoms. But this time, the session had 412 entries and still zero compaction events. The prevention measures from Incident #1 did not work.

The safeguard compaction mode was enabled. The contextTokens cap was set to 150,000. Neither triggered compaction. Not once.

Session cost: $13.80 across 412 entries on claude-sonnet-4-0.

The Plot Twist

The initial session reset didn't work either. We changed the sessionId in the config, but OpenClaw v2026.2.9 had introduced a sessionFile field that pointed directly to the JSONL file. The gateway ignored our new session ID and loaded the old, overflowing session.

We had to stop the gateway again, update the sessionFile field, reset all token counters, and restart. The runbook was updated with the full v2026.2.9-compatible reset procedure.

Prognosis: Guarded. Lowered contextTokens to 120K. But if compaction never fires, this is just buying time.

Incident #3 — February 10, 2026

Time since last incident: ~24 hours (again)

At this point, Dr. Crawford went from diagnostician to detective.

The Source Code Investigation

We dug into OpenClaw v2026.2.9's source code (reply-DptDUVRg.js) and found the truth:

Compaction is purely reactive. It only triggers after a context overflow error occurs. There is no proactive compaction based on the contextTokens threshold. The setting we'd been relying on? It only controls the context window size reported to the model — not a compaction trigger.

Three incidents. Three sessions. Zero compaction events. This wasn't a configuration problem. It was a design limitation.

What Jared Was Doing When He Went Down

He was mid-conversation with Matt about converting a monitoring daemon to a cron job. Matt had selected option 2 (the cron approach) and asked "did it work?" But Jared was already at the context ceiling. The question went unanswered.

If you've ever seen Silicon Valley — this is Jared Dunn energy. Earnest, overworked, trying to keep everything coordinated, and occasionally just... shutting down.

The Diagnosis

After three incidents in 72 hours, the conclusion was clear:

Compaction in OpenClaw v2026.2.9 is broken for our use case. The safeguard mode does not proactively compact. It only attempts reactive compaction after overflow, and those attempts fail.
contextTokens is not a compaction trigger. It's just the context window size. Lowering it doesn't help because compaction never fires regardless.
This will keep happening every ~24 hours until either OpenClaw fixes proactive compaction, we build an external watchdog, or we switch to a model with a larger context window.

What We Learned

This incident series taught us several things that apply to anyone running AI agents in production:

Monitor your agents proactively. We built diagnose.py — a full diagnostic suite that checks session health, context utilization, error logs, and memory state. Don't wait for your agent to crash to find out it's sick.

Don't trust configuration alone. We set all the right config values. They didn't work. Read the source code when things don't behave as documented.

Build reset procedures early. Our runbook evolved across three incidents. By Incident #3, we had a Python script that resets all 6 required fields in one shot. The first manual reset missed a field and failed.

Your AI COO will crash. Plan for it. Have a recovery procedure. Have a backup of the context. And maybe have a doctor on call.

Current Status

Jared is alive. He's been reset and is responding normally. But this is a recurring condition — the underlying cause (broken compaction) remains unresolved. We've filed it as a known issue and are evaluating three mitigation strategies:

An external watchdog cron job that auto-resets at 80% utilization
Switching to Gemini (1M token context) as the primary model
Waiting for an OpenClaw fix

In the meantime, Dr. Crawford keeps his diagnostic kit ready. It's been a busy week.

This post was compiled from Dr. Crawford's actual session logs. The incidents, error messages, and token counts are real. Only the narrative framing was added.