<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>Swapnil Surdi — Blog</title><description>Engineering writing by Swapnil Surdi — production AI systems, performance war stories, and backend infrastructure.</description><link>https://surdi.in/</link><item><title>Breaking the 25,000-token wall</title><link>https://surdi.in/blog/breaking-the-25k-token-wall/</link><guid isPermaLink="true">https://surdi.in/blog/breaking-the-25k-token-wall/</guid><description>MCP is everywhere now — and so is its oldest constraint. How a transparent caching proxy gets any MCP server past the 25,000-token response limit.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Make the model the exception, not the loop</title><link>https://surdi.in/blog/model-exception-not-loop/</link><guid isPermaLink="true">https://surdi.in/blog/model-exception-not-loop/</guid><description>How a 24/7 AI agent fleet stays affordable on one subscription: deterministic code handles every tick, and the model only runs on real signals.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item><item><title>The 160× index: a 4.18-second dashboard and the COUNT(*) that ate it</title><link>https://surdi.in/blog/the-160x-index/</link><guid isPermaLink="true">https://surdi.in/blog/the-160x-index/</guid><description>My fleet dashboard quietly degraded to 4.18s. The cause: one COUNT(*) full-scanning 258k rows on every load. One index later: ~18ms, flat forever.</description><pubDate>Wed, 10 Jun 2026 00:00:00 GMT</pubDate></item></channel></rss>