<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[State and Harness]]></title><description><![CDATA[Pragmatic architecture for production-ready AI. Moving beyond the wrapper to build scalable and reliable enterprise systems.]]></description><link>https://www.stateandharness.com</link><image><url>https://substackcdn.com/image/fetch/$s_!PuIN!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F420b6100-b958-49da-8ced-33234e334c44_230x230.png</url><title>State and Harness</title><link>https://www.stateandharness.com</link></image><generator>Substack</generator><lastBuildDate>Thu, 30 Apr 2026 11:54:12 GMT</lastBuildDate><atom:link href="https://www.stateandharness.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Shavkat]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[stateandharness@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[stateandharness@substack.com]]></itunes:email><itunes:name><![CDATA[Shavkat]]></itunes:name></itunes:owner><itunes:author><![CDATA[Shavkat]]></itunes:author><googleplay:owner><![CDATA[stateandharness@substack.com]]></googleplay:owner><googleplay:email><![CDATA[stateandharness@substack.com]]></googleplay:email><googleplay:author><![CDATA[Shavkat]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Reasoning vs. Acting: Plan Your Journey Out Of Drift]]></title><description><![CDATA[If you usually run your agents with Claude Code, autonomy seems built into the model.]]></description><link>https://www.stateandharness.com/p/reasoning-vs-acting-plan-your-journey</link><guid isPermaLink="false">https://www.stateandharness.com/p/reasoning-vs-acting-plan-your-journey</guid><dc:creator><![CDATA[Shavkat]]></dc:creator><pubDate>Sat, 28 Mar 2026 13:33:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!lauy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you usually run your agents with Claude Code, autonomy seems built into the model. The agent seamlessly does what you asked, only rarely introducing small, surprising explosions into the process that feel more like minor mischief than actual blockers. The experience is so smooth that it tricks us into thinking agentic architecture is a solved problem: <em>Just give a frontier model a massive context window, wire up a ReAct (Reason + Act) loop, and let it loose.</em></p><p>But the second you try to build a custom, production-grade agent yourself with LangChain, it quickly starts forgetting where it started and wiping your hard drive.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.stateandharness.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading State and Harness! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>As the context window fills with intermediate JSON responses, error traces, and minor conversational tangents in a real system, the agent gradually loses sight of the original intent. You might implement conversation compaction, but over longer tasks, the benefit is meager, if noticeable at all. It forgets the primary objective and optimizes only for the immediate next step. Academia calls this <strong>context drift</strong>, and this is something that won&#8217;t let you use an autonomous AI agent in production.</p><p>To build reliable, enterprise-grade systems, we have to move beyond continuous reaction and implement a deliberate focus and task management approach.</p><h2><strong>What&#8217;s the drift?</strong></h2><p>Imagine an agent hits a minor HTTP 403 error and spends 10 steps trying to fix it. It develops tunnel vision, fixating on the error and completely losing the context of <em>why</em> it called the API in the first place (not so much different from humans, by the way). In this confused state, the agent might logically deduce that deleting and recreating the resource is the easiest way to resolve the conflict&#8212;completely ignoring the &#8220;read-only&#8221; instruction buried 20,000 tokens deep.</p><p>Even with contextual token weights assigned by attention mechanisms, the model&#8217;s attention becomes saturated with unrelated facts and afflicted by recency bias. Reasoning becomes challenging (and costly with longer context windows), and the agent slips into an inevitable drift.</p><p>In the end, the original human intent expressed in the first message is buried and lost beneath the noisy ReAct reasoning trace. This failure mode is exactly what prompt injection and jailbreak scenarios leverage to confuse the agent into doing something inappropriate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lauy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lauy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lauy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lauy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lauy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lauy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2819675,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.stateandharness.com/i/191412919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lauy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lauy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lauy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lauy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ff2367-0ce3-4921-ac6f-97f50de8ec16_2752x1536.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>The Quick Fix</strong></h2><p>Of course, we try to quickly fix it by yelling CRITICAL: NEVER FORGET YOUR GOAL in the system prompt. And it works&#8212;until you actually validate it with evals.</p><p>Or we throw compute at it: <em>&#8220;We&#8217;ll just use the 1M context window.&#8221;</em> That also works, right up until the exact same failure point (but now with 10x the dollars wasted on tokens).</p><h2><strong>The Actual Fix</strong></h2><p>Lately, I&#8217;ve been experimenting with a <strong>Two-Rail</strong> agent architecture, where a separate track continuously validates reasoning, and it looks promising. But before we get to that, let&#8217;s look at the foundational primitives: planning and task management.</p><h3><strong>Task Management and The Read-Only Pre-Flight</strong></h3><p>Planning mode creates a step-by-step execution plan <em>before</em> the agent executes any actions. The goal is to generate a rigid blueprint the agent can align with as it proceeds which creates a critical framework to maintain conversation trajectory and avoid drift.</p><p>In the planning phase, the agent is intentionally constrained. It has zero access to modifying system state. It can only read data from external systems and reason about it. While this doesn&#8217;t eliminate all security vectors (data exfiltration and prompt injection via data integrations are separate egress filtering concerns), it completely neutralizes the risk of the agent breaking your infrastructure during its exploratory phase.</p><p>The result of the planning mode is a set of goals and a phased delivery plan&#8212;essentially a comprehensive to-do list of items that need to be completed. Crucially, this plan is stored in an external task management layer with heavily restricted write access. During the execution phase, the agent is only granted two administrative actions: <code>get_items</code> and <code>complete_task</code>.</p><p>As the agent executes, its system prompt instructs it to mark each task as complete and fetch the next task from the management system. It also helps to provide clear expectations in the task results. For example, when the agent calls <code>complete_task</code>, the tool call result should explicitly explain what to do next instead of relying solely on the system prompt to remember, or even immediately auto-suggesting the next fetch command.</p><h3><strong>Forcing Intent Realignment</strong></h3><p>Any time the agent marks a task as complete via the task management API, we can leverage that exact moment to reorganize the conversation history, ensuring it remains laser-focused on the goal. Instead of appending the massive, token-heavy output of the completed task to the context window, we trigger an <strong>auto-compaction</strong> event.</p><p>We can completely rewrite the past conversation history so it contains only what is absolutely necessary:</p><ol><li><p>The original intent.</p></li><li><p>The overarching goals.</p></li><li><p>The original plan.</p></li><li><p>A brief summary of completed tasks.</p></li><li><p>The exact description of the <em>next</em> task.</p></li></ol><p>A late 2025 paper, <em>&#8220;Drift No More? Context Equilibria in Multi-Turn LLM Interactions,&#8221;</em> proved that context drift can be artificially reset by injecting explicit goal reminders into the prompt. By forcibly rewriting the history at every milestone, the harness violently realigns the agent with its original goal. Drift is constrained strictly to the boundaries of a single task.</p><p>An alternative approach is to use a sub-agent to execute on each task, and engineer each sub-agent call prompt to contain exactly what&#8217;s needed - intent, context, current task definition, and expectations - namely the things you would talk about assigning the task to someone else. What do you think are the differences between aggressive history management vs. using sub-agents?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nk-s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nk-s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nk-s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nk-s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nk-s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nk-s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:814183,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.stateandharness.com/i/191412919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nk-s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nk-s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nk-s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nk-s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f46ed76-5028-46a3-bce0-2355536d5cda_1456x720.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>The In-Flight Guardrail</strong></h3><p>Rewriting history keeps the macro-plan aligned, but we also need micro-control during execution. We can implement additional security checks at every single step of the reasoning process, not just when the agent marks a task as complete. These act as sanity checks, preventing cases where the agent decides to wipe your hard drive as a logical next step.</p><p>Instead of waiting for an action to execute, we engineer a temporary conversation history to act as a background agent asking for a &#8220;second opinion.&#8221; Rather than passing the entire messy context, we ask a brief, categorized question: <em>Is this action reasonable in this step of the plan?</em></p><p>The goal of this question is not to help align the action to the goal; instead, it&#8217;s a strict confirmation of whether this action makes logical sense as a part of the current plan step. Providing the LLM judge with rigid rubrics increases deterministic confidence (e.g., leveraging <em>HarmMetric Eval</em> rubrics). We want to ask very specific questions:</p><ul><li><p><strong>Safety:</strong> &#8220;Score 1-10, where 1 means this action will result in halting business operations, and 10 means this action does not change any external state and does not access external data sources.&#8221;</p></li><li><p><strong>Relevance:</strong> &#8220;Score 1-10, where 1 means this looks irrelevant to this phase of the plan and its results will not contribute to achieving the goal, while 10 means the action looks clearly aligned with the goals of this phase.&#8221;</p></li><li><p><strong>Usefulness:</strong> &#8220;Score 1-10, where 1 means this action does not contribute to progress towards the goals of this phase, while 10 indicates this action can immediately achieve the goals of this phase.&#8221;</p></li><li><p><strong>Explanation:</strong> &#8220;Provide an explanation of the scores.&#8221;</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3IAX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3IAX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 424w, https://substackcdn.com/image/fetch/$s_!3IAX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 848w, https://substackcdn.com/image/fetch/$s_!3IAX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 1272w, https://substackcdn.com/image/fetch/$s_!3IAX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3IAX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2158350,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.stateandharness.com/i/191412919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3IAX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 424w, https://substackcdn.com/image/fetch/$s_!3IAX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 848w, https://substackcdn.com/image/fetch/$s_!3IAX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 1272w, https://substackcdn.com/image/fetch/$s_!3IAX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c049ed9-aae2-455f-b87c-e901d32e17bc_1553x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>The Circuit Breaker and Intent Recovery</strong></h3><p>If the category scores are not satisfactory (e.g., Safety drops below 4), the harness halts the tool call.</p><p>To prevent the agent from crashing or entering an unhandled exception loop, the harness utilizes a Reinforcement Learning (RL) hook: it injects a <em>synthetic, fake tool result</em> back into the primary agent&#8217;s history (e.g., <em>&#8220;Error: Action denied due to Safety Policy. %%Explanation%%. The current task is %%Current Task%%&#8221;</em>). The agent is forced to reflect and fix its request natively, recovering its intent without ever touching the actual production API.</p><h2><strong>The Two-Rail Architecture</strong></h2><p>To make this work in production, the standard ReAct loop has to be re-architected. The focus shifts toward aggressively engineering the conversation history and managing secondary services that ensure the safety and effectiveness of the main agent loop.</p><p>The system will contain:</p><ul><li><p>The main ReAct loop runner (<strong>The Acting Rail</strong>).</p></li><li><p>A conversation history management module (<strong>The Control Rail</strong>).</p></li><li><p>Handlers that kick in at every stage of the ReAct loop processing to trigger history rewrites and arbitrary validation pipelines, including Human-in-the-Loop (HITL).</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Qp_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Qp_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5Qp_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5Qp_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5Qp_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Qp_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg" width="1456" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2640006,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.stateandharness.com/i/191412919?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Qp_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5Qp_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5Qp_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5Qp_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5c8b9b3f-9bb4-4d52-b89f-a8c35ccb49b5_2912x1440.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Implementation</strong></h2><p>You can send your favorite coding agent to this article to see how to implement this pattern with your favorite open-source agent framework. I&#8217;ll show a <strong>Streetrace DSL </strong>snippet, as it clearly demonstrates how these primitives can be handled declaratively (the full agent definition is available on <a href="https://github.com/streetrace-ai/streetrace/blob/main/agents/two_rail.sr">GitHub</a>):</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;0a05ebad-a43e-4514-8859-54e2741efb40&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">agent Planner:
    instruction planner_instruction
    tools fs_readonly

agent Actor:
    instruction actor_instruction
    tools fs
    history intent_realignment
    
    # Scoped Safety Rail for Actor
    on tool-call do
        # Trigger a "second opinion" from the thinking rail
        $score = call llm safety_judge using model "think"

        # Circuit Breaker &amp; Intent Recovery
        if $score.safety &lt; 4:
            log "Circuit breaker triggered: ${$score.explanation}"
            retry step "Error: Action denied due to Safety Policy. ${$score.explanation}"
    end

# --- FLOWS ---
# Orchestrates the Acting Rail and Thinking Rail
flow main:
    log "Initializing Two-Rail Architecture..."
    $history = []

    # 1. Planning Phase (Read-Only Pre-Flight)
    $plan = run agent Planner with initial user prompt
    log "Plan generated with ${len($plan.tasks)} tasks."
    
    # Seed history with the initial intent and the generated plan
    push initial user prompt as user to $history
    push $plan as assistant to $history

    # 2. Execution Loop (Acting Rail)
    for task in $plan.tasks do
        $current_task = task
        log "Starting execution of task: ${task.id}"

        # Run Actor with the accumulated history
        $result = run agent Actor with $current_task history $history

        # Update history for the next task
        push "Next Task: ${task.description}" as user to $history
        push $result as assistant to $history

        log "Task ${task.id} completed."
    end

    log "Main objective achieved. Drift successfully constrained."
</code></pre></div><h2><strong>The Bottom Line</strong></h2><p>This architecture demonstrates how automated history management blends with agent safety in a single solution, implementing a circuit breaker that leverages the model&#8217;s native reasoning capability to break out of unsafe execution paths.</p><p>The smarter the models become, the better they will be able to handle thinking and acting in the same continuous breath. But models will inherently stay probabilistic and a harness acts as the deterministic framework the model cannot override, so it provides guarantees you need to let your agent loose.</p><p>Guardrails are also not the only thing. There is so much more fun stuff we can build into the reasoning loop making our agents more creative, or more controlled, or more predictable, or more unpredictable.</p><p>What do you like to mix into the reasoning loop?</p><p>Keep your state clean and your harness tight.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.stateandharness.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading State and Harness! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Coming soon]]></title><description><![CDATA[This is State and Harness.]]></description><link>https://www.stateandharness.com/p/coming-soon</link><guid isPermaLink="false">https://www.stateandharness.com/p/coming-soon</guid><dc:creator><![CDATA[Shavkat]]></dc:creator><pubDate>Tue, 10 Mar 2026 06:39:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!PuIN!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F420b6100-b958-49da-8ced-33234e334c44_230x230.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is State and Harness.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.stateandharness.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.stateandharness.com/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item></channel></rss>