<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Theatricality]]></title><description><![CDATA[Notes on what AI systems perform versus what they do]]></description><link>https://www.theatricality.xyz</link><image><url>https://www.theatricality.xyz/img/substack.png</url><title>Theatricality</title><link>https://www.theatricality.xyz</link></image><generator>Substack</generator><lastBuildDate>Wed, 29 Apr 2026 08:57:22 GMT</lastBuildDate><atom:link href="https://www.theatricality.xyz/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Anurag Mohanty]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[anuragmohanty@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[anuragmohanty@substack.com]]></itunes:email><itunes:name><![CDATA[Anurag Mohanty]]></itunes:name></itunes:owner><itunes:author><![CDATA[Anurag Mohanty]]></itunes:author><googleplay:owner><![CDATA[anuragmohanty@substack.com]]></googleplay:owner><googleplay:email><![CDATA[anuragmohanty@substack.com]]></googleplay:email><googleplay:author><![CDATA[Anurag Mohanty]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[The Four Levels of Knowing If Your AI Product Actually Worked ]]></title><description><![CDATA[The Smile Signal]]></description><link>https://www.theatricality.xyz/p/the-four-levels-of-knowing-if-your</link><guid isPermaLink="false">https://www.theatricality.xyz/p/the-four-levels-of-knowing-if-your</guid><dc:creator><![CDATA[Anurag Mohanty]]></dc:creator><pubDate>Sat, 28 Mar 2026 04:38:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0iAR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0iAR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0iAR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 424w, https://substackcdn.com/image/fetch/$s_!0iAR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 848w, https://substackcdn.com/image/fetch/$s_!0iAR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 1272w, https://substackcdn.com/image/fetch/$s_!0iAR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0iAR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png" width="1279" height="518" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:518,&quot;width&quot;:1279,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:130687,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.theatricality.xyz/i/195831942?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F26a87fe9-bbee-4b69-b575-03c0cef12981_1280x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0iAR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 424w, https://substackcdn.com/image/fetch/$s_!0iAR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 848w, https://substackcdn.com/image/fetch/$s_!0iAR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 1272w, https://substackcdn.com/image/fetch/$s_!0iAR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffed4b1e-18be-4154-8dbc-d0c9d312acaf_1279x518.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Originally published on LinkedIn on March 28, 2026.</em></p><p>There&#8217;s a conversation happening in AI about how to measure whether an AI agent actually helped someone. New frameworks, scoring systems, startups &#8212; Microsoft just introduced a Multimodal Agent Score (Dynamics 365 Blog, Feb 2026) to holistically evaluate AI agents across understanding, reasoning, and response quality.</p><p>It&#8217;s a good conversation. But I think it&#8217;s a much older question wearing a new outfit.</p><p>Every product team has wrestled with the same thing: did what we built actually work for the person using it? And the answer has always come in layers.</p><h3>The Measurement Stack</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JZ4Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 424w, https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 848w, https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 1272w, https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png" width="1456" height="878" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:878,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 424w, https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 848w, https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 1272w, https://substackcdn.com/image/fetch/$s_!JZ4Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89197f9e-fb0a-432d-afdb-63618c6e40ad_1488x897.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Level 1 is solved. Level 2 is where most teams live. Level 3 is where the eval industry is printing money &#8212; LangSmith, Braintrust, Arize. But all of that still tells you if the agent did the right thing. Not whether the person felt served.</p><h3>Why AI agents break the old model</h3><p>In traditional products, Level 2 was a good enough proxy for Level 4. Clicks, conversions, repeat visits &#8212; reasonable signals because the interaction was structured. A funnel. Buttons. Predictable paths.</p><p>AI agents blow that up. Conversations are unstructured, variable, and don&#8217;t map to funnels. A 15-turn chat that looks successful might have left someone confused. A 2-turn conversation that looks like abandonment might have given them exactly what they needed. There&#8217;s no &#8220;like&#8221; button in a chat.</p><p>McKinsey&#8217;s &#8220;Trust in the Age of Agents&#8221; research (2025) found that 80% of organizations have already encountered risky behavior from AI agents. Their framing: <em>&#8220;Agency isn&#8217;t a feature; it&#8217;s a transfer of decision rights.&#8221;</em> When you transfer decision rights, you need to know if that&#8217;s working for the person &#8212; not just the system.</p><h3>Not every product needs Level 4</h3><p>Facebook&#8217;s newsfeed? Level 2 is enough. Low-stakes, high-volume. Trillion-dollar business on likes and time-spent. A resolution KPI for whether each post made you happy would be overkill.</p><p>And there&#8217;s academic evidence for why this is fine. Keiningham et al. published in the Journal of Marketing (2007) showing that changes in Net Promoter Score have almost no relationship with how customers allocate spending. A broader review by Dawes in the International Journal of Market Research (2024) confirmed: NPS is not a reliably superior predictor of growth compared with other satisfaction metrics. Behavioral proxies often tell you more than asking people how they feel.</p><p>For high-volume, low-stakes products &#8212; Level 2 isn&#8217;t a consolation prize. It&#8217;s the right answer.</p><h3>Where Level 4 is genuinely needed</h3><p>It comes down to stakes and trust. Harvard Business Review Analytic Services surveyed 603 business leaders (published Dec 2025 via Fortune) and found only 6% of companies fully trust AI agents to handle core business processes. 43% limit them to routine tasks only.</p><p><strong>Financial advisory.</strong> An AI copilot drafts a client proposal. The advisor accepts it. But did they send it as-is, or silently rewrite it? A 2025 World Economic Forum / Capgemini report found 93% of financial advisors want final say over AI outputs. The question isn&#8217;t &#8220;was the proposal accurate&#8221; &#8212; it&#8217;s &#8220;did the tool build or erode trust?&#8221;</p><p><strong>Healthcare.</strong> An AI suggests a treatment path. The doctor follows it. But were they confident, or busy and planning to second-guess later? A 2024 study in Frontiers in Digital Health found that even when clinical AI is accurate, adoption stalls without perceived trustworthiness. Accuracy doesn&#8217;t drive adoption. Confidence does.</p><p><strong>Enterprise renewals.</strong> The contract went through, terms accurate, stakeholders notified. But does the customer feel like a partner or a line item? That feeling &#8212; not the contract accuracy &#8212; determines next year&#8217;s renewal.</p><p><strong>Legal.</strong> An AI flags contract risks. The lawyer reads the output. Do they trust it enough to stop there? If they&#8217;re redoing the review, the AI didn&#8217;t resolve anything. It added a step.</p><p><strong>The ice cream shop.</strong> No dashboard. No evals. The owner watches the kid&#8217;s face. Did the kid smile? Are they tugging their parent&#8217;s arm saying &#8220;can we come back tomorrow?&#8221; That&#8217;s the purest Smile Signal &#8212; unmeasurable, and yet the most powerful retention signal there is.</p><h3>How would you actually measure Level 4?</h3><p>Nobody has cracked this. But I think it&#8217;s a composite &#8212; three signals, weighted by domain:</p><p><strong>Conversational signals</strong> (real-time, weakest). Corrections, rephrasing, sentiment shifts. Useful but limited. The biggest blind spot: silent failure &#8212; the person who gets a bad answer and just quietly leaves.</p><p><strong>Downstream behavior</strong> (lagging, strongest). Did the user come back? Did the advisor send the proposal without rewriting? Did the doctor follow the suggestion on the next patient too? Closest to ground truth, but requires instrumenting beyond the conversation.</p><p><strong>Contextual micro-feedback</strong> (intermittent, calibrating). Not NPS. Matt Dixon, Nick Toman, and Karen Freeman made the case in &#8220;Stop Trying to Delight Your Customers&#8221; (HBR, July 2010) and later in <em>The Effortless Experience</em> &#8212; delight has negligible impact on loyalty; reducing effort matters far more. Gartner&#8217;s CES research puts a number on it: high-effort experiences make customers 96% more likely to become disloyal. Low-friction, in-the-moment feedback &#8212; a thumbs up, a &#8220;this wasn&#8217;t helpful&#8221; button &#8212; is better than any post-interaction survey</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3iqD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3iqD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 424w, https://substackcdn.com/image/fetch/$s_!3iqD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 848w, https://substackcdn.com/image/fetch/$s_!3iqD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 1272w, https://substackcdn.com/image/fetch/$s_!3iqD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3iqD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png" width="1320" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1320,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!3iqD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 424w, https://substackcdn.com/image/fetch/$s_!3iqD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 848w, https://substackcdn.com/image/fetch/$s_!3iqD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 1272w, https://substackcdn.com/image/fetch/$s_!3iqD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5aa91f9-ee19-42cc-84e0-840afb8c84ac_1320x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>My take</h3><p>For low-stakes, high-volume products &#8212; don&#8217;t over-engineer this. Level 2 works. It&#8217;s always worked.</p><p>For high-stakes, trust-dependent domains &#8212; finance, healthcare, legal, enterprise &#8212; Level 4 isn&#8217;t optional. It&#8217;s where competitive advantage lives.</p><p>The companies that figure out how to blend these signals &#8212; and know which ones to weight for their domain &#8212; will build the Smile Signal of agentic products. Except hopefully one that actually correlates with what it claims to measure.</p><div><hr></div><h3>Sources</h3><ul><li><p>Dixon, Toman, DeLisi &#8212; &#8220;Stop Trying to Delight Your Customers,&#8221; <em>Harvard Business Review</em>, July 2010</p></li><li><p>Keiningham, Cooil, Andreassen, Aksoy &#8212; &#8220;A Longitudinal Examination of Net Promoter and Firm Revenue Growth,&#8221; <em>Journal of Marketing</em>, 2007</p></li><li><p>Dawes &#8212; &#8220;The Net Promoter Score: What Should Managers Know?&#8221; <em>International Journal of Market Research</em>, 2024</p></li><li><p>McKinsey &#8212; &#8220;Trust in the Age of Agents,&#8221; 2025</p></li><li><p>Harvard Business Review Analytic Services &#8212; AI Agent Trust Survey, Dec 2025 (via Fortune)</p></li><li><p>World Economic Forum / Capgemini &#8212; AI in Wealth Management, 2025</p></li><li><p>Frontiers in Digital Health &#8212; Trust in Clinical AI Systems, 2024</p></li><li><p>Gartner &#8212; Customer Effort Score Research; Agentic AI Predictions, 2025</p></li><li><p>Microsoft Dynamics 365 Blog &#8212; &#8220;Multimodal Agent Score,&#8221; February 2026</p></li></ul><div><hr></div><p><em>Where have you seen the gap between &#8220;the system worked&#8221; and &#8220;the person felt served&#8221;</em></p><p><em>?</em></p>]]></content:encoded></item><item><title><![CDATA[Building an AI pipeline for a rural healthcare NGO in West Bengal ]]></title><description><![CDATA[A Bengali healthcare NGO, 25,000 calls a month, and what a transcription benchmark exposed]]></description><link>https://www.theatricality.xyz/p/building-an-ai-pipeline-for-a-rural</link><guid isPermaLink="false">https://www.theatricality.xyz/p/building-an-ai-pipeline-for-a-rural</guid><dc:creator><![CDATA[Anurag Mohanty]]></dc:creator><pubDate>Sun, 01 Mar 2026 05:22:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/485e6ad6-e7f7-4550-b63a-8e30821669f8_2618x1460.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Originally published on LinkedIn on March 15, 2026.</em></p><p>For me it was my college WhatsApp group. Beyond the usual socialization, alumni meets, sports and politics &#8212; a friend posted asking for help with AI. She volunteers for a healthcare NGO in West Bengal and her team was getting inundated with over 25,000 Bengali patient feedback calls per month.</p><p>My friends suggested ElevenLabs, Twilio, voice agents &#8212; all good suggestions, but the sad reality was this wasn&#8217;t a technical user group, and an NGO can&#8217;t afford any of that.</p><p>I got intrigued and asked to be connected.</p><p>The NGO was trying to push Bengali voice recordings into Gemini for translation, but it was causing more grief than helping. When I actually spoke to them I realized translation was just one problem, not the only problem. The other issue was actionable clinical intelligence. One exacerbated the other.</p><p>Nobody knew which clinics were underperforming. Nobody knew that chronic condition patients &#8212; diabetics, hypertensives, people on maintenance meds &#8212; are the ones who actually need follow-up calls, not everyone. The team was calling all patients equally and couldn&#8217;t understand why it wasn&#8217;t working.</p><p>So I built something. Pro bono, using Claude Code.</p><p>Before picking a transcription engine I ran a benchmark &#8212; four engines, same 50 real calls, rural West Bengal Bengali recorded on basic phones. Scored across five criteria: completeness, coherence, proper noun accuracy, medical content accuracy, and handling of noise and dropped calls.</p><p>Same audio. Very different results.</p><p>I expected Whisper V3 to win. It didn&#8217;t. Sarvam AI&#8217;s Saaras V3 was the clear winner &#8212; and the nuance was that it actually understood rural dialects, not just standard Bengali. That gap matters when clinical data is in the mix. A misheard medication name or a mangled outcome is a wrong decision downstream</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kRwT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kRwT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 424w, https://substackcdn.com/image/fetch/$s_!kRwT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 848w, https://substackcdn.com/image/fetch/$s_!kRwT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 1272w, https://substackcdn.com/image/fetch/$s_!kRwT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kRwT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png" width="1280" height="504" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:504,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Anurag Mohanty&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Anurag Mohanty" title="Anurag Mohanty" srcset="https://substackcdn.com/image/fetch/$s_!kRwT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 424w, https://substackcdn.com/image/fetch/$s_!kRwT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 848w, https://substackcdn.com/image/fetch/$s_!kRwT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 1272w, https://substackcdn.com/image/fetch/$s_!kRwT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba599002-6aff-4d26-bcc2-3d354de44e57_1280x504.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The pipeline transcribes each call, extracts structured fields &#8212; recovery status, medication dropout, access barriers, whether the patient sought treatment elsewhere, whether they're on a maintenance condition &#8212; and surfaces everything by clinic in a dashboard. Patterns across thousands of calls, not summaries read one at a time.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;7d5e1d1a-d21a-4f23-b4f9-4c05e15932cf&quot;,&quot;duration&quot;:null}"></div><p>Expected outcomes: 13pp reduction in medication dropout, 85% improvement in clinical escalation time, 2-3x increase in access barrier identification.</p><p>Open source, self-hostable, bring your own API. Any health NGO can deploy this on nonprofit cloud credits.</p><p>Still in progress but working.</p><p></p>]]></content:encoded></item></channel></rss>