Topic dashboard
Agent Architecture & Personal Infrastructure
Last refreshed May 10, 2026 · 11 concepts
Agent Architecture & Personal Infrastructure
The harness, not the model, is becoming the moat.
My take
Start from two assumptions I treat as fixed: models will keep getting better, and they will stay non-deterministic. At any given moment one model will be ahead of another, but leadership rotates on a quarterly cadence, and non-determinism never goes away. Both facts point at the same answer: betting on a specific model is a depreciating asset, and even the best model is unsafe to ship without scaffolding around it.
What compounds, then, is not the model. It is the harness, the evals, and the memory - the layer that governs context, scopes tools, persists state across sessions, and lets you measure whether a swap actually improved anything. The ability to switch models confidently, backed by your own evaluation suite, is the capability most teams underinvest in. It is also the one that decides who has pricing power against the labs.
The mistake I see most often is treating “agent” as a model-capability question. It is an infrastructure question. Whoever owns the harness owns the workflow, the data exhaust, and the switching cost, regardless of which model is plugged in underneath. That is why open-source harnesses wrapping proprietary CLIs are a real threat to subscription economics, not a curiosity.
Over the next twelve months I expect enterprise buyers to start asking harness-shaped questions - memory, observability, permission boundaries, audit, evals - before they ask model-shaped ones. The vendors who treat the harness as a thin wrapper will lose to the ones treating it as the operating system.
Everything above the divider is mine. Everything below is auto-assembled daily from my knowledge base — individual links and summaries may be stale or off-target. Last refreshed: 2026-05-10.
What’s shifted recently
-
Enterprise Agent Observability OTEL Tracing (updated 2026-05-09)
Enterprise agent observability is the discipline of instrumenting every step of an autonomous agent’s execution — tool calls, retrieval operations, subagent invocations, and inter… — source · source · source -
Harness Engineering (updated 2026-05-09)
Harness engineering is the practice of designing the OS-layer around AI coding agents — the context governance, tool architecture, eval loops, memory management, and permission mo… — source · source · source -
Hermes Agent Skill Composition Framework (updated 2026-05-09)
Hermes Agent is an open-source CLI-first agent framework built by NousResearch that structures autonomous workflows around three composable primitives: skills (discrete capability… — source · source · source -
Enterprise Agent Integration Layer (updated 2026-05-08)
The enterprise agent integration layer refers to the set of CLI tools, APIs, protocol adapters, and platform extensions that established enterprise software vendors are building t… — source · source · source -
Agent Economy Infrastructure (updated 2026-05-07)
Agent economy infrastructure refers to the foundational primitives — identity, communications, payments, compute, memory, and orchestration — being built specifically for AI agent… — source · source · source -
Agent Evaluation Noise (updated 2026-05-07)
Agent evaluation noise refers to the non-model variance introduced into agentic benchmarks by infrastructure configuration, reward hacking behavior, non-deterministic runtime envi… — source · source · source -
Agent Memory Architecture (updated 2026-05-07)
Agent memory architecture refers to the set of mechanisms by which AI coding agents and AI coworkers maintain context that persists beyond a single session, enabling continuity ac… — source · source · source -
Agent Subagent Decomposition Production Pattern (updated 2026-05-07)
Agent-subagent decomposition is the architectural pattern of splitting a production AI workflow into a parent orchestrator and one or more specialized child agents, each scoped to… — source · source · source -
Agent Session Memory Loss Project Context (updated 2026-05-04)
Coding agents and LLM-based assistants, including Claude Code, discard all conversational state when a session ends — there is no native mechanism to carry forward decisions made,… — source · source · source -
Local First Workflow Engine Deterministic Agents (updated 2026-05-03)
Local-first workflow engines for AI agents are self-hosted, open-source platforms that execute multi-step agentic pipelines entirely within the operator’s own infrastructure — no… — source · source · source -
Social Agent Queue Backpressure Pattern (updated 2026-05-03)
Social agents — autonomous processes monitoring feeds on platforms like Farcaster, Moltbook, and Nostr — generate write bursts to downstream research or memory systems faster than… — source · source · source
The ideas I keep coming back to
Currently active (last 30 days):
- Enterprise Agent Observability OTEL Tracing — Enterprise agent observability is the discipline of instrumenting every step of an autonomous agent’s execution — tool calls, retrieval operations, subagent invocations, and inter…
- Harness Engineering — Harness engineering is the practice of designing the OS-layer around AI coding agents — the context governance, tool architecture, eval loops, memory management, and permission mo…
- Hermes Agent Skill Composition Framework — Hermes Agent is an open-source CLI-first agent framework built by NousResearch that structures autonomous workflows around three composable primitives: skills (discrete capability…
- Enterprise Agent Integration Layer — The enterprise agent integration layer refers to the set of CLI tools, APIs, protocol adapters, and platform extensions that established enterprise software vendors are building t…
- Agent Economy Infrastructure — Agent economy infrastructure refers to the foundational primitives — identity, communications, payments, compute, memory, and orchestration — being built specifically for AI agent…
- Agent Evaluation Noise — Agent evaluation noise refers to the non-model variance introduced into agentic benchmarks by infrastructure configuration, reward hacking behavior, non-deterministic runtime envi…
- Agent Memory Architecture — Agent memory architecture refers to the set of mechanisms by which AI coding agents and AI coworkers maintain context that persists beyond a single session, enabling continuity ac…
- Agent Subagent Decomposition Production Pattern — Agent-subagent decomposition is the architectural pattern of splitting a production AI workflow into a parent orchestrator and one or more specialized child agents, each scoped to…
- Agent Session Memory Loss Project Context — Coding agents and LLM-based assistants, including Claude Code, discard all conversational state when a session ends — there is no native mechanism to carry forward decisions made,…
- Local First Workflow Engine Deterministic Agents — Local-first workflow engines for AI agents are self-hosted, open-source platforms that execute multi-step agentic pipelines entirely within the operator’s own infrastructure — no…
- Social Agent Queue Backpressure Pattern — Social agents — autonomous processes monitoring feeds on platforms like Farcaster, Moltbook, and Nostr — generate write bursts to downstream research or memory systems faster than…
Who I’m watching
- Anthropic (organization) — Anthropic is the AI lab behind the Claude family of models and Claude Code, positioned as a frontier safety-focused competitor to OpenAI and Google.
- LangChain (organization) — LangChain is a framework and tooling company for building production LLM applications, with the LangChain orchestration library, the LangSmith observability platform, and the Deep…
- DeepSeek (organization) — DeepSeek is a Chinese AI lab whose open-weight model releases anchor the lower end of the cost-capability frontier and contribute directly to the frontier-model-compression dynami…
- Garry Tan (person) — Garry Tan is the president and CEO of Y Combinator, and one of the most visible public commentators on AI coding tools, startup strategy, and AI security risk.
- Jensen Huang (person) — Jensen Huang is co-founder and CEO of NVIDIA, which under his leadership became the world’s most valuable company by capitalizing on the AI infrastructure buildout.
- Moonshot AI / Kimi (organization) — Moonshot AI (月之暗面) is the Chinese lab behind the Kimi model family, including the open-weight Kimi K2.5 release that powers Cursor Composer 2.
- NVIDIA (organization) — NVIDIA is the dominant supplier of GPU compute for AI training and inference, and as of 2026 the world’s most valuable public company.
- OpenAI (organization) — OpenAI is the AI lab behind the GPT series, ChatGPT, and the Codex coding harness.
- Peter Steinberger (person) — Peter Steinberger (X: @steipete) is the creator of OpenClaw, the open-source personal AI agent platform that reached over 160,000 GitHub stars within weeks of launch.
- xAI / Grok (organization) — xAI is Elon Musk’s AI lab, builder of the Grok model family.
Sources I’ve been drawing on
- fluxhuman.com — cited in Enterprise Agent Observability OTEL Tracing
- autoolize.com — cited in Enterprise Agent Observability OTEL Tracing
- blog.brightcoding.dev — cited in Enterprise Agent Observability OTEL Tracing
- www.braintrust.dev — cited in Enterprise Agent Observability OTEL Tracing
- arxiv.org — cited in Enterprise Agent Observability OTEL Tracing
- dev.to — cited in Enterprise Agent Observability OTEL Tracing
- www.reddit.com — cited in Enterprise Agent Observability OTEL Tracing
- medium.com — cited in Enterprise Agent Observability OTEL Tracing
- productresources.collibra.com — cited in Enterprise Agent Observability OTEL Tracing
- proficient.store — cited in Enterprise Agent Observability OTEL Tracing
- martinfowler.com — cited in Harness Engineering
- openai.com — cited in Harness Engineering