人生天地之间，若白驹过隙，忽然而已——庄子

Plano：把“Agent 上生产”最难的那一段中间件，变成一个 AI‑native 数据平面（基于 README + Repo Description）

仓库：https://github.com/katanemo/plano
官网：https://planoai.dev
文档：https://docs.planoai.dev
Quickstart：https://docs.planoai.dev/get_started/quickstart.html

Repo description（原文）：

Plano is an AI-native proxy and data plane for agentic apps — with built-in orchestration, safety, observability, and smart LLM routing so you stay focused on your agents core logic.

0. 先把“Plano 是什么”讲清楚：它不是又一个 Agent 框架

如果你最近做过 agentic app，你一定懂那种落差感：

Demo 的时候：一个 prompt + 两个工具调用 + 一点链式逻辑，几小时就能跑出效果
上线的时候：开始补齐“隐藏的中间件”——路由、编排、安全、观测、评估、模型切换、provider 兼容……然后你发现自己在写的东西越来越不像“产品逻辑”

Plano 在 README 里直接点破了这件事：

Building agentic demos is easy. Shipping agentic applications safely, reliably, and repeatably to production is hard.

它给出的解法不是“再来一个更大的框架抽象”，而是把这些共性问题抽离成一个 out-of-process dataplane（进程外数据平面），也就是：
你的应用继续专注在 agent 的核心逻辑；Plano 作为一个AI‑native proxy / data plane，接管那些不该每个项目都手搓一遍的“管道活”。

1. Plano 想帮你省掉哪些“必修课”？

README 给了一个很工整的总结：当你不用 Plano 时，你最终都会自己把这些拼起来：

到底该调用哪个 agent（或哪个子服务）的routing logic
统一的 safety / moderation / guardrails “钩子”
evaluation 与 observability 的“胶水层”
不同模型/不同 provider 的 API quirks 散落在代码里

Plano 把这些集中到 dataplane 里，给你四个核心能力（README 原文要点）：

🚦 Orchestration：agents 之间低延迟编排；新增 agent 不必改应用代码
🔗 Model Agility：LLM 路由可以按 model name、alias（语义别名）或按偏好自动路由（README 链接到 “use plano as a LLM router”）
🕵 Agentic Signals™：零代码捕获 Signals + OpenTelemetry traces/metrics
🛡️ Moderation & Memory Hooks：通过 Filter Chains 一致地加越狱防护、审核策略、记忆等能力

你会发现它的表达方式很像“平台化能力清单”，而不是“框架 API 教程”。这是 dataplane 思路的典型特征：把横切关注点（cross-cutting concerns）统一托管。

2. 它的底座是什么？（README 明确写了）

Plano README 明确说它：

built on Envoy：https://envoyproxy.io
并由 Envoy 的核心贡献者构建（README 描述：built critical infrastructure at scale for modern workloads）

这句话很关键，因为它暗示了 Plano 的形态更接近“网关/代理/数据面”而不是“库”：

你不需要把它塞进你的应用进程
你可以把它放在服务之间，让它负责代理与编排
观测/路由/过滤这些事，天然适合在代理层做

README 里甚至给了 High-Level Network Sequence Diagram（配图路径）：

docs/source/_static/img/plano_network_diagram_high_level.png

3. 一次把它跑起来：README 的“旅行助理”例子非常典型

Plano README 在 “Build Agentic Apps with Plano” 里给了一个多 agent 的旅行场景示例，并指出完整代码在：

demos/agent_orchestration/travel_agents/

这个 demo 选择得很聪明：它天然需要“同一段对话里调用多个 agent”，而这正是编排与路由的刚需。

下面我按 README 的顺序，把它拆成三个你可以直接理解、直接复刻的步骤。

4. Step 1：用 YAML “声明”你的 agents（而不是手写路由逻辑）

README 的核心口号之一是：

What you declare：Agent URLs + 自然语言描述
What you don’t write：意图分类器、路由逻辑、模型 fallback、provider adapter、tracing instrumentation

示例配置（README 原文片段）大致长这样：

# config.yaml
version: v0.3.0

agents:
  - id: weather_agent
    url: http://localhost:10510
  - id: flight_agent
    url: http://localhost:10520

model_providers:
  - model: openai/gpt-4o
    access_key: $OPENAI_API_KEY
    default: true
  - model: anthropic/claude-3-5-sonnet
    access_key: $ANTHROPIC_API_KEY

listeners:
  - type: agent
    name: travel_assistant
    port: 8001
    router: plano_orchestrator_v1  # README：由 4B 参数路由模型驱动；可替换成其他模型
    agents:
      - id: weather_agent
        description: |
          Gets real-time weather and forecasts for any city worldwide.
          Handles: "What's the weather in Paris?", "Will it rain in Tokyo?"

      - id: flight_agent
        description: |
          Searches flights between airports with live status and schedules.
          Handles: "Flights from NYC to LA", "Show me flights to Seattle"

tracing:
  random_sampling: 100  # Auto-capture traces for evaluation

这里有两个“很 Plano 的点”：

agent 的 description 是“可执行的路由语义”
你不是写 if/else，而是写“这个 agent 擅长处理什么”。这更像把路由问题��给一个 router（README 点名 plano_orchestrator_v1）。
tracing 是配置项，不是你到处埋点
random_sampling: 100 这种写法表达的是：默认就把观测当作管道能力来提供。

5. Step 2：agent 代码可以极简（只要实现 OpenAI 兼容的 chat completions）

README 对 agent 的约束很干净：

Your agents are just HTTP servers that implement the OpenAI-compatible chat completions endpoint. Use any language or framework.

它给了一个 Python/FastAPI 的例子，并且有一个非常关键的注释：

Point to Plano’s LLM gateway - it handles model routing for you

也就是：你的 agent 不需要直连各家 provider；它可以把 LLM 调用统一打到 Plano 的 gateway。

示例（README 原文片段）：

# weather_agent.py
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
from openai import AsyncOpenAI

app = FastAPI()

# Point to Plano's LLM gateway - it handles model routing for you
llm = AsyncOpenAI(base_url="http://localhost:12001/v1", api_key="EMPTY")

@app.post("/v1/chat/completions")
async def chat(request: Request):
    body = await request.json()
    messages = body.get("messages", [])
    days = 7

    # Your agent logic: fetch data, call APIs, run tools
    # See demos/agent_orchestration/travel_agents/ for the full implementation
    weather_data = await get_weather_data(request, messages, days)

    # Stream the response back through Plano
    async def generate():
        stream = await llm.chat.completions.create(
            model="openai/gpt-4o",
            messages=[{"role": "system", "content": f"Weather: {weather_data}"}, *messages],
            stream=True
        )
        async for chunk in stream:
            yield f"data: {chunk.model_dump_json()}\n\n"

    return StreamingResponse(generate(), media_type="text/event-stream")

这段例子非常适合写进博客，因为它说明了一个“生产化思路”：

agent 专注：拿数据、拼上下文、做工具调用
LLM 连接/路由/供应商差异：交给 dataplane（Plano）

6. Step 3：启动 Plano，用一个入口把对话跑通（它会在同一对话内路由多个 agent）

README 的启动与调用也很直接：

1
2
3

# Start Plano
planoai up config.yaml
...

然后通过 listener 暴露的端口（例子里是 8001）以 OpenAI 兼容接口请求：

curl http://localhost:8001/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "I want to travel from NYC to Paris next week. What is the weather like there, and can you find me some flights?"}
    ]
  }'

README 还很贴心地写了预期行为（原文箭头说明）：

先路由到 weather_agent 获取巴黎天气
再路由到 flight_agent 查 NYC→Paris 航班
返回一个综合的旅行计划

这就把“多 agent 编排”从应用层 if/else，变成了 dataplane 的职责。

7. “免费”得到可观测性：OpenTelemetry 端到端追踪（README 明确）

README 直接说：

Every request is traced end-to-end with OpenTelemetry - no instrumentation code needed.

并给了示例图：

docs/source/_static/img/demo_tracing.png

如果你写过生产系统，你会知道“观测”从来不免费：
你要统一 trace id、要跨服务传递、要挑选采样策略、要考虑日志与 spans 的关联……
Plano 选择把它内建进 dataplane，让它成为“默认能力”。

8. 它到底帮你省了什么？README 用一张表讲透

README 的对照表非常适合原样引用（我这里按它的结构复述）：

Infrastructure Concern	Without Plano	With Plano
Agent Orchestration	写 intent classifier + routing logic	YAML 声明 agent 描述
Model Management	处理每家 provider 的 quirks	统一 LLM API + 状态管理
Rich Tracing	每个服务自己埋 OTEL	自动端到端 traces & logs
Learning Signals	自建 spans 捕获/导出 pipeline	零代码 agentic signals
Adding Agents	改路由代码、测试、发版	改配置、重启

以及它为什么“高效”（README 原文）：

Plano uses purpose-built, lightweight LLMs (like our 4B-parameter orchestrator) instead of heavyweight frameworks or GPT-4 for routing - giving you production-grade routing at a fraction of the cost and latency.

这句话翻译成人话就是：
路由这件事不该用最贵的模型做；Plano 用专门的小模型（比如 4B orchestrator）来完成这类“控制面/编排面”的工作。

9. 一点容易被忽略但很实用的信息：Plano 的 LLM（Arch family）与运行方式

README 有一段 Important 提示（非常关键）：

Plano and the Arch family of LLMs (like Plano-Orchestrator-4B, Arch-Router, etc) are hosted free of charge in the US-central region to give you a great first-run developer experience of Plano. To scale and run in production, you can either run these LLMs locally or contact us on Discord for API keys.

这里明确了三点：

为了 first-run 体验，官方在 US-central 免费托管了一些相关 LLM
真要上规模/上生产：你可以本地跑这些 LLM，或联系他们拿 API keys
“Plano 不是只提供 proxy”，它还围绕路由/编排提供了模型家族（Arch family）

10. 进阶：如果你要本地开发 CLI / 构建组件，README 里还有专门文档

除了根 README，仓库里还有一些子 README 对开发者很友好，例如：

10.1 CLI 本地开发（`cli/README.md`）

它介绍用 uv 来做本地开发（README 原文命令示例）：

安装 uv：
- macOS/Linux：curl -LsSf https://astral.sh/uv/install.sh | sh
- Windows：powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
在 cli/ 目录安装依赖：uv sync
构建：uv run planoai build
看日志：uv run planoai logs --follow

这对想贡献/调试 CLI 的人很有用。

10.2 Envoy filter / gateway 相关（`config/README.md`）

里面提到 Rust + wasm 目标与构建方式（README 原文要点）：

rustup target add wasm32-wasip1
cargo build --target wasm32-wasip1 --release
cargo test
本地 dev docker compose 启动等

这也再次印证：Plano 的“代理/数据面”属性非常强，和 Envoy/filter 的世界贴得很近。

11. 最后一段：Plano 的定位，适合什么样的团队？

只基于 README 和 description，我会这样给一个务实的画像：

Plano 适合那些已经确定“要做 agentic app”，但不想让团队把时间耗在：

反复造路由与编排轮子
在不同 AI 框架抽象之间来回迁移
观测与安全策略散落在每个服务里难以统一
频繁换模型/换 provider 时改一堆业务代码

它把这些从“每个项目的手工活”，变成“一个可配置的数据平面”。

当你的业务在变化（多 agent、多个模型、多个 provider），而你又想保持架构稳定时，这种“AI‑native proxy/data plane”的思路会非常有吸引力：
把变化留给配置与路由，把稳定留给核心业务逻辑。

参考链接（README 中出现的官方入口）

Docs：https://docs.planoai.dev
Quickstart Guide：https://docs.planoai.dev/get_started/quickstart.html
LLM Routing：https://docs.planoai.dev/guides/llm_router.html
Agent Orchestration：https://docs.planoai.dev/guides/orchestration.html
Filter Chains：https://docs.planoai.dev/concepts/filter_chain.html
Prompt Targets：https://docs.planoai.dev/concepts/prompt_target.html
Observability：https://docs.planoai.dev/guides/observability/observability.html
Contact（Discord）：https://discord.gg/pGZf2gcwEc
Roadmap（GitHub Projects）：https://github.com/orgs/katanemo/projects/1