Harness工程学习--Learn Claude Code从0到1--(2)
一、第一阶段
1.1 S01 最小智能体(一个工具 + 一个循环)
Agent Loop把工具的结果送回模型继续推理。
1.1.1 TOOLS 定义里只有一个bash工具:
TOOLS = [{ "name": "bash", "description": "Run a shell command.", "input_schema": { "type": "object", "properties": {"command": {"type": "string"}}, "required": ["command"], }, }]1.1.2 循环代码,后面章节都在这个循环上叠加机制,循环本身始终不变:
while True: # 把消息历史、system prompt 和工具定义一起发给模型 response = client.messages.create( model=MODEL, system=SYSTEM, messages=messages, tools=TOOLS, max_tokens=8000, ) # 加模型回答,作下一轮的上下文参考 messages.append({"role": "assistant", "content": response.content}) # 是否调用工具?否->结束 if response.stop_reason != "tool_use": return # 执行工具调用,收集结果 results = [] for block in response.content: if block.type == "tool_use": # 调用bash output = run_bash(block.input["command"]) results.append({ "type": "tool_result", "tool_use_id": block.id, "content": output, }) # 把工具结果作为新消息追加,继续循环 messages.append({"role": "user", "content": results})SYSTEM = f"You are a coding agent at {目录}. Use bash to solve tasks. Act, don't explain."
1.1.3 CC源码
| 1 | 循环结构差异 | stop_reason不作为循环继续的唯一依据。流式响应中只要检测到tool_use块就设needsFollowUp为true |
| 2 | State 对象 10 字段(教学版只用 messages) | 1.messages: 当前迭代的消息数组 2.toolUseContext:工具、信号、权限上下文 10.transition:上一次继续原因 |
| 3 | 多条退出和继续路径 | 多条退出和继续路径,覆盖 blocking limit、prompt too long、model error、abort、hook stop、max turns、token budget continuation、reactive compact retry 等场景。 |
| 4 | 流式工具执行和 QueryEngine | 让工具在模型还在生成时就开始并行执行(根据工具是否 concurrency-safe 决定并发或独占)。QueryEngine.ts额外加了费用超限、结构化输出验证失败等保护。 |
1.1.4 测试prompt
a. Create a file called hello.py that prints "Hello, World!"
b. List all Python files in this directory
c. What is the current git branch?
1.2 S02 工具箱
1.2.1 TOOLS 定义了5个工具:bash、read_file、write_file、edit_file、glob
文件工具加了safe_path,只能操作项目目录里的文件,防止智能体越界访问系统文件。
1.2.2 循环:run_bash被替换为查找分发。
results = [] for block in response.content: if block.type == "tool_use": # 查找工具 handler = TOOL_HANDLERS.get(block.name) # 找到工具则调用 output = handler(**block.input) if handler else f"Unknown: {block.name}" results.append({"type": "tool_result", "tool_use_id": block.id, "content": output})SYSTEM = f"You are a coding agent at {目录}. Use tools to solve tasks. Act, don't explain."
注意:这里的工具是顺序调用的。CC 的做法更复杂:按原始顺序切成连续 batch,batch 内并发安全的工具并行执行,batch 间严格顺序。
1.2.3 CC源码
| 1 | 工具定义方式 | 每个工具是buildTool()创建的独立对象,包含 schema、验证、权限、执行。 |
| 2 | 并发安全判断:isConcurrencySafe | 用isConcurrencySafe(input)判断能否并发,把连续的并发安全调用合成同一个batch |
| 3 | 分区算法 | 把工具调用按连续块分批 |
| 4 | 验证管线(5 步验证) | 1.Zod schema 验证:参数类型/结构检查 2.工具级 validateInput:参数值验证 3.PreToolUse hooks:钩子可以返回消息、修改输入、阻止执行 4.权限检查:canUseTool + checkPermissions → allow/deny/ask 5.执行 tool.call() |
| 5 | 流式工具执行 | 让工具在模型还在生成时就启动 |
| 6 | 工具结果持久化 | 每个工具有一个maxResultSizeChars字段。结果超过这个值就落盘,模型看到的是预览 + 文件路径。 |
1.2.4 测试prompt
a. Read the file README.md and tell me what this project is about
b. Create a file called test.py that prints "hello", then read it back
c. Find all Python files in this directory
d. Read both README.md and requirements.txt, then create a summary file
1.3 S03 工具权限
权限系统是为了让 agent 工具调用前先经过一道可靠的安全判断。
1.3.1 循环:工具执行前插入check_permission——每个工具调用经过三道闸门,顺序固定:硬拒绝优先(拒绝列表),软询问次之(规则匹配、用户审批),都没命中就放行。
results = [] for block in response.content: if block.type != "tool_use": continue # s03 change: 工具执行前检查权限 if not check_permission(block): results.append({"type": "tool_result", "tool_use_id": block.id, "content": "Permission denied."}) continue handler = TOOL_HANDLERS.get(block.name) output = handler(**block.input) if handler else f"Unknown: {block.name}" results.append({"type": "tool_result", "tool_use_id": block.id, "content": output})SYSTEM = f"You are a coding agent at {目录}. All destructive operations require user approval."1.3.2 CC源码
| 1 | PermissionResult:不是 3 种,是 4 种 | deny,ask,allow,passthrough(交给通用管线决定) |
| 2 | 生产版的验证阶段 | 工具调用不是经过三道闸门,而是经过多个阶段 |
| 3 | 拒绝列表:不是一个文件,是 8 个来源 | user < project < local < flag < policy,加上 cliArg、command、session |
| 4 | isDestructive | 纯粹是 UI 展示用的 |
| 5 | YoloClassifier(自动审批) | 先尝试 acceptEdits 模式模拟,如果 acceptEdits 允许 → 直接批准),再查安全工具白名单,最后才调分类器。分类器连续拒绝太多次 → 回退到人工审批。 |
| 6 | 权限冒泡 | 限弹窗冒泡到父 Agent 的终端,而不是在子 Agent 里静默拒绝。 |
1.3.3 测试prompt
a. Create a file called test.txt in the current directory(应该直接通过)
b. Delete all temporary files in /tmp(bash + rm 会触发闸门 2)
c. What files are in the current directory?(只读,全部通过)
d. Try to write a file to /etc/something(写工作区外,触发闸门 2)
1.4 S04 工具调用前后的钩子
Hook 是主循环在固定时机对外发出的调用。
1.4.1 循环:把check_permission从循环体内移到了 hook 上,循环不再直接调用任何检查函数,这样扩展行为时,不需要改动循环体代码。
while True: response = client.messages.create( model=MODEL, system=SYSTEM, messages=messages, tools=TOOLS, max_tokens=8000, ) messages.append({"role": "assistant", "content": response.content}) if response.stop_reason != "tool_use": # 调用停止钩子 force = trigger_hooks("Stop", messages) if force: messages.append({"role": "user", "content": force}) continue return results = [] for block in response.content: if block.type != "tool_use": continue # s04 change: 调用工具使用前钩子 blocked = trigger_hooks("PreToolUse", block) if blocked: results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(blocked)}) continue handler = TOOL_HANDLERS.get(block.name) output = handler(**block.input) if handler else f"Unknown: {block.name}" # 调用工具使用后钩子 trigger_hooks("PostToolUse", block, output) # s04: post hook results.append({"type": "tool_result", "tool_use_id": block.id, "content": output}) messages.append({"role": "user", "content": results})SYSTEM = f"You are a coding agent at {目录}. Use tools to solve tasks. Act, don't explain."1.4.2 CC源码
| 1 | Hook 事件:不止4 个,而是 27 个 | 工具相关(3)、会话相关(5)、用户交互(4)、子 Agent(2)、压缩相关(2)、团队相关(3)、其他(8) |
| 2 | HookResult 常用字段摘录14个 | message、blockingError、outcome、preventContinuation、stopReason、permissionBehavior、updatedInput、additionalContext、updatedMCPToolOutput |
| 3 | 关键不变式:Hook 'allow' 不能绕过 deny/ask 规则 | hook 返回 allow 时,仍然要检查 settings.json 的 deny/ask 规则 |
| 4 | stopHookActive 机制 | 防止无限循环 |
| 5 | hook_stopped_continuation | 退出循环 |
1.4.3 测试prompt
a. Read the file README.md(应该直接通过,观察 hook 日志)
b. Create a file called test.txt(通过后观察 PostToolUse 是否触发)
c. Delete all temporary files in /tmp(bash + rm 触发权限 hook)
二、第二阶段
2.1 S05 Todos先做计划,防止Agent注意力漂移
2.1.1 循环:工具箱增加了todo_write工具,循环体加了todo_write提醒todo_write本身不做任何实际工作(不能读文件、不能跑命令,只是展示任务每一步的状态)。注意SYSTEM的变化。
global rounds_since_todo while True: # s05: 3轮没调todo_write则注入一条提醒,CC中没有 if rounds_since_todo >= 3 and messages: messages.append({"role": "user", "content": "<reminder>Update your todos.</reminder>"}) rounds_since_todo = 0 response = client.messages.create( model=MODEL, system=SYSTEM, messages=messages, tools=TOOLS, max_tokens=8000, ) messages.append({"role": "assistant", "content": response.content}) if response.stop_reason != "tool_use": force = trigger_hooks("Stop", messages) if force: messages.append({"role": "user", "content": force}) continue return rounds_since_todo += 1 results = [] for block in response.content: if block.type != "tool_use": continue blocked = trigger_hooks("PreToolUse", block) if blocked: results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(blocked)}) continue handler = TOOL_HANDLERS.get(block.name) output = handler(**block.input) if handler else f"Unknown: {block.name}" trigger_hooks("PostToolUse", block, output) # s05: 当todo_write被调用,重置轮次计数器 if block.name == "todo_write": rounds_since_todo = 0 results.append({"type": "tool_result", "tool_use_id": block.id, "content": output}) messages.append({"role": "user", "content": results})SYSTEM = ( f"You are a coding agent at {目录}. " "Before starting any multi-step task, use todo_write to plan your steps. " "Update status as you go." )2.1.2 CC源码:有两套任务系统并存。
| TodoWrite(V1) | 一个简单的列表工具,数据在内存 AppState 中维护 |
| Task System(V2 = s12) | 文件持久化({taskId}.json)、依赖图(blockedBy)、并发锁(proper-lockfile)、ownership |
2.1.3 测试prompt
a. Refactor s05_todo_write/example/hello.py: add type hints, docstrings, and a main guard(先列 3 步再执行)
b. Create a Python package under s05_todo_write/example/demo_pkg with __init__.py, utils.py, and tests/test_utils.py
c. Review Python files under s05_todo_write/example and fix any style issues
2.2 S06 子智能体 — 大任务拆小,小任务有独立的上下文
子智能体的核心,不是多一个角色,而是多一个干净上下文。
2.2.1 工具:增加了task工具,用于生成子智能体
TOOLS.append({ "name": "task", "description": "Launch a subagent to handle a complex subtask. Returns only the final conclusion.", "input_schema": {"type": "object", "properties": {"description": {"type": "string"}}, "required": ["description"]}, })2.2.2 主循环:代码没有变化,和S05一样,但SYSTEM有变化。
SYSTEM = ( f"You are a coding agent at {目录}. " "For complex sub-problems, use the task tool to spawn a subagent." )2.2.3 子Agent循环
def spawn_subagent(description: str) -> str: """Spawn a subagent with fresh messages[], return summary only.""" messages = [{"role": "user", "content": description}] # 新的messages,上下文隔离 for _ in range(30): # 最多30次循环 response = client.messages.create( model=MODEL, system=SUB_SYSTEM, messages=messages, tools=SUB_TOOLS, max_tokens=8000, ) messages.append({"role": "assistant", "content": response.content}) if response.stop_reason != "tool_use": break results = [] for block in response.content: if block.type == "tool_use": # Issue 1: subagent保留安全策略 blocked = trigger_hooks("PreToolUse", block) if blocked: results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(blocked)}) continue handler = SUB_HANDLERS.get(block.name) output = handler(**block.input) if handler else f"Unknown: {block.name}" trigger_hooks("PostToolUse", block, output) results.append({"type": "tool_result", "tool_use_id": block.id, "content": output}) messages.append({"role": "user", "content": results}) # Issue 5: fallback if safety limit hit during tool_use result = extract_text(messages[-1]["content"]) if not result: # last message is tool_result, look backwards for assistant text for msg in reversed(messages): if msg["role"] == "assistant": result = extract_text(msg["content"]) if result: break if not result: result = "Subagent stopped after 30 turns without final answer." return result # 只返回最后的文本结论SUB_SYSTEM = ( f"You are a coding agent at {目录}. " "Complete the task you were given, then return a concise summary. " "Do not delegate further." )SUB_TOOLS不包含工具task,防止子Agent再生子Agent,禁止递归。
2.2.4 CC源码
| 1 | 不是一种模式,是三种 | Normal subagent、Fork Subagent、General-Purpose |
| 2 | Fork 模式:为了共享 Prompt Cache | 不创建全新上下文,目的是让 Anthropic API 的 prompt cache 命中 |
| 3 | Context Isolation 的精确粒度 | 子 Agent 不是完全隔离的:文件读取状态是共享的。 |
| 4 | 递归 Fork 防护 | |
| 5 | Permission Bubbling | 子 Agent 的权限弹窗冒泡到父终端 |
| 6 | Async vs Sync | 异步子 Agent 完成后通过通知机制告知父 Agent |
2.2.5 测试prompt
a. Use a subtask to find what testing framework this project uses(子 Agent 去读文件,主 Agent 只收结论)
b. Delegate: read all .py files in agents/ and summarize what each one does
c. Use a task to create s06_subagent/example/string_tools.py with a slugify(text: str) function, then verify it from the parent agent
2.3 S08 上下文压缩
上下文压缩的核心,是让模型在更短的活跃上下文里,仍然保住继续工作的连续性。
2.3.1 循环:增加压缩调用 L3(tool_result_budget落盘,保留完整内容) -> L1(snip裁掉无关的旧对话,裁中间,保留头尾)-> L2(micro旧工具结果占位) ->L4(主动调LLM压缩)
reactive_retries = 0 while True: # s08 change: 3个预处理,0 api调用 # 顺序与 CC 相同: budget → snip → micro messages[:] = tool_result_budget(messages) # L3: 大结果落盘 messages[:] = snip_compact(messages) # L1: 裁中间 messages[:] = micro_compact(messages) # L2: 旧工具结果占位 # s08 change: tokens超过阈值 → LLM 摘要(1 API调用) if estimate_size(messages) > CONTEXT_LIMIT: # L4: 调用LLM 压缩 messages[:] = compact_history(messages) try: response = client.messages.create(model=MODEL, system=SYSTEM, messages=messages, tools=TOOLS, max_tokens=8000) reactive_retries = 0 # reset on successful API call except Exception as e: if ("prompt_too_long" in str(e).lower() or "too many tokens" in str(e).lower()) and reactive_retries < MAX_REACTIVE_RETRIES: # 应急压缩 messages[:] = reactive_compact(messages) reactive_retries += 1 continue raise messages.append({"role": "assistant", "content": response.content}) if response.stop_reason != "tool_use": return results = [] for block in response.content: if block.type != "tool_use": continue # s08: 模型主动调用时触发compact工具 if block.name == "compact": messages[:] = compact_history(messages) results.append({"type": "tool_result", "tool_use_id": block.id, "content": "[Compacted. Conversation history has been summarized.]"}) messages.append({"role": "user", "content": results}) break # 结束当前轮次,用压缩后的上下文开始新一轮 blocked = trigger_hooks("PreToolUse", block) if blocked: results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(blocked)}) continue handler = TOOL_HANDLERS.get(block.name) output = handler(**block.input) if handler else f"Unknown: {block.name}" trigger_hooks("PostToolUse", block, output) results.append({"type": "tool_result", "tool_use_id": block.id, "content": str(output)}) else: # 正常路径: 无压缩 messages.append({"role": "user", "content": results}) continue # compact was called: results already appended above continue2.3.2 CC源码
| 1 | 执行顺序 | budget → snip → micro → collapse → auto |
| 2 | read_file 的取舍 | 把Read也放进可 microcompact 的工具集合,但同时维护readFileState |
| 3 | contextCollapse | 独立的上下文管理系统,启用时抑制 proactive autocompact |
sessionMemoryCompact | compact_history 之前,先尝试用已有的 session memory做轻量摘要,不调 LLM | |
| 4 | 压缩 prompt 长什么样 | 1.绝对禁止调用工具 2.先分析再总结 |
2.3.3 测试prompt
a. Read the file README.md, then read code.py, then read s01_agent_loop/README.md(连续读多个文件,观察 L2 压缩旧结果)
b. Read every file in s08_context_compact/(一次性读大量内容,观察 L3 落盘)
c. 反复对话 20+ 轮,观察是否出现 [auto compact] 或 [reactive compact]