🚀 快速安装

复制以下命令并运行,立即安装此 Skill:

npx skills add https://skills.sh/supercent-io/skills-template/langsmith

💡 提示:需要 Node.js 和 NPM

langsmith — LLM 可观测性、评估与提示管理 (LLM Observability, Evaluation & Prompt Management)

关键字 (Keyword): langsmith · llm tracing · llm evaluation · @traceable · langsmith evaluate

LangSmith 是一个与框架无关的平台,用于开发、调试和部署 LLM 应用。
它提供端到端的追踪、质量评估、提示版本控制和生产监控。

何时使用此技能 (When to use this skill)

  • 为任何 LLM 流水线添加追踪(OpenAI、Anthropic、LangChain、自定义模型)
  • 使用 evaluate() 对精选数据集运行离线评估
  • 设置生产监控和在线评估
  • 在 Prompt Hub 中管理和版本控制提示
  • 为回归测试和基准测试创建数据集
  • 为追踪添加人工或自动反馈
  • 使用 openevals 进行 LLM 即评判打分
  • 通过端到端追踪检查调试代理失败问题

操作说明 (Instructions)

  1. 安装 SDK:pip install -U langsmith(Python)或 npm install langsmith(TypeScript)
  2. 设置环境变量:LANGSMITH_TRACING=trueLANGSMITH_API_KEY=lsv2_...
  3. 使用 @traceable 装饰器或 wrap_openai() 包装器进行代码插桩
  4. smith.langchain.com 查看追踪
  5. 关于评估设置,请参阅 references/python-sdk.md
  6. 关于 CLI 命令,请参阅 references/cli.md
  7. 运行 bash scripts/setup.sh 自动配置环境

API 密钥 (API Key):从 smith.langchain.com → Settings → API Keys 获取
文档 (Docs)https://docs.langchain.com/langsmith


快速开始 (Quick Start)

Python

pip install -U langsmith openai
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."
export OPENAI_API_KEY="sk-..."
from langsmith import traceable
from langsmith.wrappers import wrap_openai
from openai import OpenAI

client = wrap_openai(OpenAI())

@traceable
def rag_pipeline(question: str) -> str:
    """自动在 LangSmith 中追踪 (Automatically traced in LangSmith)"""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": question}]
    )
    return response.choices[0].message.content

result = rag_pipeline("What is LangSmith?")

TypeScript

npm install langsmith openai
export LANGSMITH_TRACING=true
export LANGSMITH_API_KEY="lsv2_..."
import { traceable } from "langsmith/traceable";
import { wrapOpenAI } from "langsmith/wrappers";
import { OpenAI } from "openai";

const client = wrapOpenAI(new OpenAI());

const pipeline = traceable(async (question: string): Promise<string> => {
  const res = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: question }],
  });
  return res.choices[0].message.content ?? "";
}, { name: "RAG Pipeline" });

await pipeline("What is LangSmith?");

核心概念 (Core Concepts)

概念 (Concept) 描述 (Description)
运行 (Run) 单个操作(LLM 调用、工具调用、检索)。是基本单位。
追踪 (Trace) 来自单个用户请求的所有运行,通过 trace_id 关联。
线程 (Thread) 对话中的多个追踪,通过 session_idthread_id 关联。
项目 (Project) 分组相关追踪的容器(通过 LANGSMITH_PROJECT 设置)。
数据集 (Dataset) 用于离线评估的 {inputs, outputs} 示例集合。
实验 (Experiment) 针对数据集运行 evaluate() 产生的结果集。
反馈 (Feedback) 附加到运行的评分/标签——数值型、分类型或自由格式。

追踪 (Tracing)

@traceable 装饰器 (Python)

from langsmith import traceable

@traceable(
    run_type="chain",          # llm | chain | tool | retriever | embedding
    name="My Pipeline",
    tags=["production", "v2"],
    metadata={"version": "2.1", "env": "prod"},
    project_name="my-project"
)
def pipeline(question: str) -> str:
    return generate_answer(question)

选择性追踪上下文 (Selective tracing context)

import langsmith as ls

# 仅为当前代码块启用追踪 (Enable tracing for this block only)
with ls.tracing_context(enabled=True, project_name="debug"):
    result = chain.invoke({"input": "..."})

# 尽管 LANGSMITH_TRACING=true,也禁用追踪 (Disable tracing despite LANGSMITH_TRACING=true)
with ls.tracing_context(enabled=False):
    result = chain.invoke({"input": "..."})

包装提供商客户端 (Wrap provider clients)

from langsmith.wrappers import wrap_openai, wrap_anthropic
from openai import OpenAI
import anthropic

openai_client = wrap_openai(OpenAI())           # 所有调用自动追踪 (All calls auto-traced)
anthropic_client = wrap_anthropic(anthropic.Anthropic())

分布式追踪(微服务) (Distributed tracing (microservices))

from langsmith.run_helpers import get_current_run_tree
import langsmith

@langsmith.traceable
def service_a(inputs):
    rt = get_current_run_tree()
    headers = rt.to_headers()     # 传递给子服务 (Pass to child service)
    return call_service_b(headers=headers)

@langsmith.traceable
def service_b(x, headers):
    with langsmith.tracing_context(parent=headers):
        return process(x)

评估 (Evaluation)

使用 evaluate() 进行基础评估 (Basic evaluation with evaluate())

from langsmith import Client
from langsmith.wrappers import wrap_openai
from openai import OpenAI

client = Client()
oai = wrap_openai(OpenAI())

# 1. 创建数据集 (Create dataset)
dataset = client.create_dataset("Geography QA")
client.create_examples(
    dataset_id=dataset.id,
    examples=[
        {"inputs": {"q": "Capital of France?"}, "outputs": {"a": "Paris"}},
        {"inputs": {"q": "Capital of Germany?"}, "outputs": {"a": "Berlin"}},
    ]
)

# 2. 目标函数 (Target function)
def target(inputs: dict) -> dict:
    res = oai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": inputs["q"]}]
    )
    return {"a": res.choices[0].message.content}

# 3. 评估器 (Evaluator)
def exact_match(inputs, outputs, reference_outputs):
    return outputs["a"].strip().lower() == reference_outputs["a"].strip().lower()

# 4. 运行实验 (Run experiment)
results = client.evaluate(
    target,
    data="Geography QA",
    evaluators=[exact_match],
    experiment_prefix="gpt-4o-mini-v1",
    max_concurrency=4
)

使用 openevals 进行 LLM 即评判 (LLM-as-judge with openevals)

pip install -U openevals
from openevals.llm import create_llm_as_judge
from openevals.prompts import CORRECTNESS_PROMPT

judge = create_llm_as_judge(
    prompt=CORRECTNESS_PROMPT,
    model="openai:o3-mini",
    feedback_key="correctness",
)

results = client.evaluate(target, data="my-dataset", evaluators=[judge])

评估类型 (Evaluation types)

类型 (Type) 使用场景 (When to use)
代码/启发式 (Code/Heuristic) 精确匹配、格式检查、基于规则 (Exact match, format checks, rule-based)
LLM 即评判 (LLM-as-judge) 主观质量、安全性、无参考评分 (Subjective quality, safety, reference-free)
人工 (Human) 标注队列、成对比较 (Annotation queues, pairwise comparison)
成对 (Pairwise) 比较两个应用版本 (Compare two app versions)
在线 (Online) 生产追踪、真实流量 (Production traces, real traffic)

提示中心 (Prompt Hub)

from langsmith import Client
from langchain_core.prompts import ChatPromptTemplate

client = Client()

# 推送提示 (Push a prompt)
prompt = ChatPromptTemplate([
    ("system", "You are a helpful assistant."),
    ("user", "{question}"),
])
client.push_prompt("my-assistant-prompt", object=prompt)

# 拉取并使用 (Pull and use)
prompt = client.pull_prompt("my-assistant-prompt")
# 拉取特定版本 (Pull specific version):
prompt = client.pull_prompt("my-assistant-prompt:abc123")

反馈 (Feedback)

from langsmith import Client
import uuid

client = Client()

# 为后续反馈链接自定义运行 ID (Custom run ID for later feedback linking)
my_run_id = str(uuid.uuid4())
result = chain.invoke({"input": "..."}, {"run_id": my_run_id})

# 附加反馈 (Attach feedback)
client.create_feedback(
    key="correctness",
    score=1,              # 0-1 数值或分类 (0-1 numeric or categorical)
    run_id=my_run_id,
    comment="Accurate and concise"
)

参考资料 (References)

📄 原始文档

完整文档(英文):

https://skills.sh/supercent-io/skills-template/langsmith

💡 提示:点击上方链接查看 skills.sh 原始英文文档,方便对照翻译。

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。