Production Prompt Engineering 提示工程

Released已發布

methodology methodology

Debug and harden production LLM prompts — handle prompt injection, output format drift, instruction forgetting in long contexts, and cross-model portability issues. Use this skill when the user ships an LLM-powered feature to production and needs to diagnose why outputs are inconsistent, unsafe, or regressed after model updates — NOT for basic 'write a better prompt' questions.

技術方法論技能：Production Prompt Engineering 分析與應用。

View on GitHub在 GitHub 查看

Overview概述

This skill addresses the failure modes that appear ONLY in production LLM applications: prompt injection, output format drift, silent regression across model versions, instruction decay in long contexts, and hallucination under pressure. It is NOT a tutorial on few-shot or chain-of-thought — assume the agent already knows basic prompting techniques.

When to Use使用時機

Trigger conditions:

A production LLM feature is misbehaving (inconsistent, unsafe, format-drifting)
Designing a system prompt for a multi-tenant application
Hardening prompts against injection or jailbreak attempts
Diagnosing regression after a model version update

When NOT to use:

Basic "how do I write a prompt" — the agent already knows few-shot, CoT, role-play
One-off content generation (just write the prompt directly)
RAG architecture design (use a RAG-specific skill)

Methodology 方法論

Phase 1: Reproduce the Failure

Collect: exact input, exact output, expected output, model + version, temperature. Reproduce in isolation (outside the app) to rule out application bugs. Gate: Failure reproduces consistently in a minimal test case.

Phase 2: Classify the Failure Mode

Match against the table above. Most production failures fall into one of 6 categories. Don't guess — identify which mode applies. Gate: Failure mode classified with evidence.

Phase 3: Apply the Targeted Fix

Fix the SPECIFIC failure mode. Don't rewrite the whole prompt. Generic rewrites often introduce new failure modes. Gate: Fix addresses root cause, not symptom.

Phase 4: Build a Regression Test

Add the failing case to a regression test suite. Run the suite before every prompt change or model version update. Gate: Test suite catches the original failure AND any reintroduction.

Output Format輸出格式

# Prompt Debug Report: {Feature Name}

Gotchas注意事項

"Ignore previous instructions" is only the beginning: Modern injection uses role-play ("Pretend you are DAN..."), language switching, Unicode tricks, and encoded payloads. Defense requires input validation AND output validation, not just instruction phrasing.
Temperature 0 is not deterministic across calls: Even at T=0, outputs can vary across API calls due to backend GPU non-determinism (batch effects). Don't rely on exact string equality in tests; use semantic or schema equality.
Few-shot examples override your instructions: If your examples show 500-word responses and you say "be concise", the model follows the examples. Examples are STRONGER than instructions.
System prompts are NOT absolute: Even with a system prompt, sufficiently adversarial user input can override behavior. System prompts are a strong hint, not a security boundary. For real security, use output validation and sandboxing.
Provider model updates are silent: OpenAI's "gpt-4" alias changes weights without notice. Pin to dated versions (gpt-4-0613) for stability. Rerun regression tests after every update.
Context window size ≠ effective context: A 128K context model may only attend well to the first 32K and last 4K. Put critical instructions at START and END, not in the middle ("lost in the middle" effect).

References參考資料

For prompt injection attack patterns, see references/injection-patterns.md
For regression testing frameworks, see references/regression-testing.md
For cross-model prompt portability, see references/cross-model-testing.md

Tags標籤

technologyllmproductiondebugging