Inputs & Outputs
Inputs
| Input | Required | Default | Description |
|---|---|---|---|
skill-name | Yes | - | Name of the skill to evaluate |
skill-path | Yes | - | Path to the skill directory (must contain SKILL.md and evals/) |
anthropic-api-key | Yes | - | Anthropic API key for the claude CLI |
pass-threshold | No | 80 | Minimum pass rate (0-100) to succeed |
timeout | No | 120 | Timeout per eval case in seconds |
post-comment | No | true | Post results as a PR comment |
github-token | No | github.token | Token for PR comments |
upload-viewer | No | true | Upload eval viewer HTML as artifact |
node-version | No | 22 | Node.js version for claude CLI |
max-retries | No | 3 | Max retry attempts per API call |
retry-delay | No | 10 | Base delay between retries (seconds) |
Outputs
| Output | Description |
|---|---|
pass-rate | Overall pass rate as percentage (0-100) |
passed | Total criteria passed |
total | Total criteria evaluated |
cases-run | Number of eval cases executed |
Using outputs
- uses: skill-bench/skill-eval-action@v1
id: eval
with:
skill-name: my-skill
skill-path: skills/my-skill
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
- run: echo "Pass rate was ${{ steps.eval.outputs.pass-rate }}%"