EvalContext that accumulates evaluation data and builds into an EvalResult.
Pre-populated Fields
Set context fields directly in the@eval decorator:
@parametrize, special parameter names (input, reference, metadata, run_data, latency) auto-populate context fields:
Setting Fields
Input, Output, and Reference
Metadata
Store additional context for debugging and analysis:Scoring with Assertions
Use assertions to score:Advanced Scoring with add_score()
For numeric scores or multiple named metrics:Auto-Return Behavior
You don’t need to explicitly return anything—the context automatically builds into anEvalResult when the function completes:
Exception Safety
If your evaluation throws an exception, partial data is preserved:EvalResult will have:
inputandoutputpreservederrorfield with the exception message- A failing score automatically added
Default Scoring
If no score is added and no assertions fail, Twevals auto-adds a passing score:Custom Parameters
For parametrized tests with custom parameter names, include them in your function signature:API Reference
| Method | Description |
|---|---|
add_score(value, notes, key) | Add a score (boolean or numeric) |
build() | Convert to immutable EvalResult |
| Property | Type | Description |
|---|---|---|
input | Any | The test input |
output | Any | The system output |
reference | Any | Expected output (optional) |
metadata | dict | Custom metadata |
run_data | dict | Debug/trace data |
latency | float | Execution time |
scores | list | List of Score objects |
