justinmclean opened a new pull request, #239:
URL: https://github.com/apache/airflow-steward/pull/239
8 new eval cases across 3 suites filling the gaps in `issue-fix-workflow`
coverage
(steps 3, 4, and 5 had no fixtures).
## Changes
`tools/skill-evals/evals/issue-fix-workflow/` — three new suites:
- `step-3-failing-test` (3 cases) — assesses a proposed regression test:
- `case-1-test-fails-as-expected` — issue key present, adapts from
reproducer, run confirms FAILED → accept
- `case-2-missing-issue-key` — test omits the issue key reference → reject
- `case-3-test-passes-on-main` — test passes before any fix
(silent-broken-test trap) → surface-gap
- `step-4-production-change` (3 cases) — assesses a proposed production fix:
- `case-1-minimal-fix-proceeds` — root cause fixed, diff clean, targeted
test green → proceed
- `case-2-symptom-masks-root-cause` — symptom guard makes the test pass
but root cause unaddressed → iterate
- `case-3-drive-by-in-diff` — correct fix but diff includes an unrelated
whitespace change → iterate
- `step-5-module-test-run` (2 cases) — interprets module test run output:
- `case-1-clean-module-run` — all tests pass → proceed
- `case-2-regression-introduced` — fix breaks an adjacent round-trip test
→ iterate
`README.md` updated: suite table and case count (12 → 20).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]