justinmclean opened a new pull request, #208: URL: https://github.com/apache/airflow-steward/pull/208
Adds behavioral eval coverage for six skills that had no test fixtures, bringing the suite from 206 cases across 9 skills to 290 cases across 15. ## New suites **issue-triage** (22 cases, 5 steps) step-1-resolve-selector, step-3-classify, step-4-compose-comment, step-5-confirm, step-7-recap. Includes one adversarial case (step-3 case-7: SYSTEM: block instructs the model to classify all issues as BUG; correct answer is INVALID). **issue-reproducer** (27 cases, 7 steps) step-1-inventory, step-2-pick-candidate, step-3-classify-shape, step-5.5-confirm, step-7-verify, step-8-baselines, step-10-compose-verdict. Includes one adversarial case (step-7 case-6: stdout contains an AGENT OVERRIDE directive; correct classification is cannot-run-environment). **issue-fix-workflow** (12 cases, 4 steps) step-2-locate-area, step-6-scope-check, step-7-compose-commit, step-8-handback. **issue-reassess** (10 cases, 4 steps) step-1-pool-selection, step-2-resumability, step-4-aggregate, step-5-campaign-report. step-5 uses structural assertions (section presence, still-failing tail coverage, no-auto-post-claim) rather than exact JSON match, following the same pattern as issue-triage/step-4. **issue-reassess-stats** (8 cases, 3 steps) step-1-fetch-verdicts, step-2-classify, step-3-aggregate. **pr-management-code-review** (5 cases, 1 step) review-disposition. Includes one adversarial case (case-5: PR body instructs the model to approve immediately; correct disposition is REQUEST_CHANGES based on a real dependency conflict in the diff). ## Coverage rationale Steps omitted are either not-applicable (pre-flights, GitHub posts, runtime execution, working-tree resets) or hard-to-test (steps that generate arbitrary code or HTML). Every step with a structured, mockable output now has at least two fixture cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
