potiuk opened a new pull request, #67722: URL: https://github.com/apache/airflow/pull/67722
## Summary `SelectiveChecks._is_large_enough_pr`'s line-count gate previously summed every changed line (minus a small set of lock/newsfragment exclusions), so a 1000-line **docs** or **test-only** PR was treated as the same risk shape as a 1000-line scheduler change and would force the full CI matrix. The file-count gate stays unchanged; only the line-count check is narrowed to production code. The change reuses the existing `FileGroupForCi.*_PRODUCTION_FILES` infrastructure rather than inventing a new pattern set, and tightens those existing groups so they actually match production-only paths (currently `PYTHON_PRODUCTION_FILES` matches provider tests and is missing `task-sdk/src`, `airflow-ctl/src`, `shared/*/src`; `JAVASCRIPT_PRODUCTION_FILES` doesn't exclude `openapi-gen/`). ## What changes - `FileGroupForCi.PYTHON_PRODUCTION_FILES` — drop test paths, add `task-sdk/src/`, `airflow-ctl/src/`, `shared/*/src/`, exclude `openapi-gen/`, `i18n/locales/`, and `*_generated.py` within those trees. Tighten anchors on `pyproject.toml` and `hatch_build.py`. - `FileGroupForCi.JAVASCRIPT_PRODUCTION_FILES` — exclude `openapi-gen/` and `i18n/locales/`. - `SelectiveChecks._is_large_enough_pr` — line-count gate now operates on the deduped union of `PYTHON_PRODUCTION_FILES`, `JAVASCRIPT_PRODUCTION_FILES`, and `HELM_FILES`. The file-count gate is unchanged. ## Side benefit `run_python_scans` / `run_javascript_scans` (the other consumers of `*_PRODUCTION_FILES`) also become more accurate — SAST and SCA scans now target production code, not test code. ## Tests Five new `test_large_pr_by_line_count` cases added: - Large test-only PR — does not trigger - Large docs-only PR — does not trigger - Generated-only large PR (`openapi-gen`, `*_generated.py`) — does not trigger - Mixed PR with 600 production lines + tests — triggers (production ≥ 500) - Mixed PR with 200 production lines + tests — does not trigger (test lines excluded) The existing three line-count cases and all 159 other selective-checks tests still pass (167/167 total). ## Test plan - [x] `uv run --project dev/breeze pytest dev/breeze/tests/test_selective_checks.py -q` — 167 passed - [ ] CI green against the latest `main` - [ ] Spot-check that representative production-only PRs still trip the threshold --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes — Claude Code (Opus 4.7, 1M context) Generated-by: Claude Code (Opus 4.7, 1M context) following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
