steveahnahn opened a new pull request, #62687: URL: https://github.com/apache/airflow/pull/62687
## Summary Fix scheduler DagRun creation transaction poisoning after DB errors. When `_create_dag_runs` processes multiple DAGs in one scheduling loop, a DB error during one `create_dagrun()` call can invalidate the SQLAlchemy transaction state for the shared session. That can cause unrelated DAGs later in the same loop to fail due to pending rollback state instead of their own logic. Changes: - Isolates each scheduled DagRun creation attempt with `session.begin_nested()` (savepoint), so a failure in one Dag is rolled back locally and does not poison the rest of the loop. - Captures `dag_id` early and uses that value in exception logging to avoid additional ORM/session access after a transaction failure. ### Test coverage Added `test_create_dag_runs_recovers_after_db_error` regression test The test injects a real DB flush error for the first DAG creation attempt and verifies: 1. The scheduler logs the failure. 2. The first DAG run is not created. 3. A second DAG in the same `_create_dag_runs` call is still created successfully. related: #59120 --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes (OpenAI Codex) Generated-by: OpenAI Codex following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
