spider-man-tm opened a new pull request, #68678: URL: https://github.com/apache/airflow/pull/68678
This prevents a Dag parsed from one file from overwriting an existing active Dag with the same `dag_id` from a different file. When the Dag processor parses files one at a time, duplicate `dag_id` checks inside a single `DagBag` can miss duplicates that live in separate files. Before this change, syncing the later parsed file to the metadata database could update the existing `DagModel` and serialized Dag row, effectively making the last parsed file win. This change adds a database-level collision check during Dag parsing result sync. If an active Dag with the same `dag_id` already exists for a different file, the incoming Dag is skipped and recorded as an import error using the existing `AirflowDagDuplicatedIdException` behavior. related: #29321 ## Tests Added a regression test covering this flow: - sync an original Dag from `original.py` - sync another Dag with the same `dag_id` from `duplicate.py` - verify the original `DagModel` and serialized Dag remain unchanged - verify the duplicate file gets a persisted import error Validation run locally: - `uv run --project airflow-core ruff format airflow-core/src/airflow/dag_processing/collection.py airflow-core/tests/unit/dag_processing/test_collection.py` - `uv run --project airflow-core ruff check --fix airflow-core/src/airflow/dag_processing/collection.py airflow-core/tests/unit/dag_processing/test_collection.py` - `breeze run pytest airflow-core/tests/unit/dag_processing/test_collection.py::TestUpdateDagParsingResults::test_duplicate_dag_id_from_different_file_is_import_error -xvs` --- ##### Was generative AI tooling used to co-author this PR? - [X] Yes — Codex (GPT-5), used as an implementation assistant for code changes, test iteration, and validation. The issue investigation, reproduction context, and final review were performed by the PR author. Generated-by: Codex (GPT-5) following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
