spider-man-tm opened a new pull request, #68678:
URL: https://github.com/apache/airflow/pull/68678

   This prevents a Dag parsed from one file from overwriting an existing active 
Dag with the same `dag_id` from a different file.
   
   When the Dag processor parses files one at a time, duplicate `dag_id` checks 
inside a single `DagBag` can miss duplicates that live in separate files. 
Before this change, syncing the later parsed file to the metadata database 
could update the existing `DagModel` and serialized Dag row, effectively making 
the last parsed file win.
   
   This change adds a database-level collision check during Dag parsing result 
sync. If an active Dag with the same `dag_id` already exists for a different 
file, the incoming Dag is skipped and recorded as an import error using the 
existing `AirflowDagDuplicatedIdException` behavior.
   
   related: #29321
   
   ## Tests
   
   Added a regression test covering this flow:
   
   - sync an original Dag from `original.py`
   - sync another Dag with the same `dag_id` from `duplicate.py`
   - verify the original `DagModel` and serialized Dag remain unchanged
   - verify the duplicate file gets a persisted import error
   
   Validation run locally:
   
   - `uv run --project airflow-core ruff format 
airflow-core/src/airflow/dag_processing/collection.py 
airflow-core/tests/unit/dag_processing/test_collection.py`
   - `uv run --project airflow-core ruff check --fix 
airflow-core/src/airflow/dag_processing/collection.py 
airflow-core/tests/unit/dag_processing/test_collection.py`
   - `breeze run pytest 
airflow-core/tests/unit/dag_processing/test_collection.py::TestUpdateDagParsingResults::test_duplicate_dag_id_from_different_file_is_import_error
 -xvs`
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [X] Yes — Codex (GPT-5), used as an implementation assistant for code 
changes, test iteration, and validation. The issue investigation, reproduction 
context, and final review were performed by the PR author.
   
   Generated-by: Codex (GPT-5) following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to