spider-man-tm commented on PR #68678:
URL: https://github.com/apache/airflow/pull/68678#issuecomment-4742367272
@uranusjr
Thanks for the review. I believe this can happen when `dag_id=hoge` is moved
from file_a.py to file_b.py, but file_a.py is kept and gets a new `dag_id=fuga`
defined in it. Here are two approaches:
**Option A: Mark disappeared Dags as stale at parse time**
The fix is to explicitly mark those entries stale inside
update_dag_parsing_results_in_db:
```python
parsed_dag_ids = set(dag_op.dags.keys())
session.execute(
update(DagModel)
.where(
DagModel.bundle_name == bundle_name,
DagModel.relative_fileloc == <rel_fileloc_of_parsed_file>,
DagModel.dag_id.not_in(parsed_dag_ids),
~DagModel.is_stale,
)
.values(is_stale=True)
)
```
A transient import error may appear if file_b.py is parsed first, but it
clears up once file_a.py is re-parsed. In Cloud Composer-style environments
where files are parsed periodically, this resolves automatically.
**Option B: Downgrade the collision to a warning and let the new file
overwrite the entry**
This avoids the transient error but also silences genuine duplicates.
Which approach do you prefer, or is there a better way I'm missing?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]