TerryYin1777 commented on issue #42542: URL: https://github.com/apache/airflow/issues/42542#issuecomment-2644166002
We have a similar setup (kubernetes deployment with git-sync) and have the same issue. After some deep dive in the code base, I believe the root cause is the following: When git-sync resyncs, it changes the DAGs folder from: hash-123/dags/example_dag.py → hash-456/dags/example_dag.py Despite the symlink that points these directories to a contract directory, the [get_dag_directory](https://github.com/apache/airflow/blob/2.9.1/airflow/dag_processing/manager.py#L963) resolves this path to its canonical path, which will be different across re-syncs. This path get passed all the way to the [deactivate_deleted_dags](https://github.com/apache/airflow/blob/2.9.1/airflow/models/dag.py#L3830) function in the dag model, by which is used to mark the dag inactive and therefore hidden from the UI. The deletion is not happening since the processor_subdir value does not match the previously registered processor_subdir. Not fully sure the reason why we need to resolve the dag folder path to its canonical. I understand it's not an issue with Airflow itself and probably happens only when git-sync is used for dag deployment. But given it's a quite widely adopted combination, is it possible to add a configuration like GET_DAG_FOLDER_RESOLVE=False so the symlink path is not resolved to its canonical path? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
