nicolas-gaillard opened a new issue, #26936:
URL: https://github.com/apache/airflow/issues/26936
### Apache Airflow version
2.4.1
### What happened
Hi everyone,
Being on Airflow 2.3.0, I am in the process of migrating to 2.4.1 and I am
having an issue with the parsing of DAGs (which is affecting the UI).
In order to reuse code, we encapsulate DAGs in Python classes. It happens
that a DAG inherits from another one to modify a behavior while preserving the
original shape of the DAG as shown in the example below:
```python
# airflow/app/dags/dummyA/dag.py
from datetime import datetime
from airflow.decorators import dag, task
class BaseDag:
START_DATE = datetime(2022, 1, 1)
def __init__(self, message: str):
self.message = message
def dag_wrapper(self, dag_id: str):
@dag(dag_id=dag_id, start_date=self.START_DATE, catchup=False)
def _base_dag():
@task(task_id="print_message")
def print_message(message: str):
print(message)
print_message(self.message)
return _base_dag()
BaseDag("my message").dag_wrapper("BaseDag")
```
```python
# airflow/app/dags/dummyB/dag.py
from app.dags.dummyA.dag import BaseDag
class ChildDag(BaseDag):
def __init__(self, message: str):
self.message = f"custom {message}"
ChildDag("my message").dag_wrapper("ChildDag")
```
We use an extremely basic configuration of Airflow with a containerized
Postgres database, a container for the webserver and one for the scheduler
(which uses the LocalExecutor).
During the airflow db init, I have the following error:
```
ERROR [airflow.models.dagbag.DagBag] Exception bagging dag: BaseDag
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py",
line 484, in _bag_dag
raise AirflowDagDuplicatedIdException(
airflow.exceptions.AirflowDagDuplicatedIdException: Ignoring DAG BaseDag
from /usr/local/airflow/app/dags/dummyB/dag.py - also found in
/usr/local/airflow/app/dags/dummyA/dag.py
ERROR [airflow.models.dagbag.DagBag] Failed to bag_dag:
/usr/local/airflow/app/dags/dummyB/dag.py
Traceback (most recent call last):
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py",
line 425, in _process_modules
self.bag_dag(dag=dag, root_dag=dag)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py",
line 452, in bag_dag
self._bag_dag(dag=dag, root_dag=root_dag, recursive=True)
File
"/home/airflow/.local/lib/python3.8/site-packages/airflow/models/dagbag.py",
line 484, in _bag_dag
raise AirflowDagDuplicatedIdException(
airflow.exceptions.AirflowDagDuplicatedIdException: Ignoring DAG BaseDag
from /usr/local/airflow/app/dags/dummyB/dag.py - also found in
/usr/local/airflow/app/dags/dummyA/dag.py
```
This error does not impact the functioning of Airflow or my DAGs but when I
go to the interface and look at the BaseDag code (it’s good for the ChildDag):

This behavior is confirmed by checking the database (dag table):

If I run the DAG, it is indeed the correct code that is executed and I have
then the correct code displayed but if I reload the UI, it is again the code of
the child class that is displayed.
What is surprising is that when I display the `fileloc` attribute of these
two DAGs, it is the file path of the `BaseDag` that is displayed.
(In case I don't do the `airflow db init`, I observe this same behavior on
the interface.)
### What you think should happen instead
The `BaseDag` code should be displayed (instead of the `ChildDag` one).
### How to reproduce
Run airflow 2.4.1 instance with these two DAGs and you should see the wrong
code in the UI (to display error logs, you just can run `airflow db init`).
### Operating System
Docker's image `apache/airflow:2.4.1-python3.8` (Debian GNU/Linux 11
(bullseye))
### Versions of Apache Airflow Providers
```
apache-airflow-providers-common-sql==1.2.0
apache-airflow-providers-docker==3.2.0
apache-airflow-providers-odbc==3.1.2
apache-airflow-providers-postgres==5.2.2
```
### Deployment
Docker-Compose
### Deployment details
used image: `apache/airflow:2.4.1-python3.8` (Python 3.8)
* a Postgres container (postgres:14.4)
* an airflow init container (`airflow db init; airflow db upgrade; airflow
users create`)
* a scheduler (`LocalExecutor`)
* a webserver
(This is a simplified version of the official docker-compose.)
### Anything else
This problem occurs every time and it happened when I upgraded from airflow
2.3.4 to 2.4.1, no other libraries were changed.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]