[ https://issues.apache.org/jira/browse/AIRFLOW-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fokko Driesprong resolved AIRFLOW-4510. --------------------------------------- Resolution: Fixed Fix Version/s: 2.0.0 > Timezone set incorrectly if multiple DAGs defined in the same file > ------------------------------------------------------------------ > > Key: AIRFLOW-4510 > URL: https://issues.apache.org/jira/browse/AIRFLOW-4510 > Project: Apache Airflow > Issue Type: Bug > Components: DAG > Affects Versions: 1.10.3 > Reporter: Abhishek Ray > Assignee: Abhishek Ray > Priority: Major > Fix For: 2.0.0 > > Attachments: Screen Shot 2019-05-13 at 2.41.54 PM.png, Screen Shot > 2019-05-13 at 6.45.25 PM.png > > > If multiple DAGs are defined in the same file and they share the same > default_args, then the subsequent DAGs have an incorrect timezone. > > Steps to reproduce: > > Set the default_timezone to be non-UTC in airflow.cfg > > {noformat} > default_timezone = America/New_York{noformat} > > DAG definition which has multiple DAGs in the same file: > > > {code:java} > from airflow import DAG > from airflow.operators.bash_operator import BashOperator > from datetime import datetime, timedelta > default_args = { > 'owner': 'airflow', > 'depends_on_past': False, > 'start_date': datetime(2019, 5, 11), > } > def make_dynamic_dag(schedule_interval, dag_name): > dag = DAG(f"tutorial_{dag_name}", default_args=default_args, > schedule_interval=schedule_interval) > t1 = BashOperator(task_id='print_date', bash_command='date', dag=dag) > return dag > test_dag_1 = make_dynamic_dag("00 15 * * *", “1”) > test_dag_2 = make_dynamic_dag("00 18 * * *", “2”) > {code} > > > test_dag_1 is expected to run at 15:00 EST or 19:00 UTC and test_dag_2 is > expected to run at 18:00 EST or 22:00 UTC. > > However, test_dag_2 runs at 18:00 UTC which seems to point at it losing > timezone information: > !Screen Shot 2019-05-13 at 2.41.54 PM.png! > > I added some logging in the Airflow code around the default_args > initialization and it confirmed the hypothesis that the default_args were > being mutated: > > {noformat} > [2019-05-13 18:40:10,409] {__init__.py:3045} INFO - default_args for DAG > tutorial_1: {'owner': 'airflow', 'start_date': datetime.datetime(2019, 5, 11, > 0, 0)} > [2019-05-13 18:40:10,410] {__init__.py:3045} INFO - default_args for DAG > tutorial_2: {'owner': 'airflow', 'start_date': <Pendulum > [2019-05-11T04:00:00+00:00]>} > {noformat} > > > As a simple fix, I changed the DAG definition to: > {noformat} > dag = DAG(f"tutorial_{dag_name}", default_args=default_args, > schedule_interval=schedule_interval){noformat} > and this seems to fix the problem: > > {noformat} > [2019-05-13 18:44:44,674] {__init__.py:3045} INFO - default_args for DAG > tutorial_1: {'owner': 'airflow', 'start_date': datetime.datetime(2019, 5, 11, > 0, 0)} > [2019-05-13 18:44:44,676] {__init__.py:3045} INFO - default_args for DAG > tutorial_2: {'owner': 'airflow', 'start_date': datetime.datetime(2019, 5, 11, > 0, 0)} > {noformat} > > !Screen Shot 2019-05-13 at 6.45.25 PM.png! > I want to add a fix to create a deep-copy of default_args here: > [https://github.com/apache/airflow/blob/master/airflow/models/dag.py#L197] -- This message was sent by Atlassian JIRA (v7.6.3#76005)