t oo created AIRFLOW-6335: ----------------------------- Summary: dag_processor_manager timeout logs should be ERROR not INFO/WARN Key: AIRFLOW-6335 URL: https://issues.apache.org/jira/browse/AIRFLOW-6335 Project: Apache Airflow Issue Type: Improvement Components: DAG, scheduler Affects Versions: 1.10.6 Reporter: t oo
I triggered a large DAG with 18k tasks, after 30mins still nothing started going to scheduled/queued/running state. All airflow processes (scheduler, worker.etc) were up. I then went searching through the logs for ERROR, nothing came up! After a lot of digging i found below in dag_processor_manager.log: ||File Path||PID||Runtime||# DAGs||# Errors||Last Runtime||Last Run|| |/home/ec2-user/airflow/dags/redact1.py|31366|261.22s|0|-1|361.32s|2019-12-24T17:10:44| |/home/ec2-user/airflow/dags/redact2.py| | |1|0|1.00s|2019-12-24T17:16:29| [2019-12-24 17:40:48,739] \{dag_processing.py:1324} INFO - Processor for /home/ec2-user/airflow/dags/redact1.py with PID 17307 started at 2019-12-24T17:34:47.417660+00:00 has timed out, killing it. [2019-12-24 17:40:49,696] \{dag_processing.py:1324} INFO - Processor for /home/ec2-user/airflow/dags/redact1.py with PID 17307 started at 2019-12-24T17:34:47.417660+00:00 has timed out, killing it. [2019-12-24 17:40:49,697] \{dag_processing.py:1191} WARNING - Processor for /home/ec2-user/airflow/dags/redact1.py exited with return code -9. Solution: Change from INFO to ERROR: [https://github.com/apache/airflow/blob/1.10.6/airflow/utils/dag_processing.py#L1321] Change from WARN to ERROR: [https://github.com/apache/airflow/blob/1.10.6/airflow/utils/dag_processing.py#L1189] -- This message was sent by Atlassian Jira (v8.3.4#803005)