[ https://issues.apache.org/jira/browse/AIRFLOW-5447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928788#comment-16928788 ]
Henry Cohen edited comment on AIRFLOW-5447 at 9/12/19 6:00 PM: --------------------------------------------------------------- This is a sample of what I see when running with the example DAGs, they queue, but when the first one tries to start it just sits, and eventually the processes die and the scheduler hangs {noformat} [2019-09-12 17:56:03,034] kubernetes_executor.py:698 INFO - TaskInstance: <TaskInstance: tutorial.print_date 2019-09-10 00:00:00+00:00 [queued]> found in queued state but was not launched, rescheduling [2019-09-12 17:56:03,043] scheduler_job.py:1376 INFO - Resetting orphaned tasks for active dag runs [2019-09-12 17:56:03,085] base_job.py:308 INFO - Reset the following 30 TaskInstances: <TaskInstance: latest_only.latest_only 2019-09-10 04:00:00+00:00 [None]> <TaskInstance: example_branch_operator.run_this_first 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: tutorial.print_date 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_2 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_1 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_2 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_1 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_True 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_False 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_branch_dop_operator_v3.condition 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_http_operator.http_sensor_check 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_passing_params_via_test_command.run_this 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_trigger_controller_dag.test_trigger_dagrun 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_1 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_1 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_True 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_False 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.runme_2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.runme_1 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.runme_0 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.also_run_this 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_xcom.push_by_returning 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_xcom.push 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_branch_operator.run_this_first 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: latest_only.latest_only 2019-09-10 00:00:00+00:00 [None]> [2019-09-12 17:56:03,092] dag_processing.py:545 INFO - Launched DagFileProcessorManager with pid: 35 [2019-09-12 17:56:03,093] scheduler_job.py:1390 DEBUG - Starting Loop... [2019-09-12 17:56:03,093] scheduler_job.py:1401 DEBUG - Harvesting DAG parsing results [2019-09-12 17:56:03,093] scheduler_job.py:1403 DEBUG - Harvested 0 SimpleDAGs [2019-09-12 17:56:03,093] scheduler_job.py:1438 DEBUG - Heartbeating the executor [2019-09-12 17:56:03,093] base_executor.py:124 DEBUG - 0 running task instances [2019-09-12 17:56:03,094] base_executor.py:125 DEBUG - 0 in queue [2019-09-12 17:56:03,094] base_executor.py:126 DEBUG - 96 open slots [2019-09-12 17:56:03,094] base_executor.py:135 DEBUG - Calling the <class 'airflow.contrib.executors.kubernetes_executor.KubernetesExecutor'> sync method [2019-09-12 17:56:03,100] scheduler_job.py:1459 DEBUG - Ran scheduling loop in 0.01 seconds [2019-09-12 17:56:03,101] scheduler_job.py:1462 DEBUG - Sleeping for 1.00 seconds [2019-09-12 17:56:03,107] settings.py:54 INFO - Configured default timezone <Timezone [America/New_York]> [2019-09-12 17:56:03,109] settings.py:327 DEBUG - Failed to import airflow_local_settings. Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/airflow/settings.py", line 315, in import_local_settings import airflow_local_settings ModuleNotFoundError: No module named 'airflow_local_settings' [2019-09-12 17:56:03,111] logging_config.py:47 INFO - Successfully imported user-defined logging config from log_config.LOGGING_CONFIG [2019-09-12 17:56:03,120] settings.py:170 DEBUG - Setting up DB connection pool (PID 35) [2019-09-12 17:56:03,121] settings.py:213 INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=35 [2019-09-12 17:56:03,289] settings.py:238 DEBUG - Disposing DB connection pool (PID 45) [2019-09-12 17:56:03,356] settings.py:238 DEBUG - Disposing DB connection pool (PID 41) [2019-09-12 17:56:04,101] scheduler_job.py:1474 DEBUG - Sleeping for 0.99 seconds to prevent excessive logging [2019-09-12 17:56:04,126] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor0-Process, stopped)> [2019-09-12 17:56:04,127] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor1-Process, stopped)> [2019-09-12 17:56:04,162] settings.py:238 DEBUG - Disposing DB connection pool (PID 55) [2019-09-12 17:56:04,223] settings.py:238 DEBUG - Disposing DB connection pool (PID 58) [2019-09-12 17:56:05,095] scheduler_job.py:1390 DEBUG - Starting Loop... [2019-09-12 17:56:05,095] scheduler_job.py:1401 DEBUG - Harvesting DAG parsing results [2019-09-12 17:56:05,097] dag_processing.py:637 DEBUG - Received message of type DagParsingStat [2019-09-12 17:56:05,098] dag_processing.py:637 DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,098] dag_processing.py:637 DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,099] dag_processing.py:637 DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,099] dag_processing.py:637 DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,100] dag_processing.py:637 DEBUG - Received message of type DagParsingStat [2019-09-12 17:56:05,101] dag_processing.py:637 DEBUG - Received message of type DagParsingStat [2019-09-12 17:56:05,101] scheduler_job.py:1403 DEBUG - Harvested 4 SimpleDAGs [2019-09-12 17:56:05,128] scheduler_job.py:921 INFO - 5 tasks up for execution: <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 04:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 04:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [scheduled]> [2019-09-12 17:56:05,138] scheduler_job.py:953 INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 5 task instances ready to be queued [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG example_subdag_operator has 0/48 running and queued tasks [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG latest_only_with_trigger has 0/48 running and queued tasks [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG latest_only_with_trigger has 1/48 running and queued tasks [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG latest_only_with_trigger has 2/48 running and queued tasks [2019-09-12 17:56:05,139] scheduler_job.py:981 INFO - DAG latest_only_with_trigger has 3/48 running and queued tasks [2019-09-12 17:56:05,139] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor2-Process, stopped)> [2019-09-12 17:56:05,140] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor3-Process, stopped)> [2019-09-12 17:56:05,157] scheduler_job.py:1031 INFO - Setting the following tasks to queued state: <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 04:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 04:00:00+00:00 [scheduled]> [2019-09-12 17:56:05,182] scheduler_job.py:1107 INFO - Setting the following 5 tasks to queued state: <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [queued]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [queued]> <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [queued]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 04:00:00+00:00 [queued]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 04:00:00+00:00 [queued]> [2019-09-12 17:56:05,183] scheduler_job.py:1143 INFO - Sending ('latest_only_with_trigger', 'latest_only', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 4 and queue default [2019-09-12 17:56:05,183] base_executor.py:59 INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'latest_only', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,183] scheduler_job.py:1143 INFO - Sending ('latest_only_with_trigger', 'task2', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 3 and queue default [2019-09-12 17:56:05,183] base_executor.py:59 INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'task2', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,184] scheduler_job.py:1143 INFO - Sending ('example_subdag_operator', 'start', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 5 and queue default [2019-09-12 17:56:05,184] base_executor.py:59 INFO - Adding to queue: ['airflow', 'run', 'example_subdag_operator', 'start', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py'] [2019-09-12 17:56:05,184] scheduler_job.py:1143 INFO - Sending ('latest_only_with_trigger', 'latest_only', datetime.datetime(2019, 9, 10, 4, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 4 and queue default [2019-09-12 17:56:05,184] base_executor.py:59 INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'latest_only', '2019-09-10T04:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,185] scheduler_job.py:1143 INFO - Sending ('latest_only_with_trigger', 'task2', datetime.datetime(2019, 9, 10, 4, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 3 and queue default [2019-09-12 17:56:05,185] base_executor.py:59 INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'task2', '2019-09-10T04:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,185] scheduler_job.py:1438 DEBUG - Heartbeating the executor [2019-09-12 17:56:05,185] base_executor.py:124 DEBUG - 0 running task instances [2019-09-12 17:56:05,186] base_executor.py:125 DEBUG - 5 in queue [2019-09-12 17:56:05,186] base_executor.py:126 DEBUG - 96 open slots [2019-09-12 17:56:05,186] kubernetes_executor.py:764 INFO - Add task ('example_subdag_operator', 'start', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) with command ['airflow', 'run', 'example_subdag_operator', 'start', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py'] with executor_config {} [2019-09-12 17:56:05,193] settings.py:238 DEBUG - Disposing DB connection pool (PID 72) [2019-09-12 17:56:05,315] settings.py:238 DEBUG - Disposing DB connection pool (PID 69) [2019-09-12 17:56:06,153] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor4-Process, stopped)> [2019-09-12 17:56:06,154] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor5-Process, stopped)> [2019-09-12 17:56:06,290] settings.py:238 DEBUG - Disposing DB connection pool (PID 83) [2019-09-12 17:56:06,322] settings.py:238 DEBUG - Disposing DB connection pool (PID 86) [2019-09-12 17:56:07,171] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor6-Process, stopped)> [2019-09-12 17:56:07,171] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor7-Process, stopped)> [2019-09-12 17:56:07,248] settings.py:238 DEBUG - Disposing DB connection pool (PID 97) [2019-09-12 17:56:07,415] settings.py:238 DEBUG - Disposing DB connection pool (PID 100) [2019-09-12 17:56:08,188] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor8-Process, stopped)> [2019-09-12 17:56:08,189] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor9-Process, stopped)> [2019-09-12 17:56:08,247] settings.py:238 DEBUG - Disposing DB connection pool (PID 111) [2019-09-12 17:56:08,379] settings.py:238 DEBUG - Disposing DB connection pool (PID 114) [2019-09-12 17:56:09,205] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor10-Process, stopped)> [2019-09-12 17:56:09,206] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor11-Process, stopped)> [2019-09-12 17:56:09,323] settings.py:238 DEBUG - Disposing DB connection pool (PID 125) [2019-09-12 17:56:09,444] settings.py:238 DEBUG - Disposing DB connection pool (PID 127) [2019-09-12 17:56:10,223] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor12-Process, stopped)> [2019-09-12 17:56:10,223] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor13-Process, stopped)> [2019-09-12 17:56:10,299] settings.py:238 DEBUG - Disposing DB connection pool (PID 139) [2019-09-12 17:56:10,325] settings.py:238 DEBUG - Disposing DB connection pool (PID 142) [2019-09-12 17:56:11,237] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor14-Process, stopped)> [2019-09-12 17:56:11,238] scheduler_job.py:257 DEBUG - Waiting for <Process(DagFileProcessor15-Process, stopped)>{noformat} was (Author: hpcohen): `[2019-09-12 17:56:03,034] {{kubernetes_executor.py:698}} INFO - TaskInstance: <TaskInstance: tutorial.print_date 2019-09-10 00:00:00+00:00 [queued]> found in queued state but was not launched, rescheduling [2019-09-12 17:56:03,043] {{scheduler_job.py:1376}} INFO - Resetting orphaned tasks for active dag runs [2019-09-12 17:56:03,085] {{base_job.py:308}} INFO - Reset the following 30 TaskInstances: <TaskInstance: latest_only.latest_only 2019-09-10 04:00:00+00:00 [None]> <TaskInstance: example_branch_operator.run_this_first 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: tutorial.print_date 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_2 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_1 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_2 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_1 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_True 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_False 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_branch_dop_operator_v3.condition 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_http_operator.http_sensor_check 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_passing_params_via_test_command.run_this 2019-09-11 00:00:00+00:00 [None]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_trigger_controller_dag.test_trigger_dagrun 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.skip_operator_1 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_skip_dag.always_true_1 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_True 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_short_circuit_operator.condition_is_False 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.runme_2 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.runme_1 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.runme_0 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_bash_operator.also_run_this 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_xcom.push_by_returning 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_xcom.push 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: example_branch_operator.run_this_first 2019-09-10 00:00:00+00:00 [None]> <TaskInstance: latest_only.latest_only 2019-09-10 00:00:00+00:00 [None]> [2019-09-12 17:56:03,092] {{dag_processing.py:545}} INFO - Launched DagFileProcessorManager with pid: 35 [2019-09-12 17:56:03,093] {{scheduler_job.py:1390}} DEBUG - Starting Loop... [2019-09-12 17:56:03,093] {{scheduler_job.py:1401}} DEBUG - Harvesting DAG parsing results [2019-09-12 17:56:03,093] {{scheduler_job.py:1403}} DEBUG - Harvested 0 SimpleDAGs [2019-09-12 17:56:03,093] {{scheduler_job.py:1438}} DEBUG - Heartbeating the executor [2019-09-12 17:56:03,093] {{base_executor.py:124}} DEBUG - 0 running task instances [2019-09-12 17:56:03,094] {{base_executor.py:125}} DEBUG - 0 in queue [2019-09-12 17:56:03,094] {{base_executor.py:126}} DEBUG - 96 open slots [2019-09-12 17:56:03,094] {{base_executor.py:135}} DEBUG - Calling the <class 'airflow.contrib.executors.kubernetes_executor.KubernetesExecutor'> sync method [2019-09-12 17:56:03,100] {{scheduler_job.py:1459}} DEBUG - Ran scheduling loop in 0.01 seconds [2019-09-12 17:56:03,101] {{scheduler_job.py:1462}} DEBUG - Sleeping for 1.00 seconds [2019-09-12 17:56:03,107] {{settings.py:54}} INFO - Configured default timezone <Timezone [America/New_York]> [2019-09-12 17:56:03,109] {{settings.py:327}} DEBUG - Failed to import airflow_local_settings. Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/airflow/settings.py", line 315, in import_local_settings import airflow_local_settings ModuleNotFoundError: No module named 'airflow_local_settings' [2019-09-12 17:56:03,111] {{logging_config.py:47}} INFO - Successfully imported user-defined logging config from log_config.LOGGING_CONFIG [2019-09-12 17:56:03,120] {{settings.py:170}} DEBUG - Setting up DB connection pool (PID 35) [2019-09-12 17:56:03,121] {{settings.py:213}} INFO - settings.configure_orm(): Using pool settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=35 [2019-09-12 17:56:03,289] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 45) [2019-09-12 17:56:03,356] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 41) [2019-09-12 17:56:04,101] {{scheduler_job.py:1474}} DEBUG - Sleeping for 0.99 seconds to prevent excessive logging [2019-09-12 17:56:04,126] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor0-Process, stopped)> [2019-09-12 17:56:04,127] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor1-Process, stopped)> [2019-09-12 17:56:04,162] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 55) [2019-09-12 17:56:04,223] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 58) [2019-09-12 17:56:05,095] {{scheduler_job.py:1390}} DEBUG - Starting Loop... [2019-09-12 17:56:05,095] {{scheduler_job.py:1401}} DEBUG - Harvesting DAG parsing results [2019-09-12 17:56:05,097] {{dag_processing.py:637}} DEBUG - Received message of type DagParsingStat [2019-09-12 17:56:05,098] {{dag_processing.py:637}} DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,098] {{dag_processing.py:637}} DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,099] {{dag_processing.py:637}} DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,099] {{dag_processing.py:637}} DEBUG - Received message of type SimpleDag [2019-09-12 17:56:05,100] {{dag_processing.py:637}} DEBUG - Received message of type DagParsingStat [2019-09-12 17:56:05,101] {{dag_processing.py:637}} DEBUG - Received message of type DagParsingStat [2019-09-12 17:56:05,101] {{scheduler_job.py:1403}} DEBUG - Harvested 4 SimpleDAGs [2019-09-12 17:56:05,128] {{scheduler_job.py:921}} INFO - 5 tasks up for execution: <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 04:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 04:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [scheduled]> [2019-09-12 17:56:05,138] {{scheduler_job.py:953}} INFO - Figuring out tasks to run in Pool(name=default_pool) with 128 open slots and 5 task instances ready to be queued [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG example_subdag_operator has 0/48 running and queued tasks [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG latest_only_with_trigger has 0/48 running and queued tasks [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG latest_only_with_trigger has 1/48 running and queued tasks [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG latest_only_with_trigger has 2/48 running and queued tasks [2019-09-12 17:56:05,139] {{scheduler_job.py:981}} INFO - DAG latest_only_with_trigger has 3/48 running and queued tasks [2019-09-12 17:56:05,139] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor2-Process, stopped)> [2019-09-12 17:56:05,140] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor3-Process, stopped)> [2019-09-12 17:56:05,157] {{scheduler_job.py:1031}} INFO - Setting the following tasks to queued state: <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 04:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [scheduled]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 04:00:00+00:00 [scheduled]> [2019-09-12 17:56:05,182] {{scheduler_job.py:1107}} INFO - Setting the following 5 tasks to queued state: <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 00:00:00+00:00 [queued]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 00:00:00+00:00 [queued]> <TaskInstance: example_subdag_operator.start 2019-09-10 00:00:00+00:00 [queued]> <TaskInstance: latest_only_with_trigger.latest_only 2019-09-10 04:00:00+00:00 [queued]> <TaskInstance: latest_only_with_trigger.task2 2019-09-10 04:00:00+00:00 [queued]> [2019-09-12 17:56:05,183] {{scheduler_job.py:1143}} INFO - Sending ('latest_only_with_trigger', 'latest_only', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 4 and queue default [2019-09-12 17:56:05,183] {{base_executor.py:59}} INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'latest_only', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,183] {{scheduler_job.py:1143}} INFO - Sending ('latest_only_with_trigger', 'task2', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 3 and queue default [2019-09-12 17:56:05,183] {{base_executor.py:59}} INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'task2', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,184] {{scheduler_job.py:1143}} INFO - Sending ('example_subdag_operator', 'start', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 5 and queue default [2019-09-12 17:56:05,184] {{base_executor.py:59}} INFO - Adding to queue: ['airflow', 'run', 'example_subdag_operator', 'start', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py'] [2019-09-12 17:56:05,184] {{scheduler_job.py:1143}} INFO - Sending ('latest_only_with_trigger', 'latest_only', datetime.datetime(2019, 9, 10, 4, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 4 and queue default [2019-09-12 17:56:05,184] {{base_executor.py:59}} INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'latest_only', '2019-09-10T04:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,185] {{scheduler_job.py:1143}} INFO - Sending ('latest_only_with_trigger', 'task2', datetime.datetime(2019, 9, 10, 4, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) to executor with priority 3 and queue default [2019-09-12 17:56:05,185] {{base_executor.py:59}} INFO - Adding to queue: ['airflow', 'run', 'latest_only_with_trigger', 'task2', '2019-09-10T04:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_latest_only_with_trigger.py'] [2019-09-12 17:56:05,185] {{scheduler_job.py:1438}} DEBUG - Heartbeating the executor [2019-09-12 17:56:05,185] {{base_executor.py:124}} DEBUG - 0 running task instances [2019-09-12 17:56:05,186] {{base_executor.py:125}} DEBUG - 5 in queue [2019-09-12 17:56:05,186] {{base_executor.py:126}} DEBUG - 96 open slots [2019-09-12 17:56:05,186] {{kubernetes_executor.py:764}} INFO - Add task ('example_subdag_operator', 'start', datetime.datetime(2019, 9, 10, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1) with command ['airflow', 'run', 'example_subdag_operator', 'start', '2019-09-10T00:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', '/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py'] with executor_config {} [2019-09-12 17:56:05,193] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 72) [2019-09-12 17:56:05,315] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 69) [2019-09-12 17:56:06,153] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor4-Process, stopped)> [2019-09-12 17:56:06,154] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor5-Process, stopped)> [2019-09-12 17:56:06,290] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 83) [2019-09-12 17:56:06,322] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 86) [2019-09-12 17:56:07,171] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor6-Process, stopped)> [2019-09-12 17:56:07,171] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor7-Process, stopped)> [2019-09-12 17:56:07,248] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 97) [2019-09-12 17:56:07,415] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 100) [2019-09-12 17:56:08,188] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor8-Process, stopped)> [2019-09-12 17:56:08,189] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor9-Process, stopped)> [2019-09-12 17:56:08,247] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 111) [2019-09-12 17:56:08,379] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 114) [2019-09-12 17:56:09,205] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor10-Process, stopped)> [2019-09-12 17:56:09,206] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor11-Process, stopped)> [2019-09-12 17:56:09,323] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 125) [2019-09-12 17:56:09,444] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 127) [2019-09-12 17:56:10,223] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor12-Process, stopped)> [2019-09-12 17:56:10,223] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor13-Process, stopped)> [2019-09-12 17:56:10,299] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 139) [2019-09-12 17:56:10,325] {{settings.py:238}} DEBUG - Disposing DB connection pool (PID 142) [2019-09-12 17:56:11,237] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor14-Process, stopped)> [2019-09-12 17:56:11,238] {{scheduler_job.py:257}} DEBUG - Waiting for <Process(DagFileProcessor15-Process, stopped)>` > KubernetesExecutor hangs on task queueing > ----------------------------------------- > > Key: AIRFLOW-5447 > URL: https://issues.apache.org/jira/browse/AIRFLOW-5447 > Project: Apache Airflow > Issue Type: Bug > Components: executor-kubernetes > Affects Versions: 1.10.4, 1.10.5 > Environment: Kubernetes version v1.14.3, Airflow version 1.10.4-1.10.5 > Reporter: Henry Cohen > Assignee: Daniel Imberman > Priority: Blocker > > Starting in 1.10.4, and continuing in 1.10.5, when using the > KubernetesExecutor, with the webserver and scheduler running in the > kubernetes cluster, tasks are scheduled, but when added to the task queue, > the executor process hangs indefinitely. Based on log messages, it appears to > be stuck at this line > https://github.com/apache/airflow/blob/v1-10-stable/airflow/contrib/executors/kubernetes_executor.py#L761 -- This message was sent by Atlassian Jira (v8.3.2#803003)