Looking again at my lscpu specs, I noticed...

[airflow@airflowetl airflow]$ lscpuArchitecture:          x86_64
CPU op-mode(s):        32-bit, 64-bitByte Order:            Little Endian
CPU(s):                8On-line CPU(s) list:   0-7Thread(s) per core:
  1Core(s) per socket:    4Socket(s):             2

Notice Thread(s) per core: 1

Looking at my airflow.cfg settings I see max_threads = 2. Setting max_threads
= 1 and restarting both the scheduler
<https://www.astronomer.io/guides/airflow-scaling-workers/> seems to have
fixed the problem.

If anyone knows more about what exactly is going wrong under the hood (eg.
why the task fails rather than just waiting for another thread to become
available), would be interested to hear about it.

On Wed, Dec 18, 2019 at 11:45 AM Reed Villanueva <[email protected]>
wrote:

> Running airflow dag that ran fine with SequentialExecutor now has many
> (though not all) simple tasks that fail without any log information when
> running with LocalExecutor and minimal parallelism, eg.
>
> <airflow.cfg>
> # overall task concurrency limit for airflow
> parallelism = 8 # which is same as number of cores shown by lscpu# max tasks 
> per dag
> dag_concurrency = 2# max instances of a given dag that can run on airflow
> max_active_runs_per_dag = 1# max threads used per worker / core
> max_threads = 2
>
> see https://www.astronomer.io/guides/airflow-scaling-workers/
>
> Looking at the airflow-webserver.* logs nothing looks out of the
> ordinary, but looking at airflow-scheduler.out I see...
>
> [airflow@airflowetl airflow]$ tail -n 20 airflow-scheduler.out....[2019-12-18 
> 11:29:17,773] {scheduler_job.py:1283} INFO - Executor reports execution of 
> mydag.task_level1_table1 execution_date=2019-12-18 21:21:48.424900+00:00 
> exited with status failed for try_number 1[2019-12-18 11:29:17,779] 
> {scheduler_job.py:1283} INFO - Executor reports execution of 
> mydag.task_level1_table2 execution_date=2019-12-18 21:21:48.424900+00:00 
> exited with status failed for try_number 1[2019-12-18 11:29:17,782] 
> {scheduler_job.py:1283} INFO - Executor reports execution of 
> mydag.task_level1_table3 execution_date=2019-12-18 21:21:48.424900+00:00 
> exited with status failed for try_number 1[2019-12-18 11:29:18,833] 
> {scheduler_job.py:832} WARNING - Set 1 task instances to state=None as their 
> associated DagRun was not in RUNNING state[2019-12-18 11:29:18,844] 
> {scheduler_job.py:1283} INFO - Executor reports execution of 
> mydag.task_level1_table4 execution_date=2019-12-18 21:21:48.424900+00:00 
> exited with status success for try_number 1....
>
> but not really sure what to take away from this.
>
> Anyone know what could be going on here or how to get more helpful
> debugging info?
>

-- 
This electronic message is intended only for the named 
recipient, and may 
contain information that is confidential or 
privileged. If you are not the 
intended recipient, you are 
hereby notified that any disclosure, copying, 
distribution or 
use of the contents of this message is strictly 
prohibited. If 
you have received this message in error or are not the 
named
recipient, please notify us immediately by contacting the 
sender at 
the electronic mail address noted above, and delete 
and destroy all copies 
of this message. Thank you.

Reply via email to