Hi,

We run Apache Airflow in Kubernetes in a manner very similar to what is
outlined in puckel/docker-airflow [1] (Celery Executor, Redis for
messaging, Postgres).

Lately, we've encountered some of our Tasks getting stuck in a running
state, and printing out the errors:

[2018-11-20 05:31:23,009] {models.py:1329} INFO - Dependencies not met
for <TaskInstance: BLAH 2018-11-19T19:19:50.757184+00:00 [running]>,
dependency 'Task Instance Not Already Running' FAILED: Task is already
running, it started on 2018-11-19 23:29:11.974497+00:00.
> [2018-11-20 05:31:23,016] {models.py:1329} INFO - Dependencies not met for 
> <TaskInstance: BLAH 2018-11-19T19:19:50.757184+00:00 [running]>, dependency 
> 'Task Instance State' FAILED: Task is in the 'running' state which is not a 
> valid state for execution. The task must be cleared in order to be run.
>
>
Is there anyway to avoid this? Does anyone know what causes this issue?

This is quite problematic. The task is stuck in running state without
making any progress when the above error occurs, and so turning on retries
on doesn't help with getting our DAGs to reliably run to completion.

Thanks!

[1] https://github.com/puckel/docker-airflow

Reply via email to