We saw this but the task instance state was generally "SUCCESS". In our case, we thought it was due to Redis being used as the results store. There is a WARNING against this right in the operational logs. Google Cloud Composer is surprisingly setup in this fashion.
We went back to running our own infrastructure and using postgres as the results store, those issues have not occurred since. The real downside we saw to this error was that our workers were highly underutilized, we were getting terrible overall data throughput, and the workers kept trying to run these tasks they couldn't actually run. - Dan Stoner On Wed, Feb 13, 2019 at 4:16 PM Kevin Lam <[email protected]> wrote: > > Friendly ping on the above! Has anyone encountered this by chance? > > We're still seeing it occasionally on longer running tasks. > > On Tue, Nov 20, 2018 at 10:31 AM Kevin Lam <[email protected]> wrote: > > > Hi, > > > > We run Apache Airflow in Kubernetes in a manner very similar to what is > > outlined in puckel/docker-airflow [1] (Celery Executor, Redis for > > messaging, Postgres). > > > > Lately, we've encountered some of our Tasks getting stuck in a running > > state, and printing out the errors: > > > > [2018-11-20 05:31:23,009] {models.py:1329} INFO - Dependencies not met for > > <TaskInstance: BLAH 2018-11-19T19:19:50.757184+00:00 [running]>, dependency > > 'Task Instance Not Already Running' FAILED: Task is already running, it > > started on 2018-11-19 23:29:11.974497+00:00. > >> [2018-11-20 05:31:23,016] {models.py:1329} INFO - Dependencies not met for > >> <TaskInstance: BLAH 2018-11-19T19:19:50.757184+00:00 [running]>, > >> dependency 'Task Instance State' FAILED: Task is in the 'running' state > >> which is not a valid state for execution. The task must be cleared in > >> order to be run. > >> > >> > > Is there anyway to avoid this? Does anyone know what causes this issue? > > > > This is quite problematic. The task is stuck in running state without > > making any progress when the above error occurs, and so turning on retries > > on doesn't help with getting our DAGs to reliably run to completion. > > > > Thanks! > > > > [1] https://github.com/puckel/docker-airflow > >
