Hi All, As per http://docs.sqlalchemy.org/en/latest/core/connections.html link db engine is not portable across process boundaries "For a multiple-process application that uses the os.fork system call, or for example the Python multiprocessing module, it’s usually required that a separate Engine be used for each child process. This is because the Engine maintains a reference to a connection pool that ultimately references DBAPI connections - these tend to not be portable across process boundaries" Please correct me if I am wrong but It seems that in Airflow 1.9 child processes don't create separate DB engine and so there is only one single DB Engine which is shared among child processes which might be causing this issue.
Thanks, Raman Gupta On 2018/08/21 15:41:14, raman gupta <ramandu...@gmail.com> wrote: > One possibility is the unavailability of session while calling > self.task_instance._check_and_change_state_before_execution > function. > (Session is provided via @provide_session decorator) > > On Tue, Aug 21, 2018 at 7:09 PM vardangupta...@gmail.com < > vardangupta...@gmail.com> wrote: > > > Is there any possibility that on call of function > > _check_and_change_state_before_execution at > > https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/jobs.py#L2500, > > this method is not actually being called > > https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/models.py#L1299? > > Because even in a happy scenario, no logs is printed from method's > > implementation and directly control is reaching here > > https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/jobs.py#L2512 > > while in stuck phase, we are seeing this log > > https://github.com/apache/incubator-airflow/blob/v1-9-stable/airflow/jobs.py#L2508 > > i.e. Task is not able to be run, FYI we've not set any sort of dependency > > with dag. > > > > Regards, > > Vardan Gupta > > > > On 2018/08/16 08:25:37, ramandu...@gmail.com <ramandu...@gmail.com> > > wrote: > > > Hi All, > > > > > > We are using airflow 1.9 with Local Executor more. Intermittently we are > > observing that tasks are getting stuck in "up_for_retry" mode and are > > getting retried again and again exceeding their configured max retries > > count. like we have configured max retries as 2 but task is retried 15 > > times and got stuck in up_for_retry state. > > > Any pointer on this would be helpful. > > > > > > Thanks, > > > Raman Gupta > > > > > >