"scheduler lost track of it" means cases like the scheduler process got killed. When scheduler restarts, tasks with SCHEDULED or QUEUED state will be set to NONE state.
For SLA, I think that delay is included, here is the logic how Airflow calculates SLA misses <https://github.com/apache/incubator-airflow/blob/284dbdb60ab1fec027dea4871e3013a4727f6041/airflow/jobs.py#L604-L739>. I think the SLA in Airflow is similar( e.g. you can add sla_miss_callback into your DAG), here's the doc <https://airflow.apache.org/concepts.html?highlight=slas#slas> for it. Cheers, Kevin Y On Fri, Jul 20, 2018 at 1:49 PM Shubham Gupta <shubham180695...@gmail.com> wrote: > Also, is this delay b/w adding of task in queue and beginning of task on > the worker not included in SLA of the task? Or is the SLA period begins > once the task actually starts on the worker? Also, if scheduler has to wait > for a response from the worker for the final state of the task > (success/failure), how can the scheduler loose track of the task? > > FYI, I am comparing airflow with quartz, which has a mistrigger handling > built in. Mistrigger in quartz means that the task was not started within a > pre-configured interval beginning form the scheduled time of start. Isn't > there something similar in airflow? > > Regards > Shubham Gupta > > On Fri, Jul 20, 2018 at 1:42 PM Shubham Gupta <shubham180695...@gmail.com> > wrote: > > > Hi Ruiqin Yang, > > > > Can you please elaborate on what is meant by "and the scheduler lost > > track of it" in your second paragraph? When can this happen? Also, what > > is the default state when the scheduler restarts? Is it not* None*? > > > > Thanks for your quick reply. > > > > Regards > > Shubham Gupta > > > > > > On Fri, Jul 20, 2018 at 1:04 AM Ruiqin Yang <yrql...@gmail.com> wrote: > > > >> Hi Shubham, > >> > >> Worker running actual airflow task will regularly heartbeat, which > updates > >> the task instance entry in the DB. Scheduler will kill task instance w/o > >> heartbeat for a long time, called zombie tasks, and if the task has > retry > >> left it will try to reschedule it( given all trigger rules are > satisfied). > >> > >> If workers have heavy load, the scheduler will still be able to schedule > >> tasks( putting tasks into worker queue). And you will just wait for > >> workers > >> to pick up the tasks from the queue. If the tasks never get picked up > and > >> the scheduler lost track of it, their state will be reset to NONE when > >> scheduler restarts, they are called orphan tasks. > >> > >> FYI, inside Airbnb, Alex Guziel( @saguziel <https://github.com/saguziel > >) > >> has a patch that will requeue tasks if they don't get picked up by > workers > >> for a long time and he has plan to open source it. > >> > >> Cheers, > >> Kevin Y > >> > >> On Fri, Jul 20, 2018 at 12:40 AM Shubham Gupta < > >> shubham180695...@gmail.com> > >> wrote: > >> > >> > Hi, > >> > > >> > I would like to know what happens if a Celery worker running one of > the > >> > tasks crashes. Will the job be rescheduled? > >> > > >> > Also, if the scheduler is not able to schedule a task on time due to > >> heavy > >> > load on all workers, what will happen to the task? > >> > > >> > Regards > >> > Shubham Gupta > >> > > >> > > >