Thanks Fokko, We are exploring K8executor, But the number of such long running jobs would be in 1000(s). So having some non-blocking mechanism would help. Rescheduling in sensors sounds good. Will explore it. Is it available in Airflow 1.10.1.
Thanks, Raman Gupta On Fri, Feb 1, 2019 at 2:57 PM Driesprong, Fokko <[email protected]> wrote: > Hi Raman, > > Right now this is the way to go. > > Recently there has been a change to the sensor, in which it will be > rescheduled instead of blocking. So this is something that you might want > to explore. Otherwise, you might want to choose a more scalable executer > such as the Celery or Kubernetes executor to ensure you have sufficient > slots. Adding a waiting state would be an option, but we will need to > discuss if this justifies the additional complexity. In this case, we could > also reuse the reschedule status. Hope this helps. > > Cheers, Fokko > > Op vr 1 feb. 2019 om 07:27 schreef [email protected] < > [email protected]>: > > > Hi All, > > > > In our workflows we trigger big data jobs which run from few hours to few > > days. Currently our Airflow operator submits the job and keeps on polling > > its status. Depending upon its status next task in the workflow is > > triggered by Airflow scheduler. > > So currently operator is not doing any useful work but is occupying one > > worker slot. > > we are exploring if we can do following > > -> Operator Submits the Big Data Job and mark itself success > > -> Once that job finishes it sends a notification either through > callback > > or through some event > > -> Based on notification Dependent task in the workflow is triggered and > > starts running. > > > > We might need to introduce new operator state like waiting .. > > Does it make sense?. > > Is there any way/workaround to achieve this. > > > > Thanks, > > Raman Gupta > > > > >
