Thanks Fokko,
We are exploring K8executor, But the number of such long running jobs would
be in 1000(s). So having some non-blocking mechanism would help.
Rescheduling in sensors sounds good. Will explore it. Is it available in
Airflow 1.10.1.

Thanks,
Raman Gupta

On Fri, Feb 1, 2019 at 2:57 PM Driesprong, Fokko <[email protected]>
wrote:

> Hi Raman,
>
> Right now this is the way to go.
>
> Recently there has been a change to the sensor, in which it will be
> rescheduled instead of blocking. So this is something that you might want
> to explore. Otherwise, you might want to choose a more scalable executer
> such as the Celery or Kubernetes executor to ensure you have sufficient
> slots. Adding a waiting state would be an option, but we will need to
> discuss if this justifies the additional complexity. In this case, we could
> also reuse the reschedule status. Hope this helps.
>
> Cheers, Fokko
>
> Op vr 1 feb. 2019 om 07:27 schreef [email protected] <
> [email protected]>:
>
> > Hi All,
> >
> > In our workflows we trigger big data jobs which run from few hours to few
> > days. Currently our Airflow operator submits the job and keeps on polling
> > its status. Depending upon its status next task in the workflow is
> > triggered by Airflow scheduler.
> > So currently operator is not doing any useful work but is occupying one
> > worker slot.
> > we are exploring if we can do following
> >  -> Operator Submits the Big Data Job and mark itself success
> >  -> Once that job finishes it sends a notification either through
> callback
> > or through some event
> >  -> Based on notification Dependent task in the workflow is triggered and
> > starts running.
> >
> > We might need to introduce new operator state like waiting ..
> > Does it make sense?.
> > Is there any way/workaround to achieve this.
> >
> > Thanks,
> > Raman Gupta
> >
> >
>

Reply via email to