Naive question: Instead of running the code on the scheduler - could the
condition check be delegated to the triggerer?

On Fri, Feb 2, 2024 at 2:33 PM Pierre Jeambrun <pierrejb...@gmail.com>
wrote:

> But maybe it’s time to reconsider that :), curious to see what others
> think.
>
> On Fri 2 Feb 2024 at 20:30, Pierre Jeambrun <pierrejb...@gmail.com> wrote:
>
> > I like the idea and I understand that it might help in some use cases.
> >
> > The first concern that I have is that it would allow user code to run in
> > the scheduler, if I understand correctly. This would have big
> implications
> > in terms of security and how our security model works. (For instance the
> > scheduler is a trusted component and has direct access to the DB, AIP-44
> > assumption)
> >
> > If I remember correctly this is a route that we specifically tried to
> stay
> > away from.
> >
> > On Fri 2 Feb 2024 at 20:03, Xiaodong (XD) DENG <xd.d...@apple.com.invalid
> >
> > wrote:
> >
> >> Hi folks,
> >>
> >> I’m writing to share my thought regarding the possibility of supporting
> >> “custom TI dependencies”.
> >>
> >> Currently we maintain the dependency check rules under
> >> “airflow.ti_deps.deps". They cover the dependency checks like if there
> are
> >> available pool slot/if the concurrency allows/TI trigger rules/if the
> state
> >> is valid, etc., and play essential role in the scheduling process.
> >>
> >> One idea was brought up in our team's internal discussion: why shouldn’t
> >> we support custom TI dependencies?
> >>
> >> In details: just like the cluster policies
> >> (dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook),
> if
> >> we support users add their own dependency checks as custom classes (and
> >> also put under airflow_local_settings.py), it will allow users to have
> much
> >> higher flexibility in the TI scheduling. These custom TI deps should be
> >> added as additions to the existing default deps (not replacing or
> removing
> >> any of them).
> >>
> >> For example: similar to check for pool availability/concurrency, the job
> >> may need to check for user’s infra-specific conditions, like if a GPU is
> >> available right now (instead of competing with other jobs randomly), or
> if
> >> an external system API is ready to be called (otherwise wait a bit ).
> And a
> >> lot more other possibilities.
> >>
> >> Why cluster policies won’t help here?  task_instance_mutation_hook is
> >> executed in a “worker”, not in the DAG file processor, just before the
> TI
> >> is executed. What we are trying to gain some control here, though, is in
> >> the scheduling process (based on custom rules, to decide if the TI state
> >> should be updated so it can be scheduled for execution).
> >>
> >> I would love to know how community finds this idea, before we start to
> >> implement anything. Any quesiton/suggestion would be greatly
> appreciated.
> >> Many thanks!
> >>
> >>
> >> XD
> >>
> >>
> >>
>

Reply via email to