But maybe it’s time to reconsider that :), curious to see what others think.

On Fri 2 Feb 2024 at 20:30, Pierre Jeambrun <pierrejb...@gmail.com> wrote:

> I like the idea and I understand that it might help in some use cases.
>
> The first concern that I have is that it would allow user code to run in
> the scheduler, if I understand correctly. This would have big implications
> in terms of security and how our security model works. (For instance the
> scheduler is a trusted component and has direct access to the DB, AIP-44
> assumption)
>
> If I remember correctly this is a route that we specifically tried to stay
> away from.
>
> On Fri 2 Feb 2024 at 20:03, Xiaodong (XD) DENG <xd.d...@apple.com.invalid>
> wrote:
>
>> Hi folks,
>>
>> I’m writing to share my thought regarding the possibility of supporting
>> “custom TI dependencies”.
>>
>> Currently we maintain the dependency check rules under
>> “airflow.ti_deps.deps". They cover the dependency checks like if there are
>> available pool slot/if the concurrency allows/TI trigger rules/if the state
>> is valid, etc., and play essential role in the scheduling process.
>>
>> One idea was brought up in our team's internal discussion: why shouldn’t
>> we support custom TI dependencies?
>>
>> In details: just like the cluster policies
>> (dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook), if
>> we support users add their own dependency checks as custom classes (and
>> also put under airflow_local_settings.py), it will allow users to have much
>> higher flexibility in the TI scheduling. These custom TI deps should be
>> added as additions to the existing default deps (not replacing or removing
>> any of them).
>>
>> For example: similar to check for pool availability/concurrency, the job
>> may need to check for user’s infra-specific conditions, like if a GPU is
>> available right now (instead of competing with other jobs randomly), or if
>> an external system API is ready to be called (otherwise wait a bit ). And a
>> lot more other possibilities.
>>
>> Why cluster policies won’t help here?  task_instance_mutation_hook is
>> executed in a “worker”, not in the DAG file processor, just before the TI
>> is executed. What we are trying to gain some control here, though, is in
>> the scheduling process (based on custom rules, to decide if the TI state
>> should be updated so it can be scheduled for execution).
>>
>> I would love to know how community finds this idea, before we start to
>> implement anything. Any quesiton/suggestion would be greatly appreciated.
>> Many thanks!
>>
>>
>> XD
>>
>>
>>

Reply via email to