This is an absolute minefield of performance issues as this is the very centre 
of the hot path for the scheduler, so I would advise against any casual use of 
this.

That said: I think it’s possible today. Set `deps` on your operator class like 
https://github.com/apache/airflow/blob/main/airflow/models/baseoperator.py#L1128-L1135
 (I don’t know if it can be set per operator instance or only per operator 
class.) So you can set this from the task_policy.

I would like us no_ to mention this in the docs as I think it’s the 1% of 1% of 
users who can do this without blowing their own foot off. Not until we have 
some good examples of how to do this well from outside of core.

(Not to mention the mentioned caveats about this being “user/operator” code 
execution and all that entails for any work on a multi-tenant scheduler in the 
future)

-ash

> On 5 Feb 2024, at 04:09, Amogh Desai <amoghdesai....@gmail.com> wrote:
> 
> I already like the idea.
> 
> It can indeed offer better flexibility and control over the scheduling
> process beyond the default checks
> 
> My main concern with the approach is in line with what Jens mentioned,
> performance but I like yout mitigation through a metric. I'd suggest
> that the metric should have a threshold value, beyond which it is not at
> all advised to use the current custom TI.
> 
> My other concern would be upgrades. These dependency classes could become
> outdated with version upgrades or airflow. Users should
> be willing to rewrite / update their custom code based on the airflow
> version they upgrade to.
> 
> Thanks & Regards,
> Amogh Desai
> 
> 
> 
> 
> On Sun, Feb 4, 2024 at 5:58 AM Xiaodong (XD) DENG <xd.d...@apple.com.invalid>
> wrote:
> 
>> Hi Jens,
>> 
>> Many thanks for the feedback.
>> 
>> I fully agree on the potential performance impact if the custom dependency
>> is not written in the ideal way, and thanks for bringing it up. The deps
>> are indeed called very frequently, so attention should be paid to ensure
>> the custom dependency is not dragging the leg.
>> 
>> This is actually somehow like writing DAG files in Airflow, attention
>> should be paid otherwise the DAG processor performance would be impacted by
>> the “bad” DAG(s).
>> 
>> Other than highlighting this to users in the documentation, maybe we
>> should also consider adding a metric to measure the custom dependency
>> performance (just like the DAG parsing speed), and give warning in logs if
>> it’s not performant.
>> 
>> Regarding AIP, I personally don’t find it necessary as this is not going
>> to be a too big change or significantly touching the architecture. But I
>> would definitely welcome more inputs here in this email thread.
>> 
>> Thanks again!
>> 
>> 
>> XD
>> 
>>> On Feb 3, 2024, at 00:55, Scheffler Jens (XC-AS/EAE-ADA-T) <
>> jens.scheff...@de.bosch.com.INVALID> wrote:
>>> 
>>> I like the idea.
>>> 
>>> I‘d propose to write-up an AIP so that details of the concept can be
>> discussed before raising a PR if you feel details might need to be aligned.
>>> 
>>> My concern primarily would be that if somebody uses this and uses
>> external data or parameters from DB that this could bring-down performance
>> or reduce availability for scheduling because the code might be called
>> often. Bit this should be just documented and sit in users/admin
>> responsibility.
>>> 
>>> Sent from Outlook for iOS<https://aka.ms/o0ukef>
>>> ________________________________
>>> From: Pierre Jeambrun <pierrejb...@gmail.com>
>>> Sent: Friday, February 2, 2024 11:26:47 PM
>>> To: dev@airflow.apache.org <dev@airflow.apache.org>
>>> Subject: Re: Idea for Discussion: custom TI dependencies
>>> 
>>> I thought that it would allow users to code their own rules and provide
>>> them to the tasks somehow. If this is restricted to admin or deployment
>>> managers through plugins or other mechanism, I guess there is no issue
>> here.
>>> 
>>> Thanks for answering my concerns.
>>> 
>>> On Fri 2 Feb 2024 at 22:53, Constance Martineau
>>> <consta...@astronomer.io.invalid> wrote:
>>> 
>>>> Not missing anything. It is mainly used for deferrable operators, but
>> feels
>>>> like an acceptable place to run user-defined/non-Airflow code portion of
>>>> this, and carry out the custom dependency checks. Assuming of course
>> that
>>>> these checks are meant to check external conditions and will never need
>>>> access to Airflow's DB. Once the condition is met, then the scheduler
>> could
>>>> carry on scheduling the task.
>>>> 
>>>> I like the idea, and it was just a thought on how to get around the
>>>> security concern Pierre brought up.
>>>> 
>>>> On Fri, Feb 2, 2024 at 4:35 PM Xiaodong (XD) DENG
>>>> <xd.d...@apple.com.invalid>
>>>> wrote:
>>>> 
>>>>> Thanks both for your inputs!
>>>>> 
>>>>> 
>>>>> Hi Pierre,
>>>>> 
>>>>> I think the key difference here is: by doing this, we are not allowing
>>>>> Airflow “users” to run their code in scheduler. We are only allowing
>>>>> Airflow “Admins” to deploy a plugin to run in scheduler, just the same
>> as
>>>>> dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook.
>>>>> 
>>>>> So I do not think this violates our current preference in terms of
>>>>> security.
>>>>> 
>>>>> 
>>>>> Hi Constance,
>>>>> 
>>>>> I thought the trigger is mainly for deferrable operator cases? It’s
>> quite
>>>>> different scenario from what I’m trying to cover here IMHO.
>>>>> Did I miss anything? Please let me know.
>>>>> 
>>>>> 
>>>>> Thanks again! Looking forward to more questions/comments!
>>>>> 
>>>>> 
>>>>> XD
>>>>> 
>>>>> 
>>>>>> On Feb 2, 2024, at 13:29, Constance Martineau <
>> consta...@astronomer.io
>>>> .INVALID>
>>>>> wrote:
>>>>>> 
>>>>>> Naive question: Instead of running the code on the scheduler - could
>>>> the
>>>>>> condition check be delegated to the triggerer?
>>>>>> 
>>>>>> On Fri, Feb 2, 2024 at 2:33 PM Pierre Jeambrun <pierrejb...@gmail.com
>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> But maybe it’s time to reconsider that :), curious to see what others
>>>>>>> think.
>>>>>>> 
>>>>>>> On Fri 2 Feb 2024 at 20:30, Pierre Jeambrun <pierrejb...@gmail.com>
>>>>> wrote:
>>>>>>> 
>>>>>>>> I like the idea and I understand that it might help in some use
>>>> cases.
>>>>>>>> 
>>>>>>>> The first concern that I have is that it would allow user code to
>> run
>>>>> in
>>>>>>>> the scheduler, if I understand correctly. This would have big
>>>>>>> implications
>>>>>>>> in terms of security and how our security model works. (For instance
>>>>> the
>>>>>>>> scheduler is a trusted component and has direct access to the DB,
>>>>> AIP-44
>>>>>>>> assumption)
>>>>>>>> 
>>>>>>>> If I remember correctly this is a route that we specifically tried
>> to
>>>>>>> stay
>>>>>>>> away from.
>>>>>>>> 
>>>>>>>> On Fri 2 Feb 2024 at 20:03, Xiaodong (XD) DENG
>>>>> <xd.d...@apple.com.invalid
>>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi folks,
>>>>>>>>> 
>>>>>>>>> I’m writing to share my thought regarding the possibility of
>>>>> supporting
>>>>>>>>> “custom TI dependencies”.
>>>>>>>>> 
>>>>>>>>> Currently we maintain the dependency check rules under
>>>>>>>>> “airflow.ti_deps.deps". They cover the dependency checks like if
>>>> there
>>>>>>> are
>>>>>>>>> available pool slot/if the concurrency allows/TI trigger rules/if
>>>> the
>>>>>>> state
>>>>>>>>> is valid, etc., and play essential role in the scheduling process.
>>>>>>>>> 
>>>>>>>>> One idea was brought up in our team's internal discussion: why
>>>>> shouldn’t
>>>>>>>>> we support custom TI dependencies?
>>>>>>>>> 
>>>>>>>>> In details: just like the cluster policies
>>>>>>>>> 
>>>>> (dag_policy/task_policy/task_instance_mutation_hook/pod_mutation_hook),
>>>>>>> if
>>>>>>>>> we support users add their own dependency checks as custom classes
>>>>> (and
>>>>>>>>> also put under airflow_local_settings.py), it will allow users to
>>>> have
>>>>>>> much
>>>>>>>>> higher flexibility in the TI scheduling. These custom TI deps
>> should
>>>>> be
>>>>>>>>> added as additions to the existing default deps (not replacing or
>>>>>>> removing
>>>>>>>>> any of them).
>>>>>>>>> 
>>>>>>>>> For example: similar to check for pool availability/concurrency,
>> the
>>>>> job
>>>>>>>>> may need to check for user’s infra-specific conditions, like if a
>>>> GPU
>>>>> is
>>>>>>>>> available right now (instead of competing with other jobs
>> randomly),
>>>>> or
>>>>>>> if
>>>>>>>>> an external system API is ready to be called (otherwise wait a bit
>>>> ).
>>>>>>> And a
>>>>>>>>> lot more other possibilities.
>>>>>>>>> 
>>>>>>>>> Why cluster policies won’t help here?  task_instance_mutation_hook
>>>> is
>>>>>>>>> executed in a “worker”, not in the DAG file processor, just before
>>>> the
>>>>>>> TI
>>>>>>>>> is executed. What we are trying to gain some control here, though,
>>>> is
>>>>> in
>>>>>>>>> the scheduling process (based on custom rules, to decide if the TI
>>>>> state
>>>>>>>>> should be updated so it can be scheduled for execution).
>>>>>>>>> 
>>>>>>>>> I would love to know how community finds this idea, before we start
>>>> to
>>>>>>>>> implement anything. Any quesiton/suggestion would be greatly
>>>>>>> appreciated.
>>>>>>>>> Many thanks!
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> XD
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
>>>>> 
>>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> For additional commands, e-mail: dev-h...@airflow.apache.org
>> 
>> 

Reply via email to