100%. Adding more helpers is the way to go.

Also a side comment (but quite relevant I think): I think this should be a
good start of making a real "DAG API". We need to - finally - define which
of the utils/functions/methods are the real API of DAGs that users and
providers should be able to use without the danger of breaking
compatibility and which are the internal API of Airflow.
Currently this is a bit "implicit" and blurry so far (based mostly on
common sense). There are some features (Variables, XCom, Connections,
BaseOperator class etc. that constiute the "API" and those are pretty
clear.

But also there are some utils/helpers already which we assume are something
"usable" by DAG/Operator authors but it is not stated anywhere.

At some point of time we need to have a clear separation between those two.
I think when we add more such helpers deliberately - this is a good
opportunity to make at least an inventory and separate documentation page
about "This is what you as DAG/Operator author can use without the fear of
breaking changes".

J.


On Mon, Aug 29, 2022 at 9:31 AM Tzu-ping Chung <[email protected]>
wrote:

> I agree that templating is not the best solution for the contrived
> example. But there are more concrete use cases for the templating mechanism
> that can be useful, such as templating the content of a file (which is
> arguably a consequence of the much to magical nature of argument templating
> to begin with, of course…).
>
> Perhaps instead of trying to apply templating, we should add more XComArg
> helpers to perform argument pre-processing more explicitly? Some helpers
> are already being added for AIP-42 (to be released in 2.4.0) e.g.
> foo(f1().zip(f2())), and we can add more for reading a file, rendering a
> template, etc.
>
> Example:
>
>
> ## code.sh
> echo {{ ds }}
> ##
>
> ## dag.py
> @task
> def script():
>     # To contrived to make sense, but imagine a task that conditional
>     # returns a script to run.
>     return "code.sh"
>
> # The render_template_file helper reads out the file content and renders
> it as a template.
> BashOperator(task_id="run", bash_command=script().render_template_file())
> ##
>
>
> TP
>
>
> On 29 Aug 2022, at 15:18, Jarek Potiuk <[email protected]> wrote:
>
> Agree with Ash,
>
> And I think if we add it, it will detract people from doing the same thing
> in a better way.
>
> I think what I really like about TaskFlow is that it actually does not
> need to use Jinja2 at all.
>
> Jinja2 as I see it, is a super-convoluted way of running (subset of)
> Python via strings. Not only dangerous from a security point of view but
> also has a number of deficiencies (no type validation really possible, no
> good support in IDES, limited set of filters to use and plenty of other
> limitations and problems).
>
> You could not avoid using Jinja2 in "classic" operators which are treated
> as black-box. And this is the only reason IMHO we use JINJA2 at all
> in Airflow.
>
> If you look at what features TaskFlow provides - one of its features is to
> make our code pure-Python, and Jinja2-less. Simply, if you want to
> dynamically derive a value - do it in Python code, not in JINJA2. If we
> think that we need a feature, rather than implementing it in Jinja. For me
> this is quite simple equation:
>
> * Classic Operators = Use Jinja2
> * Task Flow = Do not use Jinja2
>
> The example of Ash is a pure manifestation of that.
>
> J.
>
>
> On Mon, Aug 29, 2022 at 9:01 AM Ash Berlin-Taylor <[email protected]> wrote:
>
>> In this example I'd say it shouldn't be templated, as you can do it via
>>
>> @task
>> def get_templated_echo(ds):
>> return f'echo "The date is {ds}"'
>>
>> (This is nothing particular to XcomArg either, just the way Xcom had
>> always worked.)
>>
>> I don't think we can change the current/default behaviour without it
>> likely being a breaking change to someone's workflow.
>>
>> If there's enough desire and a good enough use case we can add a way to
>> make it something you can turn on via one way or another.
>>
>> -ash
>>
>>
>>
>> On 29 August 2022 07:46:01 BST, Tzu-ping Chung <[email protected]>
>> wrote:
>>>
>>> Context: https://github.com/apache/airflow/issues/26016
>>>
>>> Currently if a value is passed to a downstream via XComArg, e.g.
>>>
>>> @taskdef get_templated_echo():    return 'echo "The date is {{ ds }}"'
>>> BashOperator(task_id="echo_task", bash_command=get_templated_echo())
>>>
>>>
>>> Then this value does not go through Jinja2 templating, i.e. the
>>> BashOperator above would print The date is {{ ds }} instead of The date
>>> is 2022-08-29. The question here is, is this a conscious design
>>> decision, and is there reasoning behind it? Personally I can see arguments
>>> from both sides, but some context would be very appreciated.
>>>
>>> And, if there wasn’t an explicit decision, should template-rendering an
>>> XComArg be supported?
>>>
>>> TP
>>>
>>
>

Reply via email to