Declaring connections prior to task execution was already proposed in AIP-1
:-). At that time, I had in mind to communicate over IPC to the task the
required settings. Registration could then happen with a manifest. Maybe
during DAG serialization this could be obtained unobtrusively? The benefit
is that tasks become truly atomic or independent from Airflow as long as
they communicate their exit codes (success, failed, and I think Ash had a
couple of others in mind - the fewer the better).

If you want two-way communication, maybe for variables as they can change
during scheduling, this can happen with AIP-44. Although, I'd prefer it to
happen with the *executor* rather than some centralized service. If the
executor is used, IPC is the logical choice. The benefit of this is that
you have better resiliency and you can start to think about no downtime
upgrades

So I hope Ash takes this to 2024 :-).

B.


On Mon, 13 May 2024 at 19:27, Ash Berlin-Taylor <a...@apache.org> wrote:

> > That would require some mechanism of declaring prior to task execution
> what connections would be used
>
> That’s exactly what I’m proposing in the proposal doc I’m working on (It’s
> part of also overhauling and re-designing the “Task Execution interface”
> that also gives us the ability to nicely have support for running tasks in
> other languages — much more than just BashOperator)
>
> This is a bit of a fundamental shift in thinking about task execution in
> Airflow, but I think it gives us some really nice properties that the
> project is currently missing.
>
> Tl;dr; lets discuss this in my doc when it comes our (next week most
> likely) please :)
>
> -ash
>
> > On 13 May 2024, at 18:15, Daniel Standish
> <daniel.stand...@astronomer.io.INVALID> wrote:
> >
> > re
> >
> > As tasks require connection access, I assume connection data will somehow
> >> be passed as part of the
> >> metadata to task execution - whether it's part of the executor protocol
> or
> >> in some other way (I'm
> >> not an expert on that part of Airflow). Then, provided it's accessible
> as
> >> part of some execution
> >> context, and not only passed to the task's execute method, OpenLineage
> >> could utilize it.
> >>
> >
> > It's not strictly necessary that connection info be passed "as part of
> task
> > matadata".  That would require some mechanism of declaring prior to task
> > execution what connections would be used.  This is a thought that has
> come
> > up when thinking about execution of non-python tasks.  But it's not
> > required from a technical perspective by AIP-44 because the
> > `get_connection` function can be made to be an RPC call so a task could
> > continue to retrieve connections at runtime.
>
>

-- 

--
Bolke de Bruin
bdbr...@gmail.com

Reply via email to