Thank you all for your answers. I appreciate your detailed responses.

I am changing my vote to +1 (binding)

Vikram

On Tue, Jan 20, 2026 at 10:22 AM Jarek Potiuk <[email protected]> wrote:

> That's a sign that the documentation for it is really needed and likely
> that it should be written by someone who has not spent hours discussing it
> and makes a lot of mental shortcuts :). Also - I think I had very similar
> questions at the beginning of discussions with David, only to find out that
> I had to challenge my own "airflow-centric" approach and then it started to
> make sense.
>
> I will leave it to David to provide his - excellent - examples from the
> work he was doing at his day job and the performance gains he got. But let
> me try to clarify the small/huge confusion.
>
> There are two performance aspects of this one and a pattern that we should
> - again indeed - document better.
>
> 1) many small async requests
> 2) that can often produce huge responses that needs to be dealt with
>
> Both are addressed actually by this "syntactic sugar" - even if they might
> seem contradicting (and request/response is what matters here).
>
> 1) *many small async requests:* the case I related to: when you make a
> lot of small async requests, the overhead to run such requests in a local
> async loop is very small (and this is actually used in Triggerrer - which
> shares multiple async calls from a number of tasks that are waiting for
> something - it can easily run thousands of such async requests - and handle
> responses efficiently). But if you want to run **many** of similar requests
> concurrently in the "native" Airflow way - using DeferrableOperator, in
> Airflow currently you have to run Mapped tasks. This introduces overhead:
> creating DagRuns with say - thousands of TaskInstance entities, starting
> thousands of worker "processes" (David works on optimising this part as
> well in [1] [2] ),  communication between those processes and DB for
> triggerer, picking up the tasks by triggerer from the DB,
> serialization/deserialization of the answers and sending them to the
> mappeed workers, finally - possibly "reducing" the output from those
> multiple tasks to be read in some downstream task that might need those
> outputs to be combined. All this overhead is gone (completely) if you run
> all those operations in the worker - immediately in a dedicated async loop
> - rather than doing all the worker -> DB -> Triggerrer -> worker dance.
>
> This is all without any "huge" payload returned. For small tasks - the
> overhead here is very real (orders of magnitude ) and David has numbers to
> back it up from his own experience.
>
> There are few differences of course vs. Deferrable operators:
>  a) you can't track individual tasks (also you can't retry them
> individually)
>  b) you have to handle the errors in this worker that runs them (answering
> your question)
>  c) you do not handle persistence of the intermediate results - they all
> are stored in memory by default
>   d) worker is not freed while waiting - it is busy running the async
> loop.
>
> But - you can immediately and natively use async hooks we developed for
> Deferrable Operators - without worrying about starting and managing the
> loop yourself (today you could do the same without the `async task` sugar,
> but you would have to repeat the async loop initialization code in each
> such task and the loop would not be "airflow" managed (which will come
> handy in the future).
>
> 2) *many async requests with large payloads*: The case that Daniel talked
> about - which is similar to that above, but also involves potentially
> "large" payload returned - in a number of cases the returned data from the
> async tasks that needs to be further processed is huge, or just "big".
> Either enough to fit in-memory or too big, but supporting streaming async
> interface we can get chunks of it at a time. In such a case if you use the
> classic "Deferrable Operator", you would need to get that data from
> multiple mapped tasks, and store them in XCom, so that the downstream task
> will possibly aggregate and process the data. With "natively async tasks"
> from AIP-98 - all that can be "compacted" into a single async task running
> a number of parallel tasks - possibly streaming the payload and processing
> it and producing aggregated output as a smaller "in-memory" data being
> output of all those async responses - and storing that as an XCom to
> downstream tasks. So additionally to the overhead from 1) above that I
> wrote about, the 2) XCom overhead that David wrote about is added. Again -
> in in this case you are not really using integration with Airflow UI - i.e.
> those monitoring and management features that we already have - seeing logs
> individually, seeing data returned individually, ability to partially
> reprocess such mapped tasks - all this is gone, because essentially we
> compact it all into a single process running separate async loop and doing
> both "map" and "reduce" part of what mapped tasks are designed for -
> without the monitoring/management, but also without the overhead it causes.
>
> This could - again - be implemented now as a custom code run in
> "synchronous" tasks. You can create an async loop and write your own async
> methods and add logic to start and wait for them in your synchronous tasks.
> But contrary to the AIP-98 - it can't be extended in the future to provide
> better integration with Airflow UI.
>
> When the async loop will be "managed" by Airflow - AIP-99 and "async task"
> will become a "first class citizen" - we can later leverage async loop
> monitoring for example TaskGroups that allow to handle "group exceptions,
> (introduced in Python 3.11) - and better monitoring interfaces of asyncio
> (introduced later). This will allow our users to track progress of
> execution of such async calls running or introduce some more sophisticated
> retry scenarios when some of those async hook calls fail. This all could be
> exposed via Airflow UI with optimised execution API handling bulk status
> update and status querying. Not the full functionality of what Mapped Tasks
> provide - but a useful subset of those - for those who will be willing to
> trade some manageability aspects for performance.
>
> I hope it might help to clear it up further.
>
> J.
>
>
> [1] https://github.com/apache/airflow/pull/55068
> [2] https://github.com/apache/airflow/pull/53009
>
> On Mon, Jan 19, 2026 at 10:01 PM Vikram Koka <[email protected]> wrote:
>
>> David and Jarek,
>>
>> Your responses together have clarified in one dimension and confused me
>> more in another dimension.
>>
>> Clarified for me:
>> - This is not a replacement for DeferrableOperators in most cases.
>> - This is intended to be complementary to DeferrableOperators and based
>> on the use case you would use one vs. the other.
>>
>> Confused me further:
>> - What those specific use cases are:
>> From David's response, I understand that the use case best suited for
>> async Operators would be:
>> - For large payloads, these would overwhelm the Airflow metadatabase,
>> since DeferrableOperators do not have access to external XCom storage
>> systems.
>>
>> From Jarek's response, I understand that the use case best suited for
>> async Operators would be:
>> - For a number of small, async operations to be done possibly
>> concurrently, leveraging async I/O.
>>
>> I understand David's explanation a bit better, possibly because I can
>> relate to it from a concrete use case perspective.
>> The follow up questions I do have are about changes in task behavior with
>> respect to task retries, as well as how / where intermediate task failures
>> should be handled.
>> This also raises the interaction with other AIPs such as watermarks,
>> resumable operators, and so on.
>> But, setting those aside, just trying to think through how this should be
>> represented to the user i.e. DAG author. This strikes me as an "advanced
>> use case", but still learning.
>>
>> I don't understand Jarek's explanation. Can you please clarify with a
>> concrete use case?
>>
>> Best regards,
>> Vikram
>>
>>
>>
>>
>> On Sun, Jan 18, 2026 at 4:03 AM Jarek Potiuk <[email protected]> wrote:
>>
>>> Yeah. I would absolutely see this as complementary, not even trying to
>>> replace Deferrable Operators. I think we should make it clear in the
>>> documentation to not confuse people but the use cases and behaviours there
>>> are different. Really, it has one thing in common -
>>> both DeferrableOperators and Async support for Python Operators can easily
>>> leverage "async Hooks".
>>>
>>> For me, the name of DeferrableOperators explains it all (and there is a
>>> good reason we did not name it AsyncOperators). The distinction I see:
>>>
>>> * Deferrable Operators are good, when you have generally synchronous
>>> operation that should be Deferred for later (usually much later)
>>> * AsyncPythonOperator is good when you want to do a number of small,
>>> async operations possibly concurrently, but you do not want to defer those,
>>> you simply leverage capabilities of async I/O operations being able to run
>>> concurrently (note - not in parallel - but concurrently - using single
>>> worker CPU and async I/O non-GIL operations for multiplexing many
>>> operations.
>>>
>>> Those are very, very distinct use cases, I would even say they do not
>>> have anything in common (except using async hooks).
>>>
>>> Also, what AsyncPythonOperator does was essentially possible before -
>>> with some boilerplate async loop utilisation code. So really what AIP-98
>>> does is adding a syntactic sugar and hiding the async loop management code,
>>> to make it a) easier b) native  for airflow with `async def task()` c) more
>>> discoverable by our users (providing we will iterate on documentation and
>>> examples).
>>>
>>> In the future (what David mentioned) it opens up for better integration
>>> with async task monitoring - for example so that we could see progress of
>>> those concurrent tasks in Airflow UI other than looking at logs, and things
>>> like more reusable "standard" operators (like IterableOperator). I'd say
>>> it's a really foundational change to recognise the "single worker async
>>> multiplexing" as native-airflow feature - which will bring some nice things
>>> in the future.
>>>
>>> J.
>>>
>>>
>>>
>>> On Sat, Jan 17, 2026 at 9:29 AM Blain David <[email protected]>
>>> wrote:
>>>
>>>> Hello Vikram,
>>>>
>>>> Thank you for your reply.
>>>>
>>>> To be clear, no I'm not deprecating deferrable operators, it just
>>>> depends on what the operator does:
>>>>
>>>> 1. If the operator is deferrable because it needs to use an async hook
>>>> to retrieve huge payloads from a paginated API, then yes, I would prefer
>>>> the async operator over the deferred one, like for example the
>>>> MSGraphAsyncOperator or the HttpOperator.
>>>> The reason why is what was also explained in the devlist discussion
>>>> before, you're literally overloading the triggers (in memory) and the
>>>> Airflow metadatabase (triggers table) with huge payloads,
>>>> something triggers are not designed for (but you could) as triggers
>>>> don't have like an XCom backend which you can easily replace with another
>>>> one, so you're stuck with storing the payloads (trigger events) in the
>>>> Airflow database table.
>>>>
>>>> 2. If the operator is deferrable because it needs to do polling to
>>>> determine it succeeded or not, then yes, it makes sense, for example I just
>>>> started a PR (https://github.com/apache/airflow/pull/60651)
>>>> to fix an issue related to polling in the WinRMOperator which blocks
>>>> the worker for no reason as it just awaits an answer, it's similar to point
>>>> one but here the payload in the triggers is small and so is the execution
>>>> time.
>>>>
>>>> So yes, in some cases I would advocate to use the BaseAsyncOperator,
>>>> but in other cases not, it all depends on the responsibility of the
>>>> operator and what you're doing with.
>>>> AIP-98 also opens the door to implement the IterableOperator in the
>>>> future which was also discussed mostly with Jarek in the devlist (
>>>> https://lists.apache.org/thread/ztnfsqolow4v1zsv4pkpnxc1fk0hbf2p ) as
>>>> he knows what the idea behind there is, but that's also still work in
>>>> progress.
>>>>
>>>> On the other hand deferrable operators also have a huge advantage as
>>>> they rely on triggers and that is that it allows us to implement the
>>>> "streaming" mechanism or the lazy dynamic task mapping expansion I
>>>> explained in AIP-88 (
>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760511
>>>> ) and which I presented on the Summit in Seattle last year.  Once PR #55068
>>>> (https://github.com/apache/airflow/pull/55068/) is merged, I will
>>>> continue working on that one as well.
>>>>
>>>> So yes, it all depends on the use case.
>>>>
>>>> I hope this makes it a bit more clear to you.
>>>>
>>>> David
>>>>
>>>> -----Original Message-----
>>>> From: Vikram Koka via dev <[email protected]>
>>>> Sent: 16 January 2026 19:58
>>>> To: [email protected]
>>>> Cc: Vikram Koka <[email protected]>
>>>> Subject: Re: [VOTE] AIP-98: Add async support for PythonOperator in
>>>> Airflow 3
>>>>
>>>> EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze
>>>> niet vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel,
>>>> stuur deze e-mail als bijlage naar [email protected]<mailto:
>>>> [email protected]>.
>>>>
>>>> Hey David,
>>>>
>>>> Just read the AIP and posted questions on the Confluence page as well.
>>>>
>>>> I find this very interesting and am *overall supportive*, but I have
>>>> several questions about usage and user / developer guidance.
>>>> Specifically around what we should be recommending around what user
>>>> situations. I put the following question in the confluence page as well:
>>>>
>>>>
>>>>    - You are making a strong case for supporting async within the
>>>>    PythonOperator pattern over Deferrable Operators.
>>>>    - What I am missing is when should users be using Deferrable
>>>> Operators
>>>>    instead?
>>>>    - Also, are you advocating deprecating Deferrable Operators
>>>> entirely? I
>>>>    am not opposed to it, but definitely something I am curious about
>>>> your
>>>>    viewpoint here.
>>>>
>>>>
>>>> Until then, I would vote
>>>> -0.5 (binding)
>>>>
>>>> I am absolutely willing and intend to change my vote, just want to get
>>>> questions answered first is all.
>>>>
>>>> These are questions which any user would have and I therefore believe
>>>> it is important to address as part of making and merging this change.
>>>>
>>>> Vikram
>>>>
>>>>
>>>>
>>>> On Fri, Jan 16, 2026 at 9:14 AM Dheeraj Turaga <[email protected]
>>>> >
>>>> wrote:
>>>>
>>>> > +1 (binding)
>>>> >
>>>> > Sriraj Dheeraj Turaga
>>>> >
>>>> > On Fri, Jan 16, 2026 at 9:19 AM Shahar Epstein <[email protected]>
>>>> wrote:
>>>> >
>>>> > > +1 (binding)
>>>> > >
>>>> > > On Fri, Jan 16, 2026 at 3:39 PM Blain David
>>>> > > <[email protected]>
>>>> > > wrote:
>>>> > >
>>>> > > > Hi Everyone,
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > I would like to be calling a vote on this AIP:
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > >
>>>> >
>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik
>>>> > i.apache.org
>>>> %2Fconfluence%2Fdisplay%2FAIRFLOW%2FAIP-98%253A%2BAdd%2Bas
>>>> > ync%2Bsupport%2Bfor%2BPythonOperator%2Bin%2BAirflow%2B3&data=05%7C02%7
>>>> > Cdavid.blain%40infrabel.be
>>>> %7C56d6ae2f0e904b8c30d608de55315684%7Cb82bc3
>>>> > 14ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639041867613554992%7CUnknown%7CTW
>>>> > FpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIs
>>>> > IkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=IEe7gFlEP5XQwYbFMM
>>>> > 8LkPw%2Bp2Yr0IuBS%2BIp1SSAr1o%3D&reserved=0
>>>> > > >
>>>> > > > There was already a discussion in the devlist regarding this
>>>> proposal:
>>>> > > >
>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>>>> > > > lists.apache.org
>>>> %2Fthread%2Fztnfsqolow4v1zsv4pkpnxc1fk0hbf2p&data=
>>>> > > > 05%7C02%7Cdavid.blain%40infrabel.be
>>>> %7C56d6ae2f0e904b8c30d608de5531
>>>> > > > 5684%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C6390418676135806
>>>> > > > 65%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDA
>>>> > > > wMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C
>>>> > > > &sdata=ccy%2Fec1OCrAyvQorRAEvhuPMDuslWEep9fFNNiT6r7o%3D&reserved=0
>>>> > > >
>>>> > > > This AIP is already implemented and merged as a PR:
>>>> > > >
>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F
>>>> > > > github.com
>>>> %2Fapache%2Fairflow%2Fpull%2F60268&data=05%7C02%7Cdavid.
>>>> > > > blain%40infrabel.be
>>>> %7C56d6ae2f0e904b8c30d608de55315684%7Cb82bc314a
>>>> > > > b8e4d6fb18946f02e1f27f2%7C0%7C0%7C639041867613602603%7CUnknown%7CT
>>>> > > > WFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
>>>> > > > zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=c4BiguOx6
>>>> > > > kmqIlWBlNIf6SrO4qN2baTwwFuDBaDmBX8%3D&reserved=0
>>>> > > >
>>>> > > > The vote will run for 5 days and last till next thursday, the
>>>> 22th
>>>> > > > of
>>>> > Jan
>>>> > > > 2026 23:30 GMT.
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > Everyone is encouraged to vote, although only PMC members and
>>>> > Committers'
>>>> > > > votes are considered binding.
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > Please vote accordingly
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > [ ] +1 Approve
>>>> > > >
>>>> > > > [ ] +0 no opinion
>>>> > > >
>>>> > > > [ ] -1 disapprove with the reason
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > I hereby already vote my +1 binding :)
>>>> > > >
>>>> > > >
>>>> > > >
>>>> > > > Regards,
>>>> > > >
>>>> > > > David aka dabla
>>>> > > >
>>>> > >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: [email protected]
>>>> For additional commands, e-mail: [email protected]
>>>>
>>>

Reply via email to