This is an automated email from the ASF dual-hosted git repository.
dabla pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new 2af60775f3e Added documentation explaining difference deferred vs
async operators (#63500)
2af60775f3e is described below
commit 2af60775f3eefd96d022b69db519cf4e7a35f970
Author: David Blain <[email protected]>
AuthorDate: Tue Mar 17 20:16:59 2026 +0100
Added documentation explaining difference deferred vs async operators
(#63500)
* refactor: Added page explaining when to use deferred and when to use
async operators with examples and explaining what the difference is
* refactor: Already updated some parts regarding to remarks
* refactor: Improved documentation with recommendations from @kaxil
* refactor: Added section to explain when not to use async or deferrable as
suggested by TP
* refactor: Added reference from deferring as Jens suggested
* refactor: Fixed doc reference
* refactor: Reformatted deferred_http_operator_dag example
* refactor: Added MS Graph async example
* refactor: Changed order of MS Graph Async Example
* refactor: Updated MS Graph example
* refactor: Fixed indentation of bullet lists
* refactor: Reformatted Async Multiplexing Example
* refactor: Fixed reference to deferred vs async in deferring document
* refactor: Put full url instead of doc reference to tsk-sdk
---------
Co-authored-by: David Blain <[email protected]>
---
.../docs/authoring-and-scheduling/deferring.rst | 11 +
task-sdk/docs/deferred-vs-async-operators.rst | 287 +++++++++++++++++++++
task-sdk/docs/index.rst | 8 +
3 files changed, 306 insertions(+)
diff --git a/airflow-core/docs/authoring-and-scheduling/deferring.rst
b/airflow-core/docs/authoring-and-scheduling/deferring.rst
index 9344f07aa60..7d2bc648e22 100644
--- a/airflow-core/docs/authoring-and-scheduling/deferring.rst
+++ b/airflow-core/docs/authoring-and-scheduling/deferring.rst
@@ -24,6 +24,17 @@ This is where *Deferrable Operators* can be used. When it
has nothing to do but
*Triggers* are small, asynchronous pieces of Python code designed to run in a
single Python process. Because they are asynchronous, they can all co-exist
efficiently in the *triggerer* Airflow component.
+.. note::
+
+ Airflow 3.2 also supports Python-native async tasks that can perform
+ concurrent I/O operations within a single worker slot. While deferred
+ operators release the worker slot while waiting for an external event,
+ async tasks keep the task process running and use a shared event loop
+ to multiplex operations.
+
+ For guidance on when to use deferred operators versus async tasks,
+ see `Deferred vs Async Operators
<https://airflow.apache.org/docs/task-sdk/stable/deferred-vs-async-operators.html>`__.
+
An overview of how this process works:
* A task instance (running operator) reaches a point where it has to wait for
other operations or conditions, and defers itself with a trigger tied to an
event to resume it. This frees up the worker to run something else.
diff --git a/task-sdk/docs/deferred-vs-async-operators.rst
b/task-sdk/docs/deferred-vs-async-operators.rst
new file mode 100644
index 00000000000..4d77deea81d
--- /dev/null
+++ b/task-sdk/docs/deferred-vs-async-operators.rst
@@ -0,0 +1,287 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+.. _sdk-deferred-vs-async-operators:
+
+Deferred vs Async Operators
+===========================
+
+.. versionadded:: 3.2.0
+
+Airflow 3.2 introduces Python-native async support for tasks, allowing
concurrent I/O within a single worker slot.
+This page explains how async operators differ from deferred operators and when
to use each.
+
+Deferred Operators
+------------------
+
+A deferred operator is an operator that can pause its execution until an
external trigger event occurs,
+without holding a worker slot. For more details see
:doc:`airflow:authoring-and-scheduling/deferring`.
+Examples include the HttpOperator in deferrable mode, sensors or operators
integrated with triggers.
+
+Key characteristics:
+
+- Execution is paused while waiting for external events or resources.
+- Worker slots are freed during the wait, improving resource efficiency.
+- Ideal for scenarios where a single external event or a small number of
events dictate task completion.
+- Typically simpler to use, as the deferred operator handles all async logic.
+
+Async Python Operators
+----------------------
+
+Python native async operators allow you to write tasks that leverage Python's
asyncio:
+
+- Tasks can perform many concurrent I/O operations efficiently within a single
worker slot sharing the same event loop.
+- Task code uses async/await syntax with async-compatible hooks, such as
HttpAsyncHook or the SFTPHookAsync.
+
+Ideal when you need to perform high-throughput operations (e.g., many HTTP
requests, database calls, or API interactions) within a single task instance,
+or when there is no deferred operator available but there is an async hook
available.
+
+When to Use Deferred Operators
+------------------------------
+
+Prefer a deferred operator when:
+
+- There is an existing deferrable operator that covers your use case (e.g.,
HttpOperator deferrable mode).
+- The task waits for a single or limited external events.
+- You want to free worker resources while waiting for triggers.
+- You don't need to loop over the same operator multiple times (e.g.
multiplexing).
+
+.. code-block:: python
+
+ from airflow.sdk import dag
+ from airflow.providers.http.operators.http import HttpOperator
+
+
+ @dag(schedule=None)
+ def deferred_http_operator_dag():
+
+ get_op_task = HttpOperator(
+ http_conn_id="http_conn_id",
+ task_id="get_op",
+ method="GET",
+ endpoint="get",
+ data={"param1": "value1", "param2": "value2"},
+ deferrable=True,
+ )
+
+
+ deferred_http_operator_dag()
+
+When to Use Async Python Operators
+----------------------------------
+
+Use async Python operators when:
+
+- The task needs to perform many concurrent requests or operations within a
single task.
+- You want to take advantage of the shared event loop to improve throughput.
+- There is simply no deferred operator available.
+- The logic depends on custom Python code (e.g. callables or lambdas) that
cannot easily be implemented in a trigger, since triggers must be serializable
and do not have access to DAG code at runtime.
+
+.. note::
+
+ The :class:`~airflow.providers.http.hooks.http.HttpAsyncHook` depends on
``aiohttp``,
+ which is installed automatically with the `apache-airflow-providers-http
<https://airflow.apache.org/docs/apache-airflow-providers-http>`_ provider.
+
+Simple Async Example
+~~~~~~~~~~~~~~~~~~~~
+
+The following example demonstrates the basic syntax for writing an async task
+using ``async``/``await``.
+
+.. note::
+
+ For a single request like this, a deferrable operator (such as
+ :class:`~airflow.providers.http.operators.http.HttpOperator` with
+ ``deferrable=True``) is usually preferred. Deferred operators release the
+ worker slot while waiting for the external request to complete.
+
+ This example is provided mainly to illustrate the structure of an async
+ task. Async operators become most useful when performing many concurrent
+ operations within the same task (see the multiplexing example below), or
+ when implementing logic such as pagination where multiple requests need to
+ be executed sequentially or concurrently within a single task instance.
+
+.. code-block:: python
+
+ from aiohttp import ClientSession
+ from airflow.providers.http.hooks.http import HttpAsyncHook
+ from airflow.sdk import dag, task
+
+
+ @dag(schedule=None)
+ def async_http_operator_dag():
+
+ @task
+ async def get_op():
+ hook = HttpAsyncHook(http_conn_id="http_conn_id", method="GET")
+
+ async with ClientSession() as session:
+ response = await hook.run(
+ session=session,
+ endpoint="get",
+ data={"param1": "value1", "param2": "value2"},
+ )
+ return await response.json()
+
+ get_op()
+
+
+ async_http_operator_dag()
+
+Async Multiplexing Example
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Async operators become particularly useful when making many concurrent requests
+within a single task. The following example executes multiple HTTP requests
+concurrently using ``asyncio.gather`` while limiting concurrency with a
semaphore.
+
+.. code-block:: python
+
+ import asyncio
+ from aiohttp import ClientSession
+ from airflow.providers.http.hooks.http import HttpAsyncHook
+ from airflow.sdk import dag, task
+
+ parameters = [
+ {"param1": "value1", "param2": "value2"},
+ {"param1": "value3", "param2": "value4"},
+ {"param1": "value5", "param2": "value6"},
+ {"param1": "value7", "param2": "value8"},
+ ]
+
+
+ @dag(schedule=None)
+ def async_http_multiplex_dag():
+
+ @task
+ async def get_op(parameters: list[dict[str, str]]):
+ hook = HttpAsyncHook(http_conn_id="http_conn_id", method="GET")
+
+ # Limit concurrent requests to avoid overwhelming downstream
services
+ semaphore = asyncio.Semaphore(5)
+
+ async def fetch(session, params):
+ async with semaphore:
+ response = await hook.run(
+ session=session,
+ endpoint="get",
+ data=params,
+ )
+ return await response.json()
+
+ async with ClientSession() as session:
+ tasks = [fetch(session, params) for params in parameters]
+
+ # Run requests concurrently in the shared event loop
+ return await asyncio.gather(*tasks)
+
+ get_op(parameters)
+
+
+ async_http_multiplex_dag()
+
+.. note::
+
+ The upcoming *Dynamic Task Iteration* feature will simplify patterns like
this.
+ Instead of manually managing concurrency with constructs such as
+ ``asyncio.gather`` and ``asyncio.Semaphore``, authors will be able to
iterate
+ over asynchronous results directly in downstream tasks while still
benefiting
+ from a shared event loop. This will make high-throughput patterns such as
+ pagination or request multiplexing easier to implement.
+
+MS Graph Async Example
+~~~~~~~~~~~~~~~~~~~~~~
+
+Another example using the
:class:`~airflow.providers.microsoft.azure.hooks.msgraph.KiotaRequestAdapterHook`,
+which is async-only, to fetch all users from the `Microsoft Graph API
<https://learn.microsoft.com/en-us/graph/azuread-users-concept-overview/>`__.
+
+In this example, multiple paginated requests are expected in order to retrieve
all users.
+Using an async Python task is appropriate here because the
:class:`~airflow.providers.microsoft.azure.hooks.msgraph.KiotaRequestAdapterHook`
+handles pagination internally and performs the requests asynchronously.
+This allows multiple paginated requests to be performed efficiently within a
single task instance and worker slot.
+
+.. code-block:: python
+
+ from airflow.providers.microsoft.azure.hooks.msgraph import
KiotaRequestAdapterHook
+ from airflow.sdk import dag, task
+
+
+ @dag(schedule=None)
+ def async_msgraph_dag():
+
+ @task
+ async def get_users():
+ hook = KiotaRequestAdapterHook.get_hook(conn_id="msgraph_default")
+
+ return await hook.paginated_run(url="users")
+
+ get_users()
+
+
+ async_msgraph_dag()
+
+When **not** to use Deferred vs Async Operators
+-----------------------------------------------
+
+While the previous sections explain when to prefer deferred or async operators,
+it is equally important to understand scenarios where one may **not** be
appropriate:
+
+- **Avoid async operators for long waits**
+ Async operators keep the task process alive while waiting. If your task
involves long-running
+ operations, such as slow APIs or external triggers, this can waste worker
resources.
+ In such cases, prefer a deferred operator, which releases the worker slot
while waiting.
+
+- **Avoid deferred operators for many short or repeated waits**
+ Deferred operators pause and resume tasks each time they defer. If deferrals
occur frequently
+ or last only a short time, the overhead of stopping and restarting the task
can reduce efficiency.
+ For high-frequency short waits, an async operator may be more suitable.
+
+- **Operator availability should not dictate choice**
+ Having a built-in deferrable operator can simplify implementation, but the
decision
+ should be driven by the use case, not just what Airflow provides.
+ If your workflow is better suited for deferring but no operator exists yet,
+ consider implementing a custom deferred operator rather than defaulting to
async.
+
+Comparison with Dynamic Task Mapping
+------------------------------------
+
+Async operators and Dynamic Task Mapping solve different problems and have
different trade-offs.
+
+.. list-table::
+ :header-rows: 1
+
+ * - Aspect
+ - Async ``@task``
+ - Dynamic Task Mapping (deferrable)
+ * - Worker slots
+ - 1 worker slot (shared event loop)
+ - N worker slots (one per mapped task instance)
+ * - Concurrency model
+ - Async I/O inside a single task
+ - Parallel task instances scheduled by the scheduler
+ * - Retry behavior
+ - Whole task retries
+ - Individual mapped tasks can retry independently
+ * - UI visibility
+ - Appears as a single task
+ - Each mapped task is visible separately
+ * - Scheduler overhead
+ - Minimal
+ - Scheduler must manage N task instances
+
+For more details about Dynamic Task Mapping, see the
+:ref:`dynamic task mapping <sdk-dynamic-task-mapping>` page.
diff --git a/task-sdk/docs/index.rst b/task-sdk/docs/index.rst
index f3258ea8243..538fdd49d44 100644
--- a/task-sdk/docs/index.rst
+++ b/task-sdk/docs/index.rst
@@ -128,6 +128,13 @@ Use instead:
# Airflow 3.x
from airflow.sdk import DAG, task
+Choosing Between Deferred and Async Tasks
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Airflow 3.2 introduces Python-native async tasks alongside deferrable
operators.
+Both approaches support non-blocking I/O, but they serve different purposes.
+For guidance on when to use each approach, see
:doc:`deferred-vs-async-operators`.
+
4. Example Dag References
-------------------------
@@ -160,5 +167,6 @@ For the full public API reference, see the :doc:`api` page.
examples
dynamic-task-mapping
+ deferred-vs-async-operators
api
concepts