Hi David et al,

I was very convinced about Dynamic Task Sharding during the call because:

 * Dynamic Task Mapping - we all know
 * Dynamic Task Iteration - the new async kid in town? Taking all into
   a single execution (with risk of fail all or nothing...)

As David was describing the way to put the iterations into (partitions/slices/chunks) I am still up for it.

Batching would also be okay but feels like more matching for the thing that "Iteration" is for, looping in async over a list. But the term discussion was more that if you have 17 000 in the list you probably rather want to track 170 "batches/partitions" as task processes being supervised of each running 100 list items. As the "batch" is 17 000 items, the "split/partitioning" to be named "batch" sounds a bit un-natural. Because previous "iteration" also was a bit of a batch.

Or do I mis-interpret?

Dynamic Task Mapping:

items = maky_me_a_work_list()

serious_work = PythonOperator.partial(
    task_id="serious_work",
    ...
).expand(op_args=items)


Dynamic Task Iteration:

async_work = PythonOperator.partial(
    task_id="async_work",
    ...
).iterate(op_args=items)


Dynamic Task Iteration with "partitions/slices"?

large_async_work_in_pieces = PythonOperator.partial(
    task_id="large_async_work_in_pieces",
    ...
).iterate(op_args=items, shrad=170)

large_async_work_in_pieces = PythonOperator.partial(
    task_id="large_async_work_in_pieces",
    ...
).iterate(op_args=items, slice=170)

large_async_work_in_pieces = PythonOperator.partial(
    task_id="large_async_work_in_pieces",
    ...
).iterate(op_args=items, batch=100)


(okay reading the code, "partition", "shrad" or "slice" would describe how many pieces to vut the elephant into and "batch" would be convincing to tell how many tasks to put together sharing a loop... so thinking-out-loud "batch" would be also OK if we want to describe the "package side of the elephan slice".

@David ... if I mis-understood can you share the PR link or the demo code to re-read what you presented?

Jens


On 04.06.26 19:54, Tzu-ping Chung via dev wrote:
I think dynamic task batch(ing) would be reasonable.

Python’s itertools has batched() that kind of is the same concept.

TP


On 5 Jun 2026, at 00:56, Blain David<[email protected]> wrote:

Hi all,

We need a better name than partition for Dynamic Task Partitioning.

The main issue is that partition already strongly suggests asset/data 
partitions in Airflow,
so using the same word here creates avoidable confusion for users and 
contributors.

We’d like a term that is clear, intuitive, and doesn’t overlap with existing 
Airflow concepts.

Some alternatives raised so far during the devcall:


  *
batch (e.g. Dynamic Task Batching)
  *
chunk (e.g. Dynamic Task Chunking)
  *
slice (bit confusing but chose to still mention it anway)
  *
shard
  *
segment


My current lean is towards chunk and batch. It feels familiar, readable in both 
code and docs, and avoids the existing partition/data-partition association.

I’d love feedback on:


  *
which term feels most natural
  *
which term is least ambiguous
  *
or whether there’s a better option we haven’t considered?


One note: map was mentioned as well, but that seems too close to existing 
task.map() terminology.

Please share thoughts, especially if you have concerns about any of the options 
above or a stronger suggestion for the long-term name.

Naming is indeed hard 🙂

Kind regards,
David

---------------------------------------------------------------------
To unsubscribe, e-mail:[email protected]
For additional commands, e-mail:[email protected]

Reply via email to