I've updated the AIP-104 proposal based on the feedback received. Following Ash's comment, I agreed that the term Dynamic Task Iteration was somewhat misleading, since no dynamic task creation is involved. As a result, the feature has been renamed to Iterable Tasks (IT) as proposed by TP which I like a lot unless we would prefer to call it Task Iteration (TI), that I leave here in the middle and see what you guys prefer? In addition, Dynamic Task Partitioning has been renamed to Dynamic Task Batching (DTB), which I believe better reflects the underlying behavior of grouping items into batches for processing. I've also added concrete examples for both Iterable Tasks and Dynamic Task Batching using the Pokémon REST API. Since the API is publicly accessible, the examples can be easily executed by anyone and should be useful for demonstrations, experimentation, and discussion. Feedback is very welcome. https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421954527
The PR has been updated as well: https://github.com/apache/airflow/pull/62922 ________________________________ From: Jarek Potiuk <[email protected]> Sent: Thursday, June 04, 2026 20:48 To: [email protected] <[email protected]> Subject: Re: [DISCUSS] Choose a better name for Dynamic Task Partitioning and thus the partition primitive EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze niet vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, stuur deze e-mail als bijlage naar [email protected]<mailto:[email protected]>. Batch is a good name for it. On Thu, Jun 4, 2026 at 8:35 PM Jens Scheffler <[email protected]> wrote: > Hi David et al, > > I was very convinced about Dynamic Task Sharding during the call because: > > * Dynamic Task Mapping - we all know > * Dynamic Task Iteration - the new async kid in town? Taking all into > a single execution (with risk of fail all or nothing...) > > As David was describing the way to put the iterations into > (partitions/slices/chunks) I am still up for it. > > Batching would also be okay but feels like more matching for the thing > that "Iteration" is for, looping in async over a list. But the term > discussion was more that if you have 17 000 in the list you probably > rather want to track 170 "batches/partitions" as task processes being > supervised of each running 100 list items. As the "batch" is 17 000 > items, the "split/partitioning" to be named "batch" sounds a bit > un-natural. Because previous "iteration" also was a bit of a batch. > > Or do I mis-interpret? > > Dynamic Task Mapping: > > items = maky_me_a_work_list() > > serious_work = PythonOperator.partial( > task_id="serious_work", > ... > ).expand(op_args=items) > > > Dynamic Task Iteration: > > async_work = PythonOperator.partial( > task_id="async_work", > ... > ).iterate(op_args=items) > > > Dynamic Task Iteration with "partitions/slices"? > > large_async_work_in_pieces = PythonOperator.partial( > task_id="large_async_work_in_pieces", > ... > ).iterate(op_args=items, shrad=170) > > large_async_work_in_pieces = PythonOperator.partial( > task_id="large_async_work_in_pieces", > ... > ).iterate(op_args=items, slice=170) > > large_async_work_in_pieces = PythonOperator.partial( > task_id="large_async_work_in_pieces", > ... > ).iterate(op_args=items, batch=100) > > > (okay reading the code, "partition", "shrad" or "slice" would describe > how many pieces to vut the elephant into and "batch" would be convincing > to tell how many tasks to put together sharing a loop... so > thinking-out-loud "batch" would be also OK if we want to describe the > "package side of the elephan slice". > > @David ... if I mis-understood can you share the PR link or the demo > code to re-read what you presented? > > Jens > > > On 04.06.26 19:54, Tzu-ping Chung via dev wrote: > > I think dynamic task batch(ing) would be reasonable. > > > > Python’s itertools has batched() that kind of is the same concept. > > > > TP > > > > > >> On 5 Jun 2026, at 00:56, Blain David<[email protected]> wrote: > >> > >> Hi all, > >> > >> We need a better name than partition for Dynamic Task Partitioning. > >> > >> The main issue is that partition already strongly suggests asset/data > partitions in Airflow, > >> so using the same word here creates avoidable confusion for users and > contributors. > >> > >> We’d like a term that is clear, intuitive, and doesn’t overlap with > existing Airflow concepts. > >> > >> Some alternatives raised so far during the devcall: > >> > >> > >> * > >> batch (e.g. Dynamic Task Batching) > >> * > >> chunk (e.g. Dynamic Task Chunking) > >> * > >> slice (bit confusing but chose to still mention it anway) > >> * > >> shard > >> * > >> segment > >> > >> > >> My current lean is towards chunk and batch. It feels familiar, readable > in both code and docs, and avoids the existing partition/data-partition > association. > >> > >> I’d love feedback on: > >> > >> > >> * > >> which term feels most natural > >> * > >> which term is least ambiguous > >> * > >> or whether there’s a better option we haven’t considered? > >> > >> > >> One note: map was mentioned as well, but that seems too close to > existing task.map() terminology. > >> > >> Please share thoughts, especially if you have concerns about any of the > options above or a stronger suggestion for the long-term name. > >> > >> Naming is indeed hard 🙂 > >> > >> Kind regards, > >> David > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail:[email protected] > > For additional commands, e-mail:[email protected] > >
