Not commenting on the Dynamic part, I would call it something like (Lazy) Iterable Task or Iterator Task. An iteration is one pass of the loop. Here we’re describing the entire loop (represented by the task).
TP > On Jun 5, 2026, at 17:48, David Blain <[email protected]> wrote: > > Maybe we should just call it Task Iteration without mentioning Dynamic? > Because at the end that it wat is does. > > On 2026/06/04 20:38:36 Ash Berlin-Taylor wrote: >> Sharding is usually (or at least every time I’ve seen it used) when work is >> distributed based on some hash function, batching is a simpler thing that is >> closer to the current implementation - take the first n items, run them, get >> the next n items, run them etc. shard I would expect to be somewhere halfway >> between batched and full task mapping, i.e. send the items to 3 >> workers/tasks in parallel. >> >> `shard(3)` to me could read as "shard into 3 tasks at a time” — or at least >> I’d have to check the first time I come across it. This could be fixable by >> not allowing positional args, only named ones. >> >> I’m also now questioning the “Dynamic” part of it. What about this is >> dynamic? In Dynamic Task Mapping, things to change at runtime (i.e the >> scheduler dynamically creates more Tis as execution happens), but that isn’t >> what is happening here is it? From the AIP alone (I haven’t had time to read >> the PR) — which now I see it is a secondary issue David: the AIP is defined >> in terms of the PR. That is not the purpose of an AIP - it should be a >> (mostly) stand-alone proposal of a new feature. >> >> Or to ask it another way: Why does it need its own name? Could it be as >> simple as adding a batching feature on to Mapped Tasks? “This is a set of >> dynamic mapped tasks” “this is a set of batched mapped tasks” etc? (If >> everything is “Dynamic this” or “Dynamic that” then it starts to lose >> meaning.) >> >> >> Sorry, this turned out as more of a stream of consciousness than I intended. >> >> -ash >> >>> On 4 Jun 2026, at 21:05, Natanel <[email protected]> wrote: >>> >>> Hello, >>> >>> I think that it would be better to go with shard or chunk, personally I >>> would prefer shard as it seems to work like sharding (at least according to >>> what David showed during the devcall) as data was not split to slices, >>> rather was sharded between the two dynamically created tasks >>> Batch just seem a little confusing as if I see a method with task.batch(3) >>> I would not know if it meant splitting to batches of 3 or to 3 batches, >>> same as with chunk though less confusing, if I see some code with >>> task.shard(3) I don't think anyone would confuse it with shards of size 3, >>> at least that is my opinion, let me know what you think >>> >>> Thanks, >>> Natanel. >>> >>> >>> On Thu, Jun 4, 2026, 21:53 Sameer Mesiah <[email protected]> wrote: >>> >>>> Hi David, >>>> >>>> I would go with either Batch or Chunk. >>>> >>>> But I must say I have no strong opinions with regards to the naming of this >>>> feature. >>>> >>>> Thanks, >>>> Sameer Mesiah. >>>> >>>> On Thu, 4 Jun 2026 at 17:56, Blain David <[email protected]> wrote: >>>> >>>>> Hi all, >>>>> >>>>> We need a better name than partition for Dynamic Task Partitioning. >>>>> >>>>> The main issue is that partition already strongly suggests asset/data >>>>> partitions in Airflow, >>>>> so using the same word here creates avoidable confusion for users and >>>>> contributors. >>>>> >>>>> We’d like a term that is clear, intuitive, and doesn’t overlap with >>>>> existing Airflow concepts. >>>>> >>>>> Some alternatives raised so far during the devcall: >>>>> >>>>> >>>>> * >>>>> batch (e.g. Dynamic Task Batching) >>>>> * >>>>> chunk (e.g. Dynamic Task Chunking) >>>>> * >>>>> slice (bit confusing but chose to still mention it anway) >>>>> * >>>>> shard >>>>> * >>>>> segment >>>>> >>>>> >>>>> My current lean is towards chunk and batch. It feels familiar, readable >>>> in >>>>> both code and docs, and avoids the existing partition/data-partition >>>>> association. >>>>> >>>>> I’d love feedback on: >>>>> >>>>> >>>>> * >>>>> which term feels most natural >>>>> * >>>>> which term is least ambiguous >>>>> * >>>>> or whether there’s a better option we haven’t considered? >>>>> >>>>> >>>>> One note: map was mentioned as well, but that seems too close to existing >>>>> task.map() terminology. >>>>> >>>>> Please share thoughts, especially if you have concerns about any of the >>>>> options above or a stronger suggestion for the long-term name. >>>>> >>>>> Naming is indeed hard 🙂 >>>>> >>>>> Kind regards, >>>>> David >>>>> >>>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
