Though it's not obvious in the name, Stateful ParDos can only be
applied to keyed PCollections, similar to GroupByKey. (You could,
however, assign every element to the same key and then apply a
Stateful DoFn, though in that case all elements would get processed on
the same worker.)

On Thu, Jul 25, 2019 at 6:06 PM rahul patwari
<rahulpatwari8...@gmail.com> wrote:
>
> Hi,
>
> https://beam.apache.org/blog/2017/02/13/stateful-processing.html  gives an 
> example of assigning an arbitrary-but-consistent index to each element on a 
> per key-and-window basis.
>
> If the Stateful ParDo is applied on a Non-Keyed PCollection, say, 
> PCollection<Row> with Fixed Windows, the state is maintained per window and 
> every element in the window will be assigned a consistent index?
> Does this mean every element belonging to the window will be processed in a 
> single DoFn Instance, which otherwise could have been done in multiple 
> parallel instances, limiting performance?
> Similarly, How does Stateful ParDo behave on Bounded Non-Keyed PCollection?
>
> Thanks,
> Rahul

Reply via email to