Hi Folks, I have question regarding *GroupBy* and *SortValue* (via SecondaryKey),
The pipeline looks like: ...source+initialTransformations .apply(Window.Into(FixedWindows.of(Duration.standardMinutes(10)))) .apply("GroupByKey", GroupByKey.create()) *// using my primaryKey (userId)* .apply("SortValues", SortValues.create(BufferedExternalSorter.options())) *// My Secondary Key is the timestamp of the incoming event* .apply("Enrichment", ParDo.of(new *BusinessEnrichment*())) *// i receives a KV<userId, Iterable<KV<timestamp, String>>>* ..SinkToBigQuery: 1. Is it guarantee that my *BusinessEnrichment *will hold all the data grouped, sorted in on single *machine* and that bundles of the same keys will not be parallelize during auto scaling (i am running on top of Dataflow) ? I expect a parallel computation on different keys (users), but the bundles within the grouped key (userId) to be treated sequentially inside *BusinessEnrichment*, is that correct? I am asking due to an observation of mixed results which could be due to a bug on my code or parallel bundles computations within the same key, my expectation is to have a sequential processing after *grouping + sorting*, for example having a userId with 1000 sorted events i expect the processing as the following sequence: userId(1) with bundleA (0..250) -> userId(1) with bundleB (251..500) -> userId(1) with bundleC (501..750) -> userId(1) with bundleD (751..1000) Please advise if my assumption is wrong. Thanks and regards -- JC