Per-key ordered delivery makes a ton of sense. I'd guess CDC has the same needs as retractions, so that the changelog can be applied in order as it arrives. And since it is per-key you still get parallelism.
Global ordering is quite different. I know that SQL and Dataframes have global sorting operations. The question has always been how does "embarassingly paralllel" processing interact with sorting/ordering. I imagine some other systems have the features so we can look at how it is used? Kenn Kenn On Mon, May 10, 2021 at 4:39 PM Sam Rohde <sro...@google.com> wrote: > Awesome, thanks Pablo! > > On Mon, May 10, 2021 at 4:05 PM Pablo Estrada <pabl...@google.com> wrote: > >> CDC would also benefit. I am working on a proposal for this that is >> concerned with streaming pipelines, and per-key ordered delivery. I will >> share with you as soon as I have a draft. >> Best >> -P. >> >> On Mon, May 10, 2021 at 2:56 PM Reuven Lax <re...@google.com> wrote: >> >>> There has been talk, but nothing concrete. >>> >>> On Mon, May 10, 2021 at 1:42 PM Sam Rohde <sro...@google.com> wrote: >>> >>>> Hi All, >>>> >>>> I was wondering if there had been any plans for creating ordered >>>> PCollections in the Beam model? Or if there might be plans for them in the >>>> future? >>>> >>>> I know that Beam SQL and Beam DataFrames would directly benefit from an >>>> ordered PCollection. >>>> >>>> Regards, >>>> Sam >>>> >>>