*~Vincent*
On Tue, Dec 8, 2020 at 3:13 PM Boyuan Zhang <boyu...@google.com> wrote: > Please note that each record output from ReadFromKafkaDoFn is in a > GlobalWindow. The workflow is: > ReadFromKafkaDoFn -> Reshuffle -> Window.into(FixedWindows) -> > Max.longsPerKey -> CommitDoFn > | > ---> downstream consumers > > but won't there still be 5 commits that happen as fast as possible for >> each of the windows that were constructed from the initial fetch? > > I'm not sure what you mean here. Would you like to elaborate more on your > questions? > Sure, I'll try to explain, it's very possible I just am misunderstanding Windowing here. Assumption 1: Windowing works on the output timestamp. Assumption 2: Max.longsPerKey will fire as fast as it can, in other words, there is no throttling. So, if we have a topic that has the following msgs: msg | timestamp (mm,ss) ----------------------- A | 01:00 B | 01:01 D | 06:00 E | 06:04 F | 12:02 and we read them all at once, we will have one window that contains [A,B] and another one that has [D,E], and a third that has [F]. Once we get the max offset for all three, won't they fire back to back without delay? So F will fire as soon as E is finished committing, which fires immediately after B is committed?