This sounds like a bug, as described. Here's the logic, shared by all runners: https://github.com/apache/beam/blob/master/runners/core-java/src/main/java/org/apache/beam/runners/core/ReduceFnRunner.java#L958
Regarding "race condition equivalent" I mean that when you have an early trigger set up there is a benign race condition that determines the presence or absence of an on time pane. So you should never count on it unless you insist on it. Regarding Robert's point I agree that if we try to interpret triggers as an expression of human intent then it seems that AfterWatermark variants should all output an ON_TIME pane. It is really the whole windowing strategy by which the user specifies their intent. IMO there are a couple reasons. Foremost, it becomes hard/impossible to reason about systems where there are layers built upon the design philosophy of guessing intent. So it is better to focus on clear boundaries; in this case triggers govern whether the runner is permitted to produce non-lifecycle outputs, while separate flags control the lifecycle outputs. You can imagine a small DSL where the combination of trigger, on time behavior, and closing behavior are expressed together. That is more or less what WindowingStrategy is trying to get at. I also think the space of triggers is sufficiently arbitrary and unexplored that I don't want to couple them to things that are more obviously fundamental, like on time behavior, lateness, and closing behavior. Previously, triggers were a user-defined state machine, which I think is overly general and too inefficient in a portable setting. But our particular set of triggers has cruft you don't need and also sometimes lacks things you do need. If/when we invest in and move toward sink triggers (aka specify business logic and let the runner figure out the rest) seems good to re-open the design space. Kenn On Mon, Jan 13, 2020 at 7:10 PM Aaron Dixon <[email protected]> wrote: > Using ClosingBehavior I was able to see all windows get final panes fired > (w/ isLast=true). > > Still unsure why OnTimeBehavior=ALWAYS wasn't enough to ensure an on-time > pane for all windows. Though not clear on the meaning of the 'race > condition equivalency' you mentioned Kenneth. Am very interested in > understanding how this is supposed to work in general - but am very happy > that I was able to get final panes by setting > ClosingBehavior=ALWAYS....thank you for that pro-tip!! > > On Mon, Jan 13, 2020 at 7:57 PM Robert Bradshaw <[email protected]> > wrote: > >> I think AfterWatermark in particular should *alway* produce an ON_TIME >> pane, regardless of whether there were early panes. (It's less clear >> with non-watermark triggers like after count or processing time.) This >> makes it feel like the on time behavior is a property of the trigger, >> not the windowing strategy. >> >> On Mon, Jan 13, 2020 at 5:06 PM Aaron Dixon <[email protected]> wrote: >> > >> > Kenn, thank you! There is OnTimeBehavior (default FIRE_ALWAYS) and >> ClosingBehavior (default FIRE_IF_NON_EMPTY). Given that OnTimeBehavior is >> always-fire, shouldn't I see empty ON_TIME panes? >> > >> > Since my lateness config is 0, I'm going to try ClosingBehavior = >> FIRE_ALWAYS and see if I can rely on .isLast() to pick out the last pane >> downstream. But curious if given that the OnTimeBehavior default is ALWAYS, >> shouldn't I be seeing on-time panes in my current config? >> > >> > >> > >> > On Mon, Jan 13, 2020 at 6:45 PM Kenneth Knowles <[email protected]> >> wrote: >> >> >> >> On my phone, so I can't grab the jira so easily, but quickly: EARLY >> panes are "race condition equivalent" to ON_TIME panes. The early panes >> consume all the pending elements then the on time pane is "empty". This is >> WAI if it is what is causing it. You need to explicitly set >> Window.configure().fireAlways()*. I know this is counterintuitive in >> accumulating mode, where the empty pane is not the identity element. >> >> >> >> Kenn >> >> >> >> *I don't recall if this is the default or not, and also because on >> phone it is slow to look up. From your experience I think not default. >> >> >> >> On Mon, Jan 13, 2020, 15:03 Aaron Dixon <[email protected]> wrote: >> >>> >> >>> Any confirmation on this from anyone? Whether per Beam spec, runners >> are obligated to send ON_TIME panes for AfterWatermark triggers? I'm stuck >> because this seems fundamental, so it's hard to imagine this is a Dataflow >> bug, but OTOH it's also hard to imagine that trigger specs like >> AfterWatermark are "optional"... ? >> >>> >> >>> On Mon, Jan 13, 2020 at 4:18 PM Aaron Dixon <[email protected]> >> wrote: >> >>>> >> >>>> Yes. Using calendar day-based windows and watermark is completely >> caught up to today ... calendar window ends several days ago. I got EARLY >> panes for each element but never ON_TIME pane. >> >>>> >> >>>> On Mon, Jan 13, 2020 at 4:16 PM Luke Cwik <[email protected]> wrote: >> >>>>> >> >>>>> Is the watermark advancing past the end of the window? >> >>>>> >> >>>>> On Mon, Jan 13, 2020 at 2:02 PM Aaron Dixon <[email protected]> >> wrote: >> >>>>>> >> >>>>>> The window is not empty fwiw; it has elements; I get an early >> firing pane for the window but well after the watermark passes there is no >> ON_TIME pane. Would this be a bug in Dataflow? Seems fundamental, so I'm >> concerned perhaps the Beam spec doesn't obligate ON_TIME firings? >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> On Mon, Jan 13, 2020 at 3:58 PM Luke Cwik <[email protected]> >> wrote: >> >>>>>>> >> >>>>>>> I would have expected an empty on time pane since the default on >> time behavior is FIRE_ALWAYS. >> >>>>>>> >> >>>>>>> On Mon, Jan 13, 2020 at 1:54 PM Aaron Dixon <[email protected]> >> wrote: >> >>>>>>>> >> >>>>>>>> Can anyone confirm? >> >>>>>>>> >> >>>>>>>> This is intermittent. Some (it seems, sparse) windows don't get >> an ON_TIME firing after watermark. Is this a bug or is there a reason to >> not expect ON_TIME firings for every window? >> >>>>>>>> >> >>>>>>>> On Mon, Jan 13, 2020 at 3:47 PM Rui Wang <[email protected]> >> wrote: >> >>>>>>>>> >> >>>>>>>>> If it indeed happened as you have described, I will be very >> interested in the expected behaviour. >> >>>>>>>>> >> >>>>>>>>> Something I remembered before: the trigger condition meets just >> gives the runner/engine "permission" to fire, but runner/engine may not >> fire immediately. But I don't know if the engine/runner will guarantee to >> fire. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> -Rui >> >>>>>>>>> >> >>>>>>>>> On Mon, Jan 13, 2020 at 1:43 PM Aaron Dixon <[email protected]> >> wrote: >> >>>>>>>>>> >> >>>>>>>>>> I have the following trigger: >> >>>>>>>>>> >> >>>>>>>>>> .apply(Window >> >>>>>>>>>> .configure() >> >>>>>>>>>> .triggering(AfterWatermark >> >>>>>>>>>> .pastEndOfWindow() >> >>>>>>>>>> .withEarlyFirings(AfterPane >> >>>>>>>>>> .elementCountAtLeast(1))) >> >>>>>>>>>> .accumulatingFiredPanes() >> >>>>>>>>>> .withAllowedLateness(Duration.ZERO) >> >>>>>>>>>> >> >>>>>>>>>> But in Dataflow I notice that I never get an ON_TIME firing >> for my window -- I only see early firing for elements, and then nothing. >> >>>>>>>>>> >> >>>>>>>>>> My assumption is that AfterWatermark should give me a last, >> on-time pane under this configuration when the watermark surpasses the >> window's end. >> >>>>>>>>>> >> >>>>>>>>>> Is my expectation correct? >> >
