Having just re-read FLIP-134, I think it mostly makes sense, though I'm not exactly looking forward to figuring out how to explain it without making it seem overly complicated.
A few points: I'm a bit confused by the discussion around custom window Triggers. Yes, I agree that complex, mixed Triggers are sometimes useful. And I buy into the argument that we want to FAIL hard for processing-time on BATCH. But why not go ahead and FAIL Triggers that can't work, rather than ignoring all custom Triggers? A BIG +1 from me for deprecating timeWindow. I do think it's critical that bounded streaming has the same configuration as unbounded streaming. Users expect/need things like processing time timers in bounded streaming during development. If I've understood the proposal correctly, this will be the case. I would prefer WARN over IGNORE as the default for cases where users have explicitly specified something that isn’t going to happen. (I would also like to see a warning given for any job that uses event time timers without having a watermark strategy, though that's unrelated to the topic at hand.) David On Thu, Sep 10, 2020 at 2:57 PM Dawid Wysakowicz <dwysakow...@apache.org> wrote: > Thanks for the explanation! Makes sense now! What do you think about > adding this behaviour of WindowAssigner in streaming mode as well? I > mean the behaviour of emitting at the end of a Window. I think it would > make sense in the STREAM mode as well and keep the two modes more aligned. > > Best, > > Dawid > > On 10/09/2020 11:42, Aljoscha Krettek wrote: > > On 10.09.20 11:30, Dawid Wysakowicz wrote: > >> I am not sure about the option for ignoring the Triggers. Do you mean to > >> ignore all the Triggers including e.g. Flink's such as CountTrigger, > >> EventTimeTrigger etc.? Won't it effectively disable the WindowOperator > >> whatsoever. Or even worse make it unusable with ever growing state? I > >> might be wrong here but aren't Triggers required for emitting results > >> from WindowOperator? If I am correct we emit results only if a Trigger > >> returns FIRE from on of onElement, onEventTime, onProcessingTime. Why do > >> you think it does not work well with FAILing hard without this option? > >> We could fail hard e.g. if the WindowAssigner#isEventTime returns false. > > > > The problem I'm trying to solve are mixed Triggers. Say you have a > > Trigger that does "fire when watermark passes maxTimestamp() but also > > fire every 5 minutes in processing time and when the watermark passes > > maxTimestamp() fire for every 5 new records". This is something that > > the Beam API for example allows users to specify and is something that > > I think is potentially valuable in the real world. > > > > Ignoring Triggers would mean that we always fire on the maxTimestamp() > > by hardcoding this in a WindowOperator that we use for BATCH > > execution. With this, the WindowAssigner becomes the only thing that > > changes. This is similar to how Beam treats windows, where the > > WindowAssigner carries semantic content but the Trigger is only for > > optimizing streaming emission, which you don't need for BATCH where > > you always have a "perfect watermark". > > > > Coming back to the initial example, such a Trigger would not work if > > we FAIL hard for processing-time on BATCH, which I'm suggesting > > because we otherwise have potentially surprising results if business > > logic depends on processing-time timers. For Windows, on the other > > hand, we could get around it by agreeing that Triggers are ignored for > > BATCH. > > > >> As for the question with getProcessingTime(). From my point of view, it > >> would be safe to simply return the current system time. I cannot think > >> of any dangers if we do so. Moreover, frankly speaking I am not entirely > >> sure what is the purpose of the method, other than injecting a clock in > >> tests of built-in operators. Maybe it was a mistake to expose it in the > >> user's API? > > > > I agree, it was a mistake to expose getProcessingTime(). And I also > > think the same about getCurrentWatermark(), but that's neither here > > nor there. 😅 I then also agree to just return the current time, as > > you said. I will change the FLIP for this. > > > > Aljoscha > >