Having just re-read FLIP-134, I think it mostly makes sense, though I'm not
exactly looking forward to figuring out how to explain it without making it
seem overly complicated.

A few points:

I'm a bit confused by the discussion around custom window Triggers. Yes, I
agree that complex, mixed Triggers are sometimes useful. And I buy into the
argument that we want to FAIL hard for processing-time on BATCH. But why
not go ahead and FAIL Triggers that can't work, rather than ignoring all
custom Triggers?

A BIG +1 from me for deprecating timeWindow.

I do think it's critical that bounded streaming has the same configuration
as unbounded streaming. Users expect/need things like processing time
timers in bounded streaming during development. If I've understood the
proposal correctly, this will be the case.

I would prefer WARN over IGNORE as the default for cases where users have
explicitly specified something that isn’t going to happen. (I would also
like to see a warning given for any job that uses event time timers without
having a watermark strategy, though that's unrelated to the topic at hand.)

David

On Thu, Sep 10, 2020 at 2:57 PM Dawid Wysakowicz <dwysakow...@apache.org>
wrote:

> Thanks for the explanation! Makes sense now! What do you think about
> adding this behaviour of WindowAssigner in streaming mode as well? I
> mean the behaviour of emitting at the end of a Window. I think it would
> make sense in the STREAM mode as well and keep the two modes more aligned.
>
> Best,
>
> Dawid
>
> On 10/09/2020 11:42, Aljoscha Krettek wrote:
> > On 10.09.20 11:30, Dawid Wysakowicz wrote:
> >> I am not sure about the option for ignoring the Triggers. Do you mean to
> >> ignore all the Triggers including e.g. Flink's such as CountTrigger,
> >> EventTimeTrigger etc.? Won't it effectively disable the WindowOperator
> >> whatsoever. Or even worse make it unusable with ever growing state? I
> >> might be wrong here but aren't Triggers required for emitting results
> >> from WindowOperator? If I am correct we emit results only if a Trigger
> >> returns FIRE from on of onElement, onEventTime, onProcessingTime. Why do
> >> you think it does not work well with FAILing hard without this option?
> >> We could fail hard e.g. if the WindowAssigner#isEventTime returns false.
> >
> > The problem I'm trying to solve are mixed Triggers. Say you have a
> > Trigger that does "fire when watermark passes maxTimestamp() but also
> > fire every 5 minutes in processing time and when the watermark passes
> > maxTimestamp() fire for every 5 new records". This is something that
> > the Beam API for example allows users to specify and is something that
> > I think is potentially valuable in the real world.
> >
> > Ignoring Triggers would mean that we always fire on the maxTimestamp()
> > by hardcoding this in a WindowOperator that we use for BATCH
> > execution. With this, the WindowAssigner becomes the only thing that
> > changes. This is similar to how Beam treats windows, where the
> > WindowAssigner carries semantic content but the Trigger is only for
> > optimizing streaming emission, which you don't need for BATCH where
> > you always have a "perfect watermark".
> >
> > Coming back to the initial example, such a Trigger would not work if
> > we FAIL hard for processing-time on BATCH, which I'm suggesting
> > because we otherwise have potentially surprising results if business
> > logic depends on processing-time timers. For Windows, on the other
> > hand, we could get around it by agreeing that Triggers are ignored for
> > BATCH.
> >
> >> As for the question with getProcessingTime(). From my point of view, it
> >> would be safe to simply return the current system time. I cannot think
> >> of any dangers if we do so. Moreover, frankly speaking I am not entirely
> >> sure what is the purpose of the method, other than injecting a clock in
> >> tests of built-in operators. Maybe it was a mistake to expose it in the
> >> user's API?
> >
> > I agree, it was a mistake to expose getProcessingTime(). And I also
> > think the same about getCurrentWatermark(), but that's neither here
> > nor there. 😅 I then also agree to just return the current time, as
> > you said. I will change the FLIP for this.
> >
> > Aljoscha
>
>

Reply via email to