[DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-04-20 Thread Aljoscha Krettek
Hi Everyone! We would like to start a discussion on "FLIP-126: Unify (and separate) Watermark Assigners" [1]. This work was started by Stephan in an experimental branch. I expanded on that work to provide a PoC for the changes proposed in this FLIP: [2]. Currently, we have two different flav

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-04-21 Thread Timo Walther
Thanks for the proposal Aljoscha. This is a very useful unification. We have considered this FLIP already in the interfaces for FLIP-95 [1] and look forward to update to the new unified watermark generators once FLIP-126 has been accepted. Regards, Timo [1] https://github.com/apache/flink/pul

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-04-27 Thread Kostas Kloudas
Hi Aljoscha, Thanks for opening the discussion! I have two comments on the FLIP: 1) we could add lifecycle methods to the Generator, i.e. open()/ close(), probably with a Context as argument: I have not fully thought this through but I think that this is more aligned with the rest of our rich fun

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-04-27 Thread David Anderson
Overall I like this proposal; thanks for bringing it forward, Aljoscha. I also like the idea of making the Watermark generator a rich function -- this should make it more straightforward to implement smarter watermark generators. Eg, one that uses state to keep statistics about the actual out-of-o

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-04-29 Thread Aljoscha Krettek
Regarding the WatermarkGenerator (WG) interface itself. The proposal is basically to turn emitting into a "flatMap", we give the WatermarkGenerator a "collector" (the WatermarkOutput) and the WG can decide whether to output a watermark or not and can also mark the output as idle. Changing the i

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-09 Thread Stephan Ewen
Thanks, Aljoscha, for picking this up. I agree with the approach of doing the here proposed set of changes for now. It already makes things simpler and adds idleness support everywhere. Rich functions and state always add complexity, let's do this in a next step, if we have a really compelling ca

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-09 Thread Jark Wu
Hi, Regarding to the `open()/close()`, I think it's necessary for Table&SQL to compile the generated code. In Table&SQL, the watermark strategy and event-timestamp is defined using SQL expressions, we will translate and generate Java code for the expressions. If we have `open()/close()`, we don't

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-11 Thread Aljoscha Krettek
Ah, I meant to write this in my previous email, sorry about that. The WatermarkStrategy, which is basically a factory for a WatermarkGenerator is the replacement for the open() method. This is the same strategy that was followed for StreamOperatorFactory, which was introduced to allow code gen

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-11 Thread Aljoscha Krettek
We're slightly running out of time. I would propose we vote on the basic principle and remain open to later additions. This feature is quite important to make the new Kafka Source that is developed as part of FLIP-27 useful. Otherwise we would have to use the legacy interfaces in the newly adde

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-11 Thread Stephan Ewen
I am fine with that. Much of the principles seem agreed upon. I understand the need to support code-generated extractors and we should support most of it already (as Aljoscha mentioned via the factories) can extend this if needed. I think that the factory approach supports code-generated extracto

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-11 Thread Jark Wu
Thanks for the explanation. I like the fatory pattern to make the member variables immutable and final. So +1 to the proposal. Best, Jark On Mon, 11 May 2020 at 22:01, Stephan Ewen wrote: > I am fine with that. > > Much of the principles seem agreed upon. I understand the need to support > cod

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Dawid Wysakowicz
Hi Aljoscha, Sorry for adding comments during the vote, but I have some really minor suggestions that should not influence the voting thread imo. 1) Does it make sense to have the TimestampAssigner extend from Flink's Function? This implies it has to be serializable which with the factory pattern

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Stephan Ewen
+1 to all of Dawid's suggestions, makes a lot of sense to me On Tue, May 12, 2020 at 2:32 PM Dawid Wysakowicz wrote: > Hi Aljoscha, > > Sorry for adding comments during the vote, but I have some really minor > suggestions that should not influence the voting thread imo. > > 1) Does it make sense

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Aljoscha Krettek
Definitely +1 to point 2) raised by Dawid. I'm not sure on points 1) and 3). 1) I can see the benefit of that but in reality most timestamp assigners will probably need to be Serializable. If you look at my (updated) POC branch [1] you can see how a TimestampAssigner would be specified on the

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Stephan Ewen
@Aljoscha About (1) could we have an interface SerializableTimestampAssigner that simply mixes in the java.io.Serializable interface? Or will this be too clumsy? About (3) RecordTimeStamp seems to fit both cases (in-source-record timestamp, in stream-record timestamp). On Tue, May 12, 2020 at 4:

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Dawid Wysakowicz
I have similar thoughts to @Stephan Ad. 1 I tried something like this on your branch:     /**      * Adds the given {@link TimestampAssigner} to this {@link WatermarkStrategies}. For top-level classes that implement both Serializable and TimestampAssigner      */     public & Serializable> Water

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Aljoscha Krettek
Yes, I am also ok with a SerializableTimestampAssigner. This only looks a bit clumsy in the API but as a user (that uses lambdas) you should not see this. I pushed changes for this to my branch: https://github.com/aljoscha/flink/tree/flink-xxx-wm-generators-rebased And yes, recordTimestamp sou

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-12 Thread Dawid Wysakowicz
Thank you for the update and sorry again for chiming in so late... Best, Dawid On 12/05/2020 18:21, Aljoscha Krettek wrote: > Yes, I am also ok with a SerializableTimestampAssigner. This only > looks a bit clumsy in the API but as a user (that uses lambdas) you > should not see this. I pushed c

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-23 Thread Guanghui Zhang
Hi, @Aljoscha, the function param currentTimestamp comment does not match the recordTimestamp "long extractTimestamp(T element, long recordTimestamp)" on wiki. Best, Zhangguanghui Dawid Wysakowicz 于2020年5月13日周三 上午12:28写道: > Thank you for the update and sorry again for chiming in so late... > >

Re: [DISCUSS] FLIP-126: Unify (and separate) Watermark Assigners

2020-05-25 Thread Aljoscha Krettek
The specifics of naming diverged a bit from the FLIP during implementation but that should be fine. What matters in the end is the intention of the FLIP and that the code that is committed in the end is good and consistent in itself. Best, Aljoscha On 24.05.20 05:12, Guanghui Zhang wrote: Hi