Hey! A couple of weeks ago me and Arvid Heise played around with an idea to address a long standing issue of Flink: lack of watermark/event time alignment between different parallel instances of sources, that can lead to ever growing state size for downstream operators like WindowOperator.
We had an impression that this is relatively low hanging fruit that can be quite easily implemented - at least partially (the first part mentioned in the FLIP document). I have written down our proposal [1] and you can also check out our PoC that we have implemented [2]. We think that this is a quite easy proposal, that has been in large part already implemented. There is one obvious limitation of our PoC. Namely we can only easily block individual SourceOperators. This works perfectly fine as long as there is at most one split per SourceOperator. However it doesn't work with multiple splits. In that case, if a single `SourceOperator` is responsible for processing both the least and the most advanced splits, we won't be able to block this most advanced split for generating new records. I'm proposing to solve this problem in the future in another follow up FLIP, as a solution that works with a single split per operator is easier and already valuable for some of the users. What do you think about this proposal? Best, Piotrek [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-182%3A+Support+watermark+alignment+of+FLIP-27+Sources [2] https://github.com/pnowojski/flink/commits/aligned-sources