Hi all, I am opening this thread to discuss FLIP-328: Allow source operators to determine isProcessingBacklog based on watermark lag[1]. We had a several discussions with Dong Ling about the design, and thanks for all the valuable advice.
The FLIP aims to target the use-case where user want to run a Flink job to backfill historical data in a high throughput manner and continue processing real-time data with low latency. Building upon the backlog concept introduced in FLIP-309[2], this proposal enables sources to report their status of processing backlog based on the watermark lag. We would greatly appreciate any comments or feedback you may have on this proposal. Best, Xuannan [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-328%3A+Allow+source+operators+to+determine+isProcessingBacklog+based+on+watermark+lag [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog