Re: Lazy Spark Structured Streaming

2020-08-02 Thread Jungtaek Lim
SPARK-24156 runs the no-data batch to apply the updated watermark, but the updated watermark may not be eligible to evict all state rows. (e.g. window, lateness of watermark) You'll still need to provide dummy input record to advance watermark, so that all expected state rows can be evicted. On

Re: Lazy Spark Structured Streaming

2020-08-02 Thread Phillip Henry
Thanks, Jungtaek. Very useful information. Could I please trouble you with one further question - what you said makes perfect sense but to what exactly does SPARK-24156 refer if not fixing the "need to add a dummy record to move watermark