Hi, folks,

A few years ago, I asked about SSS not processing the final batch left on a
Kafka topic when using groupBy, OutputMode.Append and withWatermark.

At the time, Jungtaek Lim kindly pointed out (27/7/20) that this was
expected behaviour, that (if I have this correct) a message needs to arrive
to trigger Spark to write the lingering batch. The solution was "to add a
dummy record to move [the] watermark forward."

Looking at the comments in SPARK-24156, it seems people still find this
unintuitive. Would there be an appetite to address this, to ensure no
messages are left behind? Or is it a Sisyphean task whose complexity I
don't appreciate?

Regards,

Phillip

Reply via email to