Hi,
I'm a little confused about the usage of watermark in SS. According to the
guideline, when we use a window-based grouping, SS will automatically handle
the late event and we should use watermark to limit the state like
this(specify a watermark before groupBy):
val words = ... // streaming
Hi,
I'm working with structured streaming, and I'm wondering whether there
should be some improvements about trigger.
Currently, when I specify a trigger, i.e. tigger(Trigger.ProcessingTime("10
minutes")), the engine will begin processing data at the time the trigger
begins, like 10:00:00,
Hi,
I'm working with Structured Streaming to process logs from kafka and use
watermark to handle late events. Currently the watermark is computed by (max
event time seen by the engine - late threshold), and the same watermark is
used for all partitions.
But in production environment it happens