Lateness depends on watermark. How did you configure your KafkaIO reader? Did you set custom timestamp function? By default watermark in KafkaIO is set to same as processing time, in which case, your watermark could be close to 13-38-37 (processing time). Note that this is in general true across all the runners, though I am not aware of any subtle differences in Spark runner.
On Fri, Sep 7, 2018 at 7:03 AM rahul patwari <[email protected]> wrote: > Hi, > > We are running a Beam program on Spark. We are using 2.5.0 Beam and > SparkRunner versions. We are seeing Late data in the output emitted by > Spark. As per the capability Matrix, Lateness is not supported in Spark. Is > it supported now? or Are we missing something? > > Steps: > Read from Kafka, Apply a Fixed Window of 1 Min with Lateness as 2 Min > with Late firings when an element is found with Accumulating Fired Panes, > GroupByKey, ParDo to display the result. > > Below is the output of the ParDo in which we are printing the GroupByKey > result: > Pane Timing : LATE > Processing Time : 2018-09-07----13-38-37-7290----+0000 > Element Time : 2018-09-07----13-36-59-9990----+0000 > Window Start Time : 2018-09-07----13-36-00-0000----+0000 > Window End Time : 2018-09-07----13-37-00-0000----+0000 > Pane Index : 1 > Pane NonSpeculativeIndex : 1 > > Regards, > Rahul >
