Re: Kafka backlog - spark structured streaming

2018-07-30 Thread Arun Mahadevan
Heres a proposal to a add - https://github.com/apache/spark/pull/21819 Its always good to set "maxOffsetsPerTrigger" unless you want spark to process till the end of the stream in each micro batch. Even without "maxOffsetsPerTrigger" the lag can be non-zero by the time the micro batch completes.

Re: Kafka backlog - spark structured streaming

2018-07-30 Thread Burak Yavuz
If you don't set rate limiting through `maxOffsetsPerTrigger`, Structured Streaming will always process until the end of the stream. So number of records waiting to be processed should be 0 at the start of each trigger. On Mon, Jul 30, 2018 at 8:03 AM, Kailash Kalahasti <