Heres a proposal to a add - https://github.com/apache/spark/pull/21819

Its always good to set "maxOffsetsPerTrigger" unless you want spark to
process till the end of the stream in each micro batch. Even without
"maxOffsetsPerTrigger" the lag can be non-zero by the time the micro batch
completes.

On 30 July 2018 at 08:50, Burak Yavuz <brk...@gmail.com> wrote:

> If you don't set rate limiting through `maxOffsetsPerTrigger`, Structured
> Streaming will always process until the end of the stream. So number of
> records waiting to be processed should be 0 at the start of each trigger.
>
> On Mon, Jul 30, 2018 at 8:03 AM, Kailash Kalahasti <
> kailash.kalaha...@gmail.com> wrote:
>
>> Is there any way to find out backlog on kafka topic while using spark
>> structured streaming ? I checked few consumer apis but that requires to
>> enable groupid for streaming, but seems it is not allowed.
>>
>> Basically i want to know number of records waiting to be processed.
>>
>> Any suggestions ?
>>
>
>

Reply via email to