It’s not intermittent, seems to happen everytime spark fails when it starts
up from last checkpoint and complains the offset is old. I checked the
offset and it is indeed true the offset expired from kafka side. My version
of spark is 2.4.4 using kafka 0.10
On Sun, Apr 19, 2020 at 3:38 PM
That sounds odd. Is it intermittent, or always reproducible if you starts
with same checkpoint? What's the version of Spark?
On Fri, Apr 17, 2020 at 6:17 AM Ruijing Li wrote:
> Hi all,
>
> I have a question on how structured streaming does checkpointing. I’m
> noticing that spark is not reading
Hi all,
I have a question on how structured streaming does checkpointing. I’m
noticing that spark is not reading from the max / latest offset it’s seen.
For example, in HDFS, I see it stored offset file 30 which contains
partition: offset {1: 2000}
But instead after stopping the job and