Hi Guillermo, What was the interval in between restarting the spark job? As a feature in Kafka, a broker deleted offsets for a consumer group after inactivity of 24 hours. In such a case, the newly started spark streaming job will read offsets from beginning for the same groupId.
Akshay Bhardwaj +91-97111-33849 On Thu, Feb 21, 2019 at 9:08 PM Gabor Somogyi <gabor.g.somo...@gmail.com> wrote: > From the info you've provided not much to say. > Maybe you could collect sample app, logs etc, open a jira and we can take > a deeper look at it... > > BR, > G > > > On Thu, Feb 21, 2019 at 4:14 PM Guillermo Ortiz <konstt2...@gmail.com> > wrote: > >> I' working with Spark Streaming 2.0.2 and Kafka 1.0.0 using Direct Stream >> as connector. I consume data from Kafka and autosave the offsets. >> I can see Spark doing commits in the logs of the last offsets processed, >> Sometimes I have restarted spark and it starts from the beginning, when I'm >> using the same groupId. >> >> Why could it happen? it only happen rarely. >> >