Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-07 Thread Cody Koeninger
Kafka as yet doesn't have a good way to distinguish between "it's ok to reset at the beginning of a job" and "it's ok to reset any time offsets are out of range" https://issues.apache.org/jira/browse/KAFKA-3370 Because of that, it's better IMHO to err on the side of obvious failures, as opposed

Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-07 Thread Srikanth
Yes that's right. I understand this is a data loss. The restart doesn't have to be all that silent. It requires us to set a flag. I thought auto.offset.reset is that flag. But there isn't much I can do at this point given that retention has cleaned things up. The app has to start. Let admins

Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-07 Thread Cody Koeninger
Just so I'm clear on what's happening... - you're running a job that auto-commits offsets to kafka. - you stop that job for longer than your retention - you start that job back up, and it errors because the last committed offset is no longer available - you think that instead of erroring, the job

Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-07 Thread Srikanth
My retention is 1d which isn't terribly low. The problem is every time I restart after retention expiry, I get this exception instead of honoring auto.offset.reset. It isn't a corner case where retention expired after driver created a batch. Its easily reproducible and consistent. On Tue, Sep 6,

Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-06 Thread Cody Koeninger
You don't want auto.offset.reset on executors, you want executors to do what the driver told them to do. Otherwise you're going to get really horrible data inconsistency issues if the executors silently reset. If your retention is so low that retention gets expired in between when the driver

Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-06 Thread Srikanth
This isn't a production setup. We kept retention low intentionally. My original question was why I got the exception instead of it using auto.offset.reset on restart? On Tue, Sep 6, 2016 at 10:48 AM, Cody Koeninger wrote: > If you leave enable.auto.commit set to true, it

Re: Reset auto.offset.reset in Kafka 0.10 integ

2016-09-06 Thread Cody Koeninger
If you leave enable.auto.commit set to true, it will commit offsets to kafka, but you will get undefined delivery semantics. If you just want to restart from a fresh state, the easiest thing to do is use a new consumer group name. But if that keeps happening, you should look into why your

Reset auto.offset.reset in Kafka 0.10 integ

2016-09-02 Thread Srikanth
Hi, Upon restarting my Spark Streaming app it is failing with error Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 6, localhost):