Re: How to gracefully handle Kafka OffsetOutOfRangeException

2017-03-10 Thread Justin Miller
I've created a ticket here: https://issues.apache.org/jira/browse/SPARK-19888 


Thanks,
Justin

> On Mar 10, 2017, at 1:14 PM, Michael Armbrust  wrote:
> 
> If you have a reproduction you should open a JIRA.  It would be great if 
> there is a fix.  I'm just saying I know a similar issue does not exist in 
> structured streaming.
> 
> On Fri, Mar 10, 2017 at 7:46 AM, Justin Miller  > wrote:
> Hi Michael,
> 
> I'm experiencing a similar issue. Will this not be fixed in Spark Streaming?
> 
> Best,
> Justin
> 
>> On Mar 10, 2017, at 8:34 AM, Michael Armbrust > > wrote:
>> 
>> One option here would be to try Structured Streaming.  We've added an option 
>> "failOnDataLoss" that will cause Spark to just skip a head when this 
>> exception is encountered (its off by default though so you don't silently 
>> miss data).
>> 
>> On Fri, Mar 18, 2016 at 4:16 AM, Ramkumar Venkataraman 
>> mailto:ram.the.m...@gmail.com>> wrote:
>> I am using Spark streaming and reading data from Kafka using
>> KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
>> smallest.
>> 
>> But in some Kafka partitions, I get kafka.common.OffsetOutOfRangeException
>> and my spark job crashes.
>> 
>> I want to understand if there is a graceful way to handle this failure and
>> not kill the job. I want to keep ignoring these exceptions, as some other
>> partitions are fine and I am okay with data loss.
>> 
>> Is there any way to handle this and not have my spark job crash? I have no
>> option of increasing the kafka retention period.
>> 
>> I tried to have the DStream returned by createDirectStream() wrapped in a
>> Try construct, but since the exception happens in the executor, the Try
>> construct didn't take effect. Do you have any ideas of how to handle this?
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
>>  
>> 
>> Sent from the Apache Spark User List mailing list archive at Nabble.com 
>> .
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
>> 
>> For additional commands, e-mail: user-h...@spark.apache.org 
>> 
>> 
>> 
> 
> 



Re: How to gracefully handle Kafka OffsetOutOfRangeException

2017-03-10 Thread Michael Armbrust
If you have a reproduction you should open a JIRA.  It would be great if
there is a fix.  I'm just saying I know a similar issue does not exist in
structured streaming.

On Fri, Mar 10, 2017 at 7:46 AM, Justin Miller <
justin.mil...@protectwise.com> wrote:

> Hi Michael,
>
> I'm experiencing a similar issue. Will this not be fixed in Spark
> Streaming?
>
> Best,
> Justin
>
> On Mar 10, 2017, at 8:34 AM, Michael Armbrust 
> wrote:
>
> One option here would be to try Structured Streaming.  We've added an
> option "failOnDataLoss" that will cause Spark to just skip a head when this
> exception is encountered (its off by default though so you don't silently
> miss data).
>
> On Fri, Mar 18, 2016 at 4:16 AM, Ramkumar Venkataraman <
> ram.the.m...@gmail.com> wrote:
>
>> I am using Spark streaming and reading data from Kafka using
>> KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
>> smallest.
>>
>> But in some Kafka partitions, I get kafka.common.OffsetOutOfRangeE
>> xception
>> and my spark job crashes.
>>
>> I want to understand if there is a graceful way to handle this failure and
>> not kill the job. I want to keep ignoring these exceptions, as some other
>> partitions are fine and I am okay with data loss.
>>
>> Is there any way to handle this and not have my spark job crash? I have no
>> option of increasing the kafka retention period.
>>
>> I tried to have the DStream returned by createDirectStream() wrapped in a
>> Try construct, but since the exception happens in the executor, the Try
>> construct didn't take effect. Do you have any ideas of how to handle this?
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetO
>> utOfRangeException-tp26534.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>


Re: How to gracefully handle Kafka OffsetOutOfRangeException

2017-03-10 Thread Justin Miller
Hi Michael,

I'm experiencing a similar issue. Will this not be fixed in Spark Streaming?

Best,
Justin

> On Mar 10, 2017, at 8:34 AM, Michael Armbrust  wrote:
> 
> One option here would be to try Structured Streaming.  We've added an option 
> "failOnDataLoss" that will cause Spark to just skip a head when this 
> exception is encountered (its off by default though so you don't silently 
> miss data).
> 
> On Fri, Mar 18, 2016 at 4:16 AM, Ramkumar Venkataraman 
> mailto:ram.the.m...@gmail.com>> wrote:
> I am using Spark streaming and reading data from Kafka using
> KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
> smallest.
> 
> But in some Kafka partitions, I get kafka.common.OffsetOutOfRangeException
> and my spark job crashes.
> 
> I want to understand if there is a graceful way to handle this failure and
> not kill the job. I want to keep ignoring these exceptions, as some other
> partitions are fine and I am okay with data loss.
> 
> Is there any way to handle this and not have my spark job crash? I have no
> option of increasing the kafka retention period.
> 
> I tried to have the DStream returned by createDirectStream() wrapped in a
> Try construct, but since the exception happens in the executor, the Try
> construct didn't take effect. Do you have any ideas of how to handle this?
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
>  
> 
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> 
> For additional commands, e-mail: user-h...@spark.apache.org 
> 
> 
> 



Re: How to gracefully handle Kafka OffsetOutOfRangeException

2017-03-10 Thread Michael Armbrust
One option here would be to try Structured Streaming.  We've added an
option "failOnDataLoss" that will cause Spark to just skip a head when this
exception is encountered (its off by default though so you don't silently
miss data).

On Fri, Mar 18, 2016 at 4:16 AM, Ramkumar Venkataraman <
ram.the.m...@gmail.com> wrote:

> I am using Spark streaming and reading data from Kafka using
> KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
> smallest.
>
> But in some Kafka partitions, I get kafka.common.OffsetOutOfRangeException
> and my spark job crashes.
>
> I want to understand if there is a graceful way to handle this failure and
> not kill the job. I want to keep ignoring these exceptions, as some other
> partitions are fine and I am okay with data loss.
>
> Is there any way to handle this and not have my spark job crash? I have no
> option of increasing the kafka retention period.
>
> I tried to have the DStream returned by createDirectStream() wrapped in a
> Try construct, but since the exception happens in the executor, the Try
> construct didn't take effect. Do you have any ideas of how to handle this?
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-
> OffsetOutOfRangeException-tp26534.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: How to gracefully handle Kafka OffsetOutOfRangeException

2017-03-10 Thread Ramkumar Venkataraman
Nope, but when we migrated to spark 1.6, we didnt see the errors yet. Not
sure if they fixed in between releases or it just be a weird timing thing
that we havent discovered yet in 1.6 as well.

On Sat, Mar 4, 2017 at 12:00 AM, nimmi.cv [via Apache Spark User List] <
ml-node+s1001560n28454...@n3.nabble.com> wrote:

> Did you find out how ?
>
> --
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-
> gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534p28454.html
> To unsubscribe from How to gracefully handle Kafka
> OffsetOutOfRangeException, click here
> 
> .
> NAML
> 
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534p28479.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to gracefully handle Kafka OffsetOutOfRangeException

2016-03-21 Thread Ramkumar Venkataraman
Which is what surprises me as well. I am able to consistently reproduce
this on my spark 1.5.2 - the same spark job crashes immediately without
checkpointing, but when I enable it, the job continues inspite of the
exceptions.

On Mon, Mar 21, 2016 at 8:25 PM, Cody Koeninger  wrote:

> Spark streaming in general will retry a batch N times then move on to
> the next one... off the top of my head, I'm not sure why checkpointing
> would have an effect on that.
>
> On Mon, Mar 21, 2016 at 3:25 AM, Ramkumar Venkataraman
>  wrote:
> > Thanks Cody for the quick help. Yes, the exception is happening in the
> > executors during processing. I will look into cloning the KafkaRDD and
> > swallowing the exception.
> >
> > But, something weird is happening: when I enable checkpointing on the
> job,
> > my job doesn't crash, it happily proceeds with the next batch, even
> though I
> > see tons of exceptions in the executor logs. So the question is: why is
> it
> > that the spark job doesn't crash when checkpointing is enabled?
> >
> > I have my code pasted here:
> > https://gist.github.com/ramkumarvenkat/00f4fc63f750c537defd
> >
> > I am not too sure if this is an issue with spark engine or with the
> > streaming module. Please let me know if you need more logs or you want
> me to
> > raise a github issue/JIRA.
> >
> > Sorry for digressing on the original thread.
> >
> > On Fri, Mar 18, 2016 at 8:10 PM, Cody Koeninger 
> wrote:
> >>
> >> Is that happening only at startup, or during processing?  If that's
> >> happening during normal operation of the stream, you don't have enough
> >> resources to process the stream in time.
> >>
> >> There's not a clean way to deal with that situation, because it's a
> >> violation of preconditions.  If you want to modify the code to do what
> >> makes sense for you, start looking at handleFetchErr in KafkaRDD.scala
> >>   Recompiling that package isn't a big deal, because it's not a part
> >> of the core spark deployment, so you'll only have to change your job,
> >> not the deployed version of spark.
> >>
> >>
> >>
> >> On Fri, Mar 18, 2016 at 6:16 AM, Ramkumar Venkataraman
> >>  wrote:
> >> > I am using Spark streaming and reading data from Kafka using
> >> > KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
> >> > smallest.
> >> >
> >> > But in some Kafka partitions, I get
> >> > kafka.common.OffsetOutOfRangeException
> >> > and my spark job crashes.
> >> >
> >> > I want to understand if there is a graceful way to handle this failure
> >> > and
> >> > not kill the job. I want to keep ignoring these exceptions, as some
> >> > other
> >> > partitions are fine and I am okay with data loss.
> >> >
> >> > Is there any way to handle this and not have my spark job crash? I
> have
> >> > no
> >> > option of increasing the kafka retention period.
> >> >
> >> > I tried to have the DStream returned by createDirectStream() wrapped
> in
> >> > a
> >> > Try construct, but since the exception happens in the executor, the
> Try
> >> > construct didn't take effect. Do you have any ideas of how to handle
> >> > this?
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >> >
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
> >> > Sent from the Apache Spark User List mailing list archive at
> Nabble.com.
> >> >
> >> > -
> >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> >> > For additional commands, e-mail: user-h...@spark.apache.org
> >> >
> >
> >
>


Re: How to gracefully handle Kafka OffsetOutOfRangeException

2016-03-21 Thread Cody Koeninger
Spark streaming in general will retry a batch N times then move on to
the next one... off the top of my head, I'm not sure why checkpointing
would have an effect on that.

On Mon, Mar 21, 2016 at 3:25 AM, Ramkumar Venkataraman
 wrote:
> Thanks Cody for the quick help. Yes, the exception is happening in the
> executors during processing. I will look into cloning the KafkaRDD and
> swallowing the exception.
>
> But, something weird is happening: when I enable checkpointing on the job,
> my job doesn't crash, it happily proceeds with the next batch, even though I
> see tons of exceptions in the executor logs. So the question is: why is it
> that the spark job doesn't crash when checkpointing is enabled?
>
> I have my code pasted here:
> https://gist.github.com/ramkumarvenkat/00f4fc63f750c537defd
>
> I am not too sure if this is an issue with spark engine or with the
> streaming module. Please let me know if you need more logs or you want me to
> raise a github issue/JIRA.
>
> Sorry for digressing on the original thread.
>
> On Fri, Mar 18, 2016 at 8:10 PM, Cody Koeninger  wrote:
>>
>> Is that happening only at startup, or during processing?  If that's
>> happening during normal operation of the stream, you don't have enough
>> resources to process the stream in time.
>>
>> There's not a clean way to deal with that situation, because it's a
>> violation of preconditions.  If you want to modify the code to do what
>> makes sense for you, start looking at handleFetchErr in KafkaRDD.scala
>>   Recompiling that package isn't a big deal, because it's not a part
>> of the core spark deployment, so you'll only have to change your job,
>> not the deployed version of spark.
>>
>>
>>
>> On Fri, Mar 18, 2016 at 6:16 AM, Ramkumar Venkataraman
>>  wrote:
>> > I am using Spark streaming and reading data from Kafka using
>> > KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
>> > smallest.
>> >
>> > But in some Kafka partitions, I get
>> > kafka.common.OffsetOutOfRangeException
>> > and my spark job crashes.
>> >
>> > I want to understand if there is a graceful way to handle this failure
>> > and
>> > not kill the job. I want to keep ignoring these exceptions, as some
>> > other
>> > partitions are fine and I am okay with data loss.
>> >
>> > Is there any way to handle this and not have my spark job crash? I have
>> > no
>> > option of increasing the kafka retention period.
>> >
>> > I tried to have the DStream returned by createDirectStream() wrapped in
>> > a
>> > Try construct, but since the exception happens in the executor, the Try
>> > construct didn't take effect. Do you have any ideas of how to handle
>> > this?
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
>> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> >
>> > -
>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> > For additional commands, e-mail: user-h...@spark.apache.org
>> >
>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How to gracefully handle Kafka OffsetOutOfRangeException

2016-03-21 Thread Ramkumar Venkataraman
Thanks Cody for the quick help. Yes, the exception is happening in the
executors during processing. I will look into cloning the KafkaRDD and
swallowing the exception.

But, something weird is happening: when I enable checkpointing on the job,
my job doesn't crash, it happily proceeds with the next batch, even though
I see tons of exceptions in the executor logs. So the question is: why is
it that the spark job doesn't crash when checkpointing is enabled?

I have my code pasted here:
https://gist.github.com/ramkumarvenkat/00f4fc63f750c537defd

I am not too sure if this is an issue with spark engine or with the
streaming module. Please let me know if you need more logs or you want me
to raise a github issue/JIRA.

Sorry for digressing on the original thread.

On Fri, Mar 18, 2016 at 8:10 PM, Cody Koeninger  wrote:

> Is that happening only at startup, or during processing?  If that's
> happening during normal operation of the stream, you don't have enough
> resources to process the stream in time.
>
> There's not a clean way to deal with that situation, because it's a
> violation of preconditions.  If you want to modify the code to do what
> makes sense for you, start looking at handleFetchErr in KafkaRDD.scala
>   Recompiling that package isn't a big deal, because it's not a part
> of the core spark deployment, so you'll only have to change your job,
> not the deployed version of spark.
>
>
>
> On Fri, Mar 18, 2016 at 6:16 AM, Ramkumar Venkataraman
>  wrote:
> > I am using Spark streaming and reading data from Kafka using
> > KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
> > smallest.
> >
> > But in some Kafka partitions, I get
> kafka.common.OffsetOutOfRangeException
> > and my spark job crashes.
> >
> > I want to understand if there is a graceful way to handle this failure
> and
> > not kill the job. I want to keep ignoring these exceptions, as some other
> > partitions are fine and I am okay with data loss.
> >
> > Is there any way to handle this and not have my spark job crash? I have
> no
> > option of increasing the kafka retention period.
> >
> > I tried to have the DStream returned by createDirectStream() wrapped in a
> > Try construct, but since the exception happens in the executor, the Try
> > construct didn't take effect. Do you have any ideas of how to handle
> this?
> >
> >
> >
> > --
> > View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
> > Sent from the Apache Spark User List mailing list archive at Nabble.com.
> >
> > -
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>


Re: How to gracefully handle Kafka OffsetOutOfRangeException

2016-03-19 Thread Sebastian Piu
Try to toubleshoot why it is happening, maybe some messages are too big to
be read from the topic? I remember getting that error and that was the cause

On Fri, Mar 18, 2016 at 11:16 AM Ramkumar Venkataraman <
ram.the.m...@gmail.com> wrote:

> I am using Spark streaming and reading data from Kafka using
> KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
> smallest.
>
> But in some Kafka partitions, I get kafka.common.OffsetOutOfRangeException
> and my spark job crashes.
>
> I want to understand if there is a graceful way to handle this failure and
> not kill the job. I want to keep ignoring these exceptions, as some other
> partitions are fine and I am okay with data loss.
>
> Is there any way to handle this and not have my spark job crash? I have no
> option of increasing the kafka retention period.
>
> I tried to have the DStream returned by createDirectStream() wrapped in a
> Try construct, but since the exception happens in the executor, the Try
> construct didn't take effect. Do you have any ideas of how to handle this?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: How to gracefully handle Kafka OffsetOutOfRangeException

2016-03-19 Thread Cody Koeninger
Is that happening only at startup, or during processing?  If that's
happening during normal operation of the stream, you don't have enough
resources to process the stream in time.

There's not a clean way to deal with that situation, because it's a
violation of preconditions.  If you want to modify the code to do what
makes sense for you, start looking at handleFetchErr in KafkaRDD.scala
  Recompiling that package isn't a big deal, because it's not a part
of the core spark deployment, so you'll only have to change your job,
not the deployed version of spark.



On Fri, Mar 18, 2016 at 6:16 AM, Ramkumar Venkataraman
 wrote:
> I am using Spark streaming and reading data from Kafka using
> KafkaUtils.createDirectStream. I have the "auto.offset.reset" set to
> smallest.
>
> But in some Kafka partitions, I get kafka.common.OffsetOutOfRangeException
> and my spark job crashes.
>
> I want to understand if there is a graceful way to handle this failure and
> not kill the job. I want to keep ignoring these exceptions, as some other
> partitions are fine and I am okay with data loss.
>
> Is there any way to handle this and not have my spark job crash? I have no
> option of increasing the kafka retention period.
>
> I tried to have the DStream returned by createDirectStream() wrapped in a
> Try construct, but since the exception happens in the executor, the Try
> construct didn't take effect. Do you have any ideas of how to handle this?
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-gracefully-handle-Kafka-OffsetOutOfRangeException-tp26534.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org