Re: kafka direct streaming with checkpointing

2015-09-25 Thread Cody Koeninger
ou may have a bug that prevents the state to be saved and you >>can’t restart the app w/o upgrade >> >> Less than ideal, yes :) >> >> -adrian >> >> From: Radu Brumariu >> Date: Friday, September 25, 2015 at 1:31 AM >> To: Cody Koeninger >

Re: kafka direct streaming with checkpointing

2015-09-25 Thread Radu Brumariu
grade > > Less than ideal, yes :) > > -adrian > > From: Radu Brumariu > Date: Friday, September 25, 2015 at 1:31 AM > To: Cody Koeninger > Cc: "user@spark.apache.org" > Subject: Re: kafka direct streaming with checkpointing > > Would changing the direct stre

Re: kafka direct streaming with checkpointing

2015-09-25 Thread Radu Brumariu
app level control of a barrier (e.g. v1 >>> reads events up to 3:00am, v2 after that). Manual state management is also >>> supported by the framework but it’s harder to control because: >>> >>>- you’re not guaranteed to shut down gracefully >>>- Y

Re: kafka direct streaming with checkpointing

2015-09-25 Thread Neelesh
; versions in parallel – with some app level control of a barrier (e.g. v1 >>>> reads events up to 3:00am, v2 after that). Manual state management is also >>>> supported by the framework but it’s harder to control because: >>>> >>>>- you’re not guarantee

Re: kafka direct streaming with checkpointing

2015-09-25 Thread Cody Koeninger
gt; >>>> I believe the simplest way to get around is to support runnning 2 >>>> versions in parallel – with some app level control of a barrier (e.g. v1 >>>> reads events up to 3:00am, v2 after that). Manual state management is also >>>> supported by the f

Re: kafka direct streaming with checkpointing

2015-09-25 Thread Adrian Tanase
e.org<mailto:user@spark.apache.org>" Subject: Re: kafka direct streaming with checkpointing Would changing the direct stream api to support committing the offsets to kafka's ZK( like a regular consumer) as a fallback mechanism, in case recovering from checkpoint fails , be an accepted soluti

kafka direct streaming with checkpointing

2015-09-24 Thread Radu Brumariu
Hi, in my application I use Kafka direct streaming and I have also enabled checkpointing. This seems to work fine if the application is restarted. However if I change the code and resubmit the application, it cannot start because of the checkpointed data being of different class versions. Is there

Re: kafka direct streaming with checkpointing

2015-09-24 Thread Cody Koeninger
No, you cant use checkpointing across code changes. Either store offsets yourself, or start up your new app code and let it catch up before killing the old one. On Thu, Sep 24, 2015 at 8:40 AM, Radu Brumariu wrote: > Hi, > in my application I use Kafka direct streaming and I

Re: kafka direct streaming with checkpointing

2015-09-24 Thread Radu Brumariu
It seems to me that this scenario that I'm facing, is quite common for spark jobs using Kafka. Is there a ticket to add this sort of semantics to checkpointing ? Does it even make sense to add it there ? Thanks, Radu On Thursday, September 24, 2015, Cody Koeninger wrote: >

Re: kafka direct streaming with checkpointing

2015-09-24 Thread Cody Koeninger
This has been discussed numerous times, TD's response has consistently been that it's unlikely to be possible On Thu, Sep 24, 2015 at 12:26 PM, Radu Brumariu wrote: > It seems to me that this scenario that I'm facing, is quite common for > spark jobs using Kafka. > Is there a

Re: kafka direct streaming with checkpointing

2015-09-24 Thread Radu Brumariu
Would changing the direct stream api to support committing the offsets to kafka's ZK( like a regular consumer) as a fallback mechanism, in case recovering from checkpoint fails , be an accepted solution? On Thursday, September 24, 2015, Cody Koeninger wrote: > This has been