Re: Adding / Removing worker nodes for Spark Streaming

Augustus Hong Mon, 28 Sep 2015 11:44:36 -0700

Got it, thank you!


On Mon, Sep 28, 2015 at 11:37 AM, Cody Koeninger <c...@koeninger.org> wrote:

> Losing worker nodes without stopping is definitely possible.  I haven't
> had much success adding workers to a running job, but I also haven't spent
> much time on it.
>
> If you're restarting with the same jar, you should be able to recover from
> checkpoint without losing data (usual caveats apply, e.g. you need enough
> kafka retention).  Make sure to test it though, as the code paths taken
> during recovery from checkpoint are not the same as on initial startup, and
> you can run into unexpected issues (e.g. authentication).
>
> On Mon, Sep 28, 2015 at 1:27 PM, Augustus Hong <augus...@branchmetrics.io>
> wrote:
>
>> Hey all,
>>
>> I'm evaluating using Spark Streaming with Kafka direct streaming, and I
>> have a couple of questions:
>>
>> 1.  Would it be possible to add / remove worker nodes without stopping
>> and restarting the spark streaming driver?
>>
>> 2.  I understand that we can enable checkpointing to recover from node
>> failures, and that it doesn't work across code changes.  What about in the
>> event that worker nodes failed due to load -> we added more worker nodes ->
>> restart Spark Streaming?  Would this incur data loss as well?
>>
>>
>> Best,
>> Augustus
>>
>> --
>> [image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus
>> Hong*
>>  Data Analytics | Branch Metrics
>>  m 650-391-3369 | e augus...@branch.io
>>
>
>


-- 
[image: Branch Metrics mobile deep linking] <http://branch.io/>* Augustus
Hong*
 Data Analytics | Branch Metrics
 m 650-391-3369 | e augus...@branch.io

Re: Adding / Removing worker nodes for Spark Streaming

Reply via email to