You also need to ensure you're using checkpointing and support recreating the context on driver failure as described in the docs here: http://spark.apache.org/docs/latest/streaming-programming-guide.html#failure-of-the-driver-node
From: Matt Narrell <[email protected]<mailto:[email protected]>> Date: Thursday, August 14, 2014 at 10:34 AM To: Tobias Pfeiffer <[email protected]<mailto:[email protected]>> Cc: salemi <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: spark streaming : what is the best way to make a driver highly available I'd suggest something like Apache YARN, or Apache Mesos with Marathon or something similar to allow for management, in particular restart on failure. mn On Aug 13, 2014, at 7:15 PM, Tobias Pfeiffer <[email protected]<mailto:[email protected]>> wrote: Hi, On Thu, Aug 14, 2014 at 5:49 AM, salemi <[email protected]<mailto:[email protected]>> wrote: what is the best way to make a spark streaming driver highly available. I would also be interested in that. In particular for Streaming applications where the Spark driver is running for a long time, this might be important, I think. Thanks Tobias
