I think Yarn ResourceManager has the mechanism to relaunch the driver on failure. But I am uncertain. Could someone help on this? Thanks.
At 2015-08-19 16:37:32, "Spark Enthusiast" <sparkenthusi...@yahoo.in> wrote: Thanks for the reply. Are Standalone or Mesos the only options? Is there a way to auto relaunch if driver runs as a Hadoop Yarn Application? On Wednesday, 19 August 2015 12:49 PM, Todd <bit1...@163.com> wrote: There is an option for the spark-submit (Spark standalone or Mesos with cluster deploy mode only) --supervise If given, restarts the driver on failure. At 2015-08-19 14:55:39, "Spark Enthusiast" <sparkenthusi...@yahoo.in> wrote: Folks, As I see, the Driver program is a single point of failure. Now, I have seen ways as to how to make it recover from failures on a restart (using Checkpointing) but I have not seen anything as to how to restart it automatically if it crashes. Will running the Driver as a Hadoop Yarn Application do it? Can someone educate me as to how?