There is an option for the spark-submit (Spark standalone or Mesos with cluster deploy mode only) --supervise If given, restarts the driver on failure.
At 2015-08-19 14:55:39, "Spark Enthusiast" <sparkenthusi...@yahoo.in> wrote: Folks, As I see, the Driver program is a single point of failure. Now, I have seen ways as to how to make it recover from failures on a restart (using Checkpointing) but I have not seen anything as to how to restart it automatically if it crashes. Will running the Driver as a Hadoop Yarn Application do it? Can someone educate me as to how?