When submitting to YARN, you can specify two different operation modes for the driver with the "--master" parameter: yarn-client or yarn-cluster. For more information on submitting to YARN, see this page in the Spark docs: http://spark.apache.org/docs/latest/running-on-yarn.html
yarn-cluster mode will run the driver inside of the Application Master, which will be retried on failure. The number of retries is dependent on the yarn.resourcemanager.am.max-attempts configuration setting for the YARN ResourceManager. Regards, Will On Wed, Aug 19, 2015 at 2:55 AM, Spark Enthusiast <sparkenthusi...@yahoo.in> wrote: > Folks, > > As I see, the Driver program is a single point of failure. Now, I have > seen ways as to how to make it recover from failures on a restart (using > Checkpointing) but I have not seen anything as to how to restart it > automatically if it crashes. > > Will running the Driver as a Hadoop Yarn Application do it? Can someone > educate me as to how? >