On 1 Oct 2015, at 16:52, Adrian Tanase 
<atan...@adobe.com<mailto:atan...@adobe.com>> wrote:

This happens automatically as long as you submit with cluster mode instead of 
client mode. (e.g. ./spark-submit —master yarn-cluster …)

The property you mention would help right after that, although you will need to 
set it to a large value (e.g. 1000?) - as there is no “infinite” support.


that doesn't catch very broken apps.

There is a way during app submission for the application launcher to specify a 
reset window; a time after which failures are reset

Its launcher-API only, and spark doesn't (currently) set it:

https://issues.apache.org/jira/browse/YARN-611


it could be done in a hadoop-version neutral way using introspection, otherwise 
you'll have to patch the source for a version of spark that only builds/runs 
against Hadoop 2.6


-adrian

From: Jeetendra Gangele
Date: Thursday, October 1, 2015 at 4:30 PM
To: user
Subject: automatic start of streaming job on failure on YARN


We've a streaming application running on yarn and we would like to ensure that 
is up running 24/7.

Is there a way to tell yarn to automatically restart a specific application on 
failure?

There is property yarn.resourcemanager.am.max-attempts which is default set to 
2 setting it to bigger value is the solution? Also I did observed this does not 
seems to work my application is failing and not starting automatically.

Mesos has this build in support wondering why yarn is lacking here?



Regards

jeetendra

Reply via email to