[ 
https://issues.apache.org/jira/browse/SPARK-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174999#comment-14174999
 ] 

Thomas Graves commented on SPARK-3877:
--------------------------------------

[~vanzin]  I agree. The user code should be exiting with non-zero or throwing 
on failure.  If they aren't then there is nothing we can do about it, other 
then tell them to change their code to properly exit if they want to see 
failure status. Perhaps we should better document what they should do on 
failure too.   Its basically the same I did for the exit codes in 
ApplicationMaster. It relies on user code exiting non-zero and throwing.

The only other option would be for us to actually look at the details in the 
scheduler ourselves to try to determine what happened.  ie we see Stage X 
failed or Y tasks failed, etc.  I would say we do that later if its needed. 



> The exit code of spark-submit is still 0 when an yarn application fails
> -----------------------------------------------------------------------
>
>                 Key: SPARK-3877
>                 URL: https://issues.apache.org/jira/browse/SPARK-3877
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>            Reporter: Shixiong Zhu
>            Priority: Minor
>              Labels: yarn
>
> When an yarn application fails (yarn-cluster mode), the exit code of 
> spark-submit is still 0. It's hard for people to write some automatic scripts 
> to run spark jobs in yarn because the failure can not be detected in these 
> scripts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to