[ 
https://issues.apache.org/jira/browse/SPARK-20658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001628#comment-16001628
 ] 

Paul Jones commented on SPARK-20658:
------------------------------------

Ah...  This is using Amazons version of Hadoop 2.7.3

{noformat}
$ hadoop version
Hadoop 2.7.3-amzn-1
Subversion g...@aws157git.com:/pkg/Aws157BigTop -r 
30eccced8ce8c483445f0aa3175ce725831ff06b
Compiled by ec2-user on 2017-02-17T17:59Z
Compiled with protoc 2.5.0
>From source with checksum 1833aada17b94cfb94ad40ccd02d3df8
This command was run using /usr/lib/hadoop/hadoop-common-2.7.3-amzn-1.jar
{noformat}

> spark.yarn.am.attemptFailuresValidityInterval doesn't seem to have an effect
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-20658
>                 URL: https://issues.apache.org/jira/browse/SPARK-20658
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 2.1.0
>            Reporter: Paul Jones
>            Priority: Minor
>
> I'm running a job in YARN cluster mode using 
> `spark.yarn.am.attemptFailuresValidityInterval=1h` specified in both 
> spark-default.conf and in my spark-submit command. (This flag shows up in the 
> environment tab of spark history server, so it seems that it's specified 
> correctly). 
> However, I just had a job die with with four AM failures (three of the four 
> failures were over an hour apart). So, I'm confused as to what could be going 
> on. I haven't figured out the cause of the individual failures, so is it 
> possible that we always count certain types of failures? E.g. jobs that are 
> killed due to memory issues always count? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to