[ 
https://issues.apache.org/jira/browse/YARN-10871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-10871:
------------------------------------

    Assignee: Srinivas S T  (was: Prabhu Joseph)

> Aborted AM is considered as App Failure when user sets MaxAttempts as 1
> -----------------------------------------------------------------------
>
>                 Key: YARN-10871
>                 URL: https://issues.apache.org/jira/browse/YARN-10871
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: RM
>    Affects Versions: 3.3.1
>            Reporter: Prabhu Joseph
>            Assignee: Srinivas S T
>            Priority: Major
>
> When an AM Container is ABORTED due to Node Decommission, the AppAttempt 
> failure is not counted. But if user sets number of attempts as 1, then YARN 
> considers the ABORTED AM as a failure. 
> {code}
>       int numberOfFailure = app.getNumFailedAppAttempts();
>       if (app.maxAppAttempts == 1) {
>         // If the user explicitly set the attempts to 1 then there are likely
>         // correctness issues if the AM restarts for any reason.
>         LOG.info("Max app attempts is 1 for " + app.applicationId
>             + ", preventing further attempts.");
>         numberOfFailure = app.maxAppAttempts;
>       } 
> {code}
> Livy sets the number of attempts as 1 since it's Rpc Server does not yet 
> support multiple connections for the same registered app. But in our case AM 
> is ABORTED before even the AM starts (AM was in ACAUIRED state)
> Usually users won't decommission the node where the Container is in RUNNING 
> state (where the session is established). But the decommission can happen on 
> nodes where the container is in ACQUIRED or ALLOCATED state. 
> Will suggest to expose an config where user can decide whether to consider 
> this as a failure or not. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to