[ https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601625#comment-13601625 ]
Bikas Saha commented on YARN-378: --------------------------------- +1 for Vinods comments. Also, personally, I would break down the following code in 2 places. First in some init method that reads the global value from config, checks for errors and sets a sensible default global value. Once that is done, use the appValue and globalValue to set the actual value. The current code is making me think more than I need to IMO. {code} + int numRMAMRetries = conf.getInt(YarnConfiguration.RM_AM_MAX_RETRIES, YarnConfiguration.DEFAULT_RM_AM_MAX_RETRIES); + int numAPPAMRetries = submissionContext.getNumMaxRetries(); + if (numAPPAMRetries <= 0) { + if (numRMAMRetries <= 0) { + // AM needs to try once at least + this.maxRetries = 1; + LOG.error("AM Retries is wrongly configured. The specific AM Retries: " + + numAPPAMRetries + " for application: " + + applicationId.getId() + ", the global AM Retries: " + + numRMAMRetries); + } else { + this.maxRetries = numRMAMRetries; + } + } else { + if (numAPPAMRetries <= numRMAMRetries) { + this.maxRetries = numAPPAMRetries; + } else { + this.maxRetries = numRMAMRetries; + LOG.warn("The specific AM Retries: " + numAPPAMRetries + + " for application: " + applicationId.getId() + + " is larger than the global AM Retries: " + numRMAMRetries + + ". Use the global AM Retries instead."); + } + } {code} Secondly, IMO the use of Retry in the name is confusing since we need a minimum value 1 for the first attempt and the first attempt is not a retry. alternative name could be maxAppAttempts If we continue to use retry in the name then its value should be 0 if the attempt is launched only once, since number of retries = 0. > ApplicationMaster retry times should be set by Client > ----------------------------------------------------- > > Key: YARN-378 > URL: https://issues.apache.org/jira/browse/YARN-378 > Project: Hadoop YARN > Issue Type: Sub-task > Components: client, resourcemanager > Environment: suse > Reporter: xieguiming > Assignee: Zhijie Shen > Labels: usability > Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, > YARN-378_4.patch, YARN-378_5.patch, YARN-378_6.patch, YARN-378_6.patch > > > We should support that different client or user have different > ApplicationMaster retry times. It also say that > "yarn.resourcemanager.am.max-retries" should be set by client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira