[ https://issues.apache.org/jira/browse/SLIDER-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15022511#comment-15022511 ]
ASF subversion and git services commented on SLIDER-930: -------------------------------------------------------- Commit 1a3fb79d5a78ad772f11b4a121b66c22ab1ba855 in incubator-slider's branch refs/heads/feature/SLIDER-82-pass-3.1 from [~gsaha] [ https://git-wip-us.apache.org/repos/asf?p=incubator-slider.git;h=1a3fb79 ] SLIDER-930 Incorporate Yarn feature of resetting AM failure count into Slider AM (Sherry Guo via gourksaha) > Incorporate Yarn feature of resetting AM failure count into Slider AM > --------------------------------------------------------------------- > > Key: SLIDER-930 > URL: https://issues.apache.org/jira/browse/SLIDER-930 > Project: Slider > Issue Type: Bug > Components: appmaster > Affects Versions: Slider 0.80 > Reporter: Gour Saha > Assignee: Sherry Guo > Fix For: Slider 0.90 > > Attachments: SLIDER-930-001.patch, SLIDER-930-002.patch > > > YARN-611 provides this feature. Currently Slider apps are bound by the number > set for yarn.resourcemanager.am.max-retries in the cluster. By default this > value is set to 2, which is very low for long running services. > Slider AM should use the feature provided in YARN-611 and set an interval > after which the failure count will be reset to 0. > I believe the API to call on ApplicationSubmissionContext is > attemptFailuresValidityInterval. To start with Slider can set it to 5 mins > which should be a reasonable default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)