[ https://issues.apache.org/jira/browse/YARN-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14175814#comment-14175814 ]
Zhijie Shen commented on YARN-2673: ----------------------------------- 1. Remove RESTful to be neutral in case the channel is changed? {code} + /** Timeline client RESTful call, max retries (-1 means no limit) */ {code} 2. Add the configs to yarn-default.xml as well 3. It's not necessary any more, because the vars are only set once in the constructor. In testCheckRetryCount, you can compose config to set the values you want to use. {code} @Private @VisibleForTesting public synchronized void changeRetrySettings(int maxRetries, long interval) { this.maxRetries = maxRetries; this.retryInterval = interval; } {code} {code} // synchronously get a snapshot of current retry settings int leftRetries = 0; long sleepMs; retried = false; synchronized (this) { leftRetries = maxRetries; sleepMs = retryInterval; } {code} 4. In testCheckRetryCount, response is not used. BTW, YARN-2676 is going to change the TimelineClient code. This patch is subject to rebase. > Add retry for timeline client put APIs > -------------------------------------- > > Key: YARN-2673 > URL: https://issues.apache.org/jira/browse/YARN-2673 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Li Lu > Assignee: Li Lu > Attachments: YARN-2673-101414-1.patch, YARN-2673-101414-2.patch, > YARN-2673-101414.patch, YARN-2673-101714.patch > > > Timeline client now does not handle the case gracefully when the server is > down. Jobs from distributed shell may fail due to ATS restart. We may need to > add some retry mechanisms to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)