[ https://issues.apache.org/jira/browse/YARN-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856951#comment-15856951 ]
ASF GitHub Bot commented on YARN-6061: -------------------------------------- Github user kambatla commented on a diff in the pull request: https://github.com/apache/hadoop/pull/182#discussion_r99949838 --- Diff: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java --- @@ -819,19 +824,39 @@ public void handle(RMFatalEvent event) { } } - public void handleTransitionToStandBy() { - if (rmContext.isHAEnabled()) { - try { - // Transition to standby and reinit active services - LOG.info("Transitioning RM to Standby mode"); - transitionToStandby(true); - EmbeddedElector elector = rmContext.getLeaderElectorService(); - if (elector != null) { - elector.rejoinElection(); + /** + * Transition to standby in a new thread. + */ + public void handleTransitionToStandByInNewThread() { + Thread standByTransitionThread = + new Thread(activeServices.standByTransitionRunnable); + standByTransitionThread.setName("StandByTransitionThread"); + standByTransitionThread.start(); + } + + private class StandByTransitionRunnable implements Runnable { + private AtomicBoolean hasRun = new AtomicBoolean(false); --- End diff -- Maybe, rename this to hasAlreadyRun? And, again add some javadoc here too? > Add a customized uncaughtexceptionhandler for critical threads in RM > -------------------------------------------------------------------- > > Key: YARN-6061 > URL: https://issues.apache.org/jira/browse/YARN-6061 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Reporter: Yufei Gu > Assignee: Yufei Gu > Attachments: YARN-6061.001.patch, YARN-6061.002.patch, > YARN-6061.003.patch, YARN-6061.004.patch, YARN-6061.005.patch, > YARN-6061.006.patch, YARN-6061.007.patch > > > There are several threads in fair scheduler. The thread will quit when there > is a runtime exception inside it. We should bring down the RM when that > happens. Otherwise, there may be some weird behavior in RM. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org