[ https://issues.apache.org/jira/browse/YARN-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388710#comment-15388710 ]
Jun Gong commented on YARN-5333: -------------------------------- Thanks [~sunilg]. Yes, fail-fast seems better. {quote} However one more thing worries me. with this patch, reinitialize queue is done before starting the active services. Still I suggest to confirm once, I will also verify and will update if I find some leads. {quote} Thanks for it! I will check it too. If it is OK and without more comments, I will update the patch to address the 'fail-fast' problem. > Some recovered apps are put into default queue when RM HA > --------------------------------------------------------- > > Key: YARN-5333 > URL: https://issues.apache.org/jira/browse/YARN-5333 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Jun Gong > Assignee: Jun Gong > Attachments: YARN-5333.01.patch, YARN-5333.02.patch, > YARN-5333.03.patch > > > Enable RM HA and use FairScheduler, > {{yarn.scheduler.fair.allow-undeclared-pools}} is set to false, > {{yarn.scheduler.fair.user-as-default-queue}} is set to false. > Reproduce steps: > 1. Start two RMs. > 2. After RMs are running, change both RM's file > {{etc/hadoop/fair-scheduler.xml}}, then add some queues. > 3. Submit some apps to the new added queues. > 4. Stop the active RM, then the standby RM will transit to active and recover > apps. > However the new active RM will put recovered apps into default queue because > it might have not loaded the new {{fair-scheduler.xml}}. We need call > {{initScheduler}} before start active services or bring {{refreshAll()}} in > front of {{rm.transitionToActive()}}. *It seems it is also important for > other scheduler*. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org