[jira] [Commented] (YARN-3764) CapacityScheduler should properly handle moving LeafQueue from one parent to another
[ https://issues.apache.org/jira/browse/YARN-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571479#comment-14571479 ] Vinod Kumar Vavilapalli commented on YARN-3764: --- bq. A short term fix is don't allow remove queue under parentQueue. We never supported removing queues. So this is not just a short-term fix, this is the right fix for now. > CapacityScheduler should properly handle moving LeafQueue from one parent to > another > > > Key: YARN-3764 > URL: https://issues.apache.org/jira/browse/YARN-3764 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > > Currently CapacityScheduler doesn't handle the case well, for example: > A queue structure: > {code} > root > | > a (100) > / \ >x y > (50) (50) > {code} > And reinitialize using following structure: > {code} > root > / \ > (50)a x (50) > | > y >(100) > {code} > The actual queue structure after reinitialize is: > {code} > root > /\ >a (50) x (50) > / \ > xy > (50) (100) > {code} > We should handle this case better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3764) CapacityScheduler should properly handle moving LeafQueue from one parent to another
[ https://issues.apache.org/jira/browse/YARN-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571477#comment-14571477 ] Wangda Tan commented on YARN-3764: -- Following test case can verify this issue: {code} @Test public void testQueueParsingWithMoveQueue() throws IOException { YarnConfiguration conf = new YarnConfiguration(); CapacitySchedulerConfiguration csConf = new CapacitySchedulerConfiguration(conf); csConf.setQueues("root", new String[] { "a" }); csConf.setQueues("root.a", new String[] { "x", "y" }); csConf.setCapacity("root.a", 100); csConf.setCapacity("root.a.x", 50); csConf.setCapacity("root.a.y", 50); CapacityScheduler capacityScheduler = new CapacityScheduler(); RMContextImpl rmContext = new RMContextImpl(null, null, null, null, null, null, new RMContainerTokenSecretManager(csConf), new NMTokenSecretManagerInRM(csConf), new ClientToAMTokenSecretManagerInRM(), null); rmContext.setNodeLabelManager(nodeLabelManager); capacityScheduler.setConf(csConf); capacityScheduler.setRMContext(rmContext); capacityScheduler.init(csConf); capacityScheduler.start(); csConf.setQueues("root", new String[] { "a", "x" }); csConf.setQueues("root.a", new String[] { "y" }); csConf.setCapacity("root.x", 50); csConf.setCapacity("root.a", 50); csConf.setCapacity("root.a.y", 100); capacityScheduler.reinitialize(csConf, rmContext); Assert.assertEquals(1, ((ParentQueue) capacityScheduler.getQueue("a")) .getChildQueues().size()); } {code} > CapacityScheduler should properly handle moving LeafQueue from one parent to > another > > > Key: YARN-3764 > URL: https://issues.apache.org/jira/browse/YARN-3764 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > > Currently CapacityScheduler doesn't handle the case well, for example: > A queue structure: > {code} > root > | > a (100) > / \ >x y > (50) (50) > {code} > And reinitialize using following structure: > {code} > root > / \ > (50)a x (50) > | > y >(100) > {code} > The actual queue structure after reinitialize is: > {code} > root > /\ >a (50) x (50) > / \ > xy > (50) (100) > {code} > We should handle this case better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3764) CapacityScheduler should properly handle moving LeafQueue from one parent to another
[ https://issues.apache.org/jira/browse/YARN-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571473#comment-14571473 ] Wangda Tan commented on YARN-3764: -- CS's reinitialize logic creates new queues, but only copies configuration properties to old queue, and new queue will be discarded after reinitialization. A comprehensive fix for this is, copy old queue's run time information to new queue, including runningApplications, etc. And discard old queue after reinitialization. A short term fix is don't allow remove queue under parentQueue. IAW, CS will throw exception if a LeafQueue is moved from one parent to another. I prefer to do comprehensive fix for 2.8.0, and short term fix for 2.7.1/2.6.1 (if required). Thoughts? > CapacityScheduler should properly handle moving LeafQueue from one parent to > another > > > Key: YARN-3764 > URL: https://issues.apache.org/jira/browse/YARN-3764 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan >Assignee: Wangda Tan >Priority: Blocker > > Currently CapacityScheduler doesn't handle the case well, for example: > A queue structure: > {code} > root > | > a (100) > / \ >x y > (50) (50) > {code} > And reinitialize using following structure: > {code} > root > / \ > (50)a x (50) > | > y >(100) > {code} > The actual queue structure after reinitialize is: > {code} > root > /\ >a (50) x (50) > / \ > xy > (50) (100) > {code} > We should handle this case better. -- This message was sent by Atlassian JIRA (v6.3.4#6332)