[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17263018#comment-17263018 ] Wangda Tan edited comment on YARN-10504 at 1/12/21, 1:58 AM: - Committed ver.010 to trunk, thanks to everybody who contributed code ([~bteke], [~zhuqi], [~gandras]) and reviewed the patch ([~sunilg], [~epayne])! was (Author: wangda): Committed to trunk, thanks to everybody who contributed code ([~bteke], [~zhuqi], [~gandras]) and reviewed the patch ([~sunilg], [~epayne])! > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.011.patch, > YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262853#comment-17262853 ] Wangda Tan edited comment on YARN-10504 at 1/11/21, 6:35 PM: - It looks like folks are generally OK with getting this patch in and deal with further clean up for mixed config mode in a follow-up Jira, it looks LGTM. I plan to get it in by today my time. was (Author: wangda): It looks like folks are generally OK with getting this patch in and deal with further clean up for mixed config mode in a follow up Jira. +_I plan to get it in by today my time. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.011.patch, > YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:33 AM: - [~zhuqi], Thanks for the detailed explanation on the root cause of the Absolute mode - auto creation test issue. I came to the same conclusion in my comment [above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549], so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. was (Author: bteke): [~zhuqi], As for the root cause of the Absolute mode - auto creation test issue I came to the same conclusion in my comment [above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549], so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261549#comment-17261549 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:32 AM: - Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: {{AutoCreatedLeafQueue#reinitializeFromTemplate}} was refactored, now the getting and merging the QueueCapacities happens *before* calling the {{ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource}} the {{AbstractCSQueue#updateEffectiveResources}} is called where the effectiveMinResource of the created queue is overridden with the template's effectiveMinResources which is exactly the same the test is getting in the asserts. {code:java} void updateEffectiveResources(Resource clusterResource) { Set configuredNodelabels = csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath()); for (String label : configuredNodelabels) { Resource resourceByLabel = labelManager.getResourceByLabel(label, clusterResource); Resource minResource = queueResourceQuotas.getConfiguredMinResource( label); // Update effective resource (min/max) to each child queue. if (getCapacityConfigType().equals( CapacityConfigType.ABSOLUTE_RESOURCE)) { queueResourceQuotas.setEffectiveMinResource(label, getMinResourceNormalized(queuePath, ((ParentQueue) parent).getEffectiveMinRatioPerResource(), minResource)); ...{code} was (Author: bteke): Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: {{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the getting and merging the QueueCapacities happens{{ *before* }}calling the{{ ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource}} the AbstractCSQueue#updateEffectiveResources is called where the effectiveMinResource of the created queue is overridden with the template's effectiveMinResources which is exactly the same the test is getting in the asserts. {code:java} void updateEffectiveResources(Resource clusterResource) { Set configuredNodelabels = csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath()); for (String label : configuredNodelabels) { Resource resourceByLabel = labelManager.getResourceByLabel(label, clusterResource); Resource minResource = queueResourceQuotas.getConfiguredMinResource( label); // Update effective resource (min/max) to each child queue. if (getCapacityConfigType().equals( CapacityConfigType.ABSOLUTE_RESOURCE)) { queueResourceQuotas.setEffectiveMinResource(label, getMinResourceNormalized(queuePath, ((ParentQueue) parent).getEffectiveMinRatioPerResource(), minResource)); ...{code} > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261549#comment-17261549 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:31 AM: - Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: {{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the getting and merging the QueueCapacities happens{{ *before* }}calling the{{ ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource}} the AbstractCSQueue#updateEffectiveResources is called where the effectiveMinResource of the created queue is overridden with the template's effectiveMinResources which is exactly the same the test is getting in the asserts. {code:java} void updateEffectiveResources(Resource clusterResource) { Set configuredNodelabels = csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath()); for (String label : configuredNodelabels) { Resource resourceByLabel = labelManager.getResourceByLabel(label, clusterResource); Resource minResource = queueResourceQuotas.getConfiguredMinResource( label); // Update effective resource (min/max) to each child queue. if (getCapacityConfigType().equals( CapacityConfigType.ABSOLUTE_RESOURCE)) { queueResourceQuotas.setEffectiveMinResource(label, getMinResourceNormalized(queuePath, ((ParentQueue) parent).getEffectiveMinRatioPerResource(), minResource)); ...{code} was (Author: bteke): Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: {{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the getting and merging the QueueCapacities happens *before* calling the {{ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource }}the {{AbstractCSQueue#updateEffectiveResources }}is called where the effectiveMinResource of the created queue is overridden with the template's effectiveMinResources which is exactly the same the test is getting in the asserts. {code:java} void updateEffectiveResources(Resource clusterResource) { Set configuredNodelabels = csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath()); for (String label : configuredNodelabels) { Resource resourceByLabel = labelManager.getResourceByLabel(label, clusterResource); Resource minResource = queueResourceQuotas.getConfiguredMinResource( label); // Update effective resource (min/max) to each child queue. if (getCapacityConfigType().equals( CapacityConfigType.ABSOLUTE_RESOURCE)) { queueResourceQuotas.setEffectiveMinResource(label, getMinResourceNormalized(queuePath, ((ParentQueue) parent).getEffectiveMinRatioPerResource(), minResource)); ...{code} > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:29 AM: - [~zhuqi], As for the root cause of the Absolute mode - auto creation test issue I came to the same conclusion in my comment [above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549], so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. was (Author: bteke): [~zhuqi], As for the root cause of the Absolute mode - auto creation test issue I came to the same conclusion in my comment above, so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:29 AM: - [~zhuqi], As for the root cause of the Absolute mode - auto creation test issue I came to the same conclusion in my comment above, so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. was (Author: bteke): [~zhuqi], I came to the same conclusion in my comment [above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549], so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:28 AM: - [~zhuqi], I came to the same conclusion in my comment [above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549], so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. was (Author: bteke): [~zhuqi], I came to the same conclusion in my comment above, so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590 ] Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:27 AM: - [~zhuqi], I came to the same conclusion in my comment above, so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review. I'll create the followup jira. was (Author: bteke): [~zhuqi], I came to the same conclusion in my comment [above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549], so I agree with your suggestion. Will upload a patch containing it, if no objections. Thanks [~sunilg] for the review! I'll create the followup jira. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, > YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, > YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, > YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262562#comment-17262562 ] zhuqi edited comment on YARN-10504 at 1/11/21, 10:44 AM: - [~wangda] [~bteke] When used the absolute resource in the autoCreatedLeafQueue, the logic is : 1.Initialize the update absolute template: {code:java} protected AutoCreatedLeafQueueConfig.Builder initializeLeafQueueConfigs() throws IOException { //Update the absolute template if (this.capacityConfigType.equals(CapacityConfigType.ABSOLUTE_RESOURCE)) { updateQueueCapacities(queueCapacities); } ... builder.capacities(queueCapacities); return builder; } //get resource value in private Resource internalGetLabeledResourceRequirementForQueue(String queue, String label, Set resourceTypes, String suffix) { ... if (matcher.find()) { // Get the sub-group. String subGroup = matcher.group(0); if (subGroup.trim().isEmpty()) { return Resources.none(); } subGroup = subGroup.substring(1, subGroup.length() - 1); for (String kvPair : subGroup.trim().split(",")) { String[] splits = kvPair.split("="); // Ensure that each sub string is key value pair separated by '='. if (splits != null && splits.length > 1) { updateResourceValuesFromConfig(resourceTypes, resource, splits); } } } ... return resource; } // Update in ManagedParentQueue's updateQueueCapacities private void updateQueueCapacities(QueueCapacities queueCapacities) { for (String label : queueCapacities.getExistingNodeLabels()) { queueCapacities.setCapacity(label, this.csContext.getResourceCalculator().divide( this.csContext.getClusterResource(), this.csContext.getConfiguration().getMinimumResourceRequirement( label, this.csContext.getConfiguration() .getAutoCreatedQueueTemplateConfPrefix(getQueuePath()), resourceTypes), getQueueResourceQuotas().getConfiguredMinResource(label))); Resource childMaxResource = this.csContext.getConfiguration() .getMaximumResourceRequirement(label, this.csContext.getConfiguration() .getAutoCreatedQueueTemplateConfPrefix(getQueuePath()), resourceTypes); Resource parentMaxRes = getQueueResourceQuotas() .getConfiguredMaxResource(label); Resource effMaxResource = Resources.min( this.csContext.getResourceCalculator(), this.csContext.getClusterResource(), childMaxResource.equals(Resources.none()) ? parentMaxRes : childMaxResource, parentMaxRes); queueCapacities.setMaximumCapacity( label, this.csContext.getResourceCalculator().divide( this.csContext.getClusterResource(), effMaxResource, getQueueResourceQuotas().getConfiguredMaxResource(label))); queueCapacities.setAbsoluteCapacity( label, queueCapacities.getCapacity(label) * getQueueCapacities().getAbsoluteCapacity(label)); queueCapacities.setAbsoluteMaximumCapacity(label, queueCapacities.getMaximumCapacity(label) * getQueueCapacities().getAbsoluteMaximumCapacity(label)); } } {code} 2. Now, the capacity has been updated to absolute resource based value in addChildQueue: It back to the absolute resource in : setEffectiveMinResource {code:java} public void mergeCapacities(QueueCapacities capacities) { for ( String nodeLabel : capacities.getExistingNodeLabels()) { queueCapacities.setCapacity(nodeLabel, capacities.getCapacity(nodeLabel)); queueCapacities.setAbsoluteCapacity(nodeLabel, capacities .getAbsoluteCapacity(nodeLabel)); queueCapacities.setMaximumCapacity(nodeLabel, capacities .getMaximumCapacity(nodeLabel)); queueCapacities.setAbsoluteMaximumCapacity(nodeLabel, capacities .getAbsoluteMaximumCapacity(nodeLabel)); Resource resourceByLabel = labelManager.getResourceByLabel(nodeLabel, csContext.getClusterResource()); getQueueResourceQuotas().setEffectiveMinResource(nodeLabel, Resources.multiply(resourceByLabel, queueCapacities.getAbsoluteCapacity(nodeLabel))); getQueueResourceQuotas().setEffectiveMaxResource(nodeLabel, Resources.multiply(resourceByLabel, queueCapacities .getAbsoluteMaximumCapacity(nodeLabel))); } }{code} The effective resource have been updated to autoCreatedLeafQueue already. And now, the result is consistent with origin test case, because also use Resources.multiply(resourceByLabel, queueCapacities.getAbsoluteCapacity(nodeLabel)) to get result. 3. Then, in LeafQueue's updateClusterResource: {code:java} public void updateClusterResource(Resource clusterResource, ResourceLimits currentResourceLimits) { writeLock.lock(); try { ...
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262562#comment-17262562 ] zhuqi edited comment on YARN-10504 at 1/11/21, 10:42 AM: - [~wangda] [~bteke] When used the absolute resource in the autoCreatedLeafQueue, the logic is : 1.Initialize the update absolute template: {code:java} protected AutoCreatedLeafQueueConfig.Builder initializeLeafQueueConfigs() throws IOException { //Update the absolute template if (this.capacityConfigType.equals(CapacityConfigType.ABSOLUTE_RESOURCE)) { updateQueueCapacities(queueCapacities); } ... builder.capacities(queueCapacities); return builder; } //get resource value in private Resource internalGetLabeledResourceRequirementForQueue(String queue, String label, Set resourceTypes, String suffix) { ... if (matcher.find()) { // Get the sub-group. String subGroup = matcher.group(0); if (subGroup.trim().isEmpty()) { return Resources.none(); } subGroup = subGroup.substring(1, subGroup.length() - 1); for (String kvPair : subGroup.trim().split(",")) { String[] splits = kvPair.split("="); // Ensure that each sub string is key value pair separated by '='. if (splits != null && splits.length > 1) { updateResourceValuesFromConfig(resourceTypes, resource, splits); } } } ... return resource; } // Update in ManagedParentQueue's updateQueueCapacities private void updateQueueCapacities(QueueCapacities queueCapacities) { for (String label : queueCapacities.getExistingNodeLabels()) { queueCapacities.setCapacity(label, this.csContext.getResourceCalculator().divide( this.csContext.getClusterResource(), this.csContext.getConfiguration().getMinimumResourceRequirement( label, this.csContext.getConfiguration() .getAutoCreatedQueueTemplateConfPrefix(getQueuePath()), resourceTypes), getQueueResourceQuotas().getConfiguredMinResource(label))); Resource childMaxResource = this.csContext.getConfiguration() .getMaximumResourceRequirement(label, this.csContext.getConfiguration() .getAutoCreatedQueueTemplateConfPrefix(getQueuePath()), resourceTypes); Resource parentMaxRes = getQueueResourceQuotas() .getConfiguredMaxResource(label); Resource effMaxResource = Resources.min( this.csContext.getResourceCalculator(), this.csContext.getClusterResource(), childMaxResource.equals(Resources.none()) ? parentMaxRes : childMaxResource, parentMaxRes); queueCapacities.setMaximumCapacity( label, this.csContext.getResourceCalculator().divide( this.csContext.getClusterResource(), effMaxResource, getQueueResourceQuotas().getConfiguredMaxResource(label))); queueCapacities.setAbsoluteCapacity( label, queueCapacities.getCapacity(label) * getQueueCapacities().getAbsoluteCapacity(label)); queueCapacities.setAbsoluteMaximumCapacity(label, queueCapacities.getMaximumCapacity(label) * getQueueCapacities().getAbsoluteMaximumCapacity(label)); } } {code} 2. Now, the capacity has been updated to absolute resource based value in addChildQueue: It back to the absolute resource in : setEffectiveMinResource {code:java} public void mergeCapacities(QueueCapacities capacities) { for ( String nodeLabel : capacities.getExistingNodeLabels()) { queueCapacities.setCapacity(nodeLabel, capacities.getCapacity(nodeLabel)); queueCapacities.setAbsoluteCapacity(nodeLabel, capacities .getAbsoluteCapacity(nodeLabel)); queueCapacities.setMaximumCapacity(nodeLabel, capacities .getMaximumCapacity(nodeLabel)); queueCapacities.setAbsoluteMaximumCapacity(nodeLabel, capacities .getAbsoluteMaximumCapacity(nodeLabel)); Resource resourceByLabel = labelManager.getResourceByLabel(nodeLabel, csContext.getClusterResource()); getQueueResourceQuotas().setEffectiveMinResource(nodeLabel, Resources.multiply(resourceByLabel, queueCapacities.getAbsoluteCapacity(nodeLabel))); getQueueResourceQuotas().setEffectiveMaxResource(nodeLabel, Resources.multiply(resourceByLabel, queueCapacities .getAbsoluteMaximumCapacity(nodeLabel))); } }{code} The effective resource have been updated to autoCreatedLeafQueue already. And now, the result is consistent with origin test case, because also use Resources.multiply(resourceByLabel, queueCapacities.getAbsoluteCapacity(nodeLabel)) to get result. 3. Then, in LeafQueue's updateClusterResource: {code:java} public void updateClusterResource(Resource clusterResource, ResourceLimits currentResourceLimits) { writeLock.lock(); try { ...
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261872#comment-17261872 ] zhuqi edited comment on YARN-10504 at 1/9/21, 2:07 PM: --- [~wangda] [~bteke] [~gandras] 1. The {{updateAbsoluteCapacitiesAndRelatedFields should update maxApplications, but in some case, for example:}} {\{ in TestCapacitySchedulerAutoQueueCreation -> }}testAutoCreatedQueueActivationDeactivation {code:java} //submit user_3 app. This cant be allocated since there is no capacity // in NO_LABEL, SSD but can be in GPU label submitApp(mockRM, parentQueue, USER3, USER3, 4, 1); final CSQueue user3LeafQueue = cs.getQueue(USER3); validateCapacities((AutoCreatedLeafQueue) user3LeafQueue, 0.0f, 0.0f, 1.0f, 1.0f); validateCapacitiesByLabel((ManagedParentQueue) parentQueue, (AutoCreatedLeafQueue) user3LeafQueue, NODEL_LABEL_GPU); {code} The case is no capacity in user_3 autoCreatedLeafQueue, so in {{updateAbsoluteCapacitiesAndRelatedFields}} {code:java} private void updateAbsoluteCapacitiesAndRelatedFields() { updateAbsoluteCapacities(); CapacitySchedulerConfiguration schedulerConf = csContext.getConfiguration(); // If maxApplications not set, use the system total max app, apply newly // calculated abs capacity of the queue. if (maxApplications <= 0) { int maxSystemApps = schedulerConf. getMaximumSystemApplications(); maxApplications = (int) (maxSystemApps * queueCapacities.getAbsoluteCapacity()); } maxApplicationsPerUser = Math.min(maxApplications, (int) (maxApplications * (usersManager.getUserLimit() / 100.0f) * usersManager.getUserLimitFactor())); } // because capacities will update to 0 if (availableCapacity >= leafQueueTemplateCapacities .getAbsoluteCapacity(nodeLabel)) { updateCapacityFromTemplate(capacities, nodeLabel); activate(leafQueue, nodeLabel); } else{ updateToZeroCapacity(capacities, nodeLabel); } // And because, the update will be after reinitializeFromTemplate final AutoCreatedLeafQueueConfig initialLeafQueueTemplate = queueManagementPolicy.getInitialLeafQueueConfiguration(leafQueue); leafQueue.reinitializeFromTemplate(initialLeafQueueTemplate); // Do one update cluster resource call to make sure all absolute resources // effective resources are updated. updateClusterResource(this.csContext.getClusterResource(), new ResourceLimits(this.csContext.getClusterResource()));{code} The maxApplications and maxApplicationsPerUser will be 0. So will should handle in new logic in //TODO recalculate max applications because they can depend on capacity The todo should be removed, just pass the AutoCreatedLeafQueue case now, or add logic to make this case's maxApplications to a fixed default num. 2. As mentioned by [~bteke] "Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: {{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the getting and merging the QueueCapacities happens *before* calling the {{ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). In \{{LeafQueue#updateClusterResource }}the \{{AbstractCSQueue#updateEffectiveResources }}is called where the effectiveMinResource of the created queue is overridden with the template's effectiveMinResources which is exactly the same the test is getting in the asserts." We should changed the \{{LeafQueue updateClusterResource }}to: {code:java} // public void updateClusterResource(Resource clusterResource, ResourceLimits currentResourceLimits) { writeLock.lock(); try { ... if (!(this instanceof AutoCreatedLeafQueue)) { super.updateEffectiveResources(clusterResource); } }{code} It will fix absolute case TestAbsoluteResourceWithAutoQueue . If you any other advice? Thanks. was (Author: zhuqi): [~wangda] [~bteke] 1. The {{updateAbsoluteCapacitiesAndRelatedFields should update maxApplications, but in some case, for example:}} {{ in TestCapacitySchedulerAutoQueueCreation -> }}testAutoCreatedQueueActivationDeactivation {code:java} //submit user_3 app. This cant be allocated since there is no capacity // in NO_LABEL, SSD but can be in GPU label submitApp(mockRM, parentQueue, USER3, USER3, 4, 1); final CSQueue user3LeafQueue = cs.getQueue(USER3); validateCapacities((AutoCreatedLeafQueue) user3LeafQueue, 0.0f, 0.0f, 1.0f, 1.0f); validateCapacitiesByLabel((ManagedParentQueue) parentQueue, (AutoCreatedLeafQueue) user3LeafQueue, NODEL_LABEL_GPU); {code} The case is no capacity in user_3 autoCreatedLeafQueue, so in {{updateAbsoluteCapacitiesAndRelatedFields}} {code:java} private void updateAbsoluteCapacitiesAndRelatedFields() { updateAbsoluteCapacities(); CapacitySchedulerConfiguration schedulerConf = csContext.getConfiguration(); // If maxApplications not set, use the system total max app, apply newly //
[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260292#comment-17260292 ] Benjamin Teke edited comment on YARN-10504 at 1/7/21, 7:53 AM: --- [~wangda], Regarding the AutoCreatedLeafQueue failures: we are looking into the issue. Currently it seems like the way the mock setup is structured the updateClusterResource is not called at the correct time. We'll update the patch once the issue is verified and fixed. was (Author: bteke): [~wangda], Regarding the AutoCreatedLeafQueue: we are looking into the issue. Cuurently it seems like the way the mock setup is structured the updateClusterResource is not called at the correct time. We'll update the patch once the issue is fixed. > Implement weight mode in Capacity Scheduler > --- > > Key: YARN-10504 > URL: https://issues.apache.org/jira/browse/YARN-10504 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10504.001.patch, YARN-10504.002.patch, > YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch > > > To allow the possibility to flexibly create queues in Capacity Scheduler a > weight mode should be introduced. The existing \{{capacity }}property should > be used with a different syntax, i.e: > root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0 > root.users.capacity = 1.0w > root.users.capacity = w:1.0 > Weight support should not impact the existing functionality. > > The new functionality should: > * accept and validate the new weight values > * enforce a singular mode on the whole queue tree > * (re)calculate the relative (percentage-based) capacities based on the > weights during launch and every time the queue structure changes -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org