[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17263018#comment-17263018
 ] 

Wangda Tan edited comment on YARN-10504 at 1/12/21, 1:58 AM:
-

Committed ver.010 to trunk, thanks to everybody who contributed code ([~bteke], 
[~zhuqi], [~gandras]) and reviewed the patch ([~sunilg], [~epayne])!


was (Author: wangda):
Committed to trunk, thanks to everybody who contributed code ([~bteke], 
[~zhuqi], [~gandras]) and reviewed the patch ([~sunilg], [~epayne])!

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.011.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Wangda Tan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262853#comment-17262853
 ] 

Wangda Tan edited comment on YARN-10504 at 1/11/21, 6:35 PM:
-

It looks like folks are generally OK with getting this patch in and deal with 
further clean up for mixed config mode in a follow-up Jira, it looks LGTM. I 
plan to get it in by today my time. 


was (Author: wangda):
It looks like folks are generally OK with getting this patch in and deal with 
further clean up for mixed config mode in a follow up Jira. +_I plan to get it 
in by today my time. 

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.011.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:33 AM:
-

[~zhuqi],

Thanks for the detailed explanation on the root cause of the Absolute mode - 
auto creation test issue. I came to the same conclusion in my comment 
[above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549],
 so I agree with your suggestion. Will upload a patch containing it, if no 
objections.

Thanks [~sunilg] for the review. I'll create the followup jira.


was (Author: bteke):
[~zhuqi],

As for the root cause of the Absolute mode - auto creation test issue I came to 
the same conclusion in my comment 
[above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549],
 so I agree with your suggestion. Will upload a patch containing it, if no 
objections.

Thanks [~sunilg] for the review. I'll create the followup jira.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261549#comment-17261549
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:32 AM:
-

Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: 
{{AutoCreatedLeafQueue#reinitializeFromTemplate}} was refactored, now the 
getting and merging the QueueCapacities happens *before* calling the 
{{ParentQueue#updateClusterResource}} (and 
{{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource}} 
the {{AbstractCSQueue#updateEffectiveResources}} is called where the 
effectiveMinResource of the created queue is overridden with the template's 
effectiveMinResources which is exactly the same the test is getting in the 
asserts.
{code:java}
  void updateEffectiveResources(Resource clusterResource) {
Set configuredNodelabels =
csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath());
for (String label : configuredNodelabels) {
  Resource resourceByLabel = labelManager.getResourceByLabel(label,
  clusterResource);  
 Resource minResource = queueResourceQuotas.getConfiguredMinResource(
  label);  // Update effective resource (min/max) to each child 
queue.
  if (getCapacityConfigType().equals(
  CapacityConfigType.ABSOLUTE_RESOURCE)) {
queueResourceQuotas.setEffectiveMinResource(label,
getMinResourceNormalized(queuePath,
((ParentQueue) parent).getEffectiveMinRatioPerResource(),
minResource));


...{code}


was (Author: bteke):
Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: 
{{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the 
getting and merging the QueueCapacities happens{{ *before* }}calling the{{ 
ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). 
In {{LeafQueue#updateClusterResource}} the 
AbstractCSQueue#updateEffectiveResources is called where the 
effectiveMinResource of the created queue is overridden with the template's 
effectiveMinResources which is exactly the same the test is getting in the 
asserts.
{code:java}
  void updateEffectiveResources(Resource clusterResource) {
Set configuredNodelabels =
csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath());
for (String label : configuredNodelabels) {
  Resource resourceByLabel = labelManager.getResourceByLabel(label,
  clusterResource);  Resource minResource = 
queueResourceQuotas.getConfiguredMinResource(
  label);  // Update effective resource (min/max) to each child 
queue.
  if (getCapacityConfigType().equals(
  CapacityConfigType.ABSOLUTE_RESOURCE)) {
queueResourceQuotas.setEffectiveMinResource(label,
getMinResourceNormalized(queuePath,
((ParentQueue) parent).getEffectiveMinRatioPerResource(),
minResource));


...{code}

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261549#comment-17261549
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:31 AM:
-

Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: 
{{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the 
getting and merging the QueueCapacities happens{{ *before* }}calling the{{ 
ParentQueue#updateClusterResource}} (and {{LeafQueue#updateClusterResource}}). 
In {{LeafQueue#updateClusterResource}} the 
AbstractCSQueue#updateEffectiveResources is called where the 
effectiveMinResource of the created queue is overridden with the template's 
effectiveMinResources which is exactly the same the test is getting in the 
asserts.
{code:java}
  void updateEffectiveResources(Resource clusterResource) {
Set configuredNodelabels =
csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath());
for (String label : configuredNodelabels) {
  Resource resourceByLabel = labelManager.getResourceByLabel(label,
  clusterResource);  Resource minResource = 
queueResourceQuotas.getConfiguredMinResource(
  label);  // Update effective resource (min/max) to each child 
queue.
  if (getCapacityConfigType().equals(
  CapacityConfigType.ABSOLUTE_RESOURCE)) {
queueResourceQuotas.setEffectiveMinResource(label,
getMinResourceNormalized(queuePath,
((ParentQueue) parent).getEffectiveMinRatioPerResource(),
minResource));


...{code}


was (Author: bteke):
Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: 
{{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the 
getting and merging the QueueCapacities happens *before* calling the 
{{ParentQueue#updateClusterResource}} (and 
{{LeafQueue#updateClusterResource}}). In {{LeafQueue#updateClusterResource 
}}the {{AbstractCSQueue#updateEffectiveResources }}is called where the 
effectiveMinResource of the created queue is overridden with the template's 
effectiveMinResources which is exactly the same the test is getting in the 
asserts.


{code:java}
  void updateEffectiveResources(Resource clusterResource) {
Set configuredNodelabels =
csContext.getConfiguration().getConfiguredNodeLabels(getQueuePath());
for (String label : configuredNodelabels) {
  Resource resourceByLabel = labelManager.getResourceByLabel(label,
  clusterResource);  Resource minResource = 
queueResourceQuotas.getConfiguredMinResource(
  label);  // Update effective resource (min/max) to each child 
queue.
  if (getCapacityConfigType().equals(
  CapacityConfigType.ABSOLUTE_RESOURCE)) {
queueResourceQuotas.setEffectiveMinResource(label,
getMinResourceNormalized(queuePath,
((ParentQueue) parent).getEffectiveMinRatioPerResource(),
minResource));


...{code}

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:29 AM:
-

[~zhuqi],

As for the root cause of the Absolute mode - auto creation test issue I came to 
the same conclusion in my comment 
[above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549],
 so I agree with your suggestion. Will upload a patch containing it, if no 
objections.

Thanks [~sunilg] for the review. I'll create the followup jira.


was (Author: bteke):
[~zhuqi],

As for the root cause of the Absolute mode - auto creation test issue I came to 
the same conclusion in my comment above, so I agree with your suggestion. Will 
upload a patch containing it, if no objections.

Thanks [~sunilg] for the review. I'll create the followup jira.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:29 AM:
-

[~zhuqi],

As for the root cause of the Absolute mode - auto creation test issue I came to 
the same conclusion in my comment above, so I agree with your suggestion. Will 
upload a patch containing it, if no objections.

Thanks [~sunilg] for the review. I'll create the followup jira.


was (Author: bteke):
[~zhuqi],

I came to the same conclusion in my comment 
[above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549],
 so I agree with your suggestion. Will upload a patch containing it, if no 
objections.

Thanks [~sunilg] for the review. I'll create the followup jira.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:28 AM:
-

[~zhuqi],

I came to the same conclusion in my comment 
[above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549],
 so I agree with your suggestion. Will upload a patch containing it, if no 
objections.

Thanks [~sunilg] for the review. I'll create the followup jira.


was (Author: bteke):
[~zhuqi],

I came to the same conclusion in my comment above, so I agree with your 
suggestion. Will upload a patch containing it, if no objections.

Thanks [~sunilg] for the review. I'll create the followup jira.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262590#comment-17262590
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/11/21, 11:27 AM:
-

[~zhuqi],

I came to the same conclusion in my comment above, so I agree with your 
suggestion. Will upload a patch containing it, if no objections.

Thanks [~sunilg] for the review. I'll create the followup jira.


was (Author: bteke):
[~zhuqi],

I came to the same conclusion in my comment 
[above|https://issues.apache.org/jira/browse/YARN-10504?focusedCommentId=17261549=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17261549],
 so I agree with your suggestion. Will upload a patch containing it, if no 
objections.

 

Thanks [~sunilg] for the review! I'll create the followup jira.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.003.patch, YARN-10504.004.patch, YARN-10504.005.patch, 
> YARN-10504.006.patch, YARN-10504.007.patch, YARN-10504.008.patch, 
> YARN-10504.009.patch, YARN-10504.010.patch, YARN-10504.ver-1.patch, 
> YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262562#comment-17262562
 ] 

zhuqi edited comment on YARN-10504 at 1/11/21, 10:44 AM:
-

[~wangda] [~bteke]

When used the absolute resource in the autoCreatedLeafQueue, the logic is :

1.Initialize the update absolute template:

 
{code:java}
protected AutoCreatedLeafQueueConfig.Builder initializeLeafQueueConfigs() 
throws IOException {
  
  //Update the absolute template
  if (this.capacityConfigType.equals(CapacityConfigType.ABSOLUTE_RESOURCE)) {
updateQueueCapacities(queueCapacities);
  }
  
  ...
   
  builder.capacities(queueCapacities);
  return builder;
}

//get resource value in 
private Resource internalGetLabeledResourceRequirementForQueue(String queue,
String label, Set resourceTypes, String suffix) {
  ...
  if (matcher.find()) {
// Get the sub-group.
String subGroup = matcher.group(0);
if (subGroup.trim().isEmpty()) {
  return Resources.none();
}
subGroup = subGroup.substring(1, subGroup.length() - 1);
for (String kvPair : subGroup.trim().split(",")) {
  String[] splits = kvPair.split("=");

  // Ensure that each sub string is key value pair separated by '='.
  if (splits != null && splits.length > 1) {
updateResourceValuesFromConfig(resourceTypes, resource, splits);
  }
}
  }
  ...
  return resource;
}

// Update in ManagedParentQueue's updateQueueCapacities
private void updateQueueCapacities(QueueCapacities queueCapacities) {
  for (String label : queueCapacities.getExistingNodeLabels()) {
queueCapacities.setCapacity(label,
this.csContext.getResourceCalculator().divide(
this.csContext.getClusterResource(),
this.csContext.getConfiguration().getMinimumResourceRequirement(
label,
this.csContext.getConfiguration()
.getAutoCreatedQueueTemplateConfPrefix(getQueuePath()),
resourceTypes),
getQueueResourceQuotas().getConfiguredMinResource(label)));

Resource childMaxResource = this.csContext.getConfiguration()
.getMaximumResourceRequirement(label,
this.csContext.getConfiguration()
.getAutoCreatedQueueTemplateConfPrefix(getQueuePath()),
resourceTypes);
Resource parentMaxRes = getQueueResourceQuotas()
.getConfiguredMaxResource(label);

Resource effMaxResource = Resources.min(
this.csContext.getResourceCalculator(),
this.csContext.getClusterResource(),
childMaxResource.equals(Resources.none()) ? parentMaxRes
: childMaxResource,
parentMaxRes);

queueCapacities.setMaximumCapacity(
label, this.csContext.getResourceCalculator().divide(
 this.csContext.getClusterResource(),
 effMaxResource,
 getQueueResourceQuotas().getConfiguredMaxResource(label)));

queueCapacities.setAbsoluteCapacity(
label, queueCapacities.getCapacity(label)
* getQueueCapacities().getAbsoluteCapacity(label));

queueCapacities.setAbsoluteMaximumCapacity(label,
queueCapacities.getMaximumCapacity(label)
* getQueueCapacities().getAbsoluteMaximumCapacity(label));
  }
} {code}
2. Now, the capacity has been updated to absolute resource based value in 
addChildQueue:

It back to the absolute resource in : setEffectiveMinResource

 
{code:java}
public void mergeCapacities(QueueCapacities capacities) {
  for ( String nodeLabel : capacities.getExistingNodeLabels()) {
queueCapacities.setCapacity(nodeLabel,
capacities.getCapacity(nodeLabel));
queueCapacities.setAbsoluteCapacity(nodeLabel, capacities
.getAbsoluteCapacity(nodeLabel));
queueCapacities.setMaximumCapacity(nodeLabel, capacities
.getMaximumCapacity(nodeLabel));
queueCapacities.setAbsoluteMaximumCapacity(nodeLabel, capacities
.getAbsoluteMaximumCapacity(nodeLabel));

Resource resourceByLabel = labelManager.getResourceByLabel(nodeLabel,
csContext.getClusterResource());
getQueueResourceQuotas().setEffectiveMinResource(nodeLabel,
Resources.multiply(resourceByLabel,
queueCapacities.getAbsoluteCapacity(nodeLabel)));
getQueueResourceQuotas().setEffectiveMaxResource(nodeLabel,
Resources.multiply(resourceByLabel, queueCapacities
.getAbsoluteMaximumCapacity(nodeLabel)));
  }
}{code}
The effective resource have been updated to autoCreatedLeafQueue already.

 And now, the result is consistent with origin test case, because also use 
Resources.multiply(resourceByLabel, 
queueCapacities.getAbsoluteCapacity(nodeLabel))

to get result.

3. Then, in LeafQueue's updateClusterResource:

 
{code:java}
public void updateClusterResource(Resource clusterResource,
ResourceLimits currentResourceLimits) {
  writeLock.lock();
  try {
   ...

[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-11 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17262562#comment-17262562
 ] 

zhuqi edited comment on YARN-10504 at 1/11/21, 10:42 AM:
-

[~wangda] [~bteke]

When used the absolute resource in the autoCreatedLeafQueue, the logic is :

1.Initialize the update absolute template:

 
{code:java}
protected AutoCreatedLeafQueueConfig.Builder initializeLeafQueueConfigs() 
throws IOException {
  
  //Update the absolute template
  if (this.capacityConfigType.equals(CapacityConfigType.ABSOLUTE_RESOURCE)) {
updateQueueCapacities(queueCapacities);
  }
  
  ...
   
  builder.capacities(queueCapacities);
  return builder;
}

//get resource value in 
private Resource internalGetLabeledResourceRequirementForQueue(String queue,
String label, Set resourceTypes, String suffix) {
  ...
  if (matcher.find()) {
// Get the sub-group.
String subGroup = matcher.group(0);
if (subGroup.trim().isEmpty()) {
  return Resources.none();
}
subGroup = subGroup.substring(1, subGroup.length() - 1);
for (String kvPair : subGroup.trim().split(",")) {
  String[] splits = kvPair.split("=");

  // Ensure that each sub string is key value pair separated by '='.
  if (splits != null && splits.length > 1) {
updateResourceValuesFromConfig(resourceTypes, resource, splits);
  }
}
  }
  ...
  return resource;
}

// Update in ManagedParentQueue's updateQueueCapacities
private void updateQueueCapacities(QueueCapacities queueCapacities) {
  for (String label : queueCapacities.getExistingNodeLabels()) {
queueCapacities.setCapacity(label,
this.csContext.getResourceCalculator().divide(
this.csContext.getClusterResource(),
this.csContext.getConfiguration().getMinimumResourceRequirement(
label,
this.csContext.getConfiguration()
.getAutoCreatedQueueTemplateConfPrefix(getQueuePath()),
resourceTypes),
getQueueResourceQuotas().getConfiguredMinResource(label)));

Resource childMaxResource = this.csContext.getConfiguration()
.getMaximumResourceRequirement(label,
this.csContext.getConfiguration()
.getAutoCreatedQueueTemplateConfPrefix(getQueuePath()),
resourceTypes);
Resource parentMaxRes = getQueueResourceQuotas()
.getConfiguredMaxResource(label);

Resource effMaxResource = Resources.min(
this.csContext.getResourceCalculator(),
this.csContext.getClusterResource(),
childMaxResource.equals(Resources.none()) ? parentMaxRes
: childMaxResource,
parentMaxRes);

queueCapacities.setMaximumCapacity(
label, this.csContext.getResourceCalculator().divide(
 this.csContext.getClusterResource(),
 effMaxResource,
 getQueueResourceQuotas().getConfiguredMaxResource(label)));

queueCapacities.setAbsoluteCapacity(
label, queueCapacities.getCapacity(label)
* getQueueCapacities().getAbsoluteCapacity(label));

queueCapacities.setAbsoluteMaximumCapacity(label,
queueCapacities.getMaximumCapacity(label)
* getQueueCapacities().getAbsoluteMaximumCapacity(label));
  }
} {code}
2. Now, the capacity has been updated to absolute resource based value in 
addChildQueue:

It back to the absolute resource in : setEffectiveMinResource

 
{code:java}
public void mergeCapacities(QueueCapacities capacities) {
  for ( String nodeLabel : capacities.getExistingNodeLabels()) {
queueCapacities.setCapacity(nodeLabel,
capacities.getCapacity(nodeLabel));
queueCapacities.setAbsoluteCapacity(nodeLabel, capacities
.getAbsoluteCapacity(nodeLabel));
queueCapacities.setMaximumCapacity(nodeLabel, capacities
.getMaximumCapacity(nodeLabel));
queueCapacities.setAbsoluteMaximumCapacity(nodeLabel, capacities
.getAbsoluteMaximumCapacity(nodeLabel));

Resource resourceByLabel = labelManager.getResourceByLabel(nodeLabel,
csContext.getClusterResource());
getQueueResourceQuotas().setEffectiveMinResource(nodeLabel,
Resources.multiply(resourceByLabel,
queueCapacities.getAbsoluteCapacity(nodeLabel)));
getQueueResourceQuotas().setEffectiveMaxResource(nodeLabel,
Resources.multiply(resourceByLabel, queueCapacities
.getAbsoluteMaximumCapacity(nodeLabel)));
  }
}{code}
The effective resource have been updated to autoCreatedLeafQueue already.

 And now, the result is consistent with origin test case, because also use 
Resources.multiply(resourceByLabel, 
queueCapacities.getAbsoluteCapacity(nodeLabel))

to get result.

3. Then, in LeafQueue's updateClusterResource:

 
{code:java}
public void updateClusterResource(Resource clusterResource,
ResourceLimits currentResourceLimits) {
  writeLock.lock();
  try {
   ...

[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-09 Thread zhuqi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17261872#comment-17261872
 ] 

zhuqi edited comment on YARN-10504 at 1/9/21, 2:07 PM:
---

[~wangda]  [~bteke] [~gandras]

1. The {{updateAbsoluteCapacitiesAndRelatedFields should update 
maxApplications, but in some case, for example:}}

{\{ in TestCapacitySchedulerAutoQueueCreation -> 
}}testAutoCreatedQueueActivationDeactivation 

 
{code:java}
//submit user_3 app. This cant be allocated since there is no capacity
// in NO_LABEL, SSD but can be in GPU label
submitApp(mockRM, parentQueue, USER3, USER3, 4, 1);
final CSQueue user3LeafQueue = cs.getQueue(USER3);
validateCapacities((AutoCreatedLeafQueue) user3LeafQueue, 0.0f, 0.0f,
1.0f, 1.0f);
validateCapacitiesByLabel((ManagedParentQueue) parentQueue,
(AutoCreatedLeafQueue)
user3LeafQueue, NODEL_LABEL_GPU);
{code}
The case is no capacity in user_3 autoCreatedLeafQueue, so in 
{{updateAbsoluteCapacitiesAndRelatedFields}}

 

 
{code:java}
private void updateAbsoluteCapacitiesAndRelatedFields() {
  updateAbsoluteCapacities();
  CapacitySchedulerConfiguration schedulerConf = csContext.getConfiguration();

  // If maxApplications not set, use the system total max app, apply newly
  // calculated abs capacity of the queue.
  if (maxApplications <= 0) {
int maxSystemApps = schedulerConf.
getMaximumSystemApplications();
maxApplications =
(int) (maxSystemApps * queueCapacities.getAbsoluteCapacity());
  }
  maxApplicationsPerUser = Math.min(maxApplications,
  (int) (maxApplications * (usersManager.getUserLimit() / 100.0f)
  * usersManager.getUserLimitFactor()));
}
// because capacities will update to 0
if (availableCapacity >= leafQueueTemplateCapacities
.getAbsoluteCapacity(nodeLabel)) {
  updateCapacityFromTemplate(capacities, nodeLabel);
  activate(leafQueue, nodeLabel);
} else{
  updateToZeroCapacity(capacities, nodeLabel);
}

// And because, the update will be after reinitializeFromTemplate
final AutoCreatedLeafQueueConfig initialLeafQueueTemplate =
queueManagementPolicy.getInitialLeafQueueConfiguration(leafQueue);
leafQueue.reinitializeFromTemplate(initialLeafQueueTemplate);

// Do one update cluster resource call to make sure all absolute resources
// effective resources are updated.
updateClusterResource(this.csContext.getClusterResource(),
new ResourceLimits(this.csContext.getClusterResource()));{code}
The maxApplications and maxApplicationsPerUser will be 0. 

 

So will should handle in new logic in 

//TODO recalculate max applications because they can depend on capacity 

The todo should be removed, just pass the AutoCreatedLeafQueue case now, or add 
logic to make this case's  maxApplications to a fixed default num.

 

2. As mentioned by [~bteke] 

"Sharing my latest findings on TestAbsoluteResourceWithAutoQueue failure: 
{{AutoCreatedLeafQueue#reinitializeFromTemplate }}was refactored, now the 
getting and merging the QueueCapacities happens *before* calling the 
{{ParentQueue#updateClusterResource}} (and 
{{LeafQueue#updateClusterResource}}). In \{{LeafQueue#updateClusterResource 
}}the \{{AbstractCSQueue#updateEffectiveResources }}is called where the 
effectiveMinResource of the created queue is overridden with the template's 
effectiveMinResources which is exactly the same the test is getting in the 
asserts."

We should changed the \{{LeafQueue updateClusterResource }}to:
{code:java}
// public void updateClusterResource(Resource clusterResource,
ResourceLimits currentResourceLimits) {
  writeLock.lock();
  try {
...

if (!(this instanceof AutoCreatedLeafQueue)) {
  super.updateEffectiveResources(clusterResource);
}

}{code}
It will fix absolute case TestAbsoluteResourceWithAutoQueue . 

If you any other advice?

Thanks.


was (Author: zhuqi):
[~wangda]  [~bteke]

1. The {{updateAbsoluteCapacitiesAndRelatedFields should update 
maxApplications, but in some case, for example:}}

{{ in TestCapacitySchedulerAutoQueueCreation -> 
}}testAutoCreatedQueueActivationDeactivation 

 
{code:java}
//submit user_3 app. This cant be allocated since there is no capacity
// in NO_LABEL, SSD but can be in GPU label
submitApp(mockRM, parentQueue, USER3, USER3, 4, 1);
final CSQueue user3LeafQueue = cs.getQueue(USER3);
validateCapacities((AutoCreatedLeafQueue) user3LeafQueue, 0.0f, 0.0f,
1.0f, 1.0f);
validateCapacitiesByLabel((ManagedParentQueue) parentQueue,
(AutoCreatedLeafQueue)
user3LeafQueue, NODEL_LABEL_GPU);
{code}
The case is no capacity in user_3 autoCreatedLeafQueue, so in 
{{updateAbsoluteCapacitiesAndRelatedFields}}

 

 
{code:java}
private void updateAbsoluteCapacitiesAndRelatedFields() {
  updateAbsoluteCapacities();
  CapacitySchedulerConfiguration schedulerConf = csContext.getConfiguration();

  // If maxApplications not set, use the system total max app, apply newly
  // 

[jira] [Comment Edited] (YARN-10504) Implement weight mode in Capacity Scheduler

2021-01-06 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17260292#comment-17260292
 ] 

Benjamin Teke edited comment on YARN-10504 at 1/7/21, 7:53 AM:
---

[~wangda],

 

Regarding the AutoCreatedLeafQueue failures: we are looking into the issue. 
Currently it seems like the way the mock setup is structured the 
updateClusterResource is not called at the correct time. We'll update the patch 
once the issue is verified and fixed.


was (Author: bteke):
[~wangda],

 

Regarding the AutoCreatedLeafQueue: we are looking into the issue. Cuurently it 
seems like the way the mock setup is structured the updateClusterResource is 
not called at the correct time. We'll update the patch once the issue is fixed.

> Implement weight mode in Capacity Scheduler
> ---
>
> Key: YARN-10504
> URL: https://issues.apache.org/jira/browse/YARN-10504
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Benjamin Teke
>Assignee: Benjamin Teke
>Priority: Major
> Attachments: YARN-10504.001.patch, YARN-10504.002.patch, 
> YARN-10504.ver-1.patch, YARN-10504.ver-2.patch, YARN-10504.ver-3.patch
>
>
> To allow the possibility to flexibly create queues in Capacity Scheduler a 
> weight mode should be introduced. The existing \{{capacity }}property should 
> be used with a different syntax, i.e:
> root.users.capacity = (1.0) or ~1.0 or ^1.0 or @1.0
> root.users.capacity = 1.0w
> root.users.capacity = w:1.0
> Weight support should not impact the existing functionality.
>  
> The new functionality should: 
>  * accept and validate the new weight values
>  * enforce a singular mode on the whole queue tree
>  * (re)calculate the relative (percentage-based) capacities based on the 
> weights during launch and every time the queue structure changes



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org