[jira] [Comment Edited] (YARN-9730) Support forcing configured partitions to be exclusive based on app node label

2019-09-26 Thread Bibin Chundatt (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938302#comment-16938302
 ] 

Bibin Chundatt edited comment on YARN-9730 at 9/26/19 6:00 AM:
---

[~jhung]

Thank you for working on this. Sorry to come in really late too ..

{code}
240   if (ResourceRequest.ANY.equals(req.getResourceName())) {
241 SchedulerUtils.enforcePartitionExclusivity(req,
242 getRmContext().getExclusiveEnforcedPartitions(),
243 asc.getNodeLabelExpression());
244   }
{code}

Configuration query on the AM allocation flow is going to be costly which i 
observed while evaluating the performance..
Could you optimize {{getRmContext().getExclusiveEnforcedPartitions()}}, since 
this is going to be invoked for every *request*






was (Author: bibinchundatt):
[~jhung]

Thank you for working on this. Sorry to come in really late too ..

{quote}
240   if (ResourceRequest.ANY.equals(req.getResourceName())) {
241 SchedulerUtils.enforcePartitionExclusivity(req,
242 getRmContext().getExclusiveEnforcedPartitions(),
243 asc.getNodeLabelExpression());
244   }
{quote}

Configuration query on the AM allocation flow is going to be costly which i 
observed while evaluating the performance..
Could you optimize {{getRmContext().getExclusiveEnforcedPartitions()}}, since 
this is going to be invoked for every *request*





> Support forcing configured partitions to be exclusive based on app node label
> -
>
> Key: YARN-9730
> URL: https://issues.apache.org/jira/browse/YARN-9730
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Fix For: 2.10.0, 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9730-branch-2.001.patch, YARN-9730.001.addendum, 
> YARN-9730.001.patch, YARN-9730.002.addendum, YARN-9730.002.patch, 
> YARN-9730.003.patch
>
>
> Use case: queue X has all of its workload in non-default (exclusive) 
> partition P (by setting app submission context's node label set to P). Node 
> in partition Q != P heartbeats to RM. Capacity scheduler loops through every 
> application in X, and every scheduler key in this application, and fails to 
> allocate each time since the app's requested label and the node's label don't 
> match. This causes huge performance degradation when number of apps in X is 
> large.
> To fix the issue, allow RM to configure partitions as "forced-exclusive". If 
> partition P is "forced-exclusive", then:
>  * 1a. If app sets its submission context's node label to P, all its resource 
> requests will be overridden to P
>  * 1b. If app sets its submission context's node label Q, any of its resource 
> requests whose labels are P will be overridden to Q
>  * 2. In the scheduler, we add apps with node label expression P to a 
> separate data structure. When a node in partition P heartbeats to scheduler, 
> we only try to schedule apps in this data structure. When a node in partition 
> Q heartbeats to scheduler, we schedule the rest of the apps as normal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9730) Support forcing configured partitions to be exclusive based on app node label

2019-09-25 Thread Bibin Chundatt (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938302#comment-16938302
 ] 

Bibin Chundatt edited comment on YARN-9730 at 9/26/19 5:58 AM:
---

[~jhung]

Thank you for working on this. Sorry to come in really late too ..

{quote}
240   if (ResourceRequest.ANY.equals(req.getResourceName())) {
241 SchedulerUtils.enforcePartitionExclusivity(req,
242 getRmContext().getExclusiveEnforcedPartitions(),
243 asc.getNodeLabelExpression());
244   }
{quote}

Configuration query on the AM allocation flow is going to be costly which i 
observed while evaluating the performance..
Could you optimize {{getRmContext().getExclusiveEnforcedPartitions()}}, since 
this is going to be invoked for every *request*






was (Author: bibinchundatt):
[~jhung]

Thank you for working on this. Sorry to come in really late too ..

{quote}
240   if (ResourceRequest.ANY.equals(req.getResourceName())) {
241 SchedulerUtils.enforcePartitionExclusivity(req,
242 getRmContext().getExclusiveEnforcedPartitions(),
243 asc.getNodeLabelExpression());
244   }
{quote}

Configuration query on the AM allocation flow is going to be costly which i 
observed while evaluating the performance..
Could you optimize {getRmContext().getExclusiveEnforcedPartitions()} ,since 
this is going to be invoked for every *request*





> Support forcing configured partitions to be exclusive based on app node label
> -
>
> Key: YARN-9730
> URL: https://issues.apache.org/jira/browse/YARN-9730
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Fix For: 2.10.0, 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9730-branch-2.001.patch, YARN-9730.001.addendum, 
> YARN-9730.001.patch, YARN-9730.002.addendum, YARN-9730.002.patch, 
> YARN-9730.003.patch
>
>
> Use case: queue X has all of its workload in non-default (exclusive) 
> partition P (by setting app submission context's node label set to P). Node 
> in partition Q != P heartbeats to RM. Capacity scheduler loops through every 
> application in X, and every scheduler key in this application, and fails to 
> allocate each time since the app's requested label and the node's label don't 
> match. This causes huge performance degradation when number of apps in X is 
> large.
> To fix the issue, allow RM to configure partitions as "forced-exclusive". If 
> partition P is "forced-exclusive", then:
>  * 1a. If app sets its submission context's node label to P, all its resource 
> requests will be overridden to P
>  * 1b. If app sets its submission context's node label Q, any of its resource 
> requests whose labels are P will be overridden to Q
>  * 2. In the scheduler, we add apps with node label expression P to a 
> separate data structure. When a node in partition P heartbeats to scheduler, 
> we only try to schedule apps in this data structure. When a node in partition 
> Q heartbeats to scheduler, we schedule the rest of the apps as normal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9730) Support forcing configured partitions to be exclusive based on app node label

2019-09-25 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16938052#comment-16938052
 ] 

Jonathan Hung edited comment on YARN-9730 at 9/25/19 8:40 PM:
--

Thanks for reporting...I think this is b/c we grab this conf from rmcontext's 
conf which is not initialized in the test cases. YARN-8468 adds 
TestRMAppManager which passes the test conf to RMAppManager so it's fixed in 
later versions.

Probably it's easiest to just add a null check so we don't have to fix all the 
test cases. Uploaded 001 addendum for this.


was (Author: jhung):
Thanks for reporting...I think this is b/c we grab this conf from rmcontext's 
conf which is not initialized in the test cases. YARN-8468 adds 
TestRMAppManager which passes the test conf to RMAppManager so it's fixed in 
later versions.

Probably it's easiest to just add a null check so we don't have to fix all the 
test cases. I'll upload a patch for this.

> Support forcing configured partitions to be exclusive based on app node label
> -
>
> Key: YARN-9730
> URL: https://issues.apache.org/jira/browse/YARN-9730
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
>  Labels: release-blocker
> Fix For: 2.10.0, 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9730-branch-2.001.patch, YARN-9730.001.addendum, 
> YARN-9730.001.patch, YARN-9730.002.patch, YARN-9730.003.patch
>
>
> Use case: queue X has all of its workload in non-default (exclusive) 
> partition P (by setting app submission context's node label set to P). Node 
> in partition Q != P heartbeats to RM. Capacity scheduler loops through every 
> application in X, and every scheduler key in this application, and fails to 
> allocate each time since the app's requested label and the node's label don't 
> match. This causes huge performance degradation when number of apps in X is 
> large.
> To fix the issue, allow RM to configure partitions as "forced-exclusive". If 
> partition P is "forced-exclusive", then:
>  * 1a. If app sets its submission context's node label to P, all its resource 
> requests will be overridden to P
>  * 1b. If app sets its submission context's node label Q, any of its resource 
> requests whose labels are P will be overridden to Q
>  * 2. In the scheduler, we add apps with node label expression P to a 
> separate data structure. When a node in partition P heartbeats to scheduler, 
> we only try to schedule apps in this data structure. When a node in partition 
> Q heartbeats to scheduler, we schedule the rest of the apps as normal.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org