[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-07-01 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875997#comment-16875997
 ] 

Tao Yang commented on YARN-9623:


[~cheersyang], I have created YARN-9658 to fix these UT failures.

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-07-01 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875986#comment-16875986
 ] 

Weiwei Yang commented on YARN-9623:
---

Hi [~Tao Yang], pls create a new issue to fix this failure. Thanks

 

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-30 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875952#comment-16875952
 ] 

Tao Yang commented on YARN-9623:


Thanks [~cheersyang].
I just noticed that Jerkins report has related failures in TestLeafQueue, if 
there's no yarn configuration in mock RMContext, cleanup interval can't be 
initialized to 5 seconds by default, causing the cleanup thread keeps 
repeatedly running without interval which may bring some problems to mock 
objects.
Add a default value for ActivitiesManager#activitiesCleanupIntervalMs can solve 
this problem in UT.  Should I create a new issue or update in this issue?

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-28 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874994#comment-16874994
 ] 

Weiwei Yang commented on YARN-9623:
---

Pushed to trunk, thanks for the contribution [~Tao Yang].

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874995#comment-16874995
 ] 

Hudson commented on YARN-9623:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16831 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16831/])
YARN-9623. Auto adjust max queue length of app activities to make sure (wwei: 
rev cbae2413201bc470b5f16421ea69d1cd9edb64a8)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/TestActivitiesManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java


> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-28 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874991#comment-16874991
 ] 

Weiwei Yang commented on YARN-9623:
---

+1, committing shortly.

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874917#comment-16874917
 ] 

Hadoop QA commented on YARN-9623:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
46s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 48s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9623 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973152/YARN-9623.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 58c6b042f9e0 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2

[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-28 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874825#comment-16874825
 ] 

Tao Yang commented on YARN-9623:


Thanks [~cheersyang] for your comments.
{quote}
If this configuration is set, then the value should be enforced for the queue 
size and disable the auto-adjustment. Can you add that logic?
{quote}
Currently configuration 
{{yarn.resourcemanager.activities-manager.app-activities.max-queue-length}} is 
still there  and can be seem as the lowest limit, max queue length of app 
activities can only be updated to a larger value than that value. I think this 
should make sense to us as well.

Attached v2 patch with adding volatile modifier for appActivitiesMaxQueueLength 
to make it to be seen by other threads as soon as possible.

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9623.001.patch, YARN-9623.002.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-27 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874168#comment-16874168
 ] 

Weiwei Yang commented on YARN-9623:
---

Hi [~Tao Yang]

OK, I am fine with that. However, we still need to configuration 
\{{yarn.resourcemanager.activities-manager.app-activities.max-queue-length}} to 
be there. If this configuration is set, then the value should be enforced for 
the queue size and disable the auto-adjustment. Can you add that logic?

This is to ensure we have a workaround if the auto-calculation is suboptimal. 
Hope that makes sense.

Thanks

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9623.001.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-27 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874001#comment-16874001
 ] 

Tao Yang commented on YARN-9623:


Thanks [~cheersyang] for the feedback.
{quote}
However, the activity manager should be a general service, it should not be 
depending on CS's configuration.
{quote}
Yes, I had this concern before, but required number of app activities is indeed 
decided by a specific scheduler and even a specific scheduling policy inside 
the scheduler.  So the patch did the same as some general services like 
QueueACLsManager/SchedulerPlacementProcessor/... (using {{if scheduler 
instanceof CapacityScheduler}}). The specific scheduler can be ignored unless 
we just set maxQueueLength to max(configuredMaxQueueLength, 1.2 * numOfNodes), 
this may somehow waste a lot in a large cluster with multi-nodes placement 
enabled. Thoughts?

{quote}
Another thing is appActivitiesMaxQueueLength, do we need to make it atomic 
because it is being modified in another thread.
{quote}
It's no need to make it atomic since there's no requirements for sequence or 
consistency, but violate is necessary for this variable.

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9623.001.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-27 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873966#comment-16873966
 ] 

Weiwei Yang commented on YARN-9623:
---

Hi [~Tao Yang]

Generally, I think this is a good approach, to have fewer configs.

However, the activity manager should be a general service, it should not be 
depending on CS's configuration, for example, the number of async threads. How 
about to let it just be {{1.2 * numOfNodes}} for both cases and see how this 
works? We can continue to tune this after we have more experience to use this 
in real clusters.

Another thing is {{appActivitiesMaxQueueLength}}, do we need to make it atomic 
because it is being modified in another thread.

Thanks

 

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9623.001.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9623) Auto adjust max queue length of app activities to make sure activities on all nodes can be covered

2019-06-26 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16873814#comment-16873814
 ] 

Tao Yang commented on YARN-9623:


[~cheersyang], hope to hear your thoughts about this. Thanks.

> Auto adjust max queue length of app activities to make sure activities on all 
> nodes can be covered
> --
>
> Key: YARN-9623
> URL: https://issues.apache.org/jira/browse/YARN-9623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9623.001.patch
>
>
> Currently we can use configuration entry 
> "yarn.resourcemanager.activities-manager.app-activities.max-queue-length" to 
> control max queue length of app activities, but in some scenarios , this 
> configuration may need to be updated in a growing cluster. Moreover, it's 
> better for users to ignore that conf therefor it should be auto adjusted 
> internally.
>  There are some differences among different scheduling modes:
>  * multi-node placement disabled
>  ** Heartbeat driven scheduling: max queue length of app activities should 
> not less than the number of nodes, considering nodes can not be always in 
> order, we should make some room for misorder, for example, we can guarantee 
> that max queue length should not be less than 1.2 * numNodes
>  ** Async scheduling: every async scheduling thread goes through all nodes in 
> order, in this mode, we should guarantee that max queue length should be 
> numThreads * numNodes.
>  * multi-node placement enabled: activities on all nodes can be involved in a 
> single app allocation, therefor there's no need to adjust for this mode.
> To sum up, we can adjust the max queue length of app activities like this:
> {code}
> int configuredMaxQueueLength;
> int maxQueueLength;
> serviceInit(){
>   ...
>   configuredMaxQueueLength = ...; //read configured max queue length
>   maxQueueLength = configuredMaxQueueLength; //take configured value as 
> default
> }
> CleanupThread#run(){
>   ...
>   if (multiNodeDisabled) {
> if (asyncSchedulingEnabled) {
>maxQueueLength = max(configuredMaxQueueLength, numSchedulingThreads * 
> numNodes);
> } else {
>maxQueueLength = max(configuredMaxQueueLength, 1.2 * numNodes);
> }
>   } else if (maxQueueLength != configuredMaxQueueLength) {
> maxQueueLength = configuredMaxQueueLength;
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org