[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-10-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616106#comment-15616106
 ] 

Hudson commented on YARN-4963:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10722 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10722/])
YARN-4963. capacity scheduler: Make number of OFF_SWITCH assignments per 
(jlowe: rev 1eae719bcead45915977aa220324650eab3c1b9e)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/conf/capacity-scheduler.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java


> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.7.2, 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
>  Labels: oct16-easy
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: YARN-4963.001.patch, YARN-4963.002.patch, 
> YARN-4963.003.patch, YARN-4963.004.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-10-27 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613526#comment-15613526
 ] 

Jason Lowe commented on YARN-4963:
--

+1 lgtm.  Will commit this tomorrow if there are no objections.

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 2.7.2, 3.0.0-alpha1
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
>  Labels: oct16-easy
> Attachments: YARN-4963.001.patch, YARN-4963.002.patch, 
> YARN-4963.003.patch, YARN-4963.004.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-10-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613425#comment-15613425
 ] 

Hadoop QA commented on YARN-4963:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 21s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 4 new + 146 unchanged - 1 fixed = 150 total (was 147) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 35m 
19s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | YARN-4963 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835666/YARN-4963.004.patch |
| Optional Tests |  asflicense  xml  compile  javac  javadoc  mvninstall  
mvnsite  unit  findbugs  checkstyle  |
| uname | Linux 30921385ddd9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / dd4ed6a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13583/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13583/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13583/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automaticall

[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-05-12 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281866#comment-15281866
 ] 

Wangda Tan commented on YARN-4963:
--

Latest patch looks good, thanks [~nroberts]!

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch, YARN-4963.002.patch, 
> YARN-4963.003.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-05-11 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281035#comment-15281035
 ] 

Wangda Tan commented on YARN-4963:
--

Thanks [~nroberts], few comments:

1) Configuration name:
How about call it: 
yarn.scheduler.capacity.per-node-heartbeat.maximum-offswitch-assignments?
offswitch-per-node-limit is a little confusing to me. And in the future we can 
add other limits under per-node-heartbeat if needed.

2) We may only need to add getOffSwitchNodeLimit to ParentQueue (instead of 
adding it to AbstractCSQueue)

3) (Minor) Logics in ParentQueue:
Add isDebugEnabled for:
{code}
500 LOG.debug("Not assigning more than " + getOffSwitchNodeLimit() +
501 " off-switch containers," +
{code} 

And it's better to print offswitchCount with the debug log.

Thoughts?

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch, YARN-4963.002.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-05-10 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278369#comment-15278369
 ] 

Nathan Roberts commented on YARN-4963:
--

I don't believe test failures are related to this change.

If someone has a few cycles for a review that would be great.  This patch 
covers step #1 agreed to above. step #2 will be handled via YARN-5013.

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch, YARN-4963.002.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15273116#comment-15273116
 ] 

Hadoop QA commented on YARN-4963:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 47s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 52s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 16s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hado

[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-28 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263109#comment-15263109
 ] 

Nathan Roberts commented on YARN-4963:
--

Sorry it took so long to get back to this. I filed YARN-5013 to handle #2. We 
can continue that discussion over there. 


> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-22 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253902#comment-15253902
 ] 

Naganarasimha G R commented on YARN-4963:
-

Thanks for the clarification [~wangda] & [~nroberts],  yes point 2 addresses 
the same issue and my mistake i missed to read this. And also agree to the 
focus of this jira to be specific to the system level OFF-SWITCH configuration.

bq. so I think when we do the application-level support the default would need 
to be either unlimited or some high value, otherwise we force all applications 
to set this limit to something other than 1 to get decent OFF_SWITCH scheduling 
behavior.
Once we have system level OFF-SWITCH configuration do we require app level 
default also ? IIUC by default we try to make use of system level OFF-SWITCH 
configuration unless explicitly overridden by the app (implementation can be 
further discussed in that jira)
bq. Sure, my application scheduled very quickly but my locality was terrible so 
I caused a lot of unnecessary cross-switch traffic. So I think we'll need some 
system-minimums that will prevent this type of abuse.
This point is debatable, even though i agree your point for controlling 
cross-switch traffic, but still the app is performing under its capacity limits 
so would it be good to limit it control it.
bq. If application A meets its OFF-SWITCH-per-node limit, do we offer the node 
to other applications in the same queue?
any limitations if we offer the node to other applications in the same queue ? 
it should be fine right ?





> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-20 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250137#comment-15250137
 ] 

Nathan Roberts commented on YARN-4963:
--

bq. IMO, I think application specific configurations should be there rather at 
scheduler level. Some applications are fine with assigning containers in 
off_switch they can specify number of containers to be assigned. But few 
applications are very strict to node locality, they can configure 1 in 
off_switch.

bq. Even i feel the same, any specfic reason it has been set only at the 
scheduler level other than the AMRM interface change ? We can keep the default 
value as 1 so that its still compatible. Also anyway allocation happens within 
app's & queue's capacity limits so i feel it would be ideal for app to decide 
how many allocations in off_switch node. thoughts ?

Thanks [~Naganarasimha], [~rohithsharma], [~leftnoteasy] for the comments. I 
think we're all in agreement that there needs to be some control at the 
application level for things like OFF_SWITCH allocations, and locality delays 
(That's what #2 was going for and I think that should be a separate jira if 
folks are agreeable to that.) This new feature will require some discussion:
- The current value of 1 is not a good value for almost all applications so I 
think when we do the application-level support the default would need to be 
either unlimited or some high value, otherwise we force all applications to set 
this limit to something other than 1 to get decent OFF_SWITCH scheduling 
behavior.
- This setting not only affects the application at hand, but can also affect 
the entire system. I can see many cases where applications will relax these 
settings significantly so that their application schedules faster, however that 
may not have been the right thing for the system as a whole. Sure, my 
application scheduled very quickly but my locality was terrible so I caused a 
lot of unnecessary cross-switch traffic. So I think we'll need some 
system-minimums that will prevent this type of abuse. 
- These changes would potentially affect the fifo-ness of the queues. If 
application A meets its OFF-SWITCH-per-node limit, do we offer the node to 
other applications in the same queue? 

So my suggestion is:
1) Have this jira make the system-level OFF-SWITCH check  configurable so 
admins can easily crank this up and dramatically improve scheduling rate. 
2) Have a second jira to address per-application settings for things like 
locality_delay and off_switch limits.

Reasonable?





> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248176#comment-15248176
 ] 

Wangda Tan commented on YARN-4963:
--

[~nroberts],

{code}
So maybe there are two things to do:
1) Have the global OFF_SWITCH check to handle the simple case of avoiding too 
many network-heavy applications on a node. 
2) A feature where applications can specify a 
max_containers_assigned_per_node_per_heartbeat. I think this would be checked 
down in LeafQueue.assignContainers().
{code}

Make sense to me, I agree to limit the scope to OFF_SWITCH allocations in this 
JIRA.

bq. Even i feel the same, any specfic reason it has been set only at the 
scheduler level other than the AMRM interface change ? We can keep the default 
value as 1 so that its still compatible. Also anyway allocation happens within 
app's & queue's capacity limits so i feel it would be ideal for app to decide 
how many allocations in off_switch node. thoughts ?
Beyond maximum off-switch allocation per node heartbeat, there're some other 
scheduler global options we may need to consider to move to per-app. One 
example is locality delays for different locality type.

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245212#comment-15245212
 ] 

Naganarasimha G R commented on YARN-4963:
-

bq.  I think application specific configurations should be there rather at 
scheduler level. 
Even i feel the same, any specfic reason it has been set only at the scheduler 
level other than the AMRM interface change ? We can keep the default value as 1 
so that its still compatible. Also anyway allocation happens within app's & 
queue's capacity limits so i feel it would be ideal for app to decide how many 
allocations in off_switch node. thoughts ?

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-17 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245187#comment-15245187
 ] 

Rohith Sharma K S commented on YARN-4963:
-

Thanks [~nroberts] for initiating discussion on this. We have seen off_switch 
assignment issue in large cluster as you described.  Especially when cluster is 
running fully occupied resource, and container release happens all together 
from one node, only 1 container is assigned this node. This makes user thinks 
that why assignment is not happening even though resource is free. 
IMO, I think  application specific configurations should be there rather at 
scheduler level. Some applications are fine with assigning containers in 
off_switch they can specify number of containers to be assigned. But few 
applications are very strict to node locality, they can configure 1 in 
off_switch.
Thoughts?

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-15 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243662#comment-15243662
 ] 

Nathan Roberts commented on YARN-4963:
--

Thanks [~leftnoteasy] for the feedback. I agree that it would be a useful 
feature to be able to give some applications better spread, regardless of 
allocation type. Now we just have to figure out how to get there.

My concern is that I don't think we'd want to implement it using the same 
simple approach if it's going to apply to all container types. For example, in 
our case we almost always want NODE_LOCAL and RACK_LOCAL to get scheduled as 
quickly as possible so I'd want the limit to be high, as opposed to OFF_SWITCH 
where I want the limit to be 3-5 to keep a nice balance between scheduling 
performance and clustering. 

The reason this check was introduced in the first place (iirc) was to prevent 
network-heavy applications from loading up on specific nodes. The OFF_SWITCH 
check was a simple way of achieving this at a global level. The feature I think 
you're asking for (please correct me if I misunderstood) is that applications 
should be able to request that container spread be prioritized over timely 
scheduling (kind of like locality delay does today). I completely agree this 
would be a useful knob for applications to have. It is a trade-off though. An 
application that wants really good spread would be sacrificing scheduling 
opportunities that would probably be given to applications behind them in the 
queue (like locality delay).

So maybe there are two things to do:
1) Have the global OFF_SWITCH check to handle the simple case of avoiding too 
many network-heavy applications on a node. 
2) A feature where applications can specify a 
max_containers_assigned_per_node_per_heartbeat. I think this would be checked 
down in LeafQueue.assignContainers().

Even with #2 in place, I don't think #1 could immediately go away because the 
network-heavy applications would need to start properly specifying this limit.

The other approach to get rid of #1 would be when network is a resource. Such 
applications could then request lots of network resource, which should prevent 
clustering.

Does that make any sort of sense?


> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243343#comment-15243343
 ] 

Hadoop QA commented on YARN-4963:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 1 new + 111 unchanged - 0 fixed = 112 total (was 111) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 0s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 39s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 55m 24s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
22s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 149m 4s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.webapp.TestRMWithCSRFFilter |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification |
|   | 
hadoop.yarn.server.resourcemanager.sch

[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-04-15 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243219#comment-15243219
 ] 

Wangda Tan commented on YARN-4963:
--

[~nroberts], +1 to this feature, this gonna be very useful.

Instead of only limit #off-switch containers allocate per heartbeat, can we 
limit #containers allocation regardless of locality? I can see some values to 
limit #containers for rack/node locality as well. For example, if user wants 
their containers allocated spread across all the cluster. 

> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)