[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-21 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110674#comment-15110674
 ] 

Jason Lowe commented on YARN-4610:
--

Whitespace complaints are bogus, and test failures are unrelated.  The 
TestClientRMTokens and TestAMAuthorization failures are the same as in trunk.  
TestSystemMetricsPublisher is failing on branch-2.7 without the patch, filed 
YARN-4623.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610-branch-2.7.002.patch, YARN-4610.001.patch, 
> YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-21 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110778#comment-15110778
 ] 

Thomas Graves commented on YARN-4610:
-

+1 for branch 2.7.  After investigating this some more the original patch of 
setting it to none() works. The reason is that the parents limit is passed and 
it would be taken into account int he leaf calculation.  I think the latter 
patch is safer but either is fine with me.

The master patch I'm not sure about how its taking the max capacity into 
account so I'll have to look at that more, but the unit tests are passing and 
that would be a separate issue from this fix.  +1 on that patch as well.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610-branch-2.7.002.patch, YARN-4610.001.patch, 
> YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111084#comment-15111084
 ] 

Hudson commented on YARN-4610:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9153 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9153/])
YARN-4610. Reservations continue looking for one app causes other apps (jlowe: 
rev 468a53b22f4ac5bb079dff986ba849a687d709fe)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Fix For: 2.7.3
>
> Attachments: YARN-4610-branch-2.7.002.patch, YARN-4610.001.patch, 
> YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-21 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15111048#comment-15111048
 ] 

Jason Lowe commented on YARN-4610:
--

Thanks for the reviews!  Committing this.


> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610-branch-2.7.002.patch, YARN-4610.001.patch, 
> YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109845#comment-15109845
 ] 

Hadoop QA commented on YARN-4610:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
17s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} branch-2.7 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} branch-2.7 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
28s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} branch-2.7 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in branch-2.7 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} branch-2.7 passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} branch-2.7 passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 1 new + 824 unchanged - 0 fixed = 825 total (was 824) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 3s 
{color} | {color:red} The patch has 1617 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 47s 
{color} | {color:red} The patch has 98 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 11s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 53m 55s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 45m 5s 
{color} | {color:red} Patch generated 77 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 172m 0s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens 

[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108697#comment-15108697
 ] 

Thomas Graves commented on YARN-4610:
-

+1.  Thanks for fixing this. 

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109392#comment-15109392
 ] 

Jason Lowe commented on YARN-4610:
--

Thanks for the reviews!  Committing this.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109427#comment-15109427
 ] 

Jason Lowe commented on YARN-4610:
--

Actually it doesn't apply to branch-2.7, so working on a patch for that as well.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109327#comment-15109327
 ] 

Jason Lowe commented on YARN-4610:
--

TestAbstractYarnScheduler is failing the same way occasionally on trunk without 
the patch applied.  Filed YARN-4615.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109330#comment-15109330
 ] 

Thomas Graves commented on YARN-4610:
-

Ok thanks for investigating.  +1 from me feel free to commit.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109511#comment-15109511
 ] 

Hadoop QA commented on YARN-4610:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} docker {color} | {color:blue} 0m 7s 
{color} | {color:blue} Dockerfile 
'/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/dev-support/docker/Dockerfile'
 not found, falling back to built-in. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 13m 55s 
{color} | {color:red} Docker failed to build yetus/hadoop:date2016-01-20. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12783421/YARN-4610.branch-2.7.001.patch
 |
| JIRA Issue | YARN-4610 |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/10345/console |


This message was automatically generated.



> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch, YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108902#comment-15108902
 ] 

Jason Lowe commented on YARN-4610:
--

bq. could you take a look at failed tests? Not sure if they're related to this 
fix.

The TestApplicationPriority failure appears to be unrelated.  I tried multiple 
times to reproduce it with the patch applied, and it always passes.

The TestClientRMTokens failure is unrelated.  It's failing the same way for 
other precommit builds, and is being tracked by YARN-4306 / HADOOP-12687.

The TestAMAuthorization failure is unrelated.  It's failing the same way for 
other precommit builds, and is being tracked by YARN-4318 / HADOOP-12687.

The TestClientRMService failure is unrelated.  It's failing the same way in 
some other precommit builds (e.g.: see 
https://builds.apache.org/job/PreCommit-YARN-Build/10285/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt).
  I tried running it multiple times with the patch applied, and it always 
passed.  I couldn't find a tracking ticket for this one, so I filed YARN-4613.

I'm guessing the TestAbstractYarnScheduler is unrelated.  Note that it didn't 
fail with JDK7 but does with JDK8.  I tried isolating the test with JDK8, and 
it doesn't fail, yet it fails when run as part of the suite.  I'll investigate 
it further to see if I can narrow down why it's failing in JDK8, but the fact 
that it's passing on another JDK and also passes when run in isolation 
indicates it's likely not a problem with the patch.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108923#comment-15108923
 ] 

Jason Lowe commented on YARN-4610:
--

Forgot to mention that TestApplicationPriority is failing in the same way for 
other precommit builds (e.g.: 
https://builds.apache.org/job/PreCommit-YARN-Build/10177/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_66.txt),
 and only failed on one JDK version vs. another.  Couldn't find a tracking 
ticket for that one, so I filed YARN-4614.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-20 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109516#comment-15109516
 ] 

Thomas Graves commented on YARN-4610:
-

Sorry after looking some more I think there might be an issue with this for 
parent queue max capacities, looking some more.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch, YARN-4610.branch-2.7.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107542#comment-15107542
 ] 

Jason Lowe commented on YARN-4610:
--

I believe the issue is in LeafQueue#assignToUser.  That method will modify the 
amount needed to unreserve for a particular user when they hit the resource 
limit.  However the amount needed to unreserve never gets reset to zero for the 
next iteration of the loop, so subsequent apps for different users can end up 
not receiving containers because it accidentally thinks it needs to unreserve 
based on that stale value.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107996#comment-15107996
 ] 

Wangda Tan commented on YARN-4610:
--

Thanks [~jlowe], it's a nice finding!

+1 to the fix, but could you take a look at failed tests? Not sure if they're 
related to this fix.

> Reservations continue looking for one app causes other apps to starve
> -
>
> Key: YARN-4610
> URL: https://issues.apache.org/jira/browse/YARN-4610
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 2.7.1
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-4610.001.patch
>
>
> CapacityScheduler's LeafQueue has "reservations continue looking" logic that 
> allows an application to unreserve elsewhere to fulfil a container request on 
> a node that has available space.  However in 2.7 that logic seems to break 
> allocations for subsequent apps in the queue.  Once a user hits its user 
> limit, subsequent apps in the queue for other users receive containers at a 
> significantly reduced rate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4610) Reservations continue looking for one app causes other apps to starve

2016-01-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107959#comment-15107959
 ] 

Hadoop QA commented on YARN-4610:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
53s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s 
{color} | {color:green} trunk passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 1 new + 55 unchanged - 1 fixed = 56 total (was 56) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 39s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 17s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_91. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 188m 9s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMService |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
| JDK v1.7.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   |