[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044565#comment-14044565
 ] 

Hudson commented on YARN-2171:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #595 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/595/])
YARN-2171. Improved CapacityScheduling to not lock on nodemanager-count when 
AMs heartbeat in. Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605616)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.5.0

 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044702#comment-14044702
 ] 

Hudson commented on YARN-2171:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1786 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1786/])
YARN-2171. Improved CapacityScheduling to not lock on nodemanager-count when 
AMs heartbeat in. Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605616)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.5.0

 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044746#comment-14044746
 ] 

Hudson commented on YARN-2171:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1813 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1813/])
YARN-2171. Improved CapacityScheduling to not lock on nodemanager-count when 
AMs heartbeat in. Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605616)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.5.0

 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044100#comment-14044100
 ] 

Vinod Kumar Vavilapalli commented on YARN-2171:
---

+1, looks good. Checking this in..

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044110#comment-14044110
 ] 

Hudson commented on YARN-2171:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5780 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5780/])
YARN-2171. Improved CapacityScheduling to not lock on nodemanager-count when 
AMs heartbeat in. Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605616)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java


 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Fix For: 2.5.0

 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033864#comment-14033864
 ] 

Jason Lowe commented on YARN-2171:
--

When the CapacityScheduler scheduler thread is running full-time due to a 
constant stream of events (e.g.: large number of running applications with a 
large number of cluster nodes) then the CapacityScheduler lock is held by that 
scheduler loop most of the time.  As AMs heartbeat into the RM to try to get 
their resources, the capacity scheduler code goes out of its way to try to 
avoid having the AMs grab the scheduler lock.  Unfortunately this one was 
missed to get this one integer value.  Therefore they end up piling up on the 
scheduler lock, filling all of the IPC handlers of the ApplicationMasterService 
and the others back up on the call queue.  Once the scheduler releases the lock 
it will quickly try to grab it again, so only a few AMs end up getting through 
the gate and the IPC handlers fill again with the next batch of AMs blocking 
on the scheduler lock.  This causes the average RPC response times to skyrocket 
for AMs.  AMs experience large delays getting their allocations which in turn 
leads to lower cluster utilization and increased application runtimes.

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical

 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034034#comment-14034034
 ] 

Hadoop QA commented on YARN-2171:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650819/YARN-2171.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4014//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4014//console

This message is automatically generated.

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034060#comment-14034060
 ] 

Vinod Kumar Vavilapalli commented on YARN-2171:
---

The code changes look fine enough to me.

The test is not so useful beyond validating this ticket, but that's okay. I see 
that we don't have any test validating the number of nodes itself explicitly, 
shall we add that here?

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034456#comment-14034456
 ] 

Hadoop QA commented on YARN-2171:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650880/YARN-2171v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/4016//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/4016//console

This message is automatically generated.

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)