[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-18 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-2171:


Priority: Major  (was: Critical)

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-18 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-2171:


Priority: Critical  (was: Major)

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-18 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-2171:


Target Version/s: 2.5.0  (was: 0.23.11, 2.5.0)

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-2171:
-

Attachment: YARN-2171.patch

Patch to use AtomicInteger for the number of nodes so we can avoid grabbing the 
lock to access the value.  I also added a unit test to verify allocate doesn't 
try to grab the capacity scheduler lock.

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-2171) AMs block on the CapacityScheduler lock during allocate()

2014-06-17 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-2171:
-

Attachment: YARN-2171v2.patch

The point of the unit test was to catch regressions at a high level.  If anyone 
changes the code such that calling allocate() will grab the scheduler lock then 
the test will fail, whether that's a regression in this particular method or 
some new method that's added that ApplicationMasterService or CapacityScheduler 
itself calls and grabs the lock.

I added a separate unit test to exercise the getNumClusterNodes method.

The AHS unit test failure seems unrelated, and it passes for me locally even 
with this change.

 AMs block on the CapacityScheduler lock during allocate()
 -

 Key: YARN-2171
 URL: https://issues.apache.org/jira/browse/YARN-2171
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical
 Attachments: YARN-2171.patch, YARN-2171v2.patch


 When AMs heartbeat into the RM via the allocate() call they are blocking on 
 the CapacityScheduler lock when trying to get the number of nodes in the 
 cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)