[ 
https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-2171:
-----------------------------

    Attachment: YARN-2171v2.patch

The point of the unit test was to catch regressions at a high level.  If anyone 
changes the code such that calling allocate() will grab the scheduler lock then 
the test will fail, whether that's a regression in this particular method or 
some new method that's added that ApplicationMasterService or CapacityScheduler 
itself calls and grabs the lock.

I added a separate unit test to exercise the getNumClusterNodes method.

The AHS unit test failure seems unrelated, and it passes for me locally even 
with this change.

> AMs block on the CapacityScheduler lock during allocate()
> ---------------------------------------------------------
>
>                 Key: YARN-2171
>                 URL: https://issues.apache.org/jira/browse/YARN-2171
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 0.23.10, 2.4.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: YARN-2171.patch, YARN-2171v2.patch
>
>
> When AMs heartbeat into the RM via the allocate() call they are blocking on 
> the CapacityScheduler lock when trying to get the number of nodes in the 
> cluster via getNumClusterNodes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to