[ https://issues.apache.org/jira/browse/YARN-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Lowe updated YARN-2171: ----------------------------- Attachment: YARN-2171v2.patch The point of the unit test was to catch regressions at a high level. If anyone changes the code such that calling allocate() will grab the scheduler lock then the test will fail, whether that's a regression in this particular method or some new method that's added that ApplicationMasterService or CapacityScheduler itself calls and grabs the lock. I added a separate unit test to exercise the getNumClusterNodes method. The AHS unit test failure seems unrelated, and it passes for me locally even with this change. > AMs block on the CapacityScheduler lock during allocate() > --------------------------------------------------------- > > Key: YARN-2171 > URL: https://issues.apache.org/jira/browse/YARN-2171 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 0.23.10, 2.4.0 > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Critical > Attachments: YARN-2171.patch, YARN-2171v2.patch > > > When AMs heartbeat into the RM via the allocate() call they are blocking on > the CapacityScheduler lock when trying to get the number of nodes in the > cluster via getNumClusterNodes. -- This message was sent by Atlassian JIRA (v6.2#6252)