[ https://issues.apache.org/jira/browse/YARN-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547673#comment-13547673 ]
Arun C Murthy commented on YARN-325: ------------------------------------ [~jlowe] This seems limited to a corner case (not that it should be ignore *smile*) in LeafQueue.assignedReservedContainer. The issue is that LeafQueue.assignReserved is a synchronized method which calls completedContainer... need to figure a way around this. > RM CapacityScheduler can deadlock when getQueueInfo() is called and a > container is completing > --------------------------------------------------------------------------------------------- > > Key: YARN-325 > URL: https://issues.apache.org/jira/browse/YARN-325 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 2.0.2-alpha, 0.23.5 > Reporter: Jason Lowe > Priority: Critical > > If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and > containers are completing then the RM can deadlock. getQueueInfo() locks the > ParentQueue and then calls the child queues' getQueueInfo() methods in turn. > However when a container completes, it locks the LeafQueue then calls back > into the ParentQueue. When the two mix, it's a recipe for deadlock. > Stacktrace to follow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira