[ 
https://issues.apache.org/jira/browse/YARN-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149194#comment-14149194
 ] 

Hudson commented on YARN-2608:
------------------------------

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1883 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1883/])
YARN-2608. FairScheduler: Potential deadlocks in loading alloc files and clock 
access. (Wei Yan via kasha) (kasha: rev 
f4357240a6f81065d91d5f443ed8fc8cd2a14a8f)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt


> FairScheduler: Potential deadlocks in loading alloc files and clock access
> --------------------------------------------------------------------------
>
>                 Key: YARN-2608
>                 URL: https://issues.apache.org/jira/browse/YARN-2608
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Wei Yan
>            Assignee: Wei Yan
>             Fix For: 2.6.0
>
>         Attachments: YARN-2608-1.patch, YARN-2608-2.patch, YARN-2608-3.patch
>
>
> Two potential deadlocks exist inside the FairScheduler.
> 1. AllocationFileLoaderService would reload the queue configuration, which 
> calls FairScheduler.AllocationReloadListener.onReload() function. And require 
> *FairScheduler's lock*; 
> {code}
>   public void onReload(AllocationConfiguration queueInfo) {
>       synchronized (FairScheduler.this) {
>           ....
>       }
>   }
> {code}
> after that, it would require the *QueueManager's queues lock*.
> {code}
>   private FSQueue getQueue(String name, boolean create, FSQueueType 
> queueType) {
>       name = ensureRootPrefix(name);
>       synchronized (queues) {
>           ....
>       }
>   }
> {code}
> Another thread FairScheduler.assignToQueue may also need to create a new 
> queue when a new job submitted. This thread would hold the *QueueManager's 
> queues lock* firstly, and then would like to hold the *FairScheduler's lock* 
> as it needs to call FairScheduler.getClock() function when creating a new 
> FSLeafQueue. Deadlock may happen here.
> 2. The AllocationFileLoaderService holds  *AllocationFileLoaderService's 
> lock* first, and then waits for *FairScheduler's lock*. Another thread (like 
> AdminService.refreshQueues) may call FairScheduler's reinitialize function, 
> which holds *FairScheduler's lock* first, and then waits for 
> *AllocationFileLoaderService's lock*. Deadlock may happen here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to