[ 
https://issues.apache.org/jira/browse/HDFS-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13097647#comment-13097647
 ] 

Todd Lipcon commented on HDFS-2223:
-----------------------------------

Looking into the -1s, I see that this is causing 
TestEditLogRace.testSaveNamespace to time out. It's a kind of messy situation 
-- the FSN rwlock is deadlocking against its fairness policy:
- saveNamespace acquires the read lock
- spawns an image saver thread
- another thread comes in to do mkdirs and is waiting on the write lock
- the image saver thread calls getNamespaceInfo (new in this patch) and wants 
the read lock
Fairness policy says that the image saver thread can't get the read lock, since 
someone is already waiting on the write lock. So, it hangs there. The writer 
hangs on the main thread, and the main thread is join()ing on the saver thread.

I think the easiest solution is to unsynchronize getNamespaceInfo, since in 
practice none of those fields change after the FSN is initialized. I will 
upload a new patch.

> Untangle depencencies between NN components
> -------------------------------------------
>
>                 Key: HDFS-2223
>                 URL: https://issues.apache.org/jira/browse/HDFS-2223
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-2223-1.txt, hdfs-2223-2.txt, hdfs-2223-3.txt, 
> hdfs-2223-4.txt, hdfs-2223-5.txt, hdfs-2223-6.txt, hdfs-2223-7.txt, 
> hdfs-2223-8.txt
>
>
> Working in the NN a lot for HA (HDFS-1623) I've come across a number of 
> situations where the tangled dependencies between NN components has been 
> problematic for adding new features and for testability. It would be good to 
> untangle some of these and clarify what the distinction is between the 
> different components: NameNode, FSNamesystem, FSDirectory, FSImage, 
> NNStorage, and FSEditLog

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to