[jira] Updated: (HDFS-1093) Improve namenode scalability by splitting the FSNamesystem synchronized section in a read/write lock

dhruba borthakur (JIRA) Mon, 12 Jul 2010 01:32:24 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


dhruba borthakur updated HDFS-1093:
-----------------------------------

    Attachment: NNreadwriteLock_trunk_3.txt

I have purposely kept the try-catch block un-indented so that it is easier for 
you to review. Once the review is complete, I will upload a final patch with 
proper indentation.

FSNamesystem.java
-------------------------
I removed the import of the ReadWriteLock. However, I kept the fsLock 
initializing inside the initialize method rather than moving it to the 
declaration. Most variables seem to be getting initialized inside the 
initialize() method, isn't it?

getBlockLocations() - I now acquire the readLock and attempt to proceed ahead. 
If the access-time has to be set, then I release the readLock, acquire the 
writeLock and start all over again.
Check for safemode should be inside the FSnamesystem lock, in fact more of it 
is coming in HDFS-988.
setTimes() move isAccessTimeSupported() call out side the lock : done.

CommitBlockSynchronization(): the key is to call the logSync() outside the 
FSnamesystem lock. And there is a LOG statement that prints out "src" after the 
call to logSync(). That is the reason why i had to declare "String src" right 
at the beginning of the method.

I agree with you that there are code portions in processReport, 
createSymLinkInternal, startFileInternal() that  can move outside the 
FSNamesystem lock. However, I would like to avoid doing this code reorganizatin 
as part of this JIRA, especially because it makes the code difficult to review. 
Also, this is not a regression because the original code has all these code 
inside the synchronized section anyway. Please let me know if you agree on this 
one.
getDatanodeListForReport - move boolean and HashMap out of read lock: done.
getSafeModeTip - readLock instead of writeLock : done

BlockManager.java
------------------------
chooseUnderReplicatedBlocks - should we grab readLock instead of writeLock? I 
wold prefer to keep the writeLock because this method updates member variables 
of the class, e.g. missingBlocksInPrevIter

removeStoredBlock - assert to be replaced by grab writeLock: I have the 
impression that all calls to removeStoredBlock already has the writeLock, that 
is the reason for the assert. Do you know of a code path via which this is not 
the case?

DecommissionManager.java - readLock instead of writeLock; The 
DecommissionManager.check() calls FSNamesystem.checkDecommissionStateInterna() 
which, in turn, invokes node.setDecommissioned(). This updates the state of the 
node, hence I was confortable keeping the writeLock in this code path.

FSDirectory.java
---------------------
 BackupStorage.loadCheckPoint() is now fixed.
Do you have a suggestion on how I can fix the code in 
FSPermissionChecker.checkPermission()?


> Improve namenode scalability by splitting the FSNamesystem synchronized 
> section in a read/write lock
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1093
>                 URL: https://issues.apache.org/jira/browse/HDFS-1093
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: NNreadwriteLock.txt, NNreadwriteLock_trunk_1.txt, 
> NNreadwriteLock_trunk_2.txt, NNreadwriteLock_trunk_3.txt
>
>
> Most critical data structures in the NameNode (NN) are protected by a 
> syncronized methods in the FSNamesystem class. This essentially makes 
> critical code paths in the NN single-threaded. However, a large percentage of 
> the NN calls are listStatus, getBlockLocations, etc which do not change 
> internal data structures at all, these are read-only calls. If we change the 
> FSNamesystem lock to a read/write lock, many of the above operations can 
> occur in parallel, thus improving the scalability of the NN.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1093) Improve namenode scalability by splitting the FSNamesystem synchronized section in a read/write lock

Reply via email to