[ https://issues.apache.org/jira/browse/HDFS-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888473#action_12888473 ]
Suresh Srinivas commented on HDFS-1093: --------------------------------------- bq. I agree with you that there are code portions in processReport, createSymLinkInternal, startFileInternal() that can move outside the FSNamesystem lock. However, I would like to avoid doing this code reorganizatin as part of this JIRA, especially because it makes the code difficult to review. Also, this is not a regression because the original code has all these code inside the synchronized section anyway. Please let me know if you agree on this one. I agree that this jira may not be the right place for this optimization. Compared to earlier code with synchronized methods, with this change, we can optimize the length of synchronized sections. We can create a bug to track this optimization. bq. getBlockLocations() - I now acquire the readLock and attempt to proceed ahead. If the access-time has to be set, then I release the readLock, acquire the writeLock and start all over again Why not check for doAccessTime, if true grab writeLock else readLock? Make doAccessTime parameter final. This change seems much simpler - no need to repeat the initial steps such as looking for inode, computing now etc. bq. removeStoredBlock - assert to be replaced by grab writeLock: I have the impression that all calls to removeStoredBlock already has the writeLock, that is the reason for the assert. Do you know of a code path via which this is not the case? I agree this method is not called without holding the writeLock. However assert is not turned on during run time. Also this code changes the previous semantics. bq. Do you have a suggestion on how I can fix the code in FSPermissionChecker.checkPermission()? This method is called only from FSNamesystem. How about grabbing the readLock and then calling checkPermission without passing the root INodeDirectory? > Improve namenode scalability by splitting the FSNamesystem synchronized > section in a read/write lock > ---------------------------------------------------------------------------------------------------- > > Key: HDFS-1093 > URL: https://issues.apache.org/jira/browse/HDFS-1093 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Attachments: NNreadwriteLock.txt, NNreadwriteLock_trunk_1.txt, > NNreadwriteLock_trunk_2.txt, NNreadwriteLock_trunk_3.txt > > > Most critical data structures in the NameNode (NN) are protected by a > syncronized methods in the FSNamesystem class. This essentially makes > critical code paths in the NN single-threaded. However, a large percentage of > the NN calls are listStatus, getBlockLocations, etc which do not change > internal data structures at all, these are read-only calls. If we change the > FSNamesystem lock to a read/write lock, many of the above operations can > occur in parallel, thus improving the scalability of the NN. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.