[ https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141111#comment-17141111 ]
Stephen O'Donnell commented on HDFS-15160: ------------------------------------------ We have been a bit nervous about committing this patch, as while we think it looks good, locking changes are very difficult to prove 100% correct. Therefore I have uploaded a new patch which adds a configuration switch to enable or disable the read lock. It is enabled by default, as I believe this is a feature everyone should use, but if there are unexpected problems, it can easily be disabled by setting `dfs.datanode.lock.read.write.enabled=false`. This would also allow us to easily benchmark the benefit of the change without having to run separate builds etc. Also if it is enabled in production, it is much easier to turn off via a switch than deploy a new build. Due to the existing code structure, it was surprisingly easy to put this switch in place. The logic of the patch is unchanged from 006 aside from adding the enable / disable switch. > ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl > methods should use datanode readlock > ----------------------------------------------------------------------------------------------------------- > > Key: HDFS-15160 > URL: https://issues.apache.org/jira/browse/HDFS-15160 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, > HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, > HDFS-15160.006.patch, HDFS-15160.007.patch, > image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png > > > Now we have HDFS-15150, we can start to move some DN operations to use the > read lock rather than the write lock to improve concurrence. The first step > is to make the changes to ReplicaMap, as many other methods make calls to it. > This Jira switches read operations against the volume map to use the readLock > rather than the write lock. > Additionally, some methods make a call to replicaMap.replicas() (eg > getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result > in a read only fashion, so they can also be switched to using a readLock. > Next is the directory scanner and disk balancer, which only require a read > lock. > Finally (for this Jira) are various "low hanging fruit" items in BlockSender > and fsdatasetImpl where is it fairly obvious they only need a read lock. > For now, I have avoided changing anything which looks too risky, as I think > its better to do any larger refactoring or risky changes each in their own > Jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org