[ 
https://issues.apache.org/jira/browse/HDFS-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141111#comment-17141111
 ] 

Stephen O'Donnell commented on HDFS-15160:
------------------------------------------

We have been a bit nervous about committing this patch, as while we think it 
looks good, locking changes are very difficult to prove 100% correct.

Therefore I have uploaded a new patch which adds a configuration switch to 
enable or disable the read lock. It is enabled by default, as I believe this is 
a feature everyone should use, but if there are unexpected problems, it can 
easily be disabled by setting `dfs.datanode.lock.read.write.enabled=false`.

This would also allow us to easily benchmark the benefit of the change without 
having to run separate builds etc. Also if it is enabled in production, it is 
much easier to turn off via a switch than deploy a new build.

Due to the existing code structure, it was surprisingly easy to put this switch 
in place. The logic of the patch is unchanged from 006 aside from adding the 
enable / disable switch.

> ReplicaMap, Disk Balancer, Directory Scanner and various FsDatasetImpl 
> methods should use datanode readlock
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-15160
>                 URL: https://issues.apache.org/jira/browse/HDFS-15160
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 3.3.0
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>         Attachments: HDFS-15160.001.patch, HDFS-15160.002.patch, 
> HDFS-15160.003.patch, HDFS-15160.004.patch, HDFS-15160.005.patch, 
> HDFS-15160.006.patch, HDFS-15160.007.patch, 
> image-2020-04-10-17-18-08-128.png, image-2020-04-10-17-18-55-938.png
>
>
> Now we have HDFS-15150, we can start to move some DN operations to use the 
> read lock rather than the write lock to improve concurrence. The first step 
> is to make the changes to ReplicaMap, as many other methods make calls to it.
> This Jira switches read operations against the volume map to use the readLock 
> rather than the write lock.
> Additionally, some methods make a call to replicaMap.replicas() (eg 
> getBlockReports, getFinalizedBlocks, deepCopyReplica) and only use the result 
> in a read only fashion, so they can also be switched to using a readLock.
> Next is the directory scanner and disk balancer, which only require a read 
> lock.
> Finally (for this Jira) are various "low hanging fruit" items in BlockSender 
> and fsdatasetImpl where is it fairly obvious they only need a read lock.
> For now, I have avoided changing anything which looks too risky, as I think 
> its better to do any larger refactoring or risky changes each in their own 
> Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to