[ https://issues.apache.org/jira/browse/HDFS-15150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033592#comment-17033592 ]
Stephen O'Donnell commented on HDFS-15150: ------------------------------------------ Thanks for the review. Nice benchmarks at the link above. Its interesting the unfair lock performs much better, but probably at the code of a long tail latency in the worst cases. Also interesting that other locking methods perform better, but we know the Reentrant RW lock does well in the Namenode, so I feel it should be good for the DN too. We will probably need a series of small Jiras to move various code paths to use the Read lock. To start with, I have created one to address the ReplicaMap, which is called by many other methods. I have a patch ready but I will hold off posting it until we commit this one, as it depends on this change - HDFS-15160. Then I will create a few more Jiras to tackle other code paths. > Introduce read write lock to Datanode > ------------------------------------- > > Key: HDFS-15150 > URL: https://issues.apache.org/jira/browse/HDFS-15150 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Attachments: HDFS-15150.001.patch, HDFS-15150.002.patch, > HDFS-15150.003.patch > > > HDFS-9668 pointed out the issues around the DN lock being a point of > contention some time ago, but that Jira went in a direction of creating a new > FSDataset implementation which is very risky, and activity on the Jira has > stalled for a few years now. Edit: Looks like HDFS-9668 eventually went in a > similar direction to what I was thinking, so I will review that Jira in more > detail to see if this one is necessary. > I feel there could be significant gains by moving to a ReentrantReadWrite > lock within the DN. The current implementation is simply a ReentrantLock so > any locker blocks all others. > Once place I think a read lock would benefit us significantly, is when the DN > is serving a lot of small blocks and there are jobs which perform a lot of > reads. The start of reading any blocks right now takes the lock, but if we > moved this to a read lock, many reads could do this at the same time. > The first conservative step, would be to change the current lock and then > make all accesses to it obtain the write lock. That way, we should keep the > current behaviour and then we can selectively move some lock accesses to the > readlock in separate Jiras. > I would appreciate any thoughts on this, and also if anyone has attempted it > before and found any blockers. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org