[ https://issues.apache.org/jira/browse/HDFS-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tomasz Nykiel updated HDFS-2490: -------------------------------- Attachment: FSNamesystemLock.java > Upgradable lock to allow simutaleous read operation while reportDiff is in > progress in processing block reports > --------------------------------------------------------------------------------------------------------------- > > Key: HDFS-2490 > URL: https://issues.apache.org/jira/browse/HDFS-2490 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node > Reporter: Tomasz Nykiel > Assignee: Tomasz Nykiel > Attachments: FSNamesystemLock.java > > > Currently, FSNamesystem operations are protected by a single > ReentrantReadWriteLock, which allows for having multiple concurrent readers > to perform reads, and a single writer to perform writes. There are, however, > operations whose execution has primarily reading nature, but occasionally > they write. > The finest example is processing block reports - currently the entire > processing is done under writeLock(). With HDFS-395 (explicit deletion acks), > processing a block report is primarily a read operation (reportDiff()) after > which only very few blocks need to be updated. In fact, we noticed this > number to be very low, or even zero blocks. > It would be desirable to have an upgradeable read lock, which would allow for > performing other reads during the first "read" part of reportDiff() (and > possibly other operations. > We implemented such mechanism, which provides writeLock(), readLock(), > upgradeableReadLock, upgradeLock(), and downgradeLock(). I achieved this be > emloying two ReentrantReadWriteLock's - one protects writes (lock1), the > other one reads (lock2). > Hence, we have: > writeLock() > lock1.writeLock().lock() > lock2.writeLock().lock() > readLock() > lock2.readLock().lock() > upgradeableReadLock() > lock1.writeLock().lock() > upgrade() > lock2.writeLock().lock() > -------------------------- > Hence a writeLock() is essentially equivalent to upgradeableLock()+upgrade() > - two writeLocks are mutually exclusive because of lock1.writeLock > - a writeLock and upgradeableLock are mutually exclusive as above > - readLock is mutually exclusive with upgradeableLock()+upgrade() OR > writeLock because of lock2.writeLock > - readLock() + writeLock() causes a deadlock, the same as currently > - writeLock() + readLock() does not cause deadlocks > -------------------------- > I am conviced to the soundness of this mechanism. > The overhead comes from having two locks, and in particular, writes need to > acquire both of them. > We deployed this feature, we used the upgradeableLock() ONLY for processing > reports. > Our initial, but not exhaustive experiments have shown that it had a very > detrimental effect on the NN throughput - writes were taking up to twice as > long. > This is very unexpected, and hard to explain by only the overhead of > acquiring additional lock for writes. > I would like to ask for input, as maybe I am missing some fundamental problem > here. > I am attaching a java class which implements this locking mechanism. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira