[ https://issues.apache.org/jira/browse/HDFS-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497364#comment-13497364 ]
Todd Lipcon commented on HDFS-4186: ----------------------------------- Maybe I'm being too nit picky, but I'd rather see the optimization around skipping logSync when nothing has been logged separate from this bug fix for lease recovery. There have been enough synchronization bugs around the FSEditLog code paths that I think it deserves its own JIRA (which we could cautiously put it only 3.0 at first, for example). I haven't generally seen the FSEditLog monitor be a point of contention, since it's only held for very short amounts of time, so the value of the optimization seems low compared to the added complexity. > logSync() is called with the write lock held while releasing lease > ------------------------------------------------------------------ > > Key: HDFS-4186 > URL: https://issues.apache.org/jira/browse/HDFS-4186 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.23.4, 2.0.2-alpha > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Priority: Critical > Attachments: hdfs-4186-trunk.patch > > > As pointed out in HDFS-4138, when the lease monitor calls > internalReleaseLease(), it acquires the namespace write lock. Inside > internalReleaseLease(), if a block recovery is needed, the lease is > reassigned to the namenode itself and this is logged & synced in > logReassignLease(). > Since this is done while the write lock is held, log syncing is blocked. When > a large number of leases are expired and blocks are recovered, namenode can > slow down. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira