[
https://issues.apache.org/jira/browse/HDFS-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495874#comment-13495874
]
Kihwal Lee commented on HDFS-4183:
----------------------------------
We've seen a busrt of commitBlockSynchronization() calls making namenode
unresponsive for a long time, causing other important RPC calls such as leas
renewing and heartbeat to fail. Since the blocks are copied, it can also create
a lot of cluster-wide traffic.
The commitBlockSynchronization() method logs two messages, one in the beginning
after acquiring the write lock and another one after releasing and syncing the
edit log. The time between the two is usually less than 1-2 ms, so the actual
processing and sync time don't seem long. But when namenode gets a busrt of
these calls, it can only sustain the rate of 20-30 per second, with almost no
other requests being served. When these calls are served back-to-back, the gap
between calls ranges from 20-100ms.
The calls are supposed to be blocked at the write lock. Although enabling
fairness is known to causes significant performance degradation on write heavy
ReadWriteLock (in my experiment about 80% degradation with 100 threads), the
overhead is still very small compared to the wait time of 20-100ms we saw.
Regardless of the performance and efficiency of commitBlockSynchronization(), I
think it is reasonable to throttle the block recovery, so that namenode can
avoid shooting itself. It will be nice to have a feedback-based dynamic
asynchronous work scheduling, but a simple throttling may do for now. I propose
configurable rate with 300/min as default.
> Throttle block recovery
> -----------------------
>
> Key: HDFS-4183
> URL: https://issues.apache.org/jira/browse/HDFS-4183
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: name-node
> Affects Versions: 0.23.4, 2.0.2-alpha
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
>
> When a large number of files are abandoned without closing, a storm of lease
> expiration follows in about an hour (lease hard limit). For the last block of
> each file, block recovery is initiated and when the datanode is done, it
> calls commitBlockSynchronization() is called against namenode. A burst of
> these calls can slow down namenode considerably. We need to throttle block
> recovery and/or speed up the rate at which commitBlockSynchronization() is
> served.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira