Liang Xie created HDFS-5664:
-------------------------------
Summary: try to relieve the BlockReaderLocal read() synchronized
hotspot
Key: HDFS-5664
URL: https://issues.apache.org/jira/browse/HDFS-5664
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs-client
Affects Versions: 2.2.0, 3.0.0
Reporter: Liang Xie
Assignee: Liang Xie
Current the BlockReaderLocal's read has a synchronized modifier:
{code}
public synchronized int read(byte[] buf, int off, int len) throws IOException {
{code}
In a HBase physical read heavy cluster, we observed some hotspots from
dfsclient path, the detail strace trace could be found from:
https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13843241&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241
I haven't looked into the detail yet, put some raw ideas here firstly:
1) replace synchronized with try lock with timeout pattern, so could fail-fast,
2) fallback to non-ssr mode if get a local reader lock failed.
There're two suitable scenario at least to remove this hotspot:
1) Local physical read heavy, e.g. HBase block cache miss ratio is high
2) slow/bad disk.
It would be helpful to achive a lower 99th percentile HBase read latency
somehow.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)