Liang Xie created HDFS-5664: ------------------------------- Summary: try to relieve the BlockReaderLocal read() synchronized hotspot Key: HDFS-5664 URL: https://issues.apache.org/jira/browse/HDFS-5664 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.2.0, 3.0.0 Reporter: Liang Xie Assignee: Liang Xie
Current the BlockReaderLocal's read has a synchronized modifier: {code} public synchronized int read(byte[] buf, int off, int len) throws IOException { {code} In a HBase physical read heavy cluster, we observed some hotspots from dfsclient path, the detail strace trace could be found from: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13843241&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241 I haven't looked into the detail yet, put some raw ideas here firstly: 1) replace synchronized with try lock with timeout pattern, so could fail-fast, 2) fallback to non-ssr mode if get a local reader lock failed. There're two suitable scenario at least to remove this hotspot: 1) Local physical read heavy, e.g. HBase block cache miss ratio is high 2) slow/bad disk. It would be helpful to achive a lower 99th percentile HBase read latency somehow. -- This message was sent by Atlassian JIRA (v6.1.4#6159)