[ https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13881281#comment-13881281 ]
stack commented on HDFS-5776: ----------------------------- So, to enable, we set DFS_DFSCLIENT_HEDGED_READ_THREADPOOL_SIZE in DN config. Should the number of threads in the hbase case be greater than NUMBER_OF_HBASE_OPEN_FILES (though this is most often an unknown number, one that is changing over the life over the hbase process, and up in the thousands frequently)? Otherwise we could set it some 'sensible' number like 16 and then just watch the metrics this patch also adds. If we are too often running the requests in the current thread because the executor has none to spare then we can up the number of pool threads (though it requires a DN restart, a PITA)? That should work for the first cut at this feature. nit: You could declare and assign in the one go rather than postpone the assign to the constructor: HEDGED_READ_METRIC = new DFSHedgedReadMetrics(); What is your thinking regards the boolean enabling/disabling hedge reads in DFSClient [~xieliang007]? On the one hand, there is a problem where the setting of pool size is done in DN config yet we have enable/disable hedge reads in the API; if the DN config has a pool size set to 0 then hedged reads are off (as was noted above), and though we may 'enable' hedge reads in the API, we won't be getting the behaviour we think we should be getting. On the other hand, it looks like this boolean could be used 'conserving' resources disabling hedged reads on a per request basis though hedged reads have been marked globally 'on' in the DN? Is that your thinking? I'm inclined to agree with the previous reviewers that this may verge on the 'exotic'. For the first cut at this feature, lets have a global on/off switch with number of threads being the means of constraining how much hedged reading we do? Otherwise patch looks great to me. > Support 'hedged' reads in DFSClient > ----------------------------------- > > Key: HDFS-5776 > URL: https://issues.apache.org/jira/browse/HDFS-5776 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Affects Versions: 3.0.0 > Reporter: Liang Xie > Assignee: Liang Xie > Attachments: HDFS-5776-v2.txt, HDFS-5776-v3.txt, HDFS-5776-v4.txt, > HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, HDFS-5776-v8.txt, > HDFS-5776.txt > > > This is a placeholder of hdfs related stuff backport from > https://issues.apache.org/jira/browse/HBASE-7509 > The quorum read ability should be helpful especially to optimize read outliers > we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & > "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read > ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we > could export the interested metric valus into client system(e.g. HBase's > regionserver metric). > The core logic is in pread code path, we decide to goto the original > fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per > the above config items. -- This message was sent by Atlassian JIRA (v6.1.5#6160)