[ 
https://issues.apache.org/jira/browse/HDFS-5776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886172#comment-13886172
 ] 

Jing Zhao commented on HDFS-5776:
---------------------------------

Thanks for the feedback [~stack]. 

My first question is, how will HBase use these 
enable/disable/setThreadsNumForHedgedReads APIs defined in DFSClient? 
DFSClient's interface audience is private, and DistributedFileSystem#getClient 
is also private in HDFS. I have not seen these APIs defined in the 
DistributedFileSystem/FileContext in the current patch, which means these will 
be added in a separate jira? In that case, actually we can remove all these API 
from the current patch and discuss how to define them in that new jira?

bq. the enable will have no effect
Yes, if the size of the thread pool is still 0, after the enableHedgedRead is 
called, the hedged read will not be really enabled right? This makes this API 
really confusing. Or we can add a javadoc for this method saying "note: this 
method may not really enable the hedged read, you still need to check the 
number of the thread pool..."?

bq. do some heavyweight gymnastics creating your own Configuration – expensive 
– and a new DFSClient – ditto
I assume we can have multiple DFSClient instances here since we want to do 
enable/disable per DFSClient instance? And calling the Configuration#set method 
to programmatically change the setting of the thread pool size may not be some 
heavyweight gymnastics. Thus while we aim to disallow users to change the 
thread number from client side dynamically, users can easily change the thread 
pool setting in an existing configuration object and use it when creating the 
next DFSClient instance?

For 2, actually I do not quite understand the necessity of changing the thread 
pool size on the fly. I think we should rename setThreadsNumForHedgedReads  to 
initializeThreadPoolForHedgedReads, and remove the "else" section from that 
method. But if it is really necessary to support this functionality, let's 
define a clear setThreadsNumForHedgedReads method instead of silently changing 
the thread pool size in the constructor of DFSClient.





> Support 'hedged' reads in DFSClient
> -----------------------------------
>
>                 Key: HDFS-5776
>                 URL: https://issues.apache.org/jira/browse/HDFS-5776
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-5776-v10.txt, HDFS-5776-v11.txt, HDFS-5776-v12.txt, 
> HDFS-5776-v12.txt, HDFS-5776-v2.txt, HDFS-5776-v3.txt, HDFS-5776-v4.txt, 
> HDFS-5776-v5.txt, HDFS-5776-v6.txt, HDFS-5776-v7.txt, HDFS-5776-v8.txt, 
> HDFS-5776-v9.txt, HDFS-5776.txt
>
>
> This is a placeholder of hdfs related stuff backport from 
> https://issues.apache.org/jira/browse/HBASE-7509
> The quorum read ability should be helpful especially to optimize read outliers
> we can utilize "dfs.dfsclient.quorum.read.threshold.millis" & 
> "dfs.dfsclient.quorum.read.threadpool.size" to enable/disable the hedged read 
> ability from client side(e.g. HBase), and by using DFSQuorumReadMetrics, we 
> could export the interested metric valus into client system(e.g. HBase's 
> regionserver metric).
> The core logic is in pread code path, we decide to goto the original 
> fetchBlockByteRange or the new introduced fetchBlockByteRangeSpeculative per 
> the above config items.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to