[ 
https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985849#comment-13985849
 ] 

Colin Patrick McCabe commented on HDFS-6286:
--------------------------------------------

bq. 2) read(buf) inside DFSInputStream has a synchronized, that means if HBase 
scan() hits into sick disk or severe io contention, it will block all 
subsequent read(buf) requests, right? 

It will block all subsequent read requests on that stream, yes.  We don't 
guarantee thread-safety if you start using a stream from multiple threads 
anyway.

bq. 1) Hedged read does work on pread only currently, not against read(buf) 
operation, and HBase scan() will call into read(buf).

Yes, hedged reads only work for {{pread()}} now.  We ought to extend it to all 
forms of {{read()}}.  This will be a big latency win across the board, and not 
only for local reads.

> adding a timeout setting for local read io
> ------------------------------------------
>
>                 Key: HDFS-6286
>                 URL: https://issues.apache.org/jira/browse/HDFS-6286
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>
> Currently, if a write or remote read requested into a sick disk, 
> DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to 
> return back. but it doesn't work on local read. Take HBase scan for example,
> DFSInputStream.read -> readWithStrategy -> readBuffer -> 
> BlockReaderLocal.read ->  dataIn.read -> FileChannelImpl.read
> if it hits a bad disk, the low read io probably takes tens of seconds,  and 
> what's worse is, the "DFSInputStream.read" hold a lock always.
> Per my knowledge, there's no good mechanism to cancel a running read 
> io(Please correct me if it's wrong), so my opinion is adding a future around 
> the read request, and we could set a timeout there, if the threshold reached, 
> we can add the local node into deadnode probably...
> Any thought?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to