[ 
https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983194#comment-13983194
 ] 

Colin Patrick McCabe commented on HDFS-6286:
--------------------------------------------

It seems like enabling hedged reads (which has been merged as HDFS-5776) is a 
better solution to the problem of high-latency local reads.

bq. Per my knowledge, there's no good mechanism to cancel a running read 
io(Please correct me if it's wrong),

You are correct that there is no mechanism for userspace to cancel a 
synchronous I/O operation in the kernel.

bq. my opinion is adding a future around the read request, and we could set a 
timeout there, if the threshold reached, we can add the local node into 
deadnode probably... Any thought?

We can't afford to construct a future on each read.  Reads are often quite 
small and that would generate too much garbage.  We could potentially calculate 
the time each read took, by calling {{System.nanoTime}} or similar.  (On most 
Linux variants, this is a low-cost call which doesn't need to transition to 
kernel space.)

But setting a timeout is going to be very problematic.  For one thing, if the 
client gets a GC, all of its local reads might then shut down due to the 
timeout, which would just make performance worse.  I've seen perfectly good 
disks become slow when under heavy load, but only occasionally.

I think it's better just to use hedged reads when latency is a concern (such as 
in HBase.)  This gets you all the same benefits, and doesn't require any code 
changes.  It also benefits you when you are doing non-local reads, which this 
change would not.

> adding a timeout setting for local read io
> ------------------------------------------
>
>                 Key: HDFS-6286
>                 URL: https://issues.apache.org/jira/browse/HDFS-6286
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>
> Currently, if a write or remote read requested into a sick disk, 
> DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to 
> return back. but it doesn't work on local read. Take HBase scan for example,
> DFSInputStream.read -> readWithStrategy -> readBuffer -> 
> BlockReaderLocal.read ->  dataIn.read -> FileChannelImpl.read
> if it hits a bad disk, the low read io probably takes tens of seconds,  and 
> what's worse is, the "DFSInputStream.read" hold a lock always.
> Per my knowledge, there's no good mechanism to cancel a running read 
> io(Please correct me if it's wrong), so my opinion is adding a future around 
> the read request, and we could set a timeout there, if the threshold reached, 
> we can add the local node into deadnode probably...
> Any thought?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to