[ https://issues.apache.org/jira/browse/HDFS-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985096#comment-13985096 ]
Liang Xie commented on HDFS-6286: --------------------------------- Thanks [~cmccabe] for your comment, my concern is : 1) Hedged read does work on pread only currently, not against read(buf) operation, and HBase scan() will call into read(buf). 2) read(buf) inside DFSInputStream has a synchronized, that means if HBase scan() hits into sick disk or severe io contention, it will block all subsequent read(buf) requests, right? :) > adding a timeout setting for local read io > ------------------------------------------ > > Key: HDFS-6286 > URL: https://issues.apache.org/jira/browse/HDFS-6286 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Affects Versions: 3.0.0, 2.4.0 > Reporter: Liang Xie > Assignee: Liang Xie > > Currently, if a write or remote read requested into a sick disk, > DFSClient.hdfsTimeout could help the caller have a guaranteed time cost to > return back. but it doesn't work on local read. Take HBase scan for example, > DFSInputStream.read -> readWithStrategy -> readBuffer -> > BlockReaderLocal.read -> dataIn.read -> FileChannelImpl.read > if it hits a bad disk, the low read io probably takes tens of seconds, and > what's worse is, the "DFSInputStream.read" hold a lock always. > Per my knowledge, there's no good mechanism to cancel a running read > io(Please correct me if it's wrong), so my opinion is adding a future around > the read request, and we could set a timeout there, if the threshold reached, > we can add the local node into deadnode probably... > Any thought? -- This message was sent by Atlassian JIRA (v6.2#6252)