[jira] [Commented] (HDFS-4273) Problem in DFSInputStream read retry logic may cause early failure

Jing Zhao (JIRA) Wed, 05 Dec 2012 17:33:00 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-4273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13511012#comment-13511012
 ]


Jing Zhao commented on HDFS-4273:
---------------------------------

I do not think the DFSInputstream#read will be used concurrently. Thus the 
failure variable reset should be correct. Also, to clear deadNodes looks 
reasonable especially when you have multiple replications (since at that you do 
not have any candidate nodes to try and some previous temporary "deaths" may 
already have been recovered). 
                
> Problem in DFSInputStream read retry logic may cause early failure
> ------------------------------------------------------------------
>
>                 Key: HDFS-4273
>                 URL: https://issues.apache.org/jira/browse/HDFS-4273
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>            Priority: Minor
>         Attachments: TestDFSInputStream.java
>
>
> Assume the following call logic
> {noformat} 
> readWithStrategy()
>   -> blockSeekTo()
>   -> readBuffer()
>      -> reader.doRead()
>      -> seekToNewSource() add currentNode to deadnode, wish to get a 
> different datanode
>         -> blockSeekTo()
>            -> chooseDataNode()
>               -> block missing, clear deadNodes and pick the currentNode again
>         seekToNewSource() return false
>      readBuffer() re-throw the exception quit loop
> readWithStrategy() got the exception,  and may fail the read call before 
> tried MaxBlockAcquireFailures.
> {noformat} 
> some issues of the logic:
> 1. seekToNewSource() logic is broken because it may clear deadNodes in the 
> middle.
> 2. the variable "int retries=2" in readWithStrategy seems have conflict with 
> MaxBlockAcquireFailures, should it be removed?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4273) Problem in DFSInputStream read retry logic may cause early failure

Reply via email to