When DFS client fails to read from a datanode, the failed datanode is not 
excluded from target reselection 
-----------------------------------------------------------------------------------------------------------

                 Key: HADOOP-698
                 URL: http://issues.apache.org/jira/browse/HADOOP-698
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
            Reporter: Hairong Kuang


In the method read(byte buf[ ], int off, int len) of DFSInputStream, when read 
fails,  it calls "blockSeekTo" to reselect a datanode. However, the failed 
datanode does not feed back to blockSeekTo. The datanode selection algorithm 
works as follows:
* If the machine that the client is running on has a local copy, return the 
local machine;
* Otherwise, randomly pick up one location.

When the failed data node info does not feed back to target reselection, this 
leads to two flaws:
1. When a client fails to read from the local copy, for example, because of the 
checksum error, the local machine will always be chosen in retries.
2. Random selection may still return the same failed node.



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to