[ https://issues.apache.org/jira/browse/HADOOP-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Raghu Angadi updated HADOOP-3339: --------------------------------- Resolution: Fixed Release Note: Some of the failures on 3rd datanode in DFS write pipelie are not detected properly. This could lead to hard failure of client's write operation. Status: Resolved (was: Patch Available) I just committed this. > DFS Write pipeline does not detect defective datanode correctly if it times > out. > -------------------------------------------------------------------------------- > > Key: HADOOP-3339 > URL: https://issues.apache.org/jira/browse/HADOOP-3339 > Project: Hadoop Core > Issue Type: Bug > Components: dfs > Affects Versions: 0.16.0 > Reporter: Raghu Angadi > Assignee: Raghu Angadi > Fix For: 0.18.0 > > Attachments: HADOOP-3339.patch, tmp-3339-dn.patch > > > When DFSClient is writing to DFS, it does not correctly detect the culprit > datanode (rather datanodes do not inform) properly if the bad node times out. > Say, the last datanode in in 3 node pipeline is is too slow or defective. In > this case, pipeline removes the first two datanodes in first two attempts. > The third attempt has only the 3rd datanode in the pipeline and it will fail > too. If the pipeline detects the bad 3rd node when the first failure occurs, > the write will succeed in the second attempt. > I will attach example logs of such cases. I think this should be fixed in > 0.17.x. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.