[ https://issues.apache.org/jira/browse/HDFS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dhruba borthakur resolved HDFS-1384. ------------------------------------ Resolution: Duplicate This bug has been fixed in trunk because the client sends the excluded list to the namenode with the addBlock RPC. The NN ensures that it does not return a datanode from the excluded list. This bug is still present in the 0.20-append branch > NameNode should give client the first node in the pipeline from different > rack other than that of excludedNodes list in the same rack. > --------------------------------------------------------------------------------------------------------------------------------------- > > Key: HDFS-1384 > URL: https://issues.apache.org/jira/browse/HDFS-1384 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.20-append, 0.20.1 > Reporter: Thanh Do > > We saw a case that NN keeps giving client nodes from the same rack, hence an > exception > from client when try to setup the pipeline. Client retries 5 times and fails. > > Here is more details. Support we have 2 rack > - Rack 0: from dn1 to dn7 > - Rack 1: from dn8 to dn14 > Client asks for 3 dns and NN replies with dn1, dn8 and dn9, for example. > Because there is network partition, so client doesn't see any node in Rack 0. > Hence, client add dn1 to excludedNodes list, and ask NN again. > Interestingly, NN picks a different node (from those in excludedNodes) in > Rack 0, > and gives back to client, and so on. Client keeps retrying and after 5 times > of retrials, > write fails. > This bug was found by our Failure Testing Service framework: > http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html > For questions, please email us: Thanh Do (than...@cs.wisc.edu) and > Haryadi Gunawi (hary...@eecs.berkeley.edu) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.