Hi Mei, These questions are best fit for the whole hdfs-dev@ group, which am adding while responding back.
You're correct that the local node being not chosen, a random node on the same rack as the writer may be tried. However, if the writer node itself had no local DN in the first place, a completely random node across any rack may be selected. This is visible at http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java?view=markup (~263). On Thu, Apr 4, 2013 at 12:53 AM, Mei Long <lon...@gmail.com> wrote: > Hi Harsh, > > I found your post on JIRA regarding the comments. I find them very confusing > as well. I'm trying to figure out the following comment: > > * the 1st replica is placed on the local machine, > * otherwise a random datanode. > > Do you know when it says "otherwise a random datanode," does it mean any > random datanode anywhere on the network or a random datanode on the same > rack as the local machine? I've been looking at the code for an hour and I'm > getting more confused with the comment and code in chooseLocalNode() > > /* choose <i>localMachine</i> as the target. > > * if <i>localMachine</i> is not availabe, > > * choose a node on the same rack > > * @return the choosen node > > */ > > Your help is much appreciated! > > Mei -- Harsh J