Re: datanode replication

Steve Loughran Mon, 11 May 2009 03:13:20 -0700

Jeff Hammerbacher wrote:

Hey Vishal,


Check out the chooseTarget() method(s) of ReplicationTargetChooser.java in
the org.apache.hadoop.hdfs.server.namenode package:
http://svn.apache.org/viewvc/hadoop/core/trunk/src/hdfs/org/apache/hadoop/hdfs/server/namenode/ReplicationTargetChooser.java?view=markup
.

In words: assuming you're using the default replication level (3), the
default strategy will put one block on the local node, one on a node in a
remote rack, and another on that same remote rack.

Note that HADOOP-3799 (http://issues.apache.org/jira/browse/HADOOP-3799)
proposes making this strategy pluggable.

Yes, there's some good reasons for having different placement algorithmsfor different datacentres, and I could even imagine different MRsequences providing hints about where they want data, depending on whatthey want to do afterwards

Re: datanode replication

Reply via email to