[ 
http://issues.apache.org/jira/browse/HADOOP-692?page=comments#action_12448880 ] 
            
Sameer Paranjpye commented on HADOOP-692:
-----------------------------------------

Declaring  the topology in the slaves file could be brittle. Nodes go down, get 
repaired and may or may not return to the same location. Particularly if nodes 
are spread across datacenters, keeping the slaves file in sync with every 
change in datacenters will quickly become cumbersome.

Ditto, constructing the topology with timing experiments on the namenode.  
Timing experiments can be unreliable and biased by transient network 
congestions. A persistent topology map is error prone because nodes can move 
around.It's also not clear how frequently the topology map would be 
constructed. Every time node(s) are added or return? 

It appears feasible to determine the location of a node in a network with a 
local operation, probably by running an installation specific script. The 
Datanode reports it's location to the namenode upon registration, which updates 
a dynamic topology map. This map can be cheaply re-constructed at startup when 
processing Datanode registrations and block reports.

---

On the subject of replica placement, would there be any value in co-locating 
blocks from a single file at a rack or datacenter level? 

Say, block 0 is placed on <node a, rack p>, <node b, rack p> and <node c, rack 
q> i.e. first replica node local, next rack local and the third on a different 
rack. Would it make sense to put block 1 and subsequent blocks similarly?

So block 1 could be on <node a, rack p>, <node d, rack p> and <node e, rack q>. 
Then all blocks from the file would end up on racks 'p' and 'q', and it would 
be easy to get rack locality for the entire file.

This probably needs a lot more modeling and analysis...



> Rack-aware Replica Placement
> ----------------------------
>
>                 Key: HADOOP-692
>                 URL: http://issues.apache.org/jira/browse/HADOOP-692
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.9.0
>
>
> This issue assumes that HDFS runs on a cluster of computers that spread 
> across many racks. Communication between two nodes on different racks needs 
> to go through switches. Bandwidth in/out of a rack may be less than the total 
> bandwidth of machines in the rack. The purpose of rack-aware replica 
> placement is to improve data reliability, availability, and network bandwidth 
> utilization. The basic idea is that each data node determines to which rack 
> it belongs at the startup time and notifies the name node of the rack id upon 
> registration. The name node maintains a rackid-to-datanode map and tries to 
> place replicas across racks.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to