[ https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13983960#comment-13983960 ]
Tsz Wo Nicholas Sze commented on HDFS-5168: ------------------------------------------- DNSToSwitchMapping is a \@Public \@Evolving interface so that we have to change it in a compatible manner (otherwise, we cannot commit this to branch-2) . We should avoid adding the new getDependency(..) method to it. How about we add another interface class, say DNSToSwitchMappingWithDependency, and keep DNSToSwitchMapping unchanged? More details: - DNSToSwitchMappingWithDependency extends DNSToSwitchMapping and adds the new getDependency(..) method. - ScriptBasedMappingWithDependency extends ScriptBasedMapping and RawScriptBasedMappingWithDependency extends RawScriptBasedMapping; change ScriptBasedMapping and RawScriptBasedMapping to allow inheritance. - Add dependency cache support to ScriptBasedMappingWithDependency. - DatanodeManager checks if dnsToSwitchMapping instanceof DNSToSwitchMappingWithDependency. If yes, cast the object and get dependencies; otherwise, use empty list. - CachedDNSToSwitchMapping and TableMapping remains unchanged. > BlockPlacementPolicy does not work for cross node group dependencies > -------------------------------------------------------------------- > > Key: HDFS-5168 > URL: https://issues.apache.org/jira/browse/HDFS-5168 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Reporter: Nikola Vujic > Assignee: Nikola Vujic > Priority: Critical > Attachments: HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch, > HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch > > > Block placement policies do not work for cross rack/node group dependencies. > In reality this is needed when compute servers and storage fall in two > independent fault domains, then both BlockPlacementPolicyDefault and > BlockPlacementPolicyWithNodeGroup are not able to provide proper block > placement. > Let's suppose that we have Hadoop cluster with one rack with two servers, and > we run 2 VMs per server. Node group topology for this cluster would be: > server1-vm1 -> /d1/r1/n1 > server1-vm2 -> /d1/r1/n1 > server2-vm1 -> /d1/r1/n2 > server2-vm2 -> /d1/r1/n2 > This is working fine as long as server and storage fall into the same fault > domain but if storage is in a different fault domain from the server, we will > not be able to handle that. For example, if storage of server1-vm1 is in the > same fault domain as storage of server2-vm1, then we must not place two > replicas on these two nodes although they are in different node groups. > Two possible approaches: > - One approach would be to define cross rack/node group dependencies and to > use them when excluding nodes from the search space. This looks as the > cleanest way to fix this as it requires minor changes in the > BlockPlacementPolicy classes. > - Other approach would be to allow nodes to fall in more than one node group. > When we chose a node to hold a replica we have to exclude from the search > space all nodes from the node groups where the chosen node belongs. This > approach may require major changes in the NetworkTopology. -- This message was sent by Atlassian JIRA (v6.2#6252)