[ 
https://issues.apache.org/jira/browse/HDFS-5168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nikola Vujic updated HDFS-5168:
-------------------------------

    Attachment: HDFS-5168.patch

attaching original patch again.

> BlockPlacementPolicy does not work for cross node group dependencies
> --------------------------------------------------------------------
>
>                 Key: HDFS-5168
>                 URL: https://issues.apache.org/jira/browse/HDFS-5168
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Nikola Vujic
>            Assignee: Nikola Vujic
>            Priority: Critical
>         Attachments: HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch, 
> HDFS-5168.patch, HDFS-5168.patch, HDFS-5168.patch
>
>
> Block placement policies do not work for cross rack/node group dependencies. 
> In reality this is needed when compute servers and storage fall in two 
> independent fault domains, then both BlockPlacementPolicyDefault and 
> BlockPlacementPolicyWithNodeGroup are not able to provide proper block 
> placement.
> Let's suppose that we have Hadoop cluster with one rack with two servers, and 
> we run 2 VMs per server. Node group topology for this cluster would be:
>  server1-vm1 -> /d1/r1/n1
>  server1-vm2 -> /d1/r1/n1
>  server2-vm1 -> /d1/r1/n2
>  server2-vm2 -> /d1/r1/n2
> This is working fine as long as server and storage fall into the same fault 
> domain but if storage is in a different fault domain from the server, we will 
> not be able to handle that. For example, if storage of server1-vm1 is in the 
> same fault domain as storage of server2-vm1, then we must not place two 
> replicas on these two nodes although they are in different node groups.
> Two possible approaches:
> - One approach would be to define cross rack/node group dependencies and to 
> use them when excluding nodes from the search space. This looks as the 
> cleanest way to fix this as it requires minor changes in the 
> BlockPlacementPolicy classes.
> - Other approach would be to allow nodes to fall in more than one node group. 
> When we chose a node to hold a replica we have to exclude from the search 
> space all nodes from the node groups where the chosen node belongs. This 
> approach may require major changes in the NetworkTopology.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to