[ https://issues.apache.org/jira/browse/HADOOP-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398294#comment-13398294 ]
Junping Du commented on HADOOP-8468: ------------------------------------ Hi Konstantin, Thanks for your comments. Please see my reply: > If you put it in terms when virtual nodes are added as the fourth level, then > you don't need to change a word in the old policy. Still need some slightly change as first replica should be placed on local virtual node but not node local. Let me show a two different way of translation the original rules you list above (in rule 2, I omit "on two different nodes" there as it is duplicated with rule 0). Original: 0. No more than one replica is placed at any one node 1. First replica on the local node 2. Second and third replicas are in the same rack 3. Other replicas on random nodes with restriction that no more than two replicas are placed in the same rack, if there is enough racks. two ways: 1) node, rack -> node, *nodegroup*; 2) node, rack -> *virtual node*, node, rack. The black word represent additional layer. way 1: 0. No more than one replica is placed at any one *nodegroup* 1. First replica on the local node 2. Second and third replicas are in the same rack 3. Other replicas on random nodes with restriction that no more than two replicas are placed in the same rack, if there is enough racks way 2: 0. No more than one replica is placed at any one node 1. First replica on the local *virtual node* 2. Second and third replicas are in the same rack 3. Other replicas on random nodes with restriction that no more than two replicas are placed in the same rack, if there is enough racks So you can see it is equivalent in words. > Umbrella of enhancements to support different failure and locality topologies > ----------------------------------------------------------------------------- > > Key: HADOOP-8468 > URL: https://issues.apache.org/jira/browse/HADOOP-8468 > Project: Hadoop Common > Issue Type: Bug > Components: ha, io > Affects Versions: 1.0.0, 2.0.0-alpha > Reporter: Junping Du > Assignee: Junping Du > Priority: Critical > Attachments: HADOOP-8468-total-v3.patch, HADOOP-8468-total.patch, > Proposal for enchanced failure and locality topologies (revised-1.0).pdf, > Proposal for enchanced failure and locality topologies.pdf > > > The current hadoop network topology (described in some previous issues like: > Hadoop-692) works well in classic three-tiers network when it comes out. > However, it does not take into account other failure models or changes in the > infrastructure that can affect network bandwidth efficiency like: > virtualization. > Virtualized platform has following genes that shouldn't been ignored by > hadoop topology in scheduling tasks, placing replica, do balancing or > fetching block for reading: > 1. VMs on the same physical host are affected by the same hardware failure. > In order to match the reliability of a physical deployment, replication of > data across two virtual machines on the same host should be avoided. > 2. The network between VMs on the same physical host has higher throughput > and lower latency and does not consume any physical switch bandwidth. > Thus, we propose to make hadoop network topology extend-able and introduce a > new level in the hierarchical topology, a node group level, which maps well > onto an infrastructure that is based on a virtualized environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira