[ 
https://issues.apache.org/jira/browse/HDFS-9149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941243#comment-14941243
 ] 

He Xiaoqiao commented on HDFS-9149:
-----------------------------------

hi [~He Tianyi]], thank you for your comments.
{quote}
The only thing is that, I'm not aware why did getWeight designed to be like 
this in the first place, i.e. whether there is some particular concern. 
{quote}
maybe there is  no any particular concerns. From the original implemention of 
{{pseudoSortByDistance}} to 
[HDFS-6268|https://issues.apache.org/jira/browse/HDFS-6268] which is first time 
to restructure by {{SortByDistance}} there is no indication to consider 
multi-IDC scenario.
{quote}
One simple idea is generalizes getWeight into a function that calculates 
distance between two locations (more like getDistance), regardless of the 
meaning of each hierarchy.
{quote}
i think it could be simple and resonable to add if statement based on 
{{getWeight}}:
{code:java}
   protected int getWeight(Node reader, Node node) {
-    // 0 is local, 1 is same rack, 2 is off rack
+    // 0 is local, 1 is same rack, 2 is same IDC, 3 is off IDC
     // Start off by initializing to off rack
-    int weight = 2;
+    int weight = 3;
     if (reader != null) {
       if (reader.equals(node)) {
         weight = 0;
       } else if (isOnSameRack(reader, node)) {
         weight = 1;
+      } else {
+        rParent = reader.getParent();
+        nParent = node.getParent();
+        if (null != rParent && null != nParent && isSameParent(rParent, 
nParent))
+          weight = 2;
       }
     }
     return weight;
{code}

> Consider multi datacenter when sortByDistance
> ---------------------------------------------
>
>                 Key: HDFS-9149
>                 URL: https://issues.apache.org/jira/browse/HDFS-9149
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: He Xiaoqiao
>            Assignee: He Tianyi
>
> {{sortByDistance}} doesn't consider multi-datacenter when read data, so there 
> my be reading data via other datacenter when hadoop deployment with multi-IDC.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to