[ https://issues.apache.org/jira/browse/HDFS-10206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15651089#comment-15651089 ]
Nandakumar commented on HDFS-10206: ----------------------------------- {{NetworkTopology#sortByDistance}} uses {{NetworkTopology#getWeight}} to calculate the distance between reader and node. Additional logic is added in {{NetworkTopology#getWeight}} to calculate the distance based on networkLocation of reader and the node when the following conditions are not satisfy bq. reader.equals(node) & isOnSameRack(reader, node) This will work for DFSClient machine which is not a datanode, since the distance calculation depends on networkLocation and not the parent Node. Please review the patch. Thanks, Nanda > getBlockLocations might not sort datanodes properly by distance > --------------------------------------------------------------- > > Key: HDFS-10206 > URL: https://issues.apache.org/jira/browse/HDFS-10206 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Ming Ma > Assignee: Nandakumar > Attachments: HDFS-10206.000.patch > > > If the DFSClient machine is not a datanode, but it shares its rack with some > datanodes of the HDFS block requested, {{DatanodeManager#sortLocatedBlocks}} > might not put the local-rack datanodes at the beginning of the sorted list. > That is because the function didn't call {{networktopology.add(client);}} to > properly set the node's parent node; something required by > {{networktopology.sortByDistance}} to compute distance between two nodes in > the same topology tree. > Another issue with {{networktopology.sortByDistance}} is it only > distinguishes local rack from remote rack, but it doesn't support general > distance calculation to tell how remote the rack is. > {noformat} > NetworkTopology.java > protected int getWeight(Node reader, Node node) { > // 0 is local, 1 is same rack, 2 is off rack > // Start off by initializing to off rack > int weight = 2; > if (reader != null) { > if (reader.equals(node)) { > weight = 0; > } else if (isOnSameRack(reader, node)) { > weight = 1; > } > } > return weight; > } > {noformat} > HDFS-10203 has suggested moving the sorting from namenode to DFSClient to > address another issue. Regardless of where we do the sorting, we still need > fix the issues outline here. > Note that BlockPlacementPolicyDefault shares the same NetworkTopology object > used by DatanodeManager and requires Nodes stored in the topology to be > {{DatanodeDescriptor}} for block placement. So we need to make sure we don't > pollute the NetworkTopology if we plan to fix it on the server side. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org