[ 
https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869262#comment-15869262
 ] 

Yiqun Lin commented on HDFS-11419:
----------------------------------

I looked into this again. I found something seem not be absolutely accurate.
{quote}
So, for the most part, this code blindly picks random datanodes that do not 
satisfy the storage type, and adds the node to excluded and tries again and 
again.
{quote}
>From the codes of {{BlockPlacementDeaultPolicy#chooseRandom}}, it doesn't add 
>the node to excluded when the random nodes not satisfy the storage type. Only 
>when it is found, then it will do. The related codes:
{code}
...
storage = chooseStorage4Block(
              chosenNode, blocksize, results, entry.getKey());
          if (storage != null) {     <=== only when storage was found then 
adding to excluded
            numOfReplicas--;
            if (firstChosen == null) {
              firstChosen = storage;
            }
            // add node (subclasses may also add related nodes) to excludedNode
            addToExcludedNodes(chosenNode, excludedNodes);
...
{code}

> BlockPlacementPolicyDefault is choosing datanode in an inefficient way
> ----------------------------------------------------------------------
>
>                 Key: HDFS-11419
>                 URL: https://issues.apache.org/jira/browse/HDFS-11419
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Chen Liang
>
> Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up 
> calling into {{chooseRandom}}, which will first find a random datanode by 
> calling
> {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, 
> excludedNodes);{code}, then it checks whether that returned datanode 
> satisfies storage type requirement
> {code}storage = chooseStorage4Block(
>               chosenNode, blocksize, results, entry.getKey());{code}
> If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, 
> and runs the loop again until {{numOfReplicas}} is down to 0.
> A problem here is that, storage type is not being considered only until after 
> a random node is already returned.  We've seen a case where a cluster has a 
> large number of datanodes, while only a few satisfy the storage type 
> condition. So, for the most part, this code blindly picks random datanodes 
> that do not satisfy the storage type, and adds the node to excluded and tries 
> again and again.
> To make matters worse, the way {{NetworkTopology#chooseRandom}} works is 
> that, given a set of excluded nodes, it first finds a random datanodes, then 
> if it is in excluded nodes set, try find another random nodes. So the more 
> excluded nodes there are, the more likely a random node will be in the 
> excluded set, in which case we basically wasted one iteration.
> Therefore, this JIRA proposes to augment/modify the relevant classes in a way 
> that datanodes can be found more efficiently. There are currently two 
> different high level solutions we are considering:
> 1. add some field to Node base types to describe the storage type info, and 
> when searching for a node, we take such field(s) in to account, and do not 
> return node that does not meet the storage type requirement.
> 2. change {{NetworkTopology}} class to be aware of storage types: for one 
> storage type, there is one tree subset that connects all the nodes with that 
> type. And one search happens on only one such subset. So unexpected storage 
> types are simply in the search space. 
> Thanks [~szetszwo] for the offline discussion, and any comments are more than 
> welcome.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to