[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #668: HDDS-3139 Pipeline placement should max out pipeline usage

GitBox Fri, 20 Mar 2020 05:53:15 -0700

sodonnel commented on a change in pull request #668: HDDS-3139 Pipeline 
placement should max out pipeline usage
URL: https://github.com/apache/hadoop-ozone/pull/668#discussion_r395615512


 ##########
 File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/PipelinePlacementPolicy.java
 ##########
 @@ -315,6 +314,50 @@ DatanodeDetails fallBackPickNodes(
     return results;
   }
 
+  private DatanodeDetails randomPick(List<DatanodeDetails> healthyNodes) {
+    DatanodeDetails datanodeDetails;
+    int firstNodeNdx = getRand().nextInt(healthyNodes.size());
+    int secondNodeNdx = getRand().nextInt(healthyNodes.size());
+
+    // There is a possibility that both numbers will be same.
+    // if that is so, we just return the node.
+    if (firstNodeNdx == secondNodeNdx) {
+      datanodeDetails = healthyNodes.get(firstNodeNdx);
+    } else {
+      DatanodeDetails firstNodeDetails = healthyNodes.get(firstNodeNdx);
+      DatanodeDetails secondNodeDetails = healthyNodes.get(secondNodeNdx);
+      datanodeDetails = nodeManager.getPipelinesCount(firstNodeDetails)
+          >= nodeManager.getPipelinesCount(secondNodeDetails)
+          ? secondNodeDetails : firstNodeDetails;
+    }
+    return datanodeDetails;
+  }
+
+  private List<DatanodeDetails> getLowerLoadNodes(
+      List<DatanodeDetails> nodes, int num) {
+    int maxPipelineUsage = nodes.size() * heavyNodeCriteria /
+        HddsProtos.ReplicationFactor.THREE.getNumber();
+    return nodes.stream()
+        // Skip the nodes which exceeds the load limit.
+        .filter(p -> nodeManager.getPipelinesCount(p) < num - maxPipelineUsage)
+        .collect(Collectors.toList());
+  }
+
+  private DatanodeDetails lowerLoadPick(List<DatanodeDetails> healthyNodes) {
+    int curPipelineCounts =  stateManager
+        .getPipelines(HddsProtos.ReplicationType.RATIS).size();
+    DatanodeDetails datanodeDetails;
+    List<DatanodeDetails> nodes = getLowerLoadNodes(
+        healthyNodes, curPipelineCounts);
+    if (nodes.isEmpty()) {
+      // random pick node if nodes load is at same level.
+      datanodeDetails = randomPick(healthyNodes);
+    } else {
+      datanodeDetails = nodes.stream().findFirst().get();
 
 Review comment:
   When i fixed the suspected bug I mentioned above, and then ran the test. The 
nodes do appear to fill their piplines on a node by node basis and then the 
test failed as each node did not have at least the average pipelines.
   
   Making this change got it to pass again:
   
   ```
   datanodeDetails = nodes.get(getRand().nextInt(nodes.size()) 
);//stream().findFirst().get();
   ```
   
   But it might be ever better if we sorted the list by pipeline count 
ascending and then took the first one, but it would be more expensive.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

[GitHub] [hadoop-ozone] sodonnel commented on a change in pull request #668: HDDS-3139 Pipeline placement should max out pipeline usage

Reply via email to