[jira] [Commented] (HDFS-17646) Add Option to limit Balancer prefer highly utilized nodes num in each iteration

ASF GitHub Bot (Jira) Fri, 18 Oct 2024 00:13:04 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-17646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890816#comment-17890816
 ]


ASF GitHub Bot commented on HDFS-17646:
---------------------------------------

Hexiaoqiao commented on code in PR #7120:
URL: https://github.com/apache/hadoop/pull/7120#discussion_r1805979088


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java:
##########
@@ -461,27 +466,31 @@ private long init(List<DatanodeStorageReport> reports) {
     metrics.setNumOfUnderUtilizedNodes(underUtilized.size());
     
     Preconditions.checkState(dispatcher.getStorageGroupMap().size()
-        == overUtilized.size() + underUtilized.size() + aboveAvgUtilized.size()
-           + belowAvgUtilized.size(),
+            == overUtilizedDiffNum + overUtilized.size() + underUtilized.size()

Review Comment:
   a. `overUtilized.size()` = `overUtilizedDiffNum + overUtilized.size()`? so 
is it need to change here? 
   b. Please align the format.



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerLongRunningTasks.java:
##########
@@ -672,6 +672,97 @@ public void testBalancerWithSortTopNodes() throws 
Exception {
     assertEquals(900, maxUsage);
   }
 
+  @Test(timeout = 60000)
+  public void testBalancerWithLimitTopNodesNum() throws Exception {
+    final Configuration conf = new HdfsConfiguration();
+    // Init the config (block size to 100)
+    initConf(conf);
+    conf.setInt(DFS_HEARTBEAT_INTERVAL_KEY, 30000);
+
+    final long totalCapacity = 1000L;
+    final int diffBetweenNodes = 50;
+
+    // Set up the nodes with two groups:
+    // 5 over-utilized nodes with 80%, 85%, 90%, 95%, 100% usage
+    // 2 under-utilized nodes with 0%, 5% usage
+    // With sortTopNodes and limitTopNodesNum option, 100% used ones will be 
chosen
+    final int numOfOverUtilizedDn = 5;
+    final int numOfUnderUtilizedDn = 2;
+    final int totalNumOfDn = numOfOverUtilizedDn + numOfUnderUtilizedDn;
+    final long[] capacityArray = new long[totalNumOfDn];
+    Arrays.fill(capacityArray, totalCapacity);
+
+    try (MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf)
+        .numDataNodes(totalNumOfDn)
+        .simulatedCapacities(capacityArray)
+        .build()) {
+      cluster.setDataNodesDead();
+      List<DataNode> dataNodes = cluster.getDataNodes();
+      // Create top used nodes
+      for (int i = 0; i < numOfOverUtilizedDn; i++) {
+        // Bring one node alive
+        DataNodeTestUtils.triggerHeartbeat(dataNodes.get(i));
+        DataNodeTestUtils.triggerBlockReport(dataNodes.get(i));
+        // Create nodes with: 80%, 85%, 90%, 95%, 100%
+        int nodeCapacity = (int) totalCapacity - diffBetweenNodes * 
(numOfOverUtilizedDn - i - 1);
+        TestBalancer.createFile(cluster, new Path("test_big" + i), 
nodeCapacity, (short) 1, 0);
+        cluster.setDataNodesDead();
+      }
+
+      // Create under utilized nodes
+      for (int i = 0; i < numOfUnderUtilizedDn; i++) {
+        // Bring one node alive
+        DataNodeTestUtils.triggerHeartbeat(dataNodes.get(i));

Review Comment:
   Here write some data in dataNodes[0] and dataNode[1] again? Not sure if the 
capacityUsed percent is as expected. Would you mind to give double check?



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java:
##########
@@ -461,27 +466,31 @@ private long init(List<DatanodeStorageReport> reports) {
     metrics.setNumOfUnderUtilizedNodes(underUtilized.size());
     
     Preconditions.checkState(dispatcher.getStorageGroupMap().size()
-        == overUtilized.size() + underUtilized.size() + aboveAvgUtilized.size()
-           + belowAvgUtilized.size(),
+            == overUtilizedDiffNum + overUtilized.size() + underUtilized.size()
+            + aboveAvgUtilized.size() + belowAvgUtilized.size(),
         "Mismatched number of storage groups");
     
     // return number of bytes to be moved in order to make the cluster balanced
     return Math.max(overLoadedBytes, underLoadedBytes);
   }
 
   private void sortOverUtilized(Map<Source, Double> overUtilizedPercentage) {
-    Preconditions.checkState(overUtilized instanceof List,
-        "Collection overUtilized is not a List.");
+    Preconditions.checkState(overUtilized instanceof LinkedList,
+        "Collection overUtilized is not a LinkedList.");
 
     LOG.info("Sorting over-utilized nodes by capacity" +
         " to bring down top used datanode capacity faster");
 
-    List<Source> list = (List<Source>) overUtilized;
+    LinkedList<Source> list = (LinkedList<Source>) overUtilized;
     list.sort(
         (Source source1, Source source2) ->
             (Double.compare(overUtilizedPercentage.get(source2),
                 overUtilizedPercentage.get(source1)))
     );
+    int size = overUtilized.size();

Review Comment:
   Just suggest to add another one separate function rather than located here 
which is `sort` something.



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java:
##########
@@ -452,6 +456,7 @@ private long init(List<DatanodeStorageReport> reports) {
       }
     }
 
+    int overUtilizedDiffNum = Math.max(overUtilized.size() - limitTopNodesNum, 
0);

Review Comment:
   I didn't get purpose of this changes.





> Add Option to limit Balancer prefer highly utilized nodes num in each 
> iteration
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-17646
>                 URL: https://issues.apache.org/jira/browse/HDFS-17646
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Zhaobo Huang
>            Assignee: Zhaobo Huang
>            Priority: Major
>              Labels: pull-request-available
>
> Limit the number of topN nodes to avoid excessive nodes affecting cluster 
> stability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-17646) Add Option to limit Balancer prefer highly utilized nodes num in each iteration

Reply via email to