[ https://issues.apache.org/jira/browse/HDFS-17646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890816#comment-17890816 ]
ASF GitHub Bot commented on HDFS-17646: --------------------------------------- Hexiaoqiao commented on code in PR #7120: URL: https://github.com/apache/hadoop/pull/7120#discussion_r1805979088 ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java: ########## @@ -461,27 +466,31 @@ private long init(List<DatanodeStorageReport> reports) { metrics.setNumOfUnderUtilizedNodes(underUtilized.size()); Preconditions.checkState(dispatcher.getStorageGroupMap().size() - == overUtilized.size() + underUtilized.size() + aboveAvgUtilized.size() - + belowAvgUtilized.size(), + == overUtilizedDiffNum + overUtilized.size() + underUtilized.size() Review Comment: a. `overUtilized.size()` = `overUtilizedDiffNum + overUtilized.size()`? so is it need to change here? b. Please align the format. ########## hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancerLongRunningTasks.java: ########## @@ -672,6 +672,97 @@ public void testBalancerWithSortTopNodes() throws Exception { assertEquals(900, maxUsage); } + @Test(timeout = 60000) + public void testBalancerWithLimitTopNodesNum() throws Exception { + final Configuration conf = new HdfsConfiguration(); + // Init the config (block size to 100) + initConf(conf); + conf.setInt(DFS_HEARTBEAT_INTERVAL_KEY, 30000); + + final long totalCapacity = 1000L; + final int diffBetweenNodes = 50; + + // Set up the nodes with two groups: + // 5 over-utilized nodes with 80%, 85%, 90%, 95%, 100% usage + // 2 under-utilized nodes with 0%, 5% usage + // With sortTopNodes and limitTopNodesNum option, 100% used ones will be chosen + final int numOfOverUtilizedDn = 5; + final int numOfUnderUtilizedDn = 2; + final int totalNumOfDn = numOfOverUtilizedDn + numOfUnderUtilizedDn; + final long[] capacityArray = new long[totalNumOfDn]; + Arrays.fill(capacityArray, totalCapacity); + + try (MiniDFSCluster cluster = new MiniDFSCluster.Builder(conf) + .numDataNodes(totalNumOfDn) + .simulatedCapacities(capacityArray) + .build()) { + cluster.setDataNodesDead(); + List<DataNode> dataNodes = cluster.getDataNodes(); + // Create top used nodes + for (int i = 0; i < numOfOverUtilizedDn; i++) { + // Bring one node alive + DataNodeTestUtils.triggerHeartbeat(dataNodes.get(i)); + DataNodeTestUtils.triggerBlockReport(dataNodes.get(i)); + // Create nodes with: 80%, 85%, 90%, 95%, 100% + int nodeCapacity = (int) totalCapacity - diffBetweenNodes * (numOfOverUtilizedDn - i - 1); + TestBalancer.createFile(cluster, new Path("test_big" + i), nodeCapacity, (short) 1, 0); + cluster.setDataNodesDead(); + } + + // Create under utilized nodes + for (int i = 0; i < numOfUnderUtilizedDn; i++) { + // Bring one node alive + DataNodeTestUtils.triggerHeartbeat(dataNodes.get(i)); Review Comment: Here write some data in dataNodes[0] and dataNode[1] again? Not sure if the capacityUsed percent is as expected. Would you mind to give double check? ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java: ########## @@ -461,27 +466,31 @@ private long init(List<DatanodeStorageReport> reports) { metrics.setNumOfUnderUtilizedNodes(underUtilized.size()); Preconditions.checkState(dispatcher.getStorageGroupMap().size() - == overUtilized.size() + underUtilized.size() + aboveAvgUtilized.size() - + belowAvgUtilized.size(), + == overUtilizedDiffNum + overUtilized.size() + underUtilized.size() + + aboveAvgUtilized.size() + belowAvgUtilized.size(), "Mismatched number of storage groups"); // return number of bytes to be moved in order to make the cluster balanced return Math.max(overLoadedBytes, underLoadedBytes); } private void sortOverUtilized(Map<Source, Double> overUtilizedPercentage) { - Preconditions.checkState(overUtilized instanceof List, - "Collection overUtilized is not a List."); + Preconditions.checkState(overUtilized instanceof LinkedList, + "Collection overUtilized is not a LinkedList."); LOG.info("Sorting over-utilized nodes by capacity" + " to bring down top used datanode capacity faster"); - List<Source> list = (List<Source>) overUtilized; + LinkedList<Source> list = (LinkedList<Source>) overUtilized; list.sort( (Source source1, Source source2) -> (Double.compare(overUtilizedPercentage.get(source2), overUtilizedPercentage.get(source1))) ); + int size = overUtilized.size(); Review Comment: Just suggest to add another one separate function rather than located here which is `sort` something. ########## hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java: ########## @@ -452,6 +456,7 @@ private long init(List<DatanodeStorageReport> reports) { } } + int overUtilizedDiffNum = Math.max(overUtilized.size() - limitTopNodesNum, 0); Review Comment: I didn't get purpose of this changes. > Add Option to limit Balancer prefer highly utilized nodes num in each > iteration > ------------------------------------------------------------------------------- > > Key: HDFS-17646 > URL: https://issues.apache.org/jira/browse/HDFS-17646 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Zhaobo Huang > Assignee: Zhaobo Huang > Priority: Major > Labels: pull-request-available > > Limit the number of topN nodes to avoid excessive nodes affecting cluster > stability. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org