[ https://issues.apache.org/jira/browse/HDDS-1637?focusedWorklogId=253790&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-253790 ]
ASF GitHub Bot logged work on HDDS-1637: ---------------------------------------- Author: ASF GitHub Bot Created on: 04/Jun/19 14:57 Start Date: 04/Jun/19 14:57 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #904: HDDS-1637. Fix random test failure TestSCMContainerPlacementRackAware. URL: https://github.com/apache/hadoop/pull/904#discussion_r290342177 ########## File path: hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/placement/algorithms/TestSCMContainerPlacementRackAware.java ########## @@ -82,7 +82,7 @@ public void setup() { when(nodeManager.getNodeStat(anyObject())) .thenReturn(new SCMNodeMetric(STORAGE_CAPACITY, 0L, 100L)); when(nodeManager.getNodeStat(datanodes.get(2))) - .thenReturn(new SCMNodeMetric(STORAGE_CAPACITY, 90L, 10L)); + .thenReturn(new SCMNodeMetric(STORAGE_CAPACITY, 90L, 20L)); Review comment: Thanks @ChenSammi for the patch. I had though about similar test only fix yesterday. But this may hide a code bug when we deal with mixed nodes that some have enough space while others don't. If the chooseNodes() keep choosing the nodes without enough capacity for 3 times (default), we end up with failures even though other nodes have enough space. Here are two proposals: 1. Filtering the candidate node list without enough capacity so that the placement algorithm won't need to deal with it. 2. Handle capacity in the placement algorithm by adding the detected node without enough capacity to the exclude node list, so that the algorithm won't choose them in the next round. I prefer 1 because it seems a cleaner approach and more efficient than 2. I'm OK with 2 as well if you feel it is easier to adjust the placement algorithm. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 253790) Time Spent: 0.5h (was: 20m) > Fix random test failure TestSCMContainerPlacementRackAware > ---------------------------------------------------------- > > Key: HDDS-1637 > URL: https://issues.apache.org/jira/browse/HDDS-1637 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This has been seen randomly in latest trunk CI, e.g., > [https://ci.anzix.net/job/ozone/16980/testReport/org.apache.hadoop.hdds.scm.container.placement.algorithms/TestSCMContainerPlacementRackAware/testFallback/] > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org