[ https://issues.apache.org/jira/browse/HDFS-11860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiaoyu Yao updated HDFS-11860: ------------------------------ Attachment: HDFS-11860-HDFS-7240.002.patch > Ozone: SCM: SCMContainerPlacementCapacity#chooseNode sometimes does not > remove chosen node from healthy list. > ------------------------------------------------------------------------------------------------------------- > > Key: HDFS-11860 > URL: https://issues.apache.org/jira/browse/HDFS-11860 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone > Affects Versions: HDFS-7240 > Reporter: Xiaoyu Yao > Assignee: Xiaoyu Yao > Attachments: HDFS-11860-HDFS-7240.001.patch, > HDFS-11860-HDFS-7240.002.patch > > > This was caught in Jenkins run randomly. After debugging, found the cause is > the > logic when two random index happens to be the same below where the node id > was returned without being removed from the healthy list for next round of > selection. As a result, there could be duplicated datanodes chosen for the > pipeline and the machine list size smaller than expected. I will post a fix > soon. > {code} > SCMContainerPlacementCapacity#chooseNode > // There is a possibility that both numbers will be same. > // if that is so, we just return the node. > if (firstNodeNdx == secondNodeNdx) { > return healthyNodes.get(firstNodeNdx); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org