[ 
https://issues.apache.org/jira/browse/HDDS-15350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDDS-15350:
-----------------------------------
    Summary: Divide by zero bug crashed SCM when decommissioning a datanode  
(was: Divide by zero bug crash SCM when decommission datanode)

> Divide by zero bug crashed SCM when decommissioning a datanode
> --------------------------------------------------------------
>
>                 Key: HDDS-15350
>                 URL: https://issues.apache.org/jira/browse/HDDS-15350
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: SCM
>            Reporter: Wei-Chiu Chuang
>            Priority: Major
>
> Encountered an interesting bug:
>  
> {noformat}
> 2026-05-22 16:23:39,439 ERROR 
> [ReplicationMonitor]-org.apache.hadoop.hdds.scm.container.replication.ReplicationManager:
>  Exception in Replication Monitor Thr
> ead.
> java.lang.ArithmeticException: / by zero
>         at 
> org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.getMaxReplicasPerRack(SCMCommonPlacementPolicy.java:419)
>         at 
> org.apache.hadoop.hdds.scm.SCMCommonPlacementPolicy.validateContainerPlacement(SCMCommonPlacementPolicy.java:466)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.ECMisReplicationCheckHandler.getPlacementStatus(ECMisReplicationCheckHandler.java:138)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.ECMisReplicationCheckHandler.checkMisReplication(ECMisReplicationCheckHandler.java:93)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.ECMisReplicationCheckHandler.handle(ECMisReplicationCheckHandler.java:69)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:38)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.health.AbstractCheck.handleChain(AbstractCheck.java:40)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processContainer(ReplicationManager.java:899)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processContainer(ReplicationManager.java:872)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.processAll(ReplicationManager.java:399)
>         at 
> org.apache.hadoop.hdds.scm.container.replication.ReplicationManager.run(ReplicationManager.java:953)
>         at java.lang.Thread.run(Thread.java:748)
> 2026-05-22 16:23:39,442 INFO 
> [ReplicationMonitor]-org.apache.hadoop.util.ExitUtil: Exiting with status 1: 
> java.lang.ArithmeticException: / by zero {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to