Sammi Chen created HDDS-3945:
--------------------------------

             Summary: ContainerReplicaNotFoundException when remove a replica 
in ContainerReportHandler
                 Key: HDDS-3945
                 URL: https://issues.apache.org/jira/browse/HDDS-3945
             Project: Hadoop Distributed Data Store
          Issue Type: Bug
            Reporter: Sammi Chen


It's not easy to produce.  


2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Container #54339 is 
over replicated. Expected replica count is 3, but found 16.
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
826dda09-1259-4c5c-9a80-56b985665dc4{ip: 9.180.6.157, host: host-9-180-6-157, 
networkLocation: /rack10, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
6f87886a-745b-4eb6-9b4b-54e1f909f20c{ip: 9.180.13.218, host: host-9-180-13-218, 
networkLocation: /rack2, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
d3336357-8920-4a4e-a12f-e57da1640c4d{ip: 9.180.20.94, host: host-9-180-20-94, 
networkLocation: /rack1, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
7b4edd6e-5787-4574-9928-810514a05d2b{ip: 9.179.142.222, host: host222, 
networkLocation: /rack2, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
5b36ed4f-4a6b-4014-b181-235789956d34{ip: 9.180.8.67, host: host-9-180-8-67, 
networkLocation: /rack10, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
d35f7754-3914-4e3a-ac91-4ae26e08e8a7{ip: 9.180.19.144, host: host-9-180-19-144, 
networkLocation: /rack3, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
db854037-4846-4093-89de-e492e0f14239{ip: 9.179.142.198, host: host198, 
networkLocation: /rack3, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
228dacd3-36cf-4473-93ec-c06a739a8a2d{ip: 9.180.8.87, host: host-9-180-8-87, 
networkLocation: /rack10, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
2e1b2fdd-f8fb-4252-bfc1-31d5339681be{ip: 9.179.144.104, host: 
host-9-179-144-104, networkLocation: /rack2, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
1904b912-998d-43ba-9e54-f7e7c40c1759{ip: 9.180.21.100, host: host-9-180-21-100, 
networkLocation: /rack2, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
dd64e953-bdef-4dae-a4c5-51aa7114ea0a{ip: 9.180.8.40, host: host-9-180-8-40, 
networkLocation: /rack10, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
47cdfded-e88f-44f3-81b9-4f95e65e364f{ip: 9.180.8.78, host: host-9-180-8-78, 
networkLocation: /rack10, certSerialId: null}
2020-07-04 16:14:19,820 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending delete 
container command for container #54339 to datanode 
11974d80-c4ff-4963-81fa-873888feaa24{ip: 9.180.8.58, host: host-9-180-8-58, 
networkLocation: /rack10, certSerialId: null}

2020-07-04 16:18:29,709 [EventQueue-ContainerReportForContainerReportHandler] 
ERROR org.apache.hadoop.hdds.scm.container.ContainerReportHandler: Exception 
while processing container report for container 54339 from datanode 
7b4edd6e-5787-4574-9928-810514a05d2b{ip: 9.179.142.222, host: host222, 
networkLocation: /rack2, certSerialId: null}.
org.apache.hadoop.hdds.scm.container.ContainerReplicaNotFoundException: 
Container #54339, replica: ContainerReplica{containerID=#54339, 
datanodeDetails=7b4edd6e-5787-4574-9928-810514a05d2b{ip: 9.179.142.222, host: 
host222, networkLocation: /rack2, certSerialId: null}, 
placeOfBirth=ca0dedd0-f586-4f99-986b-3a953dfc2dde, sequenceId=4249}
        at 
org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.removeContainerReplica(ContainerStateMap.java:256)
        at 
org.apache.hadoop.hdds.scm.container.ContainerStateManager.removeContainerReplica(ContainerStateManager.java:534)
        at 
org.apache.hadoop.hdds.scm.container.SCMContainerManager.removeContainerReplica(SCMContainerManager.java:560)
        at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerReplica(AbstractContainerReportHandler.java:234)
        at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:81)
        at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:163)
        at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:131)
        at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:51)
        at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to