[ 
https://issues.apache.org/jira/browse/HDDS-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3920:
-----------------------------
    Description: 
2020-07-03 16:26:45,200 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Container #105228 is 
over replicated. Expected replica count is 3, but found 31.


2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Handling 
underreplicated container: 210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: deletionInFlight of 
container {}#210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: replicationInFlight of 
container {}#210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.43
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: source of container 
{}#210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.5.41
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.251
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.85
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.250
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.35
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.67
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.135
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.144.104
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.58
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.198
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.222
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Process container 
#210413 error:
java.lang.IllegalArgumentException
        at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
        at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:101)
        at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:568)
        at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:331)
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e{ip: 9.179.142.251, host: host251, 
networkLocation: /rack3, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :826dda09-1259-4c5c-9a80-56b985665dc4{ip: 9.180.6.157, host: 
host-9-180-6-157, networkLocation: /rack10, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :b85962f2-6647-463b-9944-3c9b24e4e313{ip: 9.180.19.148, host: 
host-9-180-19-148, networkLocation: /rack3, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :039cb21e-4e2e-47e2-bf3e-b025319ee856{ip: 9.179.142.158, host: host158, 
networkLocation: /rack1, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack1/33b49c34-caa2-4b4f-894e-dce7db4f97b9, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/b1e555d4-7114-4b80-b425-93086b0f2036, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack1/55148789-0cdb-4631-a3b3-c1da774523aa, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/32e8d855-b702-438d-b829-ac43dc567afc, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack2/2e1b2fdd-f8fb-4252-bfc1-31d5339681be, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/db854037-4846-4093-89de-e492e0f14239, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack10/826dda09-1259-4c5c-9a80-56b985665dc4, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/b85962f2-6647-463b-9944-3c9b24e4e313, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack1/039cb21e-4e2e-47e2-bf3e-b025319ee856, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Container: #210419. 
The container is mis-replicated as it is on 1 racks but should be on 2 racks.
2020-07-03 10:48:00,161 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending replicate 
container command for container #210419 to datanode 
5cb315e9-7326-4592-8dd6-21f4342b09c1{ip: 9.180.8.85, host: host-9-180-8-85, 
networkLocation: /rack10, certSerialId: null}


LOG message not clear enough:
1. 
2020-07-03 22:01:47,780 
[EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] 
ERROR org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: 
Exception while processing ICR for container 41138

2.  
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.




  was:
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Handling 
underreplicated container: 210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: deletionInFlight of 
container {}#210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: replicationInFlight of 
container {}#210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.43
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: source of container 
{}#210413
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.5.41
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.251
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.85
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.250
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.35
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.67
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.135
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.144.104
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.58
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.198
2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.222
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Process container 
#210413 error:
java.lang.IllegalArgumentException
        at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
        at 
org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:101)
        at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:568)
        at 
org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:331)
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e{ip: 9.179.142.251, host: host251, 
networkLocation: /rack3, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :826dda09-1259-4c5c-9a80-56b985665dc4{ip: 9.180.6.157, host: 
host-9-180-6-157, networkLocation: /rack10, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :b85962f2-6647-463b-9944-3c9b24e4e313{ip: 9.180.19.148, host: 
host-9-180-19-148, networkLocation: /rack3, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
node :039cb21e-4e2e-47e2-bf3e-b025319ee856{ip: 9.179.142.158, host: host158, 
networkLocation: /rack1, certSerialId: null}
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack1/33b49c34-caa2-4b4f-894e-dce7db4f97b9, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/b1e555d4-7114-4b80-b425-93086b0f2036, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack1/55148789-0cdb-4631-a3b3-c1da774523aa, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/32e8d855-b702-438d-b829-ac43dc567afc, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack2/2e1b2fdd-f8fb-4252-bfc1-31d5339681be, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/db854037-4846-4093-89de-e492e0f14239, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack10/826dda09-1259-4c5c-9a80-56b985665dc4, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack3/b85962f2-6647-463b-9944-3c9b24e4e313, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
/rack1/039cb21e-4e2e-47e2-bf3e-b025319ee856, generation to exclude: 1, 
generation to return: 1
2020-07-03 10:48:00,161 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Container: #210419. 
The container is mis-replicated as it is on 1 racks but should be on 2 racks.
2020-07-03 10:48:00,161 [ReplicationMonitor] INFO 
org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending replicate 
container command for container #210419 to datanode 
5cb315e9-7326-4592-8dd6-21f4342b09c1{ip: 9.180.8.85, host: host-9-180-8-85, 
networkLocation: /rack10, certSerialId: null}


LOG message not clear enough:
1. 
2020-07-03 22:01:47,780 
[EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] 
ERROR org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: 
Exception while processing ICR for container 41138

2.  
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.





> Redudant replications due to fail to get node's ancestor in ReplicationManager
> ------------------------------------------------------------------------------
>
>                 Key: HDDS-3920
>                 URL: https://issues.apache.org/jira/browse/HDDS-3920
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Sammi Chen
>            Priority: Critical
>
> 2020-07-03 16:26:45,200 [ReplicationMonitor] INFO 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Container #105228 is 
> over replicated. Expected replica count is 3, but found 31.
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Handling 
> underreplicated container: 210413
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: deletionInFlight of 
> container {}#210413
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: replicationInFlight 
> of container {}#210413
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.43
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: source of container 
> {}#210413
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.5.41
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.251
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.85
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.250
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.35
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.8.67
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.135
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.144.104
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.58
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.179.142.198
> 2020-07-03 10:48:00,161 [ReplicationMonitor] DEBUG 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: 9.180.20.222
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Process container 
> #210413 error:
> java.lang.IllegalArgumentException
>         at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
>         at 
> org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementRackAware.chooseDatanodes(SCMContainerPlacementRackAware.java:101)
>         at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.handleUnderReplicatedContainer(ReplicationManager.java:568)
>         at 
> org.apache.hadoop.hdds.scm.container.ReplicationManager.processContainer(ReplicationManager.java:331)
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
> node :f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e{ip: 9.179.142.251, host: host251, 
> networkLocation: /rack3, certSerialId: null}
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
> node :826dda09-1259-4c5c-9a80-56b985665dc4{ip: 9.180.6.157, host: 
> host-9-180-6-157, networkLocation: /rack10, certSerialId: null}
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
> node :b85962f2-6647-463b-9944-3c9b24e4e313{ip: 9.180.19.148, host: 
> host-9-180-19-148, networkLocation: /rack3, certSerialId: null}
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.NetUtils: Fail to get ancestor generation 1 of 
> node :039cb21e-4e2e-47e2-bf3e-b025319ee856{ip: 9.179.142.158, host: host158, 
> networkLocation: /rack1, certSerialId: null}
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack1/33b49c34-caa2-4b4f-894e-dce7db4f97b9, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack3/b1e555d4-7114-4b80-b425-93086b0f2036, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack1/55148789-0cdb-4631-a3b3-c1da774523aa, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack3/32e8d855-b702-438d-b829-ac43dc567afc, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack2/2e1b2fdd-f8fb-4252-bfc1-31d5339681be, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack3/db854037-4846-4093-89de-e492e0f14239, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack3/f8d9ccf6-20c6-4dfa-8a49-012f43a1b27e, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack10/826dda09-1259-4c5c-9a80-56b985665dc4, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack3/b85962f2-6647-463b-9944-3c9b24e4e313, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] WARN 
> org.apache.hadoop.hdds.scm.net.InnerNodeImpl: Ancestor not found, node: 
> /rack1/039cb21e-4e2e-47e2-bf3e-b025319ee856, generation to exclude: 1, 
> generation to return: 1
> 2020-07-03 10:48:00,161 [ReplicationMonitor] INFO 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Container: #210419. 
> The container is mis-replicated as it is on 1 racks but should be on 2 racks.
> 2020-07-03 10:48:00,161 [ReplicationMonitor] INFO 
> org.apache.hadoop.hdds.scm.container.ReplicationManager: Sending replicate 
> container command for container #210419 to datanode 
> 5cb315e9-7326-4592-8dd6-21f4342b09c1{ip: 9.180.8.85, host: host-9-180-8-85, 
> networkLocation: /rack10, certSerialId: null}
> LOG message not clear enough:
> 1. 
> 2020-07-03 22:01:47,780 
> [EventQueue-IncrementalContainerReportForIncrementalContainerReportHandler] 
> ERROR org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: 
> Exception while processing ICR for container 41138
> 2.  
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.
> 2020-07-03 22:23:38,297 [IPC Server handler 39 on default port 9861] WARN 
> org.apache.hadoop.hdds.scm.block.DeletedBlockLogImpl: Deleted TXID not found.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to