ayushtkn commented on PR #5583: URL: https://github.com/apache/hadoop/pull/5583#issuecomment-1520440465
I think it required the replica with older genstamp should also be on the same datanode as with the replica with the newer genstamp. Are you able to reproduce it when the 1001 replicas and 1002 replicas are on different datanodes? Else this block deletes corrupt replica immediately ``` // the block is over-replicated so invalidate the replicas immediately invalidateBlock(b, node, numberOfReplicas); ``` If you debug your test and go inside invalidateBlocks(). ``` // we already checked the number of replicas in the caller of this // function and know there are enough live replicas, so we can delete it. addToInvalidates(b.getCorrupted(), dn); removeStoredBlock(b.getStored(), node); ``` This ``` removeStoredBlock(b.getStored(), node);``` removes ``node`` from the stored block which even contained the 1002 genstamp. Since all three 1001 & 1002 are on 3 same DN. For first 2 ``` boolean minReplicationSatisfied = hasMinStorage(b.getStored(), numUsableReplicas); ``` this stays satisfied, but since the previous 2 removed the actual storage which contained 1002 from the blockMap to get rid of 1001, so in the last iteration, this comes false, and hence the last replica isn't deleted. So, I feel for this to trigger the 1001 & 1002 need to be on same datanode -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org