[ 
https://issues.apache.org/jira/browse/HDFS-15200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049899#comment-17049899
 ] 

Akira Ajisaka commented on HDFS-15200:
--------------------------------------

Thanks [~ayushtkn] for the report and thanks [~surendrasingh] for pinging me.

I think there is a rare situation that some admin would like to get the data of 
the corrupt replica by accessing the local disk of the DataNode. That way the 
admin can understand what the corrupt replica is and how to fix the corrupt 
data.

Therefore I would like to make the behavior an option and delete the corrupt 
replicas immediately by default.

> Delete Corrupt Replica Immediately Irrespective of Replicas On Stale Storage 
> -----------------------------------------------------------------------------
>
>                 Key: HDFS-15200
>                 URL: https://issues.apache.org/jira/browse/HDFS-15200
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Critical
>
> Presently {{invalidateBlock(..)}} before adding a replica into invalidates, 
> checks whether any  block replica is on stale storage, if any replica is on 
> stale storage, it postpones deletion of the replica.
> Here :
> {code:java}
>    // Check how many copies we have of the block
>     if (nr.replicasOnStaleNodes() > 0) {
>       blockLog.debug("BLOCK* invalidateBlocks: postponing " +
>           "invalidation of {} on {} because {} replica(s) are located on " +
>           "nodes with potentially out-of-date block reports", b, dn,
>           nr.replicasOnStaleNodes());
>       postponeBlock(b.getCorrupted());
>       return false;
> {code}
>  
> In case of corrupt replica, we can skip this logic and delete the corrupt 
> replica immediately, as a corrupt replica can't get corrected.
> One outcome of this behavior presently is namenodes showing different block 
> states post failover, as:
> If a replica is marked corrupt, the Active NN, will mark it as corrupt, and 
> mark it for deletion and remove it from corruptReplica's and  
> excessRedundancyMap.
> If before the deletion of replica, Failover happens.
> The standby Namenode will mark all the storages as stale.
> Then will start processing IBR's, Now since the replica's would be on stale 
> storage, it will skip deletion, and removal from corruptReplica's
> Hence both the namenode will show different numbers and different corrupt 
> replicas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to