Replication policy for corrupted block
---------------------------------------
Key: HADOOP-2065
URL: https://issues.apache.org/jira/browse/HADOOP-2065
Project: Hadoop
Issue Type: Bug
Components: dfs
Affects Versions: 0.14.1
Reporter: Koji Noguchi
Thanks to HADOOP-1955, even if one of the replica is corrupted, the block
should get replicated from a good replica relatively fast.
Created this ticket to continue the discussion from
http://issues.apache.org/jira/browse/HADOOP-1955#action_12531162.
bq. 2. Delete corrupted source replica
bq. 3. If all replicas are corrupt, stop replication.
For (2), it'll be nice if the namenode can delete the corrupted block if
there's a good replica on other nodes.
For (3), I prefer if the namenode can still replicate the block.
Before 0.14, if the file was corrupted, users were still able to pull the data
and decide if they want to delete those files. (HADOOP-2063)
In 0.14 and later, we cannot/don't replicate these blocks so they eventually
get lost.
To make the matters worse, if the corrupted file is accessed, all the corrupted
replicas would be deleted except for one and stay as replication factor of 1
forever.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.