Periodic verification at the Datanode
-------------------------------------

                 Key: HADOOP-2012
                 URL: https://issues.apache.org/jira/browse/HADOOP-2012
             Project: Hadoop
          Issue Type: New Feature
            Reporter: Raghu Angadi
            Assignee: Raghu Angadi



Currently on-disk data corruption on data blocks is detected only when it is 
read by the client or by another datanode.  These errors are detected much 
earlier if datanode can periodically verify the data checksums for the local 
blocks.

Some of the issues to consider :

- How should we check the blocks ( no more often than once every couple of 
weeks ?)
- How do we keep track of when a block was last verfied ( there is a .meta file 
associcated with each lock ).
- What action to take once a corruption is detected
- Scanning should be done as a very low priority with rest of the datanode disk 
traffic in mind.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to