[ https://issues.apache.org/jira/browse/HADOOP-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537676 ]
Raghu Angadi commented on HADOOP-2012: -------------------------------------- > 2) It seems to me it might be better to try to repair the block if possible, > rather then just delete it. This avoids bad corner cases. It adds complexity > though. Thoughts? A simple variant is just to copy a new version locally. Datanode actually does not physically delete the block after detecting corruption. It asks the Namenode to delete the block (just like a client does when it detects corruption). Namenode deletes the blocks only if there are more replicas left and then replicates the block. Does this address the case? Regd (1) and Dhruba's comment, a log file would be my preferred approach too. But main question I get asked is "Why add another file?". will think about (3). > Periodic verification at the Datanode > ------------------------------------- > > Key: HADOOP-2012 > URL: https://issues.apache.org/jira/browse/HADOOP-2012 > Project: Hadoop > Issue Type: New Feature > Components: dfs > Reporter: Raghu Angadi > Assignee: Raghu Angadi > Fix For: 0.16.0 > > Attachments: HADOOP-2012.patch, HADOOP-2012.patch, HADOOP-2012.patch, > HADOOP-2012.patch > > > Currently on-disk data corruption on data blocks is detected only when it is > read by the client or by another datanode. These errors are detected much > earlier if datanode can periodically verify the data checksums for the local > blocks. > Some of the issues to consider : > - How should we check the blocks ( no more often than once every couple of > weeks ?) > - How do we keep track of when a block was last verfied ( there is a .meta > file associcated with each lock ). > - What action to take once a corruption is detected > - Scanning should be done as a very low priority with rest of the datanode > disk traffic in mind. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.