[ https://issues.apache.org/jira/browse/HADOOP-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534140 ]
Raghu Angadi commented on HADOOP-2012: -------------------------------------- > Would writing more to the .meta file increase the chance of the .meta file > being corrupted? I'm not sure I see that. Even if we are writing 8 bytes, if there a problem during the write (hardware / driver error), I would usually corrupt a few sectors instead of just a few bytes. More times we write to the same location of a file more the chances there is an error. I might be wrong, may be errors that occur during writes is much much less than other ways disks go corrupt.. but I have seen quite a few kernel / driver errors while writing. > Periodic verification at the Datanode > ------------------------------------- > > Key: HADOOP-2012 > URL: https://issues.apache.org/jira/browse/HADOOP-2012 > Project: Hadoop > Issue Type: New Feature > Components: dfs > Reporter: Raghu Angadi > Assignee: Raghu Angadi > > Currently on-disk data corruption on data blocks is detected only when it is > read by the client or by another datanode. These errors are detected much > earlier if datanode can periodically verify the data checksums for the local > blocks. > Some of the issues to consider : > - How should we check the blocks ( no more often than once every couple of > weeks ?) > - How do we keep track of when a block was last verfied ( there is a .meta > file associcated with each lock ). > - What action to take once a corruption is detected > - Scanning should be done as a very low priority with rest of the datanode > disk traffic in mind. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.