[ https://issues.apache.org/jira/browse/HADOOP-2012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558953#action_12558953 ]
dhruba borthakur commented on HADOOP-2012: ------------------------------------------ Code looks good. +1. Some issues related to design of this patch: 1. The DFSClient makes an additional RPC to the namenode after it received and verified the block contents. It might be a good idea to run a DFSIO benchmark to validate that it does not impact read performance. 2. This patch adds additional simon metrics to record block validation statistics. Simon config file for existing clusters might need to be updated to view these new metrics. 3. The additional data structure that maps Blocks to BlockInfo may be merged with existing blocksmap in FSDataset.java. > Periodic verification at the Datanode > ------------------------------------- > > Key: HADOOP-2012 > URL: https://issues.apache.org/jira/browse/HADOOP-2012 > Project: Hadoop > Issue Type: New Feature > Components: dfs > Reporter: Raghu Angadi > Assignee: Raghu Angadi > Fix For: 0.16.0 > > Attachments: HADOOP-2012.patch, HADOOP-2012.patch, HADOOP-2012.patch, > HADOOP-2012.patch, HADOOP-2012.patch, HADOOP-2012.patch, HADOOP-2012.patch, > HADOOP-2012.patch, HADOOP-2012.patch > > > Currently on-disk data corruption on data blocks is detected only when it is > read by the client or by another datanode. These errors are detected much > earlier if datanode can periodically verify the data checksums for the local > blocks. > Some of the issues to consider : > - How should we check the blocks ( no more often than once every couple of > weeks ?) > - How do we keep track of when a block was last verfied ( there is a .meta > file associcated with each lock ). > - What action to take once a corruption is detected > - Scanning should be done as a very low priority with rest of the datanode > disk traffic in mind. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.