[ https://issues.apache.org/jira/browse/HDFS-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319015#comment-14319015 ]
Colin Patrick McCabe commented on HDFS-7648: -------------------------------------------- Hi [~szetszwo], since you suggested splitting this JIRA into two, I had assumed that you wanted to have the discussion about "automatic fixing" on the second JIRA. However if you want to have it now, I'll share my thoughts. As I stated earlier, I don't think we should do automatic fixing. We simply don't know *why* the DataNode got into a state where the directory layout is wrong. This is similar to "what happens if there is no VERSION file?" We don't try to automatically fix this. If there is no VERSION file, then it's very likely that there is a serious misconfiguration and/or filesystem bug, and our attempts to fix it would only make things worse. The same logic applies here. If there are blocks in the wrong location, why is that happening? It could be because there is a serious bug in the software. In that case, deleting the blocks, as you have suggested, would only lead to data loss. It could be because the sysadmin manually edited a {{VERSION}} file for an old (pre HDFS-6482) datanode directory to look like it was post-HDFS-6482, bypassing the upgrade process. In this case, deleting *all* the data is still the wrong thing to do... the sysadmin should instead see logs telling him that this configuration is wrong. Finally, blocks could be in the wrong place because there is a serious disk drive or local FS error. In this case, deletion will still do no good, because the device is in a seriously unusable state. I'd also like to note that we've spent quite a lot of time discussing theoretical failures that may or may not ever happen. Who knows whether we actually will ever find blocks in the wrong place? You are asking for automatic handling of something that, to our knowledge, has never even happened once. That seems like putting the cart before the horse. > Verify the datanode directory layout > ------------------------------------ > > Key: HDFS-7648 > URL: https://issues.apache.org/jira/browse/HDFS-7648 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Tsz Wo Nicholas Sze > Assignee: Rakesh R > Attachments: HDFS-7648.patch, HDFS-7648.patch > > > HDFS-6482 changed datanode layout to use block ID to determine the directory > to store the block. We should have some mechanism to verify it. Either > DirectoryScanner or block report generation could do the check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)