[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default

Hadoop QA (Commented) (JIRA) Thu, 22 Mar 2012 12:48:45 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13235950#comment-13235950
 ]


Hadoop QA commented on HDFS-3044:
---------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12519115/HDFS-3044.003.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2070//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2070//console

This message is automatically generated.
                
> fsck move should be non-destructive by default
> ----------------------------------------------
>
>                 Key: HDFS-3044
>                 URL: https://issues.apache.org/jira/browse/HDFS-3044
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Eli Collins
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-3044.002.patch, HDFS-3044.003.patch
>
>
> The fsck move behavior in the code and originally articulated in HADOOP-101 
> is:
> {quote}Current failure modes for DFS involve blocks that are completely 
> missing. The only way to "fix" them would be to recover chains of blocks and 
> put them into lost+found{quote}
> A directory is created with the file name, the blocks that are accessible are 
> created as individual files in this directory, then the original file is 
> removed. 
> I suspect the rationale for this behavior was that you can't use files that 
> are missing locations, and copying the block as files at least makes part of 
> the files accessible. However this behavior can also result in permanent 
> dataloss. Eg:
> - Some datanodes don't come up (eg due to a HW issues) and checkin on cluster 
> startup, files with blocks where all replicas are on these set of datanodes 
> are marked corrupt
> - Admin does fsck move, which deletes the "corrupt" files, saves whatever 
> blocks were available
> - The HW issues with datanodes are resolved, they are started and join the 
> cluster. The NN tells them to delete their blocks for the corrupt files since 
> the file was deleted. 
> I think we should:
> - Make fsck move non-destructive by default (eg just does a move into 
> lost+found)
> - Make the destructive behavior optional (eg "--destructive" so admins think 
> about what they're doing)
> - Provide better sanity checks and warnings, eg if you're running fsck and 
> not all the slaves have checked in (if using dfs.hosts) then fsck should 
> print a warning indicating this that an admin should have to override if they 
> want to do something destructive

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default

Reply via email to