[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default

Hudson (Commented) (JIRA) Thu, 22 Mar 2012 14:58:45 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236082#comment-13236082
 ]


Hudson commented on HDFS-3044:
------------------------------

Integrated in Hadoop-Mapreduce-trunk-Commit #1925 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1925/])
    HDFS-3044. fsck move should be non-destructive by default. Contributed by 
Colin Patrick McCabe (Revision 1304063)

     Result = ABORTED
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1304063
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFsck.java

                
> fsck move should be non-destructive by default
> ----------------------------------------------
>
>                 Key: HDFS-3044
>                 URL: https://issues.apache.org/jira/browse/HDFS-3044
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Eli Collins
>            Assignee: Colin Patrick McCabe
>             Fix For: 0.23.3
>
>         Attachments: HDFS-3044.002.patch, HDFS-3044.003.patch
>
>
> The fsck move behavior in the code and originally articulated in HADOOP-101 
> is:
> {quote}Current failure modes for DFS involve blocks that are completely 
> missing. The only way to "fix" them would be to recover chains of blocks and 
> put them into lost+found{quote}
> A directory is created with the file name, the blocks that are accessible are 
> created as individual files in this directory, then the original file is 
> removed. 
> I suspect the rationale for this behavior was that you can't use files that 
> are missing locations, and copying the block as files at least makes part of 
> the files accessible. However this behavior can also result in permanent 
> dataloss. Eg:
> - Some datanodes don't come up (eg due to a HW issues) and checkin on cluster 
> startup, files with blocks where all replicas are on these set of datanodes 
> are marked corrupt
> - Admin does fsck move, which deletes the "corrupt" files, saves whatever 
> blocks were available
> - The HW issues with datanodes are resolved, they are started and join the 
> cluster. The NN tells them to delete their blocks for the corrupt files since 
> the file was deleted. 
> I think we should:
> - Make fsck move non-destructive by default (eg just does a move into 
> lost+found)
> - Make the destructive behavior optional (eg "--destructive" so admins think 
> about what they're doing)
> - Provide better sanity checks and warnings, eg if you're running fsck and 
> not all the slaves have checked in (if using dfs.hosts) then fsck should 
> print a warning indicating this that an admin should have to override if they 
> want to do something destructive

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3044) fsck move should be non-destructive by default

Reply via email to