[jira] Commented: (HADOOP-2540) Empty blocks make fsck report corrupt, even when it isn't

Konstantin Shvachko (JIRA) Thu, 10 Jan 2008 16:53:58 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557861#action_12557861
 ]


Konstantin Shvachko commented on HADOOP-2540:
---------------------------------------------

> The namenode was not cleaning up the last block on lease recovery.

You mean it was not cleaning up the last block if it is a one block file, right?
This looks right to me.
I only don't like exposing the opportunity of changing lease intervals directly 
through the NameNode calls.
I'd rather introduce undocumented configuration variables. We used to do this 
in the past afair.

- FSDataOutputStream stm in TestFileCreation.testFileCreationError2() is not 
used anywhere.
Could you please also remove other warnings in this file.
- import org.apache.hadoop.fs.FsShell; is redundant
  import org.apache.hadoop.util.StringUtils; is redundant
- TEST_ROOT_DIR is never read locally.
- "this.assertEquals" should be replaced by simply "assertEquals" because 
assertEquals() is a static method.

> Empty blocks make fsck report corrupt, even when it isn't
> ---------------------------------------------------------
>
>                 Key: HADOOP-2540
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2540
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.15.1
>            Reporter: Allen Wittenauer
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.15.3
>
>         Attachments: recoverLastBlock.patch
>
>
> If the name node crashes after blocks have been allocated and before the 
> content has been uploaded, fsck will report the zero sized files as corrupt 
> upon restart:
> /user/rajive/rand0/_task_200712121358_0001_m_000808_0/part-00808: MISSING 1 
> blocks of total size 0 B
> ... even though all blocks are accounted for:
> Status: CORRUPT
>  Total size:    2932802658847 B
>  Total blocks:  26603 (avg. block size 110243305 B)
>  Total dirs:    419
>  Total files:   5031
>  Over-replicated blocks:        197 (0.740518 %)
>  Under-replicated blocks:       0 (0.0 %)
>  Target replication factor:     3
>  Real replication factor:       3.0074053
> The filesystem under path '/' is CORRUPT
> In UFS and related filesystems, such files would get put into lost+found 
> after an fsck and the filesystem would return back to normal.  It would be 
> super if HDFS could do a similar thing.  Perhaps if all of the nodes stored 
> in the name node's 'includes' file have reported in, HDFS could automatically 
> run a fsck and store these not-necessarily-broken files in something like 
> lost+found.  
> Files that are actually missing blocks, however, should not be touched.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2540) Empty blocks make fsck report corrupt, even when it isn't

Reply via email to