[ 
https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335523#comment-14335523
 ] 

Allen Wittenauer commented on HDFS-7537:
----------------------------------------

bq. When numUnderMinimalRelicatedBlocks > 0 and there is no missing/corrupted 
block, all under minimal replicated blocks have at least one good replica so 
that they can be replicated and there is no data loss. It makes sense to 
consider the file system as healthy.

Exactly this.

I made a prototype to play with.  One of things I did was put the number of 
blocks that didn't meet the replication minimum surrounded by the asterisks 
that the corrupted output did.  This made it absolutely crystal clear why the 
NN wasn't coming out of safemode.

> fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas 
> && NN restart
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-7537
>                 URL: https://issues.apache.org/jira/browse/HDFS-7537
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Allen Wittenauer
>            Assignee: GAO Rui
>         Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png
>
>
> If minimum replication is set to 2 or higher and some of those replicas are 
> missing and the namenode restarts, it isn't always obvious that the missing 
> replicas are the reason why the namenode isn't leaving safemode.  We should 
> improve the output of fsck and the web UI to make it obvious that the missing 
> blocks are from unmet replicas vs. completely/totally missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to