[ 
https://issues.apache.org/jira/browse/HDFS-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16405605#comment-16405605
 ] 

Bharat Viswanadham commented on HDFS-13295:
-------------------------------------------

Hi [~nfraison.criteo]

Thanks for the info. I am not clear, what is the above stack trace reason and 
how is it related here?

Because if it is calling from forceCompleteBlock, force will be set to true, 
and it will not throw any exceptions.

if (!force && !hasMinStorage(curBlock, numNodes)) {
throw new IOException("Cannot complete block: "
+ "block does not satisfy minimal replication requirement.");
}
if (!force && curBlock.getBlockUCState() != BlockUCState.COMMITTED) {
throw new IOException(
"Cannot complete block: block has not been COMMITTED by the client");
}

If the block reported is from edit logs, it will not have storages set, so 
numNodes will be 0 right?

int numNodes = curBlock.numNodes();

so, when calling bmSafeMode.incrementSafeBlockCount(Math.min(numNodes, 
minStorage),curBlock) it will be passed with

bmSafeMode.incrementSafeBlockCount(Math.min(0,<<value>>,curBlock).

So, still, i am not convinced how when reading from edit logs this will cause 
the issue.

 

 

Can we have any simple UT for this behavior?
  java.lang.Thread.State: RUNNABLE
          at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:811)
          at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.completeBlock(BlockManager.java:823)
          at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:836)
          at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:954)
          at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:433)
          at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:231)
          at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:140)
          at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:856)
          at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:837)
          at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:262)
          at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:395)
          at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$300(EditLogTailer.java:348)
          at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:365)
          at 
java.security.AccessController.doPrivileged(AccessController.java:-1)
          at javax.security.auth.Subject.doAs(Subject.java:360)
          at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1900)
          at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:442)
          at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:361)

> Namenode doesn't leave safemode if dfs.namenode.safemode.replication.min set 
> < dfs.namenode.replication.min
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-13295
>                 URL: https://issues.apache.org/jira/browse/HDFS-13295
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>         Environment: CDH 5.11 with HDFS-8716 backported.
> dfs.namenode.replication.min=2
> dfs.namenode.safemode.replication.min=1
>  
>            Reporter: Nicolas Fraison
>            Assignee: Nicolas Fraison
>            Priority: Major
>         Attachments: HDFS-13295.patch
>
>
> When we set dfs.namenode.safemode.replication.min < 
> dfs.namenode.replication.min from HDFS-8716 patch the number of replica for 
> which it will increase the safe block count
> must be equal to dfs.namenode.safemode.replication.min in 
> `FSNamesystem.incrementSafeBlockCount`
> When reading modification from edits, the replica number for new blocks is 
> set at min(numNodes,
> dfs.namenode.replication.min) in BlockManager.completeBlock which is greater 
> than dfs.namenode.safemode.replication.min.
> Due to that safe block count never reach number of available blocks and 
> namenode doesn't leave automatically the safemode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to