[ https://issues.apache.org/jira/browse/HDFS-16999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17720740#comment-17720740 ]
ASF GitHub Bot commented on HDFS-16999: --------------------------------------- Hexiaoqiao merged PR #5622: URL: https://github.com/apache/hadoop/pull/5622 > Fix wrong use of processFirstBlockReport() > ------------------------------------------ > > Key: HDFS-16999 > URL: https://issues.apache.org/jira/browse/HDFS-16999 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Shuyan Zhang > Assignee: Shuyan Zhang > Priority: Major > Labels: pull-request-available > > `processFirstBlockReport()` is used to process first block report from > datanode. It does not calculating `toRemove` list because it believes that > there is no metadata about the datanode in the namenode. However, If a > datanode is re registered after restarting, its `blockReportCount` will be > updated to 0. That is to say, the first block report after a datanode > restarts will be processed by `processFirstBlockReport()`. This is > unreasonable because the metadata of the datanode already exists in namenode > at this time, and if redundant replica metadata is not removed in time, the > blocks with insufficient replicas cannot be reconstruct in time, which > increases the risk of missing block. In summary, `processFirstBlockReport()` > should only be used when the namenode restarts, not when the datanode > restarts. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org