[
https://issues.apache.org/jira/browse/HADOOP-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672357#action_12672357
]
Hairong Kuang commented on HADOOP-5133:
---------------------------------------
Next two lines of the log:
WARN hdfs.StateChange (FSNamesystem.java:addStoredBlock(2872)) - BLOCK*
NameSystem.addStoredBlock: Redundant addStoredBlock request received for
blk_2248817250507458558_1011 on 127.0.0.1:51024 size 63
WARN hdfs.StateChange (FSNamesystem.java:addStoredBlock(2872)) - BLOCK*
NameSystem.addStoredBlock: Redundant addStoredBlock request received for
blk_2248817250507458558_1011 on 127.0.0.1:51021 size 63
blockReceived from 128.0.0.1:51021 did come. This time it did not complain
about the length but redundant addStoredBlock. The replica got added to the
blocksMap but of no use because the block was already marked as corrupt.
What's wrong here was that 128.0.0.1:51021 had a very good replica but NN
wrongly marked it as corrupt based on some stale information.
> FSNameSystem#addStoredBlock does not handle inconsistent block length
> correctly
> -------------------------------------------------------------------------------
>
> Key: HADOOP-5133
> URL: https://issues.apache.org/jira/browse/HADOOP-5133
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.18.2
> Reporter: Hairong Kuang
> Assignee: Hairong Kuang
> Priority: Blocker
> Fix For: 0.18.4
>
>
> Currently NameNode treats either the new replica or existing replicas as
> corrupt if the new replica's length is inconsistent with NN recorded block
> length. The correct behavior should be
> 1. For a block that is not under construction, the new replica should be
> marked as corrupt if its length is inconsistent (no matter shorter or longer)
> with the NN recorded block length;
> 2. For an under construction block, if the new replica's length is shorter
> than the NN recorded block length, the new replica could be marked as
> corrupt; if the new replica's length is longer, NN should update its recorded
> block length. But it should not mark existing replicas as corrupt. This is
> because NN recorded length for an under construction block does not
> accurately match the block length on datanode disk. NN should not judge an
> under construction replica to be corrupt by looking at the inaccurate
> information: its recorded block length.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.