[ 
https://issues.apache.org/jira/browse/HADOOP-5133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668214#action_12668214
 ] 

dhruba borthakur commented on HADOOP-5133:
------------------------------------------

> addStoredBlock can not completely ignore it. It should at least update the 
> stored block length and add the replica to the blocksMap.

Agreed.

Suppose two datanodes report inconsistent block length in their blockReceived 
confirmation of the same block. Suppose both replicas have the same generation 
stamp.
  1. If the file is not under construction or it is not the last block of a 
file then the replica with the smaller size should be treated as corrupt. The 
larger size replica should be in the blocksMap.
  2. if the file is the last block of a file that is under construction: then 
keep the longer size replica in the blocksmap but do not delete the shorter 
size replica from the corresponding (i.e. do not treat the shorter size replica 
as corrupt).  Remove the shorter size replica from the blocks map.
 
Case1  typically happens when the lazy flush of OS buffers in the datanode 
encounters a transient error and one copy of a good replica is truncated on 
disk.

Case 2 could occur because a datanode prematurely (because of buggy code 
somewhere) sends a block Received to the NN. In this case, it is safe to not 
treat the replica as corrupt because the existence of the lease indicates that 
the NN does not "own" this block. This situation will be fixed when a block 
report is processed after the lease is closed.

> FSNameSystem#addStoredBlock does not handle inconsistent block length 
> correctly
> -------------------------------------------------------------------------------
>
>                 Key: HADOOP-5133
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5133
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.2
>            Reporter: Hairong Kuang
>             Fix For: 0.19.1
>
>
> Currently NameNode treats either the new replica or existing replicas as 
> corrupt if the new replica's length is inconsistent with NN recorded block 
> length. The correct behavior should be
> 1. For a block that is not under construction, the new replica should be 
> marked as corrupt if its length is inconsistent (no matter shorter or longer) 
> with the NN recorded block length;
> 2. For an under construction block, if the new replica's length is shorter 
> than the NN recorded block length, the new replica could be marked as 
> corrupt; if the new replica's length is longer, NN should update its recorded 
> block length. But it should not mark existing replicas as corrupt. This is 
> because NN recorded length for an under construction block does not 
> accurately match the block length on datanode disk. NN should not judge an 
> under construction replica to be corrupt by looking at the inaccurate 
> information:  its recorded block length.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to