[ 
https://issues.apache.org/jira/browse/HADOOP-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666301#action_12666301
 ] 

Hairong Kuang commented on HADOOP-4692:
---------------------------------------

In the current trunk, the source datanode ignores the block length that NN sent 
and uses the on-disk block length to transfer the block.

What I plan to do is that when receiving a block replication request, datanode 
first checks if this block is under construction or not by looking at the 
ongoingCreates list. If yes, stop replicating the block. Otherwise check if the 
on-disk block length is the same as the block length sent by NN. If no, report 
NN corrupt blocks and stop replicating. Otherwise, start replicated the block.

>  Namenode in infinite loop for replicating/deleting corrupted block
> -------------------------------------------------------------------
>
>                 Key: HADOOP-4692
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4692
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.20.0
>
>         Attachments: namenode_inconsistent_size.patch, 
> truncateBlockReplication.patch
>
>
> Our cluster has an under-replicated block with only one replica, assuming its 
> block id is B. NameNode log shows that NameNode is in an infinite loop 
> replicating/deleting the block.
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask DN1 to replicate blk_B to 
> datanode(s) DN2, DN3
> WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for block blk_B 
> reported from DN2  current size is 134217728 reported size is 134205440
> WARN org.apache.hadoop.fs.FSNamesystem: Deleting block blk_B from DN2
> INFO org.apache.hadoop.dfs.StateChange: DIR* NameSystem.invalidateBlock: 
> blk_B on DN2
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_B is 
> added to invalidSet of DN2
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> blockMap updated: DN2 is added to blk_B size 134217728
> WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for block blk_-B 
> reported from DN3 current size is 134217728 reported size is 134205440
> WARN org.apache.hadoop.fs.FSNamesystem: Deleting block blk_B from DN3
> INFO org.apache.hadoop.dfs.StateChange: DIR* NameSystem.invalidateBlock: 
> blk_B on DN3
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.delete: blk_B is 
> added to invalidSet of DN3
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> blockMap updated: DN3 is added to blk_B size 134217728
> INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask DN1 to replicate blk_B  to 
> datanode(s) DN4, DN5
> ...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to