[ 
https://issues.apache.org/jira/browse/HADOOP-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066974#comment-14066974
 ] 

Wang Peipei commented on HADOOP-2890:
-------------------------------------

Hi dhruba borthakur, isn't this problem is because that NN incorrectly uses the 
block object used in RPC to queue to neededReplication queue instead of using 
internal block object ?  Because 134217728 is actually 128k. I got this message 
from HADOOP-5605

> HDFS should recover when  replicas of block have different sizes (due to 
> corrupted block)
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2890
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2890
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.16.0
>            Reporter: Lohit Vijayarenu
>            Assignee: dhruba borthakur
>             Fix For: 0.17.0
>
>         Attachments: inconsistentSize.patch, inconsistentSize.patch, 
> inconsistentSize.patch, inconsistentSize.patch
>
>
> We had a case where reading a file caused IOException.
> 08/02/25 17:23:02 INFO fs.DFSClient: Could not obtain block 
> blk_-8333897631311887285 from any node:  java.io.IOException: No live nodes 
> contain current block
> hadoop fsck said the block was healthy.
> [lohit]$ hadoop fsck part-04344 -files -blocks -locations | grep 
> 8333897631311887285
> 21. -8333897631311887285 len=134217728 repl=3 [74.6.129.238:50010, 
> 74.6.133.231:50010, 74.6.128.158:50010]
> Looking for logs about the block showed this message in namenode log
> 17:26:23,543 WARN org.apache.hadoop.fs.FSNamesystem: Inconsistent size for 
> block blk_-8333897631311887285 reported from 74.6.133.231:50010 current size 
> is 134217728 reported size is 134205440
> So, the namenode was expecting 134217728 while the actual block size was 
> 134205440
> Dhruba took a look at the logs further and we found out this is what had 
> happend
> 1. While the file was being created this block was replicated to three nodes 
> of which 2 nodes had correct sized block, but the third node has 
> partial/truncated block. (but the metadata was same on all nodes)
> 2. Later after 3 days namenode was restarted, at which point the 3rd node 
> reported warning message about incorrect block size. (Namenode logged this)
> 3. After few days the first 2 nodes went down and the 3rd node replicated the 
> partial/truncated block to two new nodes. 
> 4. Now when we tried to read this block, we hit the IOException
> 5. On all the nodes, the metadata corresponded to the original valid block 
> while the block itself was missing around 12K of data.
> Two problems which could be fixed here
> 1. When namenode identifies replicas with different blocksize (point 2 
> above). It could choose the biggest block and discard the small block. If the 
> block is not the last block, then its size has to be equal to the block size, 
> anything less than that could be considered bad block.
> 2. Datanode Block periodic verifier could also verify that the metadata has 
> the correct size as that of the actual block present. Any changes should be 
> reported/recovered considering what would be done in above step.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to