[ 
https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368053#comment-14368053
 ] 

Konstantin Shvachko commented on HDFS-7930:
-------------------------------------------

Although this will not fix the {{testTruncateWithDataNodesRestart()}} 
completely. The location is correctly invalidated on the NN, but then NN 
postpones invalidation on the DN and waits for the next report.
{code}
2015-03-18 15:11:02,922 INFO  BlockStateChange 
(CorruptReplicasMap.java:addToCorruptReplicasMap(76)) - BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1073741827 added as corrupt on 
127.0.0.1:46044 by localhost/127.0.0.1  because block is COMPLETE and reported 
genstamp 1003 does not match genstamp in block map 1004
2015-03-18 15:11:02,922 INFO  BlockStateChange 
(BlockManager.java:invalidateBlock(1215)) - BLOCK* invalidateBlock: 
blk_1073741827_1003(stored=blk_1073741827_1004) on 127.0.0.1:46044
2015-03-18 15:11:02,922 INFO  BlockStateChange 
(BlockManager.java:invalidateBlock(1225)) - BLOCK* invalidateBlocks: postponing 
invalidation of blk_1073741827_1003(stored=blk_1073741827_1004) on 
127.0.0.1:46044 because 1 replica(s) are located on nodes with potentially 
out-of-date block reports
{code}
If I add {{triggerBlockReports()}} before {{waitReplication()}} then the test 
passes, as it finally triggers deletion of the replica on the DN.
I am fine fixing the test by adding  {{triggerBlockReports()}} as above, but I 
don't know what is the reason for postponing replica deletion. Postponing 
should probably be avoided in this case, since the {{commitBlockSync()}} is as 
good as block report for the particular block.

BTW, your change completely eliminates the failure of 
{{testTruncateWithDataNodesRestartImmediately()}} from HDFS-7886, which I ran 
without {{triggerBlockReports()}}.

> commitBlockSynchronization() does not remove locations
> ------------------------------------------------------
>
>                 Key: HDFS-7930
>                 URL: https://issues.apache.org/jira/browse/HDFS-7930
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Konstantin Shvachko
>            Assignee: Yi Liu
>            Priority: Blocker
>         Attachments: HDFS-7930.001.patch, HDFS-7930.002.patch
>
>
> When {{commitBlockSynchronization()}} has less {{newTargets}} than in the 
> original block it does not remove unconfirmed locations. This results in that 
> the the block stores locations of different lengths or genStamp (corrupt).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to