[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241145#comment-13241145 ]
Ashish Singhi commented on HDFS-3157: ------------------------------------- After deleting a block. The pipeline will update the gen stamp of the block say blk_blockId_1002 to blk_blockId_1003. Then DN1 will mark the block with old gen stamp as corrupt. In BlockManager#processReportedBlock() storedBlock will get assigned to blk_blockId_1003 as blockMap is now updated with new gen stamp for this blockId and then it will ask DN1 to delete this blk_blockId_1003. As DN1's volumeMap does not contain blk_blockId_1003. It will throw an exception. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > ----------------------------------------------------------------------------------------------------------------- > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.23.0, 0.24.0 > Reporter: J.Andreina > Fix For: 0.24.0 > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > ------------------------------------------------------------------------------------- > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira