[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407133#comment-13407133 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Mapreduce-trunk #1127 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1127/]) HDFS-3157. Fix a bug in the case that the generation stamps of the stored block in a namenode and the reported block from a datanode do not match. Contributed by Ashish Singhi (Revision 1356086) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1356086 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.1-alpha > > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405016#comment-13405016 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Hdfs-trunk #1093 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1093/]) HDFS-3157. Fix a bug in the case that the generation stamps of the stored block in a namenode and the reported block from a datanode do not match. Contributed by Ashish Singhi (Revision 1356086) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1356086 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.1-alpha > > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404940#comment-13404940 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2436 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2436/]) HDFS-3157. Fix a bug in the case that the generation stamps of the stored block in a namenode and the reported block from a datanode do not match. Contributed by Ashish Singhi (Revision 1356086) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1356086 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.1-alpha > > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404928#comment-13404928 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Common-trunk-Commit #2419 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2419/]) HDFS-3157. Fix a bug in the case that the generation stamps of the stored block in a namenode and the reported block from a datanode do not match. Contributed by Ashish Singhi (Revision 1356086) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1356086 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.1-alpha > > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404925#comment-13404925 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Hdfs-trunk-Commit #2487 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2487/]) HDFS-3157. Fix a bug in the case that the generation stamps of the stored block in a namenode and the reported block from a datanode do not match. Contributed by Ashish Singhi (Revision 1356086) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1356086 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.1-alpha > > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404913#comment-13404913 ] Uma Maheswara Rao G commented on HDFS-3157: --- Thanks Nicholas for the explanation. +1 on the patch. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404907#comment-13404907 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- Hi Uma, Thanks for taking a look. The reason of using == instead of equals(..) is that one of the constructors (see below) sets stored and corrupted to the same object. So == is fine. {code} +BlockToMarkCorrupt(BlockInfo stored, String reason) { + this(stored, stored, reason); +} {code} > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13404788#comment-13404788 ] Uma Maheswara Rao G commented on HDFS-3157: --- Hi Nicholas, Latest Patch looks great. I have one comment: {code} (corrupted == stored? {code} This should be .equals? as we creating new reference of BlockInfo explicitly in some of the ctors right? And other question is: if (countNodes(b.stored).liveReplicas() >= bc.getReplication()) { This point may not be related to this patch, but considering one case I wanted to point it. Due to several pipeline failure in cluster, only 2 live replicas present in the cluster and all other nodes has the partial block(corrupt) present in RBW. Now NN can not invalidat that blocks as it did not meet the enough replication and may try to replicate them to other nodes first. But unfortunately other nodes already have the block with older genstamp. volumes map may have that blocks already and I remember it will reject the replication. So, we have only 2 live replicas even though we have more DNs. But this situation should be very rare and almost no possibility in bigger clusters. Worth considering the case for small clusters. Brahma reported this in one small cluster of 5 nodes. Anyway I will ask him to file separate one, we can discuss there. Also Thanks a lot Ashish for your efforts on this issue :-) Thanks Uma > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403263#comment-13403263 ] Hadoop QA commented on HDFS-3157: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533840/HDFS-3157-5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2718//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2718//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157-5.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396764#comment-13396764 ] Ashish Singhi commented on HDFS-3157: - Thanks a lot Nicholas for your time and reviewing the patch. {quote} 1.The new BlockInfo(storedBlock) constructor won't copy triplets. So the blockInfo in BlockToMarkCorrupt has the GS in DN but don't have locations. {quote} Yes, the iblkInfo in BlockToMarkCorrupt has only the GS in DN but doesn't have the locations. {quote} 2.In markBlockAsCorrupt(..), since the location could be empty and the GS could be different from the one in the blocksMap, we lookup the block again. {quote} While looking into the blocksMap for a block we check only the blockId(which will be same for both reported block and storedBlock here) and not the GS. So we will still have the locations of storedBlock. {quote} could you combine it with your test if you think they are good? {quote} I went through the patch it looked good, will upload a patch tomorrow with the test case. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157.patch, > HDFS-3157.patch, HDFS-3157.patch, h3157_20120618.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396273#comment-13396273 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- See if I understand the patch correctly: # The new BlockInfo(storedBlock) constructor won't copy triplets. So the blockInfo in BlockToMarkCorrupt has the GS in DN but don't have locations. # In markBlockAsCorrupt(..), since the location could be empty and the GS could be different from the one in the blocksMap, we lookup the block again. If my understanding is correct, I have the following suggestions: - Add storedBlock to BlockToMarkCorrupt so that no additional lookup is required. - We have to be very careful about when to use the block with the stored gs and when to use the block with the reported gs. In markBlockAsCorrupt(..), calls to addToCorruptReplicasMap and addToInvalidates should pass the block with DN's gs. Other calls (addBlock, countNodes, updateNeededReplications) should pass the block with stored gs. Similar changes has to be done in invalidateBlock(..). It is lengthy to describe all the changes. So I put them in h3157_20120618.patch. Ashish, could you combine it with your test if you think they are good? - I think there are similar bugs in processMisReplicatedBlock(..) and the related code since they do not handle the case that the generation stamps are different. These are the new code introduced for HA. Let's fix it separately. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157.patch, > HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295487#comment-13295487 ] Ashish Singhi commented on HDFS-3157: - Added a sleep of 100ms in each while loop in the test case. To avoid from one single thread taking most of the CPU usuage. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157-4.patch, HDFS-3157.patch, > HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295265#comment-13295265 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- Hi Uma, I will take a look the patch tomorrow. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13295065#comment-13295065 ] Uma Maheswara Rao G commented on HDFS-3157: --- {code} -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart {code} Unrelated to this patch. Nicholas, Do you have any more comments on this patch? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293490#comment-13293490 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531062/HDFS-3157-3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestFileLengthOnClusterRestart +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2641//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2641//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293381#comment-13293381 ] Uma Maheswara Rao G commented on HDFS-3157: --- Thanks a lot Nicholas. Even though I have login for Jenkins, I am not able to see the BuildNow option :-( . BTW, do you have any comments on this patch? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293380#comment-13293380 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- Hi Uma, I just have started a build. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293373#comment-13293373 ] Uma Maheswara Rao G commented on HDFS-3157: --- Test failure seems to be due to HDFS-3492. Now its reverted. Ashish, could you please reattach the patch for clean QA report? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291320#comment-13291320 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531062/HDFS-3157-3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestShortCircuitLocalRead +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2615//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2615//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290049#comment-13290049 ] Ashish Singhi commented on HDFS-3157: - {quote} -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestListFilesInFileContext org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics {quote} Not related to the patch. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290035#comment-13290035 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531062/HDFS-3157-3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestListFilesInFileContext org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2604//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2604//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288560#comment-13288560 ] Ashish Singhi commented on HDFS-3157: - Uploaded the patch addressing Uma's comment. Now since we are adding the datanode who is reporting the corrupt block to the storedBlock(block in blocksMap) triplets. Hence there is no need of copying the triplets of storedBlock to the reported corrupt block. Thus removed all the changes done in BlockInfo class. Can someone please review the patch. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288544#comment-13288544 ] Hadoop QA commented on HDFS-3157: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530773/HDFS-3157-3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2580//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2580//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157-3.patch, HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288442#comment-13288442 ] Ashish Singhi commented on HDFS-3157: - Thanks Uma. Yes your right. I need to handle this case, will upload a patch addressing this.. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287446#comment-13287446 ] Uma Maheswara Rao G commented on HDFS-3157: --- I think you have to handle one more case: {code} // Add replica to the data-node if it is not already there node.addBlock(storedBlock); // Add this replica to corruptReplicas Map corruptReplicas.addToCorruptReplicasMap(storedBlock, node, reason); if (countNodes(storedBlock).liveReplicas() >= bc.getReplication()) { // the block is over-replicated so invalidate the replicas immediately invalidateBlock(storedBlock, node); } else if (namesystem.isPopulatingReplQueues()) { // add the block to neededReplication updateNeededReplications(storedBlock, -1, 0); } {code} Here you are adding storedBlock which is ported genstamp (assume genstamp is 1). When invalidateBlock, it will try to remove newer genstamp from node because blockMap#removeNode will lookup the block again from blockMap. {code} if (!blocksMap.removeNode(block, node)) { if(NameNode.stateChangeLog.isDebugEnabled()) { NameNode.stateChangeLog.debug("BLOCK* removeStoredBlock: " + block + " has already been removed from node " + node); } return; } {code} how about adding the block which is present in blockMap? so, that block can be removed successfully when it calls blocksMap.removeNode > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285372#comment-13285372 ] Ashish Singhi commented on HDFS-3157: - Mistake. {quote} In System.arraycopy it will create a new reference {quote} System.arraycopy(...) wil not create any new reference. I wanted to say was if we use System.arraycopy(...) any changes done in this.triplets will not be reflected in from.triplets. As both will be pointing to some other locations in the memory. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285368#comment-13285368 ] Ashish Singhi commented on HDFS-3157: - I forgot to mention that, I have used {code}+ this.triplets = from.triplets;{code} instead of {code}+ System.arraycopy(from.triplets, 0, this.triplets, 0, from.triplets.length);{code} In System.arraycopy it will create a new reference. So problem is in markBlockAsCorrupt(...) at node.addBlock(storedBlock), we will add the datanode into the triplets of corruptBlock but when we call countNodes(...) here when we look in blockMap for the storedBlock it will return the iterator of only one datanode i.e., the one holding the live replica. To avoid this I have used this.triplets = from.triplets, so that both will pointing to the same location and there will not be any problem as described above. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283960#comment-13283960 ] Hadoop QA commented on HDFS-3157: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12529856/HDFS-3157-2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2523//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2523//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157-2.patch, > HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279964#comment-13279964 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528379/HDFS-3157-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2492//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2492//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157-1.patch, HDFS-3157-1.patch, HDFS-3157.patch, > HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279343#comment-13279343 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- Hi Ashish, Thanks for the update. I think it is better to use BlockInfo(BlockInfo from) instead of adding getTriplets() and setTriplets(..). We may change BlockInfo(BlockInfo from) as follows: {code} - protected BlockInfo(BlockInfo from) { + protected BlockInfo(BlockInfo from, boolean copyLocations) { this(from, from.bc.getReplication()); this.bc = from.bc; +if (copyLocations) { + System.arraycopy(from.triplets, 0, this.triplets, 0, from.triplets.length); +} } {code} > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157-1.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278879#comment-13278879 ] Ashish Singhi commented on HDFS-3157: - TestReplicationPolicy is passing locally for me with the patch. {code} Running org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.038 sec Results : Tests run: 10, Failures: 0, Errors: 0, Skipped: 0 {code} > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157-1.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278822#comment-13278822 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528043/HDFS-3157-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2473//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2473//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157-1.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278778#comment-13278778 ] Ashish Singhi commented on HDFS-3157: - Patch updated and ready for review. Please provide your review comments. {code} + /* + * Look up again to storedBlock as it might be a reported block also. + * @see BlockManager#checkReplicaCorrupt(...) + */ + BlockInfo blkInfo = blocksMap.getStoredBlock(storedBlock); {code} I have added this because in updatedNeededReplication we want namenode to ask datanode to replicate the storedBlock which is there in its blockMap not with the reported block with datanode is reporting as corrupt. Now in the test case I am asserting 3 things, First - There should be one block in the corruptReplicasMap. Second - After marking the block as corrupt, ReplicationMonitor thread should replicate a live replica to one of the datanode. Third - After replicating the live replica, the corrupt replica in corruptReplicasMap should get invalidated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157-1.patch, HDFS-3157.patch, HDFS-3157.patch, > HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13274245#comment-13274245 ] Ashish Singhi commented on HDFS-3157: - Currently I am working on the following solution for the patch - Rebuilding the blockInfo just with reported block genstamp and other all states same as storedBlock. Again with this solution, the test case may randomly fail. Reason, Now though the reported block is added into corruptReplicasMap it is not getting invalidated on the DN who is reporting this corrupt block, because for the corrupt block to get invalidated first we need to meet the live replicas for the block equal to the replication factor set. Problem - If chooseTarget() picks the same DN who is reporting this corrupt block then it will fail with ReplicaAlreadyExistsException. Now question is why NN is picking the same DN who is reporting this corrupt block not the 3rd DN ? Answer - In excludedNodes map only one DN will be present who has the live replica of the block( or who has the block in his Finalized folder). The following partial logs depicits the above scenario. {code} excludedNodes contains the following datanode/s. {127.0.0.1:54681=127.0.0.1:54681} 2012-05-12 23:57:33,773 INFO hdfs.StateChange (BlockManager.java:computeReplicationWorkForBlocks(1226)) - BLOCK* ask 127.0.0.1:54681 to replicate blk_3471690017167574595_1003 to datanode(s) 127.0.0.1:54041 2012-05-12 23:57:33,791 INFO datanode.DataNode (DataNode.java:transferBlock(1221)) - DatanodeRegistration(127.0.0.1, storageID=DS-1047816814-192.168.44.128-54681-1336847251649, infoPort=62840, ipcPort=26036, storageInfo=lv=-40;cid=testClusterID;nsid=1646783488;c=0) Starting thread to transfer block BP-1770179175-192.168.44.128-1336847247907:blk_3471690017167574595_1003 to 127.0.0.1:54041 2012-05-12 23:57:33,795 INFO hdfs.StateChange (BlockManager.java:processReport(1450)) - BLOCK* processReport: from DatanodeRegistration(127.0.0.1, storageID=DS-1047816814-192.168.44.128-54681-1336847251649, infoPort=62840, ipcPort=26036, storageInfo=lv=-40;cid=testClusterID;nsid=1646783488;c=0), blocks: 1, processing time: 0 msecs 2012-05-12 23:57:33,796 INFO datanode.DataNode (BPServiceActor.java:blockReport(404)) - BlockReport of 1 blocks took 0 msec to generate and 2 msecs for RPC and NN processing 2012-05-12 23:57:33,796 INFO datanode.DataNode (BPServiceActor.java:blockReport(423)) - sent block report, processed command:org.apache.hadoop.hdfs.server.protocol.FinalizeCommand@12eb0b3 2012-05-12 23:57:33,811 INFO datanode.DataNode (DataXceiver.java:writeBlock(342)) - Receiving block BP-1770179175-192.168.44.128-1336847247907:blk_3471690017167574595_1003 src: /127.0.0.1:33583 dest: /127.0.0.1:54041 2012-05-12 23:57:33,812 INFO datanode.DataNode (DataXceiver.java:writeBlock(495)) - opWriteBlock BP-1770179175-192.168.44.128-1336847247907:blk_3471690017167574595_1003 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1770179175-192.168.44.128-1336847247907:blk_3471690017167574595_1003 already exists in state RBW and thus cannot be created. 2012-05-12 23:57:33,814 ERROR datanode.DataNode (DataXceiver.java:run(193)) - 127.0.0.1:54041:DataXceiver error processing WRITE_BLOCK operation src: /127.0.0.1:33583 dest: /127.0.0.1:54041 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-1770179175-192.168.44.128-1336847247907:blk_3471690017167574595_1003 already exists in state RBW and thus cannot be created. at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:795) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:151) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:365) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:66) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:189) at java.lang.Thread.run(Thread.java:619) 2012-05-12 23:57:33,815 INFO datanode.DataNode (DataNode.java:run(1406)) - DataTransfer: Transmitted BP-1770179175-192.168.44.128-1336847247907:blk_3471690017167574595_1003 (numBytes=100) to /127.0.0.1:54041 2012-05-12 23:57:34,066 INFO hdfs.StateChange (BlockManager.java:processReport(1450)) - BLOCK* processReport: from DatanodeRegistration(127.0.0.1, storageID=DS-610636930-192.168.44.128-20029-1336847250644, infoPort=52843, ipcPort=46734, storageInfo=lv=-40;cid=testClusterID;nsid=1646783488;c=0), blocks: 0, processing time: 0 msecs 2012-05-12 23:57:34,067 INFO datanode.DataNode (BPServic
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273231#comment-13273231 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Hdfs-trunk #1040 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1040/]) Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. (Revision 1336572) Result = FAILURE umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273091#comment-13273091 ] Uma Maheswara Rao G commented on HDFS-3157: --- {quote} One potential issue with this patch: Because it creates a new BlockInfo object, that BlockInfo doesn't have any pointer to the associated inode. Hence when we call markBlockAsCorrupt, it doesn't go through the normal corrupt replica handling path – instead, it gets immediately enqueued for deletion. {quote} You are right, Infact we have reverted this patch because of that cause. {quote} This makes me a little bit nervous – if we had a bug, for example, which caused the NN's view of the gen stamp to get increased without the DNs being increased, we would issue deletions for all replicas. If instead we were going through the normal corrupt replica handling path, it would first make sure it had good replicas of the "correct" genstamp before invalidating the corrupt replicas. That would prevent the data loss, instead turning into an unavailability. Does that make sense? {quote} Right. It make sense to me. We have go through the normal corruption flow. We will update the patch soon for that. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273075#comment-13273075 ] Todd Lipcon commented on HDFS-3157: --- One potential issue with this patch: Because it creates a new BlockInfo object, that BlockInfo doesn't have any pointer to the associated inode. Hence when we call markBlockAsCorrupt, it doesn't go through the normal corrupt replica handling path -- instead, it gets immediately enqueued for deletion. This makes me a little bit nervous -- if we had a bug, for example, which caused the NN's view of the gen stamp to get increased without the DNs being increased, we would issue deletions for all replicas. If instead we were going through the normal corrupt replica handling path, it would first make sure it had good replicas of the "correct" genstamp before invalidating the corrupt replicas. That would prevent the data loss, instead turning into an unavailability. Does that make sense? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272345#comment-13272345 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Mapreduce-trunk #1075 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1075/]) Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. (Revision 1336572) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272326#comment-13272326 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2237 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2237/]) Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. (Revision 1336572) Result = ABORTED umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272297#comment-13272297 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Common-trunk-Commit #2220 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2220/]) Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. (Revision 1336572) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272294#comment-13272294 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Hdfs-trunk-Commit #2295 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2295/]) Reverting (Need to re-do the patch. new BlockInfo does not set iNode ) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. (Revision 1336572) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272240#comment-13272240 ] Uma Maheswara Rao G commented on HDFS-3157: --- Yes, Nicholas, Thanks a lot for checking this. It will actually will not mark as block corrupt due to that inode check. We may have to rebuild the blockInfo just with reported block genstamp and other state should be same as storedBlock. Let's fix this in next patch. I just reverted the changes. Ashish is working on it. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271758#comment-13271758 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- I think we only have to fix the test. The behavior of dn0 is expected. DirectoryScanner.reconcile() should be able to remove the block from the replica map later on. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271736#comment-13271736 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- Here is the reason of the TestRBWBlockInvalidation failing: The block and meta files are deleted in dn0 but it is still in the replica map (FsDatasetImpl.volumeMap). When replication happens, it fails since the block is still in the replica map and so it throw ReplicaAlreadyExistsException. Therefore, the number of live replicas remains 2. > ... I am wondering, we got +1 here from QA. I also don't understand that why Jenkins has +1'ed on it. It seems that the test must always fail. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271725#comment-13271725 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- When replication happens, somehow the replica already exists. {noformat} //TestRBWBlockInvalidation output 2012-05-09 12:30:04,122 INFO datanode.DataNode (DataXceiver.java:writeBlock(495)) - opWriteBlock BP-2087796974-10.10.11.90-1336591801017:blk_-571802999240948417_1003 received exception org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block BP-2087796974-10.10.11.90-1336591801017:blk_-571802999240948417_1003 already exists in state RBW and thus cannot be created. {noformat} > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271639#comment-13271639 ] Uma Maheswara Rao G commented on HDFS-3157: --- Hi John, Thanks for digging into HDFS-3391. I am wondering, we got +1 here from QA. Anyway, I have discussed with Ashish, for taking a look. Let him find the actual cause for it. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271514#comment-13271514 ] John George commented on HDFS-3157: --- I believe this JIRA broke TestPipelinesFailover as stated in HDFS-3391. Could you take a look? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271425#comment-13271425 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Mapreduce-trunk #1074 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1074/]) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. Contributed by Ashish Singhi. (Revision 1335719) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335719 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271383#comment-13271383 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Hdfs-trunk #1039 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1039/]) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. Contributed by Ashish Singhi. (Revision 1335719) Result = FAILURE umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335719 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270782#comment-13270782 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2226 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2226/]) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. Contributed by Ashish Singhi. (Revision 1335719) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335719 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270781#comment-13270781 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526038/HDFS-3157.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2391//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270772#comment-13270772 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Common-trunk-Commit #2209 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2209/]) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. Contributed by Ashish Singhi. (Revision 1335719) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335719 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270774#comment-13270774 ] Uma Maheswara Rao G commented on HDFS-3157: --- I have committed this to trunk and branch-2. Thanks a lot Ashish for the contribution! Thanks Nicholas, for the review! > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270769#comment-13270769 ] Hudson commented on HDFS-3157: -- Integrated in Hadoop-Hdfs-trunk-Commit #2284 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2284/]) HDFS-3157. Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened. Contributed by Ashish Singhi. (Revision 1335719) Result = SUCCESS umamahesh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1335719 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestRBWBlockInvalidation.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/DataNodeTestUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetTestUtil.java > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269494#comment-13269494 ] Ashish Singhi commented on HDFS-3157: - Thanks a lot Uma and Nicholas for reviewing the patch. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269475#comment-13269475 ] Uma Maheswara Rao G commented on HDFS-3157: --- Thanks Nicholas, for the clarification. I will commit the patch today in some time. bq.-1 javadoc. The javadoc tool appears to have generated 16 warning messages. javadoc comments are unrelated to this patch. Thanks a lot, Ashish for the patch. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268849#comment-13268849 ] Tsz Wo (Nicholas), SZE commented on HDFS-3157: -- I think it is a bug and the fix is correct. Good work! +1 on the patch. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268105#comment-13268105 ] Uma Maheswara Rao G commented on HDFS-3157: --- Hi Ahish, Patch make sense to me. Let me get some clarifications on old behaviour. @Nicholas, do you have idea, why we are using storedBlock for marking it as corrupt when genstamps are mismatching. Ideally DN may not be able to find that stored block if genstamp is different from his volumeMaps block. Is there any specific reason for it? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265771#comment-13265771 ] Hadoop QA commented on HDFS-3157: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12525165/HDFS-3157.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. -1 javadoc. The javadoc tool appears to have generated 16 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2352//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2352//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256478#comment-13256478 ] Hadoop QA commented on HDFS-3157: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523172/HDFS-3157.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2295//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2295//console This message is automatically generated. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina > Fix For: 0.24.0 > > Attachments: HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241145#comment-13241145 ] Ashish Singhi commented on HDFS-3157: - After deleting a block. The pipeline will update the gen stamp of the block say blk_blockId_1002 to blk_blockId_1003. Then DN1 will mark the block with old gen stamp as corrupt. In BlockManager#processReportedBlock() storedBlock will get assigned to blk_blockId_1003 as blockMap is now updated with new gen stamp for this blockId and then it will ask DN1 to delete this blk_blockId_1003. As DN1's volumeMap does not contain blk_blockId_1003. It will throw an exception. > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina > Fix For: 0.24.0 > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240476#comment-13240476 ] Uma Maheswara Rao G commented on HDFS-3157: --- Hi Andreina, It would be good if we keep the descrion field short and add as comments about further details. This can avoid generating the big emails for every update on this issue. Thanks Uma > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina > Fix For: 0.24.0 > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira