[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273087#comment-13273087 ] Uma Maheswara Rao G commented on HDFS-3391: --- Thanks Todd, We can disscuss in HDFS-3157. > TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing > --- > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-3391.txt > > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273085#comment-13273085 ] Hudson commented on HDFS-3026: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2246 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2246/]) HDFS-3026. HA: Handle failure during HA state transition. Contributed by Aaron T. Myers. (Revision 1337030) Result = ABORTED atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337030 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273082#comment-13273082 ] Todd Lipcon commented on HDFS-3391: --- Hi Uma. I commented on HDFS-3157 as well, so let's continue that discussion there. On this JIRA let's discuss the improvement to InvalidateBlocks -- I think this bug fix is a good improvement regardless of whether 3157 is in. > TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing > --- > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-3391.txt > > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273080#comment-13273080 ] Uma Maheswara Rao G commented on HDFS-3391: --- In one way HDFS-3157 is incorrectly handled. Because, It was creating new block info. But unfortunately new blockInfo ctor sets the inode as null. When we are marking it corrupt, that will just invalidate the blocks and will say block does not belongs any file. When we set the inode from storedBlock to newly created BlockInfo also doesn't help, strangely I have seen triplets does not contain that block info. Now it is able add to corrupt replicas, but nodeIterator for BlockMap does not have information about this block. {noformat} 2012-05-10 21:30:04,378 WARN blockmanagement.BlockManager (BlockManager.java:createLocatedBlock(666)) - Inconsistent number of corrupt replicas for blk_-6411755644530997250_1003 blockMap has 0 but corrupt replicas map has 1 2012-05-10 21:30:04,381 WARN blockmanagement.BlockManager (BlockManager.java:createLocatedBlock(666)) - Inconsistent number of corrupt replicas for blk_-6411755644530997250_1003 blockMap has 0 but corrupt replicas map has 1{noformat} Let me dig into it. Is there any other bug exist in this lines which we did not notice. > TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing > --- > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-3391.txt > > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened
[ https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273075#comment-13273075 ] Todd Lipcon commented on HDFS-3157: --- One potential issue with this patch: Because it creates a new BlockInfo object, that BlockInfo doesn't have any pointer to the associated inode. Hence when we call markBlockAsCorrupt, it doesn't go through the normal corrupt replica handling path -- instead, it gets immediately enqueued for deletion. This makes me a little bit nervous -- if we had a bug, for example, which caused the NN's view of the gen stamp to get increased without the DNs being increased, we would issue deletions for all replicas. If instead we were going through the normal corrupt replica handling path, it would first make sure it had good replicas of the "correct" genstamp before invalidating the corrupt replicas. That would prevent the data loss, instead turning into an unavailability. Does that make sense? > Error in deleting block is keep on coming from DN even after the block report > and directory scanning has happened > - > > Key: HDFS-3157 > URL: https://issues.apache.org/jira/browse/HDFS-3157 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0, 0.24.0 >Reporter: J.Andreina >Assignee: Ashish Singhi > Fix For: 2.0.0, 3.0.0 > > Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch > > > Cluster setup: > 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" > 300,"dfs.datanode.directoryscan.interval" 1 > step 1: write one file "a.txt" with sync(not closed) > step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which > replication happened. > step 3: close the file. > Since the replication factor is 2 the blocks are replicated to the other > datanode. > Then at the NN side the following cmd is issued to DN from which the block is > deleted > - > {noformat} > 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK > NameSystem.addToCorruptReplicasMap: duplicate requested for > blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX > because reported RBW replica with genstamp 1002 does not match COMPLETE > block's genstamp in block map 1003 > 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* > Removing block blk_2903555284838653156_1003 from neededReplications as it has > enough replicas. > {noformat} > From the datanode side in which the block is deleted the following exception > occured > {noformat} > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Unexpected error trying to delete block blk_2903555284838653156_1003. > BlockInfo not found in volumeMap. > 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > Error processing datanode Command > java.io.IOException: Error in deleting blocks. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581) > at > org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662) > at java.lang.Thread.run(Thread.java:619) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3391: -- Attachment: hdfs-3391.txt This attached patch seems to fix the issue, even with HDFS-3157 and the above troublesome sleep() call in place. I think what was happening here was the following: - in some cases, the block synchronization path can run twice, if the first attempt is slow. This ends up first finalizing the block at genstamp 1005, and then again at 1006 or 1007. - for each of those genstamps, the DNs report FINALIZED replicas to both NNs. - When the new NN becomes active, then, it replays the block reports -- first FINALIZED for blk_N_1005, and then FINALIZED for blk_N_1006. - When it sees the blk_N_1005 genstamp, it already knows that 1006 is the "correct" latest genstamp for the block, so it wants to mark it as corrupt. Here is where the behavior differs: Prior to HDFS-3157, it was marking blk_N_1006 as corrupt instead of blk_N_1005. Thus the markBlockAsCorrupt() call would succeed. When processing the FINALIZED blk_N_1006, it would remove it from the corrupt list, and everything would be fine. With HDFS-3157 in place, it instead marks blk_N_1005 as corrupt. However, the BlockInfo object it creates to do so has no attached inode (BlockCollection in new parlance). So, markBlockAsCorrupt immediately enqueued the replica for invalidation, rather than treating it like a normal corrupt replica. Then, upon seeing the report of the blk_N_1006 FINALIZED replica, the check against invalidateBlocks.contains(block) caused it to be skipped, and thus addStoredBlock() didn't get called. The fix in this patch is to change invalidateBlocks so that its contains() call can check for genstamp match as well. So, even though blk_N_1005 has been enqueued for deletion, we should still accept a block report for blk_N_1006. > TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing > --- > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > Attachments: hdfs-3391.txt > > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273066#comment-13273066 ] Hudson commented on HDFS-3026: -- Integrated in Hadoop-Hdfs-trunk-Commit #2303 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2303/]) HDFS-3026. HA: Handle failure during HA state transition. Contributed by Aaron T. Myers. (Revision 1337030) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337030 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3026: - Fix Version/s: 2.0.0 > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273063#comment-13273063 ] Hudson commented on HDFS-3026: -- Integrated in Hadoop-Common-trunk-Commit #2229 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2229/]) HDFS-3026. HA: Handle failure during HA state transition. Contributed by Aaron T. Myers. (Revision 1337030) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337030 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3026: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks a lot for the reviews, Eli. I've just committed this to trunk, branch-2, and branch-2.0.0-alpha. > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
[ https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3404: - Status: Patch Available (was: Open) > Make putImage in GetImageServlet infer remote address to fetch from > --- > > Key: HDFS-3404 > URL: https://issues.apache.org/jira/browse/HDFS-3404 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3404.patch, HDFS-3404.patch > > > As it stands, daemons which perform checkpointing must determine their own > address on which they can be reached, so that the NN which they checkpoint > against knows what address to fetch a merged fsimage from. This causes > problems if, for example, the daemon performing checkpointing binds to > 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
[ https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3404: - Attachment: HDFS-3404.patch Here's an updated patch which addresses Eli's review feedback. I struggled for a while with how to write an automated test for this, and ultimately concluded it's not really possible on a single host, since connecting to 0.0.0.0 will work on a single box, whereas it wouldn't in a multi-box setup. I'll test this patch manually in a multi-node setup tomorrow. > Make putImage in GetImageServlet infer remote address to fetch from > --- > > Key: HDFS-3404 > URL: https://issues.apache.org/jira/browse/HDFS-3404 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3404.patch, HDFS-3404.patch > > > As it stands, daemons which perform checkpointing must determine their own > address on which they can be reached, so that the NN which they checkpoint > against knows what address to fetch a merged fsimage from. This causes > problems if, for example, the daemon performing checkpointing binds to > 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273052#comment-13273052 ] Todd Lipcon commented on HDFS-3391: --- I was able to reproduce this by reapplying HDFS-3157 and adding the following in DataNode.java: {code} --- a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java +++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java @@ -1983,6 +1983,13 @@ public class DataNode extends Configured datanodes[i] = r.id; storages[i] = r.storageID; } +if (newBlock.getGenerationStamp() == 1005) { + try { +Thread.sleep(1500); + } catch (InterruptedException ie) { +Thread.currentThread().interrupt(); + } +} nn.commitBlockSynchronization(block, newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false, datanodes, storages); {code} I have to think through whether this is a bug which we've had for a while which is uncovered by HDFS-3157, or if HDFS-3157 itself was incorrect. > TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing > --- > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere
[ https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273048#comment-13273048 ] Hadoop QA commented on HDFS-2391: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526471/HDFS-2391.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2423//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2423//console This message is automatically generated. > Newly set BalancerBandwidth value is not displayed anywhere > --- > > Key: HDFS-2391 > URL: https://issues.apache.org/jira/browse/HDFS-2391 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Affects Versions: 0.20.205.0 >Reporter: Rajit Saha >Assignee: Harsh J > Labels: newbie > Attachments: HDFS-2391.patch, HDFS-2391.patch > > > with current implementation of > $ hadoop dfsadmin -setBalancerBandwidth > only shows following message in DN log > INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand > action: DNA_BALANCERBANDWIDTHUPDATE > But it would be nice to have the value of > be displayed in DN log or any other > suitable place, so that we can have a track. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
[ https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273044#comment-13273044 ] Aaron T. Myers commented on HDFS-3404: -- bq. This change needs to be made to the 2NN as well right or were you thinking just the SBN? Nope, there's no change to be made to the 2NN. The 2NN doesn't do the same sort of validation that the SBN does that the configured NN HTTP address is not INADDR_ANY. The 2NN will automatically start behaving in the same way the SBN does just by virtue of the fact that it's connecting to an NN which doesn't look at the machine name in the param string. The 2NN will also stop sending the machine name in the param string, by virtue of the fact that it uses GetImageServlet#getParamStringToPutImage to form the param string. I also tested this patch with an NN/2NN, and it works just fine. bq. NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an IP here. Perhaps better to use InetAddresses.isInetAddress.' Sure, makes sense. I'll update the patch to suit. bq. How much more difficult would it be to just have it do a straight HTTP POST or PUT of the new image instead of the "I'll ask you to ask me for this image" dance? I investigated what it would take to do this a little bit, and concluded that to do it right would take a fair bit of refactoring that's well outside the modest scope of this JIRA. I've filed a separate JIRA to make this change, I hope that's OK: HDFS-3405 > Make putImage in GetImageServlet infer remote address to fetch from > --- > > Key: HDFS-3404 > URL: https://issues.apache.org/jira/browse/HDFS-3404 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3404.patch > > > As it stands, daemons which perform checkpointing must determine their own > address on which they can be reached, so that the NN which they checkpoint > against knows what address to fetch a merged fsimage from. This causes > problems if, for example, the daemon performing checkpointing binds to > 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273041#comment-13273041 ] Hadoop QA commented on HDFS-3335: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526470/HDFS-3335.008.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2422//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2422//console This message is automatically generated. > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, > HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, > HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch, > HDFS-3335.008.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages
Aaron T. Myers created HDFS-3405: Summary: Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages Key: HDFS-3405 URL: https://issues.apache.org/jira/browse/HDFS-3405 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1.0.0 Reporter: Aaron T. Myers As Todd points out in [this comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986], the current scheme for a checkpointing daemon to upload a merged fsimage file to an NN is to issue an HTTP get request to tell the target NN to issue another GET request back to the checkpointing daemon to retrieve the merged fsimage file. There's no fundamental reason the checkpointing daemon can't just use an HTTP POST or PUT to send back the merged fsimage file, rather than the double-GET scheme. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273034#comment-13273034 ] Todd Lipcon commented on HDFS-3391: --- I looped TestPipelinesFailover for quite some time and could not get a failure. In the logs you pointed to on build #2397, I traced the issue to the following: {code} 2012-05-09 23:50:33,074 DEBUG namenode.FSNamesystem (FSEditLogLoader.java:applyEditLogOp(296)) - OP_CLOSE: /test-file numblocks : 2 clientHolder clientMachine 2012-05-09 23:50:33,074 DEBUG blockmanagement.BlockManager (BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued message ReportedBlockInfo [block=blk_-3039116449792967513_1005, dn=127.0.0.1:45674, reportedState=FINALIZED] 2012-05-09 23:50:33,074 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1660)) - Reported block blk_-3039116449792967513_1005 on 127.0.0.1:45674 size 2048 replicaState = FINALIZED 2012-05-09 23:50:33,074 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = COMPLETE 2012-05-09 23:50:33,075 INFO hdfs.StateChange (BlockManager.java:markBlockAsCorrupt(926)) - BLOCK markBlockAsCorrupt: block blk_-3039116449792967513_1005 could not be marked as corrupt as it does not belong to any file 2012-05-09 23:50:33,076 INFO hdfs.StateChange (InvalidateBlocks.java:add(77)) - BLOCK* InvalidateBlocks: add blk_-3039116449792967513_1005 to 127.0.0.1:45674 2012-05-09 23:50:33,076 DEBUG blockmanagement.BlockManager (BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued message ReportedBlockInfo [block=blk_-3039116449792967513_1005, dn=127.0.0.1:35659, reportedState=FINALIZED] 2012-05-09 23:50:33,076 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1660)) - Reported block blk_-3039116449792967513_1005 on 127.0.0.1:35659 size 2048 replicaState = FINALIZED 2012-05-09 23:50:33,076 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = COMPLETE 2012-05-09 23:50:33,077 INFO hdfs.StateChange (BlockManager.java:markBlockAsCorrupt(926)) - BLOCK markBlockAsCorrupt: block blk_-3039116449792967513_1005 could not be marked as corrupt as it does not belong to any file 2012-05-09 23:50:33,077 INFO hdfs.StateChange (InvalidateBlocks.java:add(77)) - BLOCK* InvalidateBlocks: add blk_-3039116449792967513_1005 to 127.0.0.1:35659 2012-05-09 23:50:33,077 DEBUG blockmanagement.BlockManager (BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued message ReportedBlockInfo [block=blk_-3039116449792967513_1005, dn=127.0.0.1:59499, reportedState=FINALIZED] 2012-05-09 23:50:33,077 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1660)) - Reported block blk_-3039116449792967513_1005 on 127.0.0.1:59499 size 2048 replicaState = FINALIZED 2012-05-09 23:50:33,078 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = COMPLETE 2012-05-09 23:50:33,078 INFO hdfs.StateChange (BlockManager.java:markBlockAsCorrupt(926)) - BLOCK markBlockAsCorrupt: block blk_-3039116449792967513_1005 could not be marked as corrupt as it does not belong to any file 2012-05-09 23:50:33,078 INFO hdfs.StateChange (InvalidateBlocks.java:add(77)) - BLOCK* InvalidateBlocks: add blk_-3039116449792967513_1005 to 127.0.0.1:59499 2012-05-09 23:50:33,078 DEBUG blockmanagement.BlockManager (BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued message ReportedBlockInfo [block=blk_-3039116449792967513_1006, dn=127.0.0.1:45674, reportedState=FINALIZED] 2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1660)) - Reported block blk_-3039116449792967513_1006 on 127.0.0.1:45674 size 2048 replicaState = FINALIZED 2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = COMPLETE 2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager (BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued message ReportedBlockInfo [block=blk_-3039116449792967513_1006, dn=127.0.0.1:59499, reportedState=FINALIZED] 2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1660)) - Reported block blk_-3039116449792967513_1006 on 127.0.0.1:59499 size 2048 replicaState = FINALIZED 2012-05-09 23:50:33,080 DEBUG blockmanagement.BlockManager (BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = COMPLETE 2012-05-09 23:50:33,080 DEBUG blockmanagement.BlockManager (BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued message ReportedBlockInfo [block=blk_-3039116449792967513_1006, dn=127.0.0.1:35659, reportedState=FINALIZED] 2012-05-
[jira] [Commented] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
[ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273032#comment-13273032 ] Hadoop QA commented on HDFS-3368: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526469/blockDeletePolicy-trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2421//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2421//console This message is automatically generated. > Missing blocks due to bad DataNodes comming up and down. > > > Key: HDFS-3368 > URL: https://issues.apache.org/jira/browse/HDFS-3368 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Attachments: blockDeletePolicy-0.22.patch, > blockDeletePolicy-trunk.patch, blockDeletePolicy.patch > > > All replicas of a block can be removed if bad DataNodes come up and down > during cluster restart resulting in data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273025#comment-13273025 ] Hudson commented on HDFS-3400: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2245 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2245/]) HDFS-3400. DNs should be able start with jsvc even if security is disabled. Contributed by Aaron T. Myers (Revision 1337017) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337017 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere
[ https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-2391: -- Attachment: HDFS-2391.patch Done. > Newly set BalancerBandwidth value is not displayed anywhere > --- > > Key: HDFS-2391 > URL: https://issues.apache.org/jira/browse/HDFS-2391 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Affects Versions: 0.20.205.0 >Reporter: Rajit Saha >Assignee: Harsh J > Labels: newbie > Attachments: HDFS-2391.patch, HDFS-2391.patch > > > with current implementation of > $ hadoop dfsadmin -setBalancerBandwidth > only shows following message in DN log > INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand > action: DNA_BALANCERBANDWIDTHUPDATE > But it would be nice to have the value of > be displayed in DN log or any other > suitable place, so that we can have a track. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3335: --- Attachment: HDFS-3335.008.patch * EditLogFileInputStream: warn when skipping the last few bytes in a file. * rename GarbageAfterTerminatorException#offset to numAfterTerminator. * Rename FSImage#setEditLog to FSImage#setEditLogForTesting and add @VisibleForTesting annotation to it * avoid making some unecessary whitespace changes > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, > HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, > HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch, > HDFS-3335.008.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere
[ https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273008#comment-13273008 ] Eli Collins commented on HDFS-2391: --- How about just one info log? {code} + LOG.info("Updating balancer bandwidth from " + dxcs.balanceThrottler.getBandwidth() + " to " + bandwidth + " bytes/s."); dxcs.balanceThrottler.setBandwidth(bandwidth); {code} > Newly set BalancerBandwidth value is not displayed anywhere > --- > > Key: HDFS-2391 > URL: https://issues.apache.org/jira/browse/HDFS-2391 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer >Affects Versions: 0.20.205.0 >Reporter: Rajit Saha >Assignee: Harsh J > Labels: newbie > Attachments: HDFS-2391.patch > > > with current implementation of > $ hadoop dfsadmin -setBalancerBandwidth > only shows following message in DN log > INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand > action: DNA_BALANCERBANDWIDTHUPDATE > But it would be nice to have the value of > be displayed in DN log or any other > suitable place, so that we can have a track. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273007#comment-13273007 ] Hudson commented on HDFS-3400: -- Integrated in Hadoop-Common-trunk-Commit #2228 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2228/]) HDFS-3400. DNs should be able start with jsvc even if security is disabled. Contributed by Aaron T. Myers (Revision 1337017) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337017 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273006#comment-13273006 ] Hudson commented on HDFS-3400: -- Integrated in Hadoop-Hdfs-trunk-Commit #2302 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2302/]) HDFS-3400. DNs should be able start with jsvc even if security is disabled. Contributed by Aaron T. Myers (Revision 1337017) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337017 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3400: -- Resolution: Fixed Fix Version/s: 2.0.0 Target Version/s: (was: 2.0.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this and merged to branch-2. Thanks ATM! > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Fix For: 2.0.0 > > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3394) Do not use generic in INodeFile.getLastBlock()
[ https://issues.apache.org/jira/browse/HDFS-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272997#comment-13272997 ] Hadoop QA commented on HDFS-3394: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526465/h3394_20120510.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2420//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2420//console This message is automatically generated. > Do not use generic in INodeFile.getLastBlock() > -- > > Key: HDFS-3394 > URL: https://issues.apache.org/jira/browse/HDFS-3394 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Attachments: h3394_20120510.patch > > > As shown in HDFS-3385, the ClassCastException check in > INodeFile.getLastBlock() is useless since generic type information is only > available in compile-time but not run-time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
[ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-3368: -- Attachment: (was: blockDeletePolicy.patch) > Missing blocks due to bad DataNodes comming up and down. > > > Key: HDFS-3368 > URL: https://issues.apache.org/jira/browse/HDFS-3368 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Attachments: blockDeletePolicy-0.22.patch, > blockDeletePolicy-trunk.patch, blockDeletePolicy.patch > > > All replicas of a block can be removed if bad DataNodes come up and down > during cluster restart resulting in data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
[ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-3368: -- Target Version/s: 0.22.1, 2.0.0, 3.0.0 (was: 3.0.0, 2.0.0, 0.22.1) Status: Patch Available (was: Open) Submitting patch for trunk. > Missing blocks due to bad DataNodes comming up and down. > > > Key: HDFS-3368 > URL: https://issues.apache.org/jira/browse/HDFS-3368 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 0.22.0, 2.0.0, 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Attachments: blockDeletePolicy-0.22.patch, > blockDeletePolicy-trunk.patch, blockDeletePolicy.patch > > > All replicas of a block can be removed if bad DataNodes come up and down > during cluster restart resulting in data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
[ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-3368: -- Attachment: blockDeletePolicy-trunk.patch > Missing blocks due to bad DataNodes comming up and down. > > > Key: HDFS-3368 > URL: https://issues.apache.org/jira/browse/HDFS-3368 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Attachments: blockDeletePolicy-0.22.patch, > blockDeletePolicy-trunk.patch, blockDeletePolicy.patch > > > All replicas of a block can be removed if bad DataNodes come up and down > during cluster restart resulting in data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
[ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-3368: -- Attachment: blockDeletePolicy-0.22.patch > Missing blocks due to bad DataNodes comming up and down. > > > Key: HDFS-3368 > URL: https://issues.apache.org/jira/browse/HDFS-3368 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Attachments: blockDeletePolicy-0.22.patch, blockDeletePolicy.patch, > blockDeletePolicy.patch > > > All replicas of a block can be removed if bad DataNodes come up and down > during cluster restart resulting in data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
[ https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272986#comment-13272986 ] Todd Lipcon commented on HDFS-3404: --- How much more difficult would it be to just have it do a straight HTTP POST or PUT of the new image instead of the "I'll ask you to ask me for this image" dance? > Make putImage in GetImageServlet infer remote address to fetch from > --- > > Key: HDFS-3404 > URL: https://issues.apache.org/jira/browse/HDFS-3404 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3404.patch > > > As it stands, daemons which perform checkpointing must determine their own > address on which they can be reached, so that the NN which they checkpoint > against knows what address to fetch a merged fsimage from. This causes > problems if, for example, the daemon performing checkpointing binds to > 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover
[ https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272985#comment-13272985 ] Hadoop QA commented on HDFS-3031: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526461/hdfs-3031.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2419//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2419//console This message is automatically generated. > HA: Error (failed to close file) when uploading large file + kill active NN + > manual failover > - > > Key: HDFS-3031 > URL: https://issues.apache.org/jira/browse/HDFS-3031 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 0.24.0 >Reporter: Stephen Chu >Assignee: Todd Lipcon > Attachments: hdfs-3031.txt, hdfs-3031.txt, hdfs-3031.txt, > styx01_killNNfailover, styx01_uploadLargeFile > > > I executed section 3.4 of Todd's HA test plan. > https://issues.apache.org/jira/browse/HDFS-1623 > 1. A large file upload is started. > 2. While the file is being uploaded, the administrator kills the first NN and > performs a failover. > 3. After the file finishes being uploaded, it is verified for correct length > and contents. > For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. > styx01 hosted the active NN and styx02 hosted the standby NN. > In the log files I attached, you can see that on styx01 I began file upload. > hadoop fs -put centos64-2.5.5.qcow2 > After waiting several seconds, I kill -9'd the active NN on styx01 and > manually failed over to the NN on styx02. I ran into exception below. (rest > of the stacktrace in the attached file styx01_uploadLargeFile) > 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred > since the start of this method invocation attempt. > put: Failed on local exception: java.io.EOFException; Host Details : local > host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: > ""styx01.sf.cloudera.com"\ > :12020; > 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file > /user/schu/centos64-2-5.5.qcow2._COPYING_ > java.io.IOException: Failed on local exception: java.io.EOFException; Host > Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination > host is: ""styx01.\ > sf.cloudera.com":12020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) > at org.apache.hadoop.ipc.Client.call(Client.java:1145) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188) > at $Proxy9.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) > at $Proxy10.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.ha
[jira] [Resolved] (HDFS-3388) GetJournalEditServlet should close output stream only if the stream is used.
[ https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-3388. -- Resolution: Fixed Fix Version/s: Shared journals (HDFS-3092) I have committed this. Thanks, Brandon! > GetJournalEditServlet should close output stream only if the stream is used. > > > Key: HDFS-3388 > URL: https://issues.apache.org/jira/browse/HDFS-3388 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node >Reporter: Brandon Li >Assignee: Brandon Li > Fix For: Shared journals (HDFS-3092) > > Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch, > HDFS-3388.HDFS-3092.patch > > > GetJournalEditServlet has the same problem as that of GetImageServlet > (HDFS-3330). It should be fixed in the same way. Also need to make > CheckpointFaultInjector visible for journal service tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3388) GetJournalEditServlet should close output stream only if the stream is used.
[ https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3388: - Summary: GetJournalEditServlet should close output stream only if the stream is used. (was: GetJournalEditServlet should catch more exceptions, not just IOException) Hadoop Flags: Reviewed +1 patch looks good. (revised the summary.) > GetJournalEditServlet should close output stream only if the stream is used. > > > Key: HDFS-3388 > URL: https://issues.apache.org/jira/browse/HDFS-3388 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node >Reporter: Brandon Li >Assignee: Brandon Li > Fix For: Shared journals (HDFS-3092) > > Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch, > HDFS-3388.HDFS-3092.patch > > > GetJournalEditServlet has the same problem as that of GetImageServlet > (HDFS-3330). It should be fixed in the same way. Also need to make > CheckpointFaultInjector visible for journal service tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3394) Do not use generic in INodeFile.getLastBlock()
[ https://issues.apache.org/jira/browse/HDFS-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3394: - Status: Patch Available (was: Open) > Do not use generic in INodeFile.getLastBlock() > -- > > Key: HDFS-3394 > URL: https://issues.apache.org/jira/browse/HDFS-3394 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Attachments: h3394_20120510.patch > > > As shown in HDFS-3385, the ClassCastException check in > INodeFile.getLastBlock() is useless since generic type information is only > available in compile-time but not run-time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3394) Do not use generic in INodeFile.getLastBlock()
[ https://issues.apache.org/jira/browse/HDFS-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3394: - Attachment: h3394_20120510.patch h3394_20120510.patch: - removes generic from INodeFile.getLastBlock(); - changes some public/protected methods package private/private; - rewrites some javadoc. > Do not use generic in INodeFile.getLastBlock() > -- > > Key: HDFS-3394 > URL: https://issues.apache.org/jira/browse/HDFS-3394 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE >Priority: Minor > Attachments: h3394_20120510.patch > > > As shown in HDFS-3385, the ClassCastException check in > INodeFile.getLastBlock() is useless since generic type information is only > available in compile-time but not run-time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase
[ https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272970#comment-13272970 ] Hadoop QA commented on HDFS-3163: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526447/HDFS-3163.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestHFlush +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2416//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2416//console This message is automatically generated. > TestHDFSCLI.testAll fails if the user name is not all lowercase > --- > > Key: HDFS-3163 > URL: https://issues.apache.org/jira/browse/HDFS-3163 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Trivial > Attachments: HDFS-3163.patch > > > In the test resource file testHDFSConf.xml, the test comparators expect user > name to be all lowercase. > If the user issuing the test has an uppercase in the username (e.g., Brandon > instead of brandon), many RegexpComarator tests will fail. The following is > one example: > {noformat} > > RegexpComparator > ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( > )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( > )*/file1 > > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file
[ https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272963#comment-13272963 ] Hudson commented on HDFS-3385: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2244 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2244/]) HDFS-3385. The last block of INodeFileUnderConstruction is not necessarily a BlockInfoUnderConstruction, so do not cast it in FSNamesystem.recoverLeaseInternal(..). (Revision 1336976) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336976 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java > ClassCastException when trying to append a file > --- > > Key: HDFS-3385 > URL: https://issues.apache.org/jira/browse/HDFS-3385 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Environment: HDFS >Reporter: Brahma Reddy Battula >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.0 > > Attachments: h3385_20120508.patch, h3385_20120509.patch > > > When I try to append a file I got > {noformat} > 2012-05-08 18:13:40,506 WARN util.KerberosName > (KerberosName.java:(87)) - Kerberos krb5 configuration not found, > setting default realm to empty > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425) > ... > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981) > at > org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272957#comment-13272957 ] Hudson commented on HDFS-3401: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2243 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2243/]) HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed by Eli Collins (Revision 1336972) Result = ABORTED eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336972 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Fix For: 2.0.0 > > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file
[ https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272950#comment-13272950 ] Hudson commented on HDFS-3385: -- Integrated in Hadoop-Common-trunk-Commit #2227 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2227/]) HDFS-3385. The last block of INodeFileUnderConstruction is not necessarily a BlockInfoUnderConstruction, so do not cast it in FSNamesystem.recoverLeaseInternal(..). (Revision 1336976) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336976 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java > ClassCastException when trying to append a file > --- > > Key: HDFS-3385 > URL: https://issues.apache.org/jira/browse/HDFS-3385 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Environment: HDFS >Reporter: Brahma Reddy Battula >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.0 > > Attachments: h3385_20120508.patch, h3385_20120509.patch > > > When I try to append a file I got > {noformat} > 2012-05-08 18:13:40,506 WARN util.KerberosName > (KerberosName.java:(87)) - Kerberos krb5 configuration not found, > setting default realm to empty > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425) > ... > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981) > at > org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272945#comment-13272945 ] Colin Patrick McCabe commented on HDFS-3335: * Yeah, your understanding of GarbageAfterTerminatorException#getOffset is correct. I'll rename it to something clearer. .bq Part of what is confusing me is this: does padding after OP_INVALID count as garbage or not? No, padding is just zeros or 0xffs. Garbage is something you wouldn't expect to be there, like more opcodes, random bytes, or something like that. * I'll see if I can remove the unecessary whitespace diffs... > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, > HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, > HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file
[ https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272946#comment-13272946 ] Hudson commented on HDFS-3385: -- Integrated in Hadoop-Hdfs-trunk-Commit #2301 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2301/]) HDFS-3385. The last block of INodeFileUnderConstruction is not necessarily a BlockInfoUnderConstruction, so do not cast it in FSNamesystem.recoverLeaseInternal(..). (Revision 1336976) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336976 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java > ClassCastException when trying to append a file > --- > > Key: HDFS-3385 > URL: https://issues.apache.org/jira/browse/HDFS-3385 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Environment: HDFS >Reporter: Brahma Reddy Battula >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.0 > > Attachments: h3385_20120508.patch, h3385_20120509.patch > > > When I try to append a file I got > {noformat} > 2012-05-08 18:13:40,506 WARN util.KerberosName > (KerberosName.java:(87)) - Kerberos krb5 configuration not found, > setting default realm to empty > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425) > ... > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981) > at > org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover
[ https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3031: -- Attachment: hdfs-3031.txt Woops, my previously uploaded patch was accidentally against the auto-HA branch. New patch is against trunk (only a trivial difference in tests) > HA: Error (failed to close file) when uploading large file + kill active NN + > manual failover > - > > Key: HDFS-3031 > URL: https://issues.apache.org/jira/browse/HDFS-3031 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 0.24.0 >Reporter: Stephen Chu >Assignee: Todd Lipcon > Attachments: hdfs-3031.txt, hdfs-3031.txt, hdfs-3031.txt, > styx01_killNNfailover, styx01_uploadLargeFile > > > I executed section 3.4 of Todd's HA test plan. > https://issues.apache.org/jira/browse/HDFS-1623 > 1. A large file upload is started. > 2. While the file is being uploaded, the administrator kills the first NN and > performs a failover. > 3. After the file finishes being uploaded, it is verified for correct length > and contents. > For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. > styx01 hosted the active NN and styx02 hosted the standby NN. > In the log files I attached, you can see that on styx01 I began file upload. > hadoop fs -put centos64-2.5.5.qcow2 > After waiting several seconds, I kill -9'd the active NN on styx01 and > manually failed over to the NN on styx02. I ran into exception below. (rest > of the stacktrace in the attached file styx01_uploadLargeFile) > 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred > since the start of this method invocation attempt. > put: Failed on local exception: java.io.EOFException; Host Details : local > host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: > ""styx01.sf.cloudera.com"\ > :12020; > 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file > /user/schu/centos64-2-5.5.qcow2._COPYING_ > java.io.IOException: Failed on local exception: java.io.EOFException; Host > Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination > host is: ""styx01.\ > sf.cloudera.com":12020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) > at org.apache.hadoop.ipc.Client.call(Client.java:1145) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188) > at $Proxy9.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) > at $Proxy10.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3385) ClassCastException when trying to append a file
[ https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3385: - Resolution: Fixed Fix Version/s: 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Suresh for the review. I have committed this. > ClassCastException when trying to append a file > --- > > Key: HDFS-3385 > URL: https://issues.apache.org/jira/browse/HDFS-3385 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Environment: HDFS >Reporter: Brahma Reddy Battula >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.0 > > Attachments: h3385_20120508.patch, h3385_20120509.patch > > > When I try to append a file I got > {noformat} > 2012-05-08 18:13:40,506 WARN util.KerberosName > (KerberosName.java:(87)) - Kerberos krb5 configuration not found, > setting default realm to empty > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425) > ... > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981) > at > org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file
[ https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272934#comment-13272934 ] Suresh Srinivas commented on HDFS-3385: --- +1 for the patch. > ClassCastException when trying to append a file > --- > > Key: HDFS-3385 > URL: https://issues.apache.org/jira/browse/HDFS-3385 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Environment: HDFS >Reporter: Brahma Reddy Battula >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h3385_20120508.patch, h3385_20120509.patch > > > When I try to append a file I got > {noformat} > 2012-05-08 18:13:40,506 WARN util.KerberosName > (KerberosName.java:(87)) - Kerberos krb5 configuration not found, > setting default realm to empty > Exception in thread "main" java.lang.ClassCastException: > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425) > ... > at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189) > at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221) > at > org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1) > at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981) > at > org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover
[ https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272933#comment-13272933 ] Hadoop QA commented on HDFS-3031: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526455/hdfs-3031.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2418//console This message is automatically generated. > HA: Error (failed to close file) when uploading large file + kill active NN + > manual failover > - > > Key: HDFS-3031 > URL: https://issues.apache.org/jira/browse/HDFS-3031 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 0.24.0 >Reporter: Stephen Chu >Assignee: Todd Lipcon > Attachments: hdfs-3031.txt, hdfs-3031.txt, styx01_killNNfailover, > styx01_uploadLargeFile > > > I executed section 3.4 of Todd's HA test plan. > https://issues.apache.org/jira/browse/HDFS-1623 > 1. A large file upload is started. > 2. While the file is being uploaded, the administrator kills the first NN and > performs a failover. > 3. After the file finishes being uploaded, it is verified for correct length > and contents. > For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. > styx01 hosted the active NN and styx02 hosted the standby NN. > In the log files I attached, you can see that on styx01 I began file upload. > hadoop fs -put centos64-2.5.5.qcow2 > After waiting several seconds, I kill -9'd the active NN on styx01 and > manually failed over to the NN on styx02. I ran into exception below. (rest > of the stacktrace in the attached file styx01_uploadLargeFile) > 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred > since the start of this method invocation attempt. > put: Failed on local exception: java.io.EOFException; Host Details : local > host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: > ""styx01.sf.cloudera.com"\ > :12020; > 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file > /user/schu/centos64-2-5.5.qcow2._COPYING_ > java.io.IOException: Failed on local exception: java.io.EOFException; Host > Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination > host is: ""styx01.\ > sf.cloudera.com":12020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) > at org.apache.hadoop.ipc.Client.call(Client.java:1145) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188) > at $Proxy9.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) > at $Proxy10.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272931#comment-13272931 ] Hudson commented on HDFS-3401: -- Integrated in Hadoop-Common-trunk-Commit #2225 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2225/]) HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed by Eli Collins (Revision 1336972) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336972 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Fix For: 2.0.0 > > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase
[ https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272932#comment-13272932 ] Brandon Li commented on HDFS-3163: -- I tested the patch by running TestHDFSCLI. It passed with different users like admin, Brandon and test1. > TestHDFSCLI.testAll fails if the user name is not all lowercase > --- > > Key: HDFS-3163 > URL: https://issues.apache.org/jira/browse/HDFS-3163 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Trivial > Attachments: HDFS-3163.patch > > > In the test resource file testHDFSConf.xml, the test comparators expect user > name to be all lowercase. > If the user issuing the test has an uppercase in the username (e.g., Brandon > instead of brandon), many RegexpComarator tests will fail. The following is > one example: > {noformat} > > RegexpComparator > ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( > )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( > )*/file1 > > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272929#comment-13272929 ] Todd Lipcon commented on HDFS-3335: --- In {{EditLogFileInputStream.nextOp}}, we should log a WARN message with the file name and data on how many bytes are skipped at the end of the file. This way, if there is an error replaying later, you might notice that in fact you did want to recover some of these edits. Having the warning in the log will make it easier to find where they went. In this place, it would also be nice to detect how many of those bytes were just 0x padding vs data that potentially looks like transactions. - Rename {{GarbageAfterTerminatorException.getOffset}} to something a little more clear -- right now it's not obvious that this is a relative offset/length after the OP_INVALID, versus an offset since the beginning of the file, etc. Perhaps {{getPaddingLengthAfterEofMarker}}? I'm still not entirely clear what this length represents... by my reading of the javadoc, it is: {code} <--- valid edits ---> < OP_INVALID > <-- N bytes of padding --> <-- non-padding data --> EOF {code} where {{N}} above is what you're talking about? Maybe some ASCII art like the above in the javadoc would be helpful. Part of what is confusing me is this: does padding after OP_INVALID count as garbage or not? {code} + /** Testing hook */ + void setEditLog(FSEditLog newLog) { {code} Can you add @VisibleForTesting and change to {{setEditLogForTesting}} so no one starts to use it in non-test code? - Lots of spurious whitespace changes in TestNameNodeRecovery - Can you add brief javadoc to the three implementations of Corruptor? eg "/** Truncate the last byte of the file */", "/* Add padding followed by some non-padding bytes to the end of the file */" and "/** Add only padding to the end of the file */"? Otherwise really nice tests. > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, > HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, > HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272927#comment-13272927 ] Hudson commented on HDFS-3401: -- Integrated in Hadoop-Hdfs-trunk-Commit #2300 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2300/]) HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed by Eli Collins (Revision 1336972) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336972 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Fix For: 2.0.0 > > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3134) Harden edit log loader against malformed or malicious input
[ https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272924#comment-13272924 ] Hudson commented on HDFS-3134: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2242 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2242/]) HDFS-3134. harden edit log loader against malformed or malicious input. Contributed by Colin Patrick McCabe (Revision 1336943) Result = ABORTED eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336943 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenIdentifier.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java > Harden edit log loader against malformed or malicious input > --- > > Key: HDFS-3134 > URL: https://issues.apache.org/jira/browse/HDFS-3134 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.0.0 > > Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, > HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, > HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch > > > Currently, the edit log loader does not handle bad or malicious input > sensibly. > We can often cause OutOfMemory exceptions, null pointer exceptions, or other > unchecked exceptions to be thrown by feeding the edit log loader bad input. > In some environments, an out of memory error can cause the JVM process to be > terminated. > It's clear that we want these exceptions to be thrown as IOException instead > of as unchecked exceptions. We also want to avoid out of memory situations. > The main task here is to put a sensible upper limit on the lengths of arrays > and strings we allocate on command. The other task is to try to avoid > creating unchecked exceptions (by dereferencing potentially-NULL pointers, > for example). Instead, we should verify ahead of time and give a more > sensible error message that reflects the problem with the input. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3401: -- Resolution: Fixed Fix Version/s: 2.0.0 Target Version/s: (was: 2.0.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the review ATM. I've committed this and merged to branch-2. > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Fix For: 2.0.0 > > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException
[ https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-3388: - Attachment: HDFS-3388.HDFS-3092.patch The new patch addressed Nicholas' comments. > GetJournalEditServlet should catch more exceptions, not just IOException > > > Key: HDFS-3388 > URL: https://issues.apache.org/jira/browse/HDFS-3388 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node >Reporter: Brandon Li >Assignee: Brandon Li > Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch, > HDFS-3388.HDFS-3092.patch > > > GetJournalEditServlet has the same problem as that of GetImageServlet > (HDFS-3330). It should be fixed in the same way. Also need to make > CheckpointFaultInjector visible for journal service tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272919#comment-13272919 ] Hadoop QA commented on HDFS-3335: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526442/HDFS-3335-b1.004.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2417//console This message is automatically generated. > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, > HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, > HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
[ https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272916#comment-13272916 ] Eli Collins commented on HDFS-3404: --- The approach - have the NN determine the hostname of the checkpointer from the request rather than have it passed as a parameter - seems more sane to me. - This change needs to be made to the 2NN as well right or were you thinking just the SBN? - NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an IP here. Perhaps better to use InetAddresses.isInetAddress. > Make putImage in GetImageServlet infer remote address to fetch from > --- > > Key: HDFS-3404 > URL: https://issues.apache.org/jira/browse/HDFS-3404 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3404.patch > > > As it stands, daemons which perform checkpointing must determine their own > address on which they can be reached, so that the NN which they checkpoint > against knows what address to fetch a merged fsimage from. This causes > problems if, for example, the daemon performing checkpointing binds to > 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover
[ https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3031: -- Attachment: hdfs-3031.txt New rev fixes the issue with append(): the problem is that the client doesn't send any previous block when appending to a file, when the append starts exactly at a block boundary. I attempted to make the client cleaner, but DFSOutputStream is a hairball. Without a substantial reworking of that, it was cleaner to do this on the server side. > HA: Error (failed to close file) when uploading large file + kill active NN + > manual failover > - > > Key: HDFS-3031 > URL: https://issues.apache.org/jira/browse/HDFS-3031 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 0.24.0 >Reporter: Stephen Chu >Assignee: Todd Lipcon > Attachments: hdfs-3031.txt, hdfs-3031.txt, styx01_killNNfailover, > styx01_uploadLargeFile > > > I executed section 3.4 of Todd's HA test plan. > https://issues.apache.org/jira/browse/HDFS-1623 > 1. A large file upload is started. > 2. While the file is being uploaded, the administrator kills the first NN and > performs a failover. > 3. After the file finishes being uploaded, it is verified for correct length > and contents. > For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. > styx01 hosted the active NN and styx02 hosted the standby NN. > In the log files I attached, you can see that on styx01 I began file upload. > hadoop fs -put centos64-2.5.5.qcow2 > After waiting several seconds, I kill -9'd the active NN on styx01 and > manually failed over to the NN on styx02. I ran into exception below. (rest > of the stacktrace in the attached file styx01_uploadLargeFile) > 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred > since the start of this method invocation attempt. > put: Failed on local exception: java.io.EOFException; Host Details : local > host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: > ""styx01.sf.cloudera.com"\ > :12020; > 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file > /user/schu/centos64-2-5.5.qcow2._COPYING_ > java.io.IOException: Failed on local exception: java.io.EOFException; Host > Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination > host is: ""styx01.\ > sf.cloudera.com":12020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731) > at org.apache.hadoop.ipc.Client.call(Client.java:1145) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188) > at $Proxy9.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) > at $Proxy10.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272909#comment-13272909 ] Hadoop QA commented on HDFS-3401: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526427/hdfs-3401.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2415//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2415//console This message is automatically generated. > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3134) Harden edit log loader against malformed or malicious input
[ https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272902#comment-13272902 ] Hudson commented on HDFS-3134: -- Integrated in Hadoop-Common-trunk-Commit #2224 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2224/]) HDFS-3134. harden edit log loader against malformed or malicious input. Contributed by Colin Patrick McCabe (Revision 1336943) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336943 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenIdentifier.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java > Harden edit log loader against malformed or malicious input > --- > > Key: HDFS-3134 > URL: https://issues.apache.org/jira/browse/HDFS-3134 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.0.0 > > Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, > HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, > HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch > > > Currently, the edit log loader does not handle bad or malicious input > sensibly. > We can often cause OutOfMemory exceptions, null pointer exceptions, or other > unchecked exceptions to be thrown by feeding the edit log loader bad input. > In some environments, an out of memory error can cause the JVM process to be > terminated. > It's clear that we want these exceptions to be thrown as IOException instead > of as unchecked exceptions. We also want to avoid out of memory situations. > The main task here is to put a sensible upper limit on the lengths of arrays > and strings we allocate on command. The other task is to try to avoid > creating unchecked exceptions (by dereferencing potentially-NULL pointers, > for example). Instead, we should verify ahead of time and give a more > sensible error message that reflects the problem with the input. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException
[ https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272896#comment-13272896 ] Tsz Wo (Nicholas), SZE commented on HDFS-3388: -- - Since GetJournalEditServletFaultInjector is an inner class, let's simply call it FaultInjector. - GetJournalEditServletFaultInjector.getInstance() is not used (and it should be static if you want to use it.) - change {code} new String(path1.toString() + "/current") {code} to {code} "path1 + "/current" {code} > GetJournalEditServlet should catch more exceptions, not just IOException > > > Key: HDFS-3388 > URL: https://issues.apache.org/jira/browse/HDFS-3388 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node >Reporter: Brandon Li >Assignee: Brandon Li > Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch > > > GetJournalEditServlet has the same problem as that of GetImageServlet > (HDFS-3330). It should be fixed in the same way. Also need to make > CheckpointFaultInjector visible for journal service tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3134) Harden edit log loader against malformed or malicious input
[ https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272895#comment-13272895 ] Hudson commented on HDFS-3134: -- Integrated in Hadoop-Hdfs-trunk-Commit #2299 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2299/]) HDFS-3134. harden edit log loader against malformed or malicious input. Contributed by Colin Patrick McCabe (Revision 1336943) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336943 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenIdentifier.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java > Harden edit log loader against malformed or malicious input > --- > > Key: HDFS-3134 > URL: https://issues.apache.org/jira/browse/HDFS-3134 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.0.0 > > Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, > HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, > HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch > > > Currently, the edit log loader does not handle bad or malicious input > sensibly. > We can often cause OutOfMemory exceptions, null pointer exceptions, or other > unchecked exceptions to be thrown by feeding the edit log loader bad input. > In some environments, an out of memory error can cause the JVM process to be > terminated. > It's clear that we want these exceptions to be thrown as IOException instead > of as unchecked exceptions. We also want to avoid out of memory situations. > The main task here is to put a sensible upper limit on the lengths of arrays > and strings we allocate on command. The other task is to try to avoid > creating unchecked exceptions (by dereferencing potentially-NULL pointers, > for example). Instead, we should verify ahead of time and give a more > sensible error message that reflects the problem with the input. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-744) Support hsync in HDFS
[ https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HDFS-744: --- Attachment: HDFS-744-trunk-v2.patch Patch that I tested against HBase. (I post the required HBase changes on the linked jira.) HBase starts up, I can flush, and compact tables. I verified via debugger that the sync path is correctly triggered. *Please* have a look. For users like us (Salesforce.com) this is an important data safety feature. > Support hsync in HDFS > - > > Key: HDFS-744 > URL: https://issues.apache.org/jira/browse/HDFS-744 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Hairong Kuang > Attachments: HDFS-744-trunk-v2.patch, HDFS-744-trunk.patch, > hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt > > > HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, > the real expected semantics should be "flushes out to all replicas and all > replicas have done posix fsync equivalent - ie the OS has flushed it to the > disk device (but the disk may have it in its cache)." This jira aims to > implement the expected behaviour. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3134) Harden edit log loader against malformed or malicious input
[ https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3134: -- Target Version/s: (was: 2.0.0) Fix Version/s: 2.0.0 Issue Type: Improvement (was: Bug) Hadoop Flags: Reviewed Summary: Harden edit log loader against malformed or malicious input (was: harden edit log loader against malformed or malicious input) I've committed this and merged to branch-2, thanks Colin! > Harden edit log loader against malformed or malicious input > --- > > Key: HDFS-3134 > URL: https://issues.apache.org/jira/browse/HDFS-3134 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Fix For: 2.0.0 > > Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, > HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, > HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch > > > Currently, the edit log loader does not handle bad or malicious input > sensibly. > We can often cause OutOfMemory exceptions, null pointer exceptions, or other > unchecked exceptions to be thrown by feeding the edit log loader bad input. > In some environments, an out of memory error can cause the JVM process to be > terminated. > It's clear that we want these exceptions to be thrown as IOException instead > of as unchecked exceptions. We also want to avoid out of memory situations. > The main task here is to put a sensible upper limit on the lengths of arrays > and strings we allocate on command. The other task is to try to avoid > creating unchecked exceptions (by dereferencing potentially-NULL pointers, > for example). Instead, we should verify ahead of time and give a more > sensible error message that reflects the problem with the input. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3134) harden edit log loader against malformed or malicious input
[ https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272879#comment-13272879 ] Eli Collins commented on HDFS-3134: --- +1 looks good > harden edit log loader against malformed or malicious input > --- > > Key: HDFS-3134 > URL: https://issues.apache.org/jira/browse/HDFS-3134 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, > HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, > HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch > > > Currently, the edit log loader does not handle bad or malicious input > sensibly. > We can often cause OutOfMemory exceptions, null pointer exceptions, or other > unchecked exceptions to be thrown by feeding the edit log loader bad input. > In some environments, an out of memory error can cause the JVM process to be > terminated. > It's clear that we want these exceptions to be thrown as IOException instead > of as unchecked exceptions. We also want to avoid out of memory situations. > The main task here is to put a sensible upper limit on the lengths of arrays > and strings we allocate on command. The other task is to try to avoid > creating unchecked exceptions (by dereferencing potentially-NULL pointers, > for example). Instead, we should verify ahead of time and give a more > sensible error message that reflects the problem with the input. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase
[ https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-3163: - Assignee: Brandon Li Status: Patch Available (was: Open) > TestHDFSCLI.testAll fails if the user name is not all lowercase > --- > > Key: HDFS-3163 > URL: https://issues.apache.org/jira/browse/HDFS-3163 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Trivial > Attachments: HDFS-3163.patch > > > In the test resource file testHDFSConf.xml, the test comparators expect user > name to be all lowercase. > If the user issuing the test has an uppercase in the username (e.g., Brandon > instead of brandon), many RegexpComarator tests will fail. The following is > one example: > {noformat} > > RegexpComparator > ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( > )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( > )*/file1 > > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase
[ https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-3163: - Attachment: HDFS-3163.patch Changed testHDFSConf.xml to support user name with uppercase and number. > TestHDFSCLI.testAll fails if the user name is not all lowercase > --- > > Key: HDFS-3163 > URL: https://issues.apache.org/jira/browse/HDFS-3163 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: Brandon Li >Priority: Trivial > Attachments: HDFS-3163.patch > > > In the test resource file testHDFSConf.xml, the test comparators expect user > name to be all lowercase. > If the user issuing the test has an uppercase in the username (e.g., Brandon > instead of brandon), many RegexpComarator tests will fail. The following is > one example: > {noformat} > > RegexpComparator > ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( > )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( > )*/file1 > > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3335: --- Attachment: HDFS-3335-b1.004.patch For the branch-1 patch: Ignore corruption after the sentinel as long as it takes place in the last 2 megabytes of the log. Test this exception. > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, > HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, > HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate
[ https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272858#comment-13272858 ] Hudson commented on HDFS-3369: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2241 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2241/]) HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and BlocksMap to {get|set|add}BlockCollection(..). Contributed by John George (Revision 1336909) Result = ABORTED szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336909 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java > change variable names referring to inode in blockmanagement to more > appropriate > --- > > Key: HDFS-3369 > URL: https://issues.apache.org/jira/browse/HDFS-3369 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 2.0.0, 3.0.0 >Reporter: John George >Assignee: John George >Priority: Minor > Fix For: 2.0.0 > > Attachments: HDFS-3369.patch > > > We should rename BlocksMap.getINode(..) and, in addition, the local variable > names such as fileInode to match 'block collection' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272855#comment-13272855 ] Aaron T. Myers commented on HDFS-3401: -- Patch looks good to me. +1 pending Jenkins. > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
[ https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3404: - Attachment: HDFS-3404.patch Here's an initial patch to make sure folks are OK with the approach. I'm still mulling over how best to write tests for this, which is a tad difficult on a single-node machine. I tested this manually by setting up an HA setup where each NN itself binds to 0.0.0.0, but has actual addresses for the other NN. It worked as expected. > Make putImage in GetImageServlet infer remote address to fetch from > --- > > Key: HDFS-3404 > URL: https://issues.apache.org/jira/browse/HDFS-3404 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3404.patch > > > As it stands, daemons which perform checkpointing must determine their own > address on which they can be reached, so that the NN which they checkpoint > against knows what address to fetch a merged fsimage from. This causes > problems if, for example, the daemon performing checkpointing binds to > 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from
Aaron T. Myers created HDFS-3404: Summary: Make putImage in GetImageServlet infer remote address to fetch from Key: HDFS-3404 URL: https://issues.apache.org/jira/browse/HDFS-3404 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Aaron T. Myers Assignee: Aaron T. Myers As it stands, daemons which perform checkpointing must determine their own address on which they can be reached, so that the NN which they checkpoint against knows what address to fetch a merged fsimage from. This causes problems if, for example, the daemon performing checkpointing binds to 0.0.0.0, and thus can't be sure of what address the NN can reach it at. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3391: -- Summary: TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing (was: Failing tests in branch-2) > TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing > --- > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272847#comment-13272847 ] Hadoop QA commented on HDFS-3026: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526418/HDFS-3026.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2414//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2414//console This message is automatically generated. > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HDFS-3403) SecondaryNamenode doesn't start up in secure cluster
[ https://issues.apache.org/jira/browse/HDFS-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony moved MAPREDUCE-4245 to HDFS-3403: --- Component/s: (was: security) security Fix Version/s: (was: 0.22.1) 0.22.1 Affects Version/s: (was: 0.22.0) 0.22.0 Key: HDFS-3403 (was: MAPREDUCE-4245) Project: Hadoop HDFS (was: Hadoop Map/Reduce) > SecondaryNamenode doesn't start up in secure cluster > > > Key: HDFS-3403 > URL: https://issues.apache.org/jira/browse/HDFS-3403 > Project: Hadoop HDFS > Issue Type: Task > Components: security >Affects Versions: 0.22.0 >Reporter: Benoy Antony >Assignee: Benoy Antony >Priority: Minor > Fix For: 0.22.1 > > Attachments: incorrect-sn-principal.patch > > > SN fails to startup due to access control error. This is an authorization > issue and not authentication issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272843#comment-13272843 ] Hadoop QA commented on HDFS-3400: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526406/HDFS-3400.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2413//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2413//console This message is automatically generated. > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) Failing tests in branch-2
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272840#comment-13272840 ] Tsz Wo (Nicholas), SZE commented on HDFS-3391: -- It does fail in trunk as in [build #2397|https://builds.apache.org/job/PreCommit-HDFS-Build/2397//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestPipelinesFailover/testLeaseRecoveryAfterFailover/]. The error is the same as Eli got. > Failing tests in branch-2 > - > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272838#comment-13272838 ] Eli Collins commented on HDFS-3026: --- +1 looks great > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HDFS-3402) Fix hdfs script for for secure datanodes
[ https://issues.apache.org/jira/browse/HDFS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony moved HADOOP-8376 to HDFS-3402: Component/s: (was: security) security Fix Version/s: (was: 0.22.1) 0.22.1 Affects Version/s: (was: 0.22.0) 0.22.0 Key: HDFS-3402 (was: HADOOP-8376) Project: Hadoop HDFS (was: Hadoop Common) > Fix hdfs script for for secure datanodes > > > Key: HDFS-3402 > URL: https://issues.apache.org/jira/browse/HDFS-3402 > Project: Hadoop HDFS > Issue Type: Task > Components: security >Affects Versions: 0.22.0 >Reporter: Benoy Antony >Assignee: Benoy Antony >Priority: Minor > Fix For: 0.22.1 > > Attachments: hdfs-jsvc.patch > > > Starting secure datanode gives out the following error : > Error thrown : > 09/04/2012 12:09:30 2524 jsvc error: Invalid option -server > 09/04/2012 12:09:30 2524 jsvc error: Cannot parse command line arguments -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272834#comment-13272834 ] Eli Collins commented on HDFS-3400: --- +1 looks good to me as well > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
[ https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3049: --- Attachment: HDFS-3049.005.against3335.patch * patch against 3335 > During the normal loading NN startup process, fall back on a different > EditLog if we see one that is corrupt > > > Key: HDFS-3049 > URL: https://issues.apache.org/jira/browse/HDFS-3049 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, > HDFS-3049.003.patch, HDFS-3049.005.against3335.patch > > > During the NameNode startup process, we load an image, and then apply edit > logs to it until we believe that we have all the latest changes. > Unfortunately, if there is an I/O error while reading any of these files, in > most cases, we simply abort the startup process. We should try harder to > locate a readable edit log and/or image file. > *There are three main use cases for this feature:* > 1. If the operating system does not honor fsync (usually due to a > misconfiguration), a file may end up in an inconsistent state. > 2. In certain older releases where we did not use fallocate() or similar to > pre-reserve blocks, a disk full condition may cause a truncated log in one > edit directory. > 3. There may be a bug in HDFS which results in some of the data directories > receiving corrupt data, but not all. This is the least likely use case. > *Proposed changes to normal NN startup* > * We should try a different FSImage if we can't load the first one we try. > * We should examine other FSEditLogs if we can't load the first one(s) we try. > * We should fail if we can't find EditLogs that would bring us up to what we > believe is the latest transaction ID. > Proposed changes to recovery mode NN startup: > we should list out all the available storage directories and allow the > operator to select which one he wants to use. > Something like this: > {code} > Multiple storage directories found. > 1. /foo/bar > edits__curent__XYZ size:213421345 md5:2345345 > image size:213421345 md5:2345345 > 2. /foo/baz > edits__curent__XYZ size:213421345 md5:2345345345 > image size:213421345 md5:2345345 > Which one would you like to use? (1/2) > {code} > As usual in recovery mode, we want to be flexible about error handling. In > this case, this means that we should NOT fail if we can't find EditLogs that > would bring us up to what we believe is the latest transaction ID. > *Not addressed by this feature* > This feature will not address the case where an attempt to access the > NameNode name directory or directories hangs because of an I/O error. This > may happen, for example, when trying to load an image from a hard-mounted NFS > directory, when the NFS server has gone away. Just as now, the operator will > have to notice this problem and take steps to correct it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) Failing tests in branch-2
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272832#comment-13272832 ] Eli Collins commented on HDFS-3391: --- Forgot to mention, I only see TestPipelinesFailover fail on branch-2-alpha, not trunk. > Failing tests in branch-2 > - > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt
[ https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3049: --- Status: Open (was: Patch Available) > During the normal loading NN startup process, fall back on a different > EditLog if we see one that is corrupt > > > Key: HDFS-3049 > URL: https://issues.apache.org/jira/browse/HDFS-3049 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, > HDFS-3049.003.patch > > > During the NameNode startup process, we load an image, and then apply edit > logs to it until we believe that we have all the latest changes. > Unfortunately, if there is an I/O error while reading any of these files, in > most cases, we simply abort the startup process. We should try harder to > locate a readable edit log and/or image file. > *There are three main use cases for this feature:* > 1. If the operating system does not honor fsync (usually due to a > misconfiguration), a file may end up in an inconsistent state. > 2. In certain older releases where we did not use fallocate() or similar to > pre-reserve blocks, a disk full condition may cause a truncated log in one > edit directory. > 3. There may be a bug in HDFS which results in some of the data directories > receiving corrupt data, but not all. This is the least likely use case. > *Proposed changes to normal NN startup* > * We should try a different FSImage if we can't load the first one we try. > * We should examine other FSEditLogs if we can't load the first one(s) we try. > * We should fail if we can't find EditLogs that would bring us up to what we > believe is the latest transaction ID. > Proposed changes to recovery mode NN startup: > we should list out all the available storage directories and allow the > operator to select which one he wants to use. > Something like this: > {code} > Multiple storage directories found. > 1. /foo/bar > edits__curent__XYZ size:213421345 md5:2345345 > image size:213421345 md5:2345345 > 2. /foo/baz > edits__curent__XYZ size:213421345 md5:2345345345 > image size:213421345 md5:2345345 > Which one would you like to use? (1/2) > {code} > As usual in recovery mode, we want to be flexible about error handling. In > this case, this means that we should NOT fail if we can't find EditLogs that > would bring us up to what we believe is the latest transaction ID. > *Not addressed by this feature* > This feature will not address the case where an attempt to access the > NameNode name directory or directories hangs because of an I/O error. This > may happen, for example, when trying to load an image from a hard-mounted NFS > directory, when the NFS server has gone away. Just as now, the operator will > have to notice this problem and take steps to correct it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272827#comment-13272827 ] Jakob Homan commented on HDFS-3400: --- +1 > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate
[ https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272825#comment-13272825 ] Hudson commented on HDFS-3369: -- Integrated in Hadoop-Common-trunk-Commit #2223 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2223/]) HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and BlocksMap to {get|set|add}BlockCollection(..). Contributed by John George (Revision 1336909) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336909 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java > change variable names referring to inode in blockmanagement to more > appropriate > --- > > Key: HDFS-3369 > URL: https://issues.apache.org/jira/browse/HDFS-3369 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 2.0.0, 3.0.0 >Reporter: John George >Assignee: John George >Priority: Minor > Fix For: 2.0.0 > > Attachments: HDFS-3369.patch > > > We should rename BlocksMap.getINode(..) and, in addition, the local variable > names such as fileInode to match 'block collection' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate
[ https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272815#comment-13272815 ] Hudson commented on HDFS-3369: -- Integrated in Hadoop-Hdfs-trunk-Commit #2298 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2298/]) HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and BlocksMap to {get|set|add}BlockCollection(..). Contributed by John George (Revision 1336909) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336909 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java * /hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java > change variable names referring to inode in blockmanagement to more > appropriate > --- > > Key: HDFS-3369 > URL: https://issues.apache.org/jira/browse/HDFS-3369 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 2.0.0, 3.0.0 >Reporter: John George >Assignee: John George >Priority: Minor > Fix For: 2.0.0 > > Attachments: HDFS-3369.patch > > > We should rename BlocksMap.getINode(..) and, in addition, the local variable > names such as fileInode to match 'block collection' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log
[ https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272808#comment-13272808 ] Hadoop QA commented on HDFS-3335: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526396/HDFS-3335.007.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2411//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2411//console This message is automatically generated. > check for edit log corruption at the end of the log > --- > > Key: HDFS-3335 > URL: https://issues.apache.org/jira/browse/HDFS-3335 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 0.23.0 >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe > Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, > HDFS-3335-b1.003.patch, HDFS-3335.001.patch, HDFS-3335.002.patch, > HDFS-3335.003.patch, HDFS-3335.004.patch, HDFS-3335.005.patch, > HDFS-3335.006.patch, HDFS-3335.007.patch > > > Even after encountering an OP_INVALID, we should check the end of the edit > log to make sure that it contains no more edits. > This will catch things like rare race conditions or log corruptions that > would otherwise remain undetected. They will got from being silent data loss > scenarios to being cases that we can detect and fix. > Using recovery mode, we can choose to ignore the end of the log if necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate
[ https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3369: - Resolution: Fixed Fix Version/s: 2.0.0 Target Version/s: 2.0.0, 3.0.0 (was: 3.0.0, 2.0.0) Status: Resolved (was: Patch Available) I have committed this. Thanks, John! > change variable names referring to inode in blockmanagement to more > appropriate > --- > > Key: HDFS-3369 > URL: https://issues.apache.org/jira/browse/HDFS-3369 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 2.0.0, 3.0.0 >Reporter: John George >Assignee: John George >Priority: Minor > Fix For: 2.0.0 > > Attachments: HDFS-3369.patch > > > We should rename BlocksMap.getINode(..) and, in addition, the local variable > names such as fileInode to match 'block collection' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3401: -- Status: Patch Available (was: Open) > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3401: -- Attachment: hdfs-3401.txt Patch attached. > Cleanup DatanodeDescriptor creation in the tests > > > Key: HDFS-3401 > URL: https://issues.apache.org/jira/browse/HDFS-3401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, test >Affects Versions: 2.0.0 >Reporter: Eli Collins >Assignee: Eli Collins > Attachments: hdfs-3401.txt > > > Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests
Eli Collins created HDFS-3401: - Summary: Cleanup DatanodeDescriptor creation in the tests Key: HDFS-3401 URL: https://issues.apache.org/jira/browse/HDFS-3401 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, test Affects Versions: 2.0.0 Reporter: Eli Collins Assignee: Eli Collins Like HDFS-3230 but for DatanodeDescriptor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3026: - Attachment: HDFS-3026.patch Forgot to address the findbugs warning - just need to synchronize NameNode#setRuntimeForTesting. > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch, HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate
[ https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272741#comment-13272741 ] Hadoop QA commented on HDFS-3369: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526180/HDFS-3369.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2410//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2410//console This message is automatically generated. > change variable names referring to inode in blockmanagement to more > appropriate > --- > > Key: HDFS-3369 > URL: https://issues.apache.org/jira/browse/HDFS-3369 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: 2.0.0, 3.0.0 >Reporter: John George >Assignee: John George >Priority: Minor > Attachments: HDFS-3369.patch > > > We should rename BlocksMap.getINode(..) and, in addition, the local variable > names such as fileInode to match 'block collection' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3372) offlineEditsViewer should be able to read a binary edits file with recovery mode
[ https://issues.apache.org/jira/browse/HDFS-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272743#comment-13272743 ] Hadoop QA commented on HDFS-3372: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12526388/HDFS-3372.002.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2409//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2409//console This message is automatically generated. > offlineEditsViewer should be able to read a binary edits file with recovery > mode > > > Key: HDFS-3372 > URL: https://issues.apache.org/jira/browse/HDFS-3372 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Attachments: HDFS-3372.001.patch, HDFS-3372.002.patch > > > It would be nice if oev (the offline edits viewer) had a switch that allowed > us to read a binary edits file using recovery mode. oev can be very useful > when working with corrupt or messed up edit log files, and this would make it > even more so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition
[ https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3026: - Attachment: HDFS-3026.patch Thanks a lot for the review, Eli. Here's an updated patch. Good idea re: trash emptier thread. I've done that in this patch. As for the other exits in NameNode - all of those are as exit codes from shell commands (e.g. format, bootstrapStandby, etc.), or from the the static main function, none of which I think really benefit from calling this method. Good point about making the error message more generic, though. I've gone ahead and done that. > HA: Handle failure during HA state transition > - > > Key: HDFS-3026 > URL: https://issues.apache.org/jira/browse/HDFS-3026 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, > HDFS-3026.patch > > > This JIRA is to address a TODO in NameNode about handling the possibility of > an incomplete HA state transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3391) Failing tests in branch-2
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272714#comment-13272714 ] Todd Lipcon commented on HDFS-3391: --- I'll investigate TestPipelinesFailover, since I wrote it. > Failing tests in branch-2 > - > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3391) Failing tests in branch-2
[ https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned HDFS-3391: - Assignee: Todd Lipcon > Failing tests in branch-2 > - > > Key: HDFS-3391 > URL: https://issues.apache.org/jira/browse/HDFS-3391 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Arun C Murthy >Assignee: Todd Lipcon >Priority: Critical > > Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< > FAILURE! > -- > Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec > <<< FAILURE! > -- -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.
[ https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-3368: -- Attachment: blockDeletePolicy.patch I end up using 4 as a multiplier for heartbeatInterval. Looked at my busy but healthy cluster. There are always some nodes with last heartbeat around 10. So multiplier 4 should cover that. If there are nodes that are permanently late with heartbeats, then this policy will eventually reduce the block count on such nodes, which will reduce load on them, and potentially help with heartbeats. > Missing blocks due to bad DataNodes comming up and down. > > > Key: HDFS-3368 > URL: https://issues.apache.org/jira/browse/HDFS-3368 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Attachments: blockDeletePolicy.patch, blockDeletePolicy.patch > > > All replicas of a block can be removed if bad DataNodes come up and down > during cluster restart resulting in data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3230) Cleanup DatanodeID creation in the tests
[ https://issues.apache.org/jira/browse/HDFS-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3230: -- Resolution: Fixed Fix Version/s: 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks ATM. I've committed this and merged to branch-2. > Cleanup DatanodeID creation in the tests > > > Key: HDFS-3230 > URL: https://issues.apache.org/jira/browse/HDFS-3230 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Reporter: Eli Collins >Assignee: Eli Collins >Priority: Minor > Fix For: 2.0.0 > > Attachments: hdfs-3230.txt, hdfs-3230.txt > > > A lot of tests create dummy DatanodeIDs for testing, often use bogus values > when creating the objects (eg hostname in the IP field), which they can get > away with because the IDs aren't actually used. Let's add a test utility > method for creating a DatanodeID for testing and use it throughout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3400: - Status: Patch Available (was: Open) > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3400) DNs should be able start with jsvc even if security is disabled
[ https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3400: - Attachment: HDFS-3400.patch Here's a patch which makes it so that DNs will start up even if JSVC is configured properly and security is disabled via the XML confs. The only things that will cause the DN to not start are now: * Security is enabled but the DN is not configured to use low ports. * JSVC_HOME is configured, but $JSVC_HOME/jsvc is not executable. No tests are included since security and root access must be available to test this. I tested it manually by: # Starting a DN with security disabled, and all security-related env vars set. # Starting a DN with security enabled, and all security-related env vars set. # Starting a DN with security disabled, and none of the security-related env vars set. # Starting a DN with security enabled, and none of the security-related env vars set. # Starting a DN with security enabled, all of the security-related env vars set, but not configured with low ports. The DN now starts properly in the first three cases. It does not start in the last two. This is the expected behavior after this patch. > DNs should be able start with jsvc even if security is disabled > --- > > Key: HDFS-3400 > URL: https://issues.apache.org/jira/browse/HDFS-3400 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, scripts >Affects Versions: 2.0.0 >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-3400.patch > > > Currently if one tries to start a DN with security disabled (via > hadoop.security.authentication = "simple" in the configs), but JSVC is > correctly configured, the DN will refuse to start. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira