[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273087#comment-13273087
 ] 

Uma Maheswara Rao G commented on HDFS-3391:
---

Thanks Todd, We can disscuss in HDFS-3157.

> TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
> ---
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3391.txt
>
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273085#comment-13273085
 ] 

Hudson commented on HDFS-3026:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2246 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2246/])
HDFS-3026. HA: Handle failure during HA state transition. Contributed by 
Aaron T. Myers. (Revision 1337030)

 Result = ABORTED
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337030
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java


> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273082#comment-13273082
 ] 

Todd Lipcon commented on HDFS-3391:
---

Hi Uma. I commented on HDFS-3157 as well, so let's continue that discussion 
there.

On this JIRA let's discuss the improvement to InvalidateBlocks -- I think this 
bug fix is a good improvement regardless of whether 3157 is in.

> TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
> ---
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3391.txt
>
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273080#comment-13273080
 ] 

Uma Maheswara Rao G commented on HDFS-3391:
---

In one way HDFS-3157 is incorrectly handled. Because, It was creating new block 
info. But unfortunately new blockInfo ctor sets the inode as null. When we are 
marking it corrupt, that will just invalidate the blocks and will say block 
does not belongs any file. When we set the inode from storedBlock to newly 
created BlockInfo also doesn't help, strangely I have seen triplets does not 
contain that block info. Now it is able add to corrupt replicas, but 
nodeIterator for BlockMap does not have information about this block.
{noformat}
2012-05-10 21:30:04,378 WARN  blockmanagement.BlockManager 
(BlockManager.java:createLocatedBlock(666)) - Inconsistent number of corrupt 
replicas for blk_-6411755644530997250_1003 blockMap has 0 but corrupt replicas 
map has 1
2012-05-10 21:30:04,381 WARN  blockmanagement.BlockManager 
(BlockManager.java:createLocatedBlock(666)) - Inconsistent number of corrupt 
replicas for blk_-6411755644530997250_1003 blockMap has 0 but corrupt replicas 
map has 1{noformat}

Let me dig into it. Is there any other bug exist in this lines which we did not 
notice.

> TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
> ---
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3391.txt
>
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3157) Error in deleting block is keep on coming from DN even after the block report and directory scanning has happened

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273075#comment-13273075
 ] 

Todd Lipcon commented on HDFS-3157:
---

One potential issue with this patch:
Because it creates a new BlockInfo object, that BlockInfo doesn't have any 
pointer to the associated inode. Hence when we call markBlockAsCorrupt, it 
doesn't go through the normal corrupt replica handling path -- instead, it gets 
immediately enqueued for deletion.

This makes me a little bit nervous -- if we had a bug, for example, which 
caused the NN's view of the gen stamp to get increased without the DNs being 
increased, we would issue deletions for all replicas. If instead we were going 
through the normal corrupt replica handling path, it would first make sure it 
had good replicas of the "correct" genstamp before invalidating the corrupt 
replicas. That would prevent the data loss, instead turning into an 
unavailability.

Does that make sense?

> Error in deleting block is keep on coming from DN even after the block report 
> and directory scanning has happened
> -
>
> Key: HDFS-3157
> URL: https://issues.apache.org/jira/browse/HDFS-3157
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: J.Andreina
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HDFS-3157.patch, HDFS-3157.patch, HDFS-3157.patch
>
>
> Cluster setup:
> 1NN,Three DN(DN1,DN2,DN3),replication factor-2,"dfs.blockreport.intervalMsec" 
> 300,"dfs.datanode.directoryscan.interval" 1
> step 1: write one file "a.txt" with sync(not closed)
> step 2: Delete the blocks in one of the datanode say DN1(from rbw) to which 
> replication happened.
> step 3: close the file.
> Since the replication factor is 2 the blocks are replicated to the other 
> datanode.
> Then at the NN side the following cmd is issued to DN from which the block is 
> deleted
> -
> {noformat}
> 2012-03-19 13:41:36,905 INFO org.apache.hadoop.hdfs.StateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: duplicate requested for 
> blk_2903555284838653156 to add as corrupt on XX.XX.XX.XX by /XX.XX.XX.XX 
> because reported RBW replica with genstamp 1002 does not match COMPLETE 
> block's genstamp in block map 1003
> 2012-03-19 13:41:39,588 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> Removing block blk_2903555284838653156_1003 from neededReplications as it has 
> enough replicas.
> {noformat}
> From the datanode side in which the block is deleted the following exception 
> occured
> {noformat}
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Unexpected error trying to delete block blk_2903555284838653156_1003. 
> BlockInfo not found in volumeMap.
> 2012-02-29 13:54:13,126 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Error processing datanode Command
> java.io.IOException: Error in deleting blocks.
>   at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.invalidate(FSDataset.java:2061)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActive(BPOfferService.java:581)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.processCommandFromActor(BPOfferService.java:545)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processCommand(BPServiceActor.java:690)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:522)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:662)
>   at java.lang.Thread.run(Thread.java:619)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3391:
--

Attachment: hdfs-3391.txt

This attached patch seems to fix the issue, even with HDFS-3157 and the above 
troublesome sleep() call in place.

I think what was happening here was the following:

- in some cases, the block synchronization path can run twice, if the first 
attempt is slow. This ends up first finalizing the block at genstamp 1005, and 
then again at 1006 or 1007.
- for each of those genstamps, the DNs report FINALIZED replicas to both NNs.
- When the new NN becomes active, then, it replays the block reports -- first 
FINALIZED for blk_N_1005, and then FINALIZED for blk_N_1006.
- When it sees the blk_N_1005 genstamp, it already knows that 1006 is the 
"correct" latest genstamp for the block, so it wants to mark it as corrupt.

Here is where the behavior differs:

Prior to HDFS-3157, it was marking blk_N_1006 as corrupt instead of blk_N_1005. 
Thus the markBlockAsCorrupt() call would succeed. When processing the FINALIZED 
blk_N_1006, it would remove it from the corrupt list, and everything would be 
fine.

With HDFS-3157 in place, it instead marks blk_N_1005 as corrupt. However, the 
BlockInfo object it creates to do so has no attached inode (BlockCollection in 
new parlance). So, markBlockAsCorrupt immediately enqueued the replica for 
invalidation, rather than treating it like a normal corrupt replica. Then, upon 
seeing the report of the blk_N_1006 FINALIZED replica, the check against 
invalidateBlocks.contains(block) caused it to be skipped, and thus 
addStoredBlock() didn't get called.

The fix in this patch is to change invalidateBlocks so that its contains() call 
can check for genstamp match as well. So, even though blk_N_1005 has been 
enqueued for deletion, we should still accept a block report for blk_N_1006.

> TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
> ---
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hdfs-3391.txt
>
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273066#comment-13273066
 ] 

Hudson commented on HDFS-3026:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2303 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2303/])
HDFS-3026. HA: Handle failure during HA state transition. Contributed by 
Aaron T. Myers. (Revision 1337030)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337030
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java


> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3026:
-

Fix Version/s: 2.0.0

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273063#comment-13273063
 ] 

Hudson commented on HDFS-3026:
--

Integrated in Hadoop-Common-trunk-Commit #2229 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2229/])
HDFS-3026. HA: Handle failure during HA state transition. Contributed by 
Aaron T. Myers. (Revision 1337030)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337030
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java


> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3026:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks a lot for the reviews, Eli. I've just committed this to trunk, branch-2, 
and branch-2.0.0-alpha.

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3404:
-

Status: Patch Available  (was: Open)

> Make putImage in GetImageServlet infer remote address to fetch from
> ---
>
> Key: HDFS-3404
> URL: https://issues.apache.org/jira/browse/HDFS-3404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3404.patch, HDFS-3404.patch
>
>
> As it stands, daemons which perform checkpointing must determine their own 
> address on which they can be reached, so that the NN which they checkpoint 
> against knows what address to fetch a merged fsimage from. This causes 
> problems if, for example, the daemon performing checkpointing binds to 
> 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3404:
-

Attachment: HDFS-3404.patch

Here's an updated patch which addresses Eli's review feedback.

I struggled for a while with how to write an automated test for this, and 
ultimately concluded it's not really possible on a single host, since 
connecting to 0.0.0.0 will work on a single box, whereas it wouldn't in a 
multi-box setup. I'll test this patch manually in a multi-node setup tomorrow.

> Make putImage in GetImageServlet infer remote address to fetch from
> ---
>
> Key: HDFS-3404
> URL: https://issues.apache.org/jira/browse/HDFS-3404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3404.patch, HDFS-3404.patch
>
>
> As it stands, daemons which perform checkpointing must determine their own 
> address on which they can be reached, so that the NN which they checkpoint 
> against knows what address to fetch a merged fsimage from. This causes 
> problems if, for example, the daemon performing checkpointing binds to 
> 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273052#comment-13273052
 ] 

Todd Lipcon commented on HDFS-3391:
---

I was able to reproduce this by reapplying HDFS-3157 and adding the following 
in DataNode.java:

{code}
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
@@ -1983,6 +1983,13 @@ public class DataNode extends Configured
   datanodes[i] = r.id;
   storages[i] = r.storageID;
 }
+if (newBlock.getGenerationStamp() == 1005) {
+  try {
+Thread.sleep(1500);
+  } catch (InterruptedException ie) {
+Thread.currentThread().interrupt();
+  }
+}
 nn.commitBlockSynchronization(block,
 newBlock.getGenerationStamp(), newBlock.getNumBytes(), true, false,
 datanodes, storages);
{code}

I have to think through whether this is a bug which we've had for a while which 
is uncovered by HDFS-3157, or if HDFS-3157 itself was incorrect.

> TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
> ---
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273048#comment-13273048
 ] 

Hadoop QA commented on HDFS-2391:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526471/HDFS-2391.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2423//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2423//console

This message is automatically generated.

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch, HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273044#comment-13273044
 ] 

Aaron T. Myers commented on HDFS-3404:
--

bq. This change needs to be made to the 2NN as well right or were you thinking 
just the SBN?

Nope, there's no change to be made to the 2NN. The 2NN doesn't do the same sort 
of validation that the SBN does that the configured NN HTTP address is not 
INADDR_ANY. The 2NN will automatically start behaving in the same way the SBN 
does just by virtue of the fact that it's connecting to an NN which doesn't 
look at the machine name in the param string. The 2NN will also stop sending 
the machine name in the param string, by virtue of the fact that it uses 
GetImageServlet#getParamStringToPutImage to form the param string.

I also tested this patch with an NN/2NN, and it works just fine.

bq. NetUtils#isIpAddress actually checks ip:port, seems like we'll always have 
an IP here. Perhaps better to use InetAddresses.isInetAddress.'

Sure, makes sense. I'll update the patch to suit.

bq. How much more difficult would it be to just have it do a straight HTTP POST 
or PUT of the new image instead of the "I'll ask you to ask me for this image" 
dance?

I investigated what it would take to do this a little bit, and concluded that 
to do it right would take a fair bit of refactoring that's well outside the 
modest scope of this JIRA. I've filed a separate JIRA to make this change, I 
hope that's OK: HDFS-3405

> Make putImage in GetImageServlet infer remote address to fetch from
> ---
>
> Key: HDFS-3404
> URL: https://issues.apache.org/jira/browse/HDFS-3404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3404.patch
>
>
> As it stands, daemons which perform checkpointing must determine their own 
> address on which they can be reached, so that the NN which they checkpoint 
> against knows what address to fetch a merged fsimage from. This causes 
> problems if, for example, the daemon performing checkpointing binds to 
> 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273041#comment-13273041
 ] 

Hadoop QA commented on HDFS-3335:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526470/HDFS-3335.008.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2422//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2422//console

This message is automatically generated.

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, 
> HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, 
> HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch, 
> HDFS-3335.008.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages

2012-05-10 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-3405:


 Summary: Checkpointing should use HTTP POST or PUT instead of 
GET-GET to send merged fsimages
 Key: HDFS-3405
 URL: https://issues.apache.org/jira/browse/HDFS-3405
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 1.0.0
Reporter: Aaron T. Myers


As Todd points out in [this 
comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986],
 the current scheme for a checkpointing daemon to upload a merged fsimage file 
to an NN is to issue an HTTP get request to tell the target NN to issue another 
GET request back to the checkpointing daemon to retrieve the merged fsimage 
file. There's no fundamental reason the checkpointing daemon can't just use an 
HTTP POST or PUT to send back the merged fsimage file, rather than the 
double-GET scheme.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273034#comment-13273034
 ] 

Todd Lipcon commented on HDFS-3391:
---

I looped TestPipelinesFailover for quite some time and could not get a failure. 
In the logs you pointed to on build #2397, I traced the issue to the following:

{code}
2012-05-09 23:50:33,074 DEBUG namenode.FSNamesystem 
(FSEditLogLoader.java:applyEditLogOp(296)) - OP_CLOSE: /test-file numblocks : 2 
clientHolder  clientMachine
2012-05-09 23:50:33,074 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued 
message ReportedBlockInfo [block=blk_-3039116449792967513_1005, 
dn=127.0.0.1:45674, reportedState=FINALIZED]
2012-05-09 23:50:33,074 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1660)) - Reported block 
blk_-3039116449792967513_1005 on 127.0.0.1:45674 size 2048 replicaState = 
FINALIZED
2012-05-09 23:50:33,074 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = 
COMPLETE
2012-05-09 23:50:33,075 INFO  hdfs.StateChange 
(BlockManager.java:markBlockAsCorrupt(926)) - BLOCK markBlockAsCorrupt: block 
blk_-3039116449792967513_1005 could not be marked as corrupt as it does not 
belong to any file
2012-05-09 23:50:33,076 INFO  hdfs.StateChange (InvalidateBlocks.java:add(77)) 
- BLOCK* InvalidateBlocks: add blk_-3039116449792967513_1005 to 127.0.0.1:45674
2012-05-09 23:50:33,076 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued 
message ReportedBlockInfo [block=blk_-3039116449792967513_1005, 
dn=127.0.0.1:35659, reportedState=FINALIZED]
2012-05-09 23:50:33,076 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1660)) - Reported block 
blk_-3039116449792967513_1005 on 127.0.0.1:35659 size 2048 replicaState = 
FINALIZED
2012-05-09 23:50:33,076 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = 
COMPLETE
2012-05-09 23:50:33,077 INFO  hdfs.StateChange 
(BlockManager.java:markBlockAsCorrupt(926)) - BLOCK markBlockAsCorrupt: block 
blk_-3039116449792967513_1005 could not be marked as corrupt as it does not 
belong to any file
2012-05-09 23:50:33,077 INFO  hdfs.StateChange (InvalidateBlocks.java:add(77)) 
- BLOCK* InvalidateBlocks: add blk_-3039116449792967513_1005 to 127.0.0.1:35659
2012-05-09 23:50:33,077 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued 
message ReportedBlockInfo [block=blk_-3039116449792967513_1005, 
dn=127.0.0.1:59499, reportedState=FINALIZED]
2012-05-09 23:50:33,077 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1660)) - Reported block 
blk_-3039116449792967513_1005 on 127.0.0.1:59499 size 2048 replicaState = 
FINALIZED
2012-05-09 23:50:33,078 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = 
COMPLETE
2012-05-09 23:50:33,078 INFO  hdfs.StateChange 
(BlockManager.java:markBlockAsCorrupt(926)) - BLOCK markBlockAsCorrupt: block 
blk_-3039116449792967513_1005 could not be marked as corrupt as it does not 
belong to any file
2012-05-09 23:50:33,078 INFO  hdfs.StateChange (InvalidateBlocks.java:add(77)) 
- BLOCK* InvalidateBlocks: add blk_-3039116449792967513_1005 to 127.0.0.1:59499
2012-05-09 23:50:33,078 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued 
message ReportedBlockInfo [block=blk_-3039116449792967513_1006, 
dn=127.0.0.1:45674, reportedState=FINALIZED]
2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1660)) - Reported block 
blk_-3039116449792967513_1006 on 127.0.0.1:45674 size 2048 replicaState = 
FINALIZED
2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = 
COMPLETE
2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued 
message ReportedBlockInfo [block=blk_-3039116449792967513_1006, 
dn=127.0.0.1:59499, reportedState=FINALIZED]
2012-05-09 23:50:33,079 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1660)) - Reported block 
blk_-3039116449792967513_1006 on 127.0.0.1:59499 size 2048 replicaState = 
FINALIZED
2012-05-09 23:50:33,080 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processReportedBlock(1684)) - In memory blockUCState = 
COMPLETE
2012-05-09 23:50:33,080 DEBUG blockmanagement.BlockManager 
(BlockManager.java:processQueuedMessages(1760)) - Processing previouly queued 
message ReportedBlockInfo [block=blk_-3039116449792967513_1006, 
dn=127.0.0.1:35659, reportedState=FINALIZED]
2012-05-

[jira] [Commented] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273032#comment-13273032
 ] 

Hadoop QA commented on HDFS-3368:
-

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12526469/blockDeletePolicy-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2421//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2421//console

This message is automatically generated.

> Missing blocks due to bad DataNodes comming up and down.
> 
>
> Key: HDFS-3368
> URL: https://issues.apache.org/jira/browse/HDFS-3368
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: blockDeletePolicy-0.22.patch, 
> blockDeletePolicy-trunk.patch, blockDeletePolicy.patch
>
>
> All replicas of a block can be removed if bad DataNodes come up and down 
> during cluster restart resulting in data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273025#comment-13273025
 ] 

Hudson commented on HDFS-3400:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2245 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2245/])
HDFS-3400. DNs should be able start with jsvc even if security is disabled. 
Contributed by Aaron T. Myers (Revision 1337017)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337017
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java


> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-2391:
--

Attachment: HDFS-2391.patch

Done.

> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch, HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3335:
---

Attachment: HDFS-3335.008.patch

* EditLogFileInputStream: warn when skipping the last few bytes in a file.

* rename GarbageAfterTerminatorException#offset to numAfterTerminator.

* Rename FSImage#setEditLog to FSImage#setEditLogForTesting and add 
@VisibleForTesting annotation to it

* avoid making some unecessary whitespace changes

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, 
> HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, 
> HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch, 
> HDFS-3335.008.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2391) Newly set BalancerBandwidth value is not displayed anywhere

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273008#comment-13273008
 ] 

Eli Collins commented on HDFS-2391:
---

How about just one info log?

{code}
+ LOG.info("Updating balancer bandwidth from " + 
dxcs.balanceThrottler.getBandwidth() + "  to " + bandwidth + " bytes/s.");
dxcs.balanceThrottler.setBandwidth(bandwidth);
{code}



> Newly set BalancerBandwidth value is not displayed anywhere
> ---
>
> Key: HDFS-2391
> URL: https://issues.apache.org/jira/browse/HDFS-2391
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer
>Affects Versions: 0.20.205.0
>Reporter: Rajit Saha
>Assignee: Harsh J
>  Labels: newbie
> Attachments: HDFS-2391.patch
>
>
> with current implementation of 
> $ hadoop dfsadmin -setBalancerBandwidth  
> only shows following message in DN log 
>  INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeCommand 
> action: DNA_BALANCERBANDWIDTHUPDATE
> But it would be nice to have the value of  
> be displayed in DN log or any other
> suitable place, so that we can have a track.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273007#comment-13273007
 ] 

Hudson commented on HDFS-3400:
--

Integrated in Hadoop-Common-trunk-Commit #2228 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2228/])
HDFS-3400. DNs should be able start with jsvc even if security is disabled. 
Contributed by Aaron T. Myers (Revision 1337017)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337017
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java


> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13273006#comment-13273006
 ] 

Hudson commented on HDFS-3400:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2302 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2302/])
HDFS-3400. DNs should be able start with jsvc even if security is disabled. 
Contributed by Aaron T. Myers (Revision 1337017)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1337017
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/SecureDataNodeStarter.java


> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3400:
--

  Resolution: Fixed
   Fix Version/s: 2.0.0
Target Version/s:   (was: 2.0.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

I've committed this and merged to branch-2. Thanks ATM!

> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 2.0.0
>
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3394) Do not use generic in INodeFile.getLastBlock()

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272997#comment-13272997
 ] 

Hadoop QA commented on HDFS-3394:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526465/h3394_20120510.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2420//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2420//console

This message is automatically generated.

> Do not use generic in INodeFile.getLastBlock()
> --
>
> Key: HDFS-3394
> URL: https://issues.apache.org/jira/browse/HDFS-3394
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Attachments: h3394_20120510.patch
>
>
> As shown in HDFS-3385, the ClassCastException check in 
> INodeFile.getLastBlock() is useless since generic type information is only 
> available in compile-time but not run-time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.

2012-05-10 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-3368:
--

Attachment: (was: blockDeletePolicy.patch)

> Missing blocks due to bad DataNodes comming up and down.
> 
>
> Key: HDFS-3368
> URL: https://issues.apache.org/jira/browse/HDFS-3368
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: blockDeletePolicy-0.22.patch, 
> blockDeletePolicy-trunk.patch, blockDeletePolicy.patch
>
>
> All replicas of a block can be removed if bad DataNodes come up and down 
> during cluster restart resulting in data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.

2012-05-10 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-3368:
--

Target Version/s: 0.22.1, 2.0.0, 3.0.0  (was: 3.0.0, 2.0.0, 0.22.1)
  Status: Patch Available  (was: Open)

Submitting patch for trunk.

> Missing blocks due to bad DataNodes comming up and down.
> 
>
> Key: HDFS-3368
> URL: https://issues.apache.org/jira/browse/HDFS-3368
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 1.0.0, 0.22.0, 2.0.0, 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: blockDeletePolicy-0.22.patch, 
> blockDeletePolicy-trunk.patch, blockDeletePolicy.patch
>
>
> All replicas of a block can be removed if bad DataNodes come up and down 
> during cluster restart resulting in data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.

2012-05-10 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-3368:
--

Attachment: blockDeletePolicy-trunk.patch

> Missing blocks due to bad DataNodes comming up and down.
> 
>
> Key: HDFS-3368
> URL: https://issues.apache.org/jira/browse/HDFS-3368
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: blockDeletePolicy-0.22.patch, 
> blockDeletePolicy-trunk.patch, blockDeletePolicy.patch
>
>
> All replicas of a block can be removed if bad DataNodes come up and down 
> during cluster restart resulting in data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.

2012-05-10 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-3368:
--

Attachment: blockDeletePolicy-0.22.patch

> Missing blocks due to bad DataNodes comming up and down.
> 
>
> Key: HDFS-3368
> URL: https://issues.apache.org/jira/browse/HDFS-3368
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: blockDeletePolicy-0.22.patch, blockDeletePolicy.patch, 
> blockDeletePolicy.patch
>
>
> All replicas of a block can be removed if bad DataNodes come up and down 
> during cluster restart resulting in data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272986#comment-13272986
 ] 

Todd Lipcon commented on HDFS-3404:
---

How much more difficult would it be to just have it do a straight HTTP POST or 
PUT of the new image instead of the "I'll ask you to ask me for this image" 
dance?

> Make putImage in GetImageServlet infer remote address to fetch from
> ---
>
> Key: HDFS-3404
> URL: https://issues.apache.org/jira/browse/HDFS-3404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3404.patch
>
>
> As it stands, daemons which perform checkpointing must determine their own 
> address on which they can be reached, so that the NN which they checkpoint 
> against knows what address to fetch a merged fsimage from. This causes 
> problems if, for example, the daemon performing checkpointing binds to 
> 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272985#comment-13272985
 ] 

Hadoop QA commented on HDFS-3031:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526461/hdfs-3031.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2419//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2419//console

This message is automatically generated.

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, hdfs-3031.txt, hdfs-3031.txt, 
> styx01_killNNfailover, styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.ha

[jira] [Resolved] (HDFS-3388) GetJournalEditServlet should close output stream only if the stream is used.

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-3388.
--

   Resolution: Fixed
Fix Version/s: Shared journals (HDFS-3092)

I have committed this.  Thanks, Brandon!

> GetJournalEditServlet should close output stream only if the stream is used.
> 
>
> Key: HDFS-3388
> URL: https://issues.apache.org/jira/browse/HDFS-3388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: Shared journals (HDFS-3092)
>
> Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch, 
> HDFS-3388.HDFS-3092.patch
>
>
> GetJournalEditServlet has the same problem as that of GetImageServlet 
> (HDFS-3330). It should be fixed in the same way. Also need to make 
> CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3388) GetJournalEditServlet should close output stream only if the stream is used.

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3388:
-

 Summary: GetJournalEditServlet should close output stream only if the 
stream is used.  (was: GetJournalEditServlet should catch more exceptions, not 
just IOException)
Hadoop Flags: Reviewed

+1 patch looks good.

(revised the summary.)

> GetJournalEditServlet should close output stream only if the stream is used.
> 
>
> Key: HDFS-3388
> URL: https://issues.apache.org/jira/browse/HDFS-3388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: Shared journals (HDFS-3092)
>
> Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch, 
> HDFS-3388.HDFS-3092.patch
>
>
> GetJournalEditServlet has the same problem as that of GetImageServlet 
> (HDFS-3330). It should be fixed in the same way. Also need to make 
> CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3394) Do not use generic in INodeFile.getLastBlock()

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3394:
-

Status: Patch Available  (was: Open)

> Do not use generic in INodeFile.getLastBlock()
> --
>
> Key: HDFS-3394
> URL: https://issues.apache.org/jira/browse/HDFS-3394
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Attachments: h3394_20120510.patch
>
>
> As shown in HDFS-3385, the ClassCastException check in 
> INodeFile.getLastBlock() is useless since generic type information is only 
> available in compile-time but not run-time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3394) Do not use generic in INodeFile.getLastBlock()

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3394:
-

Attachment: h3394_20120510.patch

h3394_20120510.patch:
- removes generic from INodeFile.getLastBlock();
- changes some public/protected methods package private/private;
- rewrites some javadoc.

> Do not use generic in INodeFile.getLastBlock()
> --
>
> Key: HDFS-3394
> URL: https://issues.apache.org/jira/browse/HDFS-3394
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
>Priority: Minor
> Attachments: h3394_20120510.patch
>
>
> As shown in HDFS-3385, the ClassCastException check in 
> INodeFile.getLastBlock() is useless since generic type information is only 
> available in compile-time but not run-time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272970#comment-13272970
 ] 

Hadoop QA commented on HDFS-3163:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526447/HDFS-3163.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestHFlush

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2416//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2416//console

This message is automatically generated.

> TestHDFSCLI.testAll fails if the user name is not all lowercase
> ---
>
> Key: HDFS-3163
> URL: https://issues.apache.org/jira/browse/HDFS-3163
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Attachments: HDFS-3163.patch
>
>
> In the test resource file testHDFSConf.xml, the test comparators expect user 
> name to be all lowercase. 
> If the user issuing the test has an uppercase in the username (e.g., Brandon 
> instead of brandon), many RegexpComarator tests will fail. The following is 
> one example:
> {noformat} 
> 
>   RegexpComparator
>   ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( 
> )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( 
> )*/file1
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272963#comment-13272963
 ] 

Hudson commented on HDFS-3385:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2244 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2244/])
HDFS-3385. The last block of INodeFileUnderConstruction is not necessarily 
a BlockInfoUnderConstruction, so do not cast it in 
FSNamesystem.recoverLeaseInternal(..). (Revision 1336976)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336976
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java


> ClassCastException when trying to append a file
> ---
>
> Key: HDFS-3385
> URL: https://issues.apache.org/jira/browse/HDFS-3385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
> Environment: HDFS
>Reporter: Brahma Reddy Battula
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.0
>
> Attachments: h3385_20120508.patch, h3385_20120509.patch
>
>
> When I try to append a file I got 
> {noformat}
> 2012-05-08 18:13:40,506 WARN  util.KerberosName 
> (KerberosName.java:(87)) - Kerberos krb5 configuration not found, 
> setting default realm to empty
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
> ...
>   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272957#comment-13272957
 ] 

Hudson commented on HDFS-3401:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2243 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2243/])
HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed by 
Eli Collins (Revision 1336972)

 Result = ABORTED
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336972
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 2.0.0
>
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272950#comment-13272950
 ] 

Hudson commented on HDFS-3385:
--

Integrated in Hadoop-Common-trunk-Commit #2227 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2227/])
HDFS-3385. The last block of INodeFileUnderConstruction is not necessarily 
a BlockInfoUnderConstruction, so do not cast it in 
FSNamesystem.recoverLeaseInternal(..). (Revision 1336976)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336976
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java


> ClassCastException when trying to append a file
> ---
>
> Key: HDFS-3385
> URL: https://issues.apache.org/jira/browse/HDFS-3385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
> Environment: HDFS
>Reporter: Brahma Reddy Battula
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.0
>
> Attachments: h3385_20120508.patch, h3385_20120509.patch
>
>
> When I try to append a file I got 
> {noformat}
> 2012-05-08 18:13:40,506 WARN  util.KerberosName 
> (KerberosName.java:(87)) - Kerberos krb5 configuration not found, 
> setting default realm to empty
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
> ...
>   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272945#comment-13272945
 ] 

Colin Patrick McCabe commented on HDFS-3335:


* Yeah, your understanding of GarbageAfterTerminatorException#getOffset is 
correct.  I'll rename it to something clearer.

.bq Part of what is confusing me is this: does padding after OP_INVALID count 
as garbage or not?

No, padding is just zeros or 0xffs.  Garbage is something you wouldn't expect 
to be there, like more opcodes, random bytes, or something like that.

* I'll see if I can remove the unecessary whitespace diffs...

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, 
> HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, 
> HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272946#comment-13272946
 ] 

Hudson commented on HDFS-3385:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2301 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2301/])
HDFS-3385. The last block of INodeFileUnderConstruction is not necessarily 
a BlockInfoUnderConstruction, so do not cast it in 
FSNamesystem.recoverLeaseInternal(..). (Revision 1336976)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336976
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestFileAppend.java


> ClassCastException when trying to append a file
> ---
>
> Key: HDFS-3385
> URL: https://issues.apache.org/jira/browse/HDFS-3385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
> Environment: HDFS
>Reporter: Brahma Reddy Battula
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.0
>
> Attachments: h3385_20120508.patch, h3385_20120509.patch
>
>
> When I try to append a file I got 
> {noformat}
> 2012-05-08 18:13:40,506 WARN  util.KerberosName 
> (KerberosName.java:(87)) - Kerberos krb5 configuration not found, 
> setting default realm to empty
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
> ...
>   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3031:
--

Attachment: hdfs-3031.txt

Woops, my previously uploaded patch was accidentally against the auto-HA 
branch. New patch is against trunk (only a trivial difference in tests)

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, hdfs-3031.txt, hdfs-3031.txt, 
> styx01_killNNfailover, styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3385) ClassCastException when trying to append a file

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3385:
-

   Resolution: Fixed
Fix Version/s: 2.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Suresh for the review.

I have committed this.

> ClassCastException when trying to append a file
> ---
>
> Key: HDFS-3385
> URL: https://issues.apache.org/jira/browse/HDFS-3385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
> Environment: HDFS
>Reporter: Brahma Reddy Battula
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 2.0.0
>
> Attachments: h3385_20120508.patch, h3385_20120509.patch
>
>
> When I try to append a file I got 
> {noformat}
> 2012-05-08 18:13:40,506 WARN  util.KerberosName 
> (KerberosName.java:(87)) - Kerberos krb5 configuration not found, 
> setting default realm to empty
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
> ...
>   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3385) ClassCastException when trying to append a file

2012-05-10 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272934#comment-13272934
 ] 

Suresh Srinivas commented on HDFS-3385:
---

+1 for the patch.

> ClassCastException when trying to append a file
> ---
>
> Key: HDFS-3385
> URL: https://issues.apache.org/jira/browse/HDFS-3385
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
> Environment: HDFS
>Reporter: Brahma Reddy Battula
>Assignee: Tsz Wo (Nicholas), SZE
> Attachments: h3385_20120508.patch, h3385_20120509.patch
>
>
> When I try to append a file I got 
> {noformat}
> 2012-05-08 18:13:40,506 WARN  util.KerberosName 
> (KerberosName.java:(87)) - Kerberos krb5 configuration not found, 
> setting default realm to empty
> Exception in thread "main" java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo cannot be cast to 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1787)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1584)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:1824)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:425)
> ...
>   at org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1150)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1189)
>   at org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1177)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:221)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:1)
>   at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:981)
>   at 
> org.apache.hadoop.hdfs.server.datanode.DeleteMe.main(DeleteMe.java:26)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272933#comment-13272933
 ] 

Hadoop QA commented on HDFS-3031:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526455/hdfs-3031.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2418//console

This message is automatically generated.

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, hdfs-3031.txt, styx01_killNNfailover, 
> styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272931#comment-13272931
 ] 

Hudson commented on HDFS-3401:
--

Integrated in Hadoop-Common-trunk-Commit #2225 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2225/])
HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed by 
Eli Collins (Revision 1336972)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336972
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 2.0.0
>
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase

2012-05-10 Thread Brandon Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272932#comment-13272932
 ] 

Brandon Li commented on HDFS-3163:
--

I tested the patch by running TestHDFSCLI. It passed with different users like 
admin, Brandon and test1.

> TestHDFSCLI.testAll fails if the user name is not all lowercase
> ---
>
> Key: HDFS-3163
> URL: https://issues.apache.org/jira/browse/HDFS-3163
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Attachments: HDFS-3163.patch
>
>
> In the test resource file testHDFSConf.xml, the test comparators expect user 
> name to be all lowercase. 
> If the user issuing the test has an uppercase in the username (e.g., Brandon 
> instead of brandon), many RegexpComarator tests will fail. The following is 
> one example:
> {noformat} 
> 
>   RegexpComparator
>   ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( 
> )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( 
> )*/file1
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272929#comment-13272929
 ] 

Todd Lipcon commented on HDFS-3335:
---

In {{EditLogFileInputStream.nextOp}}, we should log a WARN message with the 
file name and data on how many bytes are skipped at the end of the file. This 
way, if there is an error replaying later, you might notice that in fact you 
did want to recover some of these edits. Having the warning in the log will 
make it easier to find where they went.

In this place, it would also be nice to detect how many of those bytes were 
just 0x padding vs data that potentially looks like transactions.



- Rename {{GarbageAfterTerminatorException.getOffset}} to something a little 
more clear -- right now it's not obvious that this is a relative offset/length 
after the OP_INVALID, versus an offset since the beginning of the file, etc. 
Perhaps {{getPaddingLengthAfterEofMarker}}? I'm still not entirely clear what 
this length represents... by my reading of the javadoc, it is:

{code}
<--- valid edits ---> < OP_INVALID > <-- N bytes of padding --> <-- non-padding 
data --> EOF
{code}
where {{N}} above is what you're talking about?

Maybe some ASCII art like the above in the javadoc would be helpful.

Part of what is confusing me is this: does padding after OP_INVALID count as 
garbage or not?



{code}
+  /** Testing hook */
+  void setEditLog(FSEditLog newLog) {
{code}

Can you add @VisibleForTesting and change to {{setEditLogForTesting}} so no one 
starts to use it in non-test code?



- Lots of spurious whitespace changes in TestNameNodeRecovery
- Can you add brief javadoc to the three implementations of Corruptor? eg "/** 
Truncate the last byte of the file */", "/* Add padding followed by some 
non-padding bytes to the end of the file */" and "/** Add only padding to the 
end of the file */"?

Otherwise really nice tests.


> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, 
> HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, 
> HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272927#comment-13272927
 ] 

Hudson commented on HDFS-3401:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2300 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2300/])
HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed by 
Eli Collins (Revision 1336972)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336972
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSClientRetries.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestGetBlocks.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocolPB/TestPBHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestHost2NodesMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBPOfferService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestInterDatanodeProtocol.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/net/TestNetworkTopology.java


> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 2.0.0
>
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3134) Harden edit log loader against malformed or malicious input

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272924#comment-13272924
 ] 

Hudson commented on HDFS-3134:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2242 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2242/])
HDFS-3134. harden edit log loader against malformed or malicious input. 
Contributed by Colin Patrick McCabe (Revision 1336943)

 Result = ABORTED
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336943
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenIdentifier.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java


> Harden edit log loader against malformed or malicious input
> ---
>
> Key: HDFS-3134
> URL: https://issues.apache.org/jira/browse/HDFS-3134
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.0.0
>
> Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, 
> HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, 
> HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch
>
>
> Currently, the edit log loader does not handle bad or malicious input 
> sensibly.
> We can often cause OutOfMemory exceptions, null pointer exceptions, or other 
> unchecked exceptions to be thrown by feeding the edit log loader bad input.  
> In some environments, an out of memory error can cause the JVM process to be 
> terminated.
> It's clear that we want these exceptions to be thrown as IOException instead 
> of as unchecked exceptions.  We also want to avoid out of memory situations.
> The main task here is to put a sensible upper limit on the lengths of arrays 
> and strings we allocate on command.  The other task is to try to avoid 
> creating unchecked exceptions (by dereferencing potentially-NULL pointers, 
> for example).  Instead, we should verify ahead of time and give a more 
> sensible error message that reflects the problem with the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3401:
--

  Resolution: Fixed
   Fix Version/s: 2.0.0
Target Version/s:   (was: 2.0.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the review ATM. I've committed this and merged to branch-2.

> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Fix For: 2.0.0
>
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-10 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3388:
-

Attachment: HDFS-3388.HDFS-3092.patch

The new patch addressed Nicholas' comments.

> GetJournalEditServlet should catch more exceptions, not just IOException
> 
>
> Key: HDFS-3388
> URL: https://issues.apache.org/jira/browse/HDFS-3388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch, 
> HDFS-3388.HDFS-3092.patch
>
>
> GetJournalEditServlet has the same problem as that of GetImageServlet 
> (HDFS-3330). It should be fixed in the same way. Also need to make 
> CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272919#comment-13272919
 ] 

Hadoop QA commented on HDFS-3335:
-

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12526442/HDFS-3335-b1.004.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2417//console

This message is automatically generated.

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, 
> HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, 
> HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272916#comment-13272916
 ] 

Eli Collins commented on HDFS-3404:
---

The approach - have the NN determine the hostname of the checkpointer from the 
request rather than have it passed as a parameter - seems more sane to me.

- This change needs to be made to the 2NN as well right or were you thinking 
just the SBN? 
- NetUtils#isIpAddress actually checks ip:port, seems like we'll always have an 
IP here. Perhaps better to use InetAddresses.isInetAddress.

> Make putImage in GetImageServlet infer remote address to fetch from
> ---
>
> Key: HDFS-3404
> URL: https://issues.apache.org/jira/browse/HDFS-3404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3404.patch
>
>
> As it stands, daemons which perform checkpointing must determine their own 
> address on which they can be reached, so that the NN which they checkpoint 
> against knows what address to fetch a merged fsimage from. This causes 
> problems if, for example, the daemon performing checkpointing binds to 
> 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3031:
--

Attachment: hdfs-3031.txt

New rev fixes the issue with append(): the problem is that the client doesn't 
send any previous block when appending to a file, when the append starts 
exactly at a block boundary.

I attempted to make the client cleaner, but DFSOutputStream is a hairball. 
Without a substantial reworking of that, it was cleaner to do this on the 
server side.

> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> -
>
> Key: HDFS-3031
> URL: https://issues.apache.org/jira/browse/HDFS-3031
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Affects Versions: 0.24.0
>Reporter: Stephen Chu
>Assignee: Todd Lipcon
> Attachments: hdfs-3031.txt, hdfs-3031.txt, styx01_killNNfailover, 
> styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
> at org.apache.hadoop.ipc.Client.call(Client.java:1145)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
> at $Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
> at $Proxy10.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272909#comment-13272909
 ] 

Hadoop QA commented on HDFS-3401:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526427/hdfs-3401.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 11 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2415//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2415//console

This message is automatically generated.

> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3134) Harden edit log loader against malformed or malicious input

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272902#comment-13272902
 ] 

Hudson commented on HDFS-3134:
--

Integrated in Hadoop-Common-trunk-Commit #2224 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2224/])
HDFS-3134. harden edit log loader against malformed or malicious input. 
Contributed by Colin Patrick McCabe (Revision 1336943)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336943
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenIdentifier.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java


> Harden edit log loader against malformed or malicious input
> ---
>
> Key: HDFS-3134
> URL: https://issues.apache.org/jira/browse/HDFS-3134
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.0.0
>
> Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, 
> HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, 
> HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch
>
>
> Currently, the edit log loader does not handle bad or malicious input 
> sensibly.
> We can often cause OutOfMemory exceptions, null pointer exceptions, or other 
> unchecked exceptions to be thrown by feeding the edit log loader bad input.  
> In some environments, an out of memory error can cause the JVM process to be 
> terminated.
> It's clear that we want these exceptions to be thrown as IOException instead 
> of as unchecked exceptions.  We also want to avoid out of memory situations.
> The main task here is to put a sensible upper limit on the lengths of arrays 
> and strings we allocate on command.  The other task is to try to avoid 
> creating unchecked exceptions (by dereferencing potentially-NULL pointers, 
> for example).  Instead, we should verify ahead of time and give a more 
> sensible error message that reflects the problem with the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3388) GetJournalEditServlet should catch more exceptions, not just IOException

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272896#comment-13272896
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3388:
--

- Since GetJournalEditServletFaultInjector is an inner class, let's simply call 
it FaultInjector.

- GetJournalEditServletFaultInjector.getInstance() is not used (and it should 
be static if you want to use it.)

- change 
{code}
new String(path1.toString() + "/current")
{code}
to 
{code}
"path1 + "/current"
{code}


> GetJournalEditServlet should catch more exceptions, not just IOException
> 
>
> Key: HDFS-3388
> URL: https://issues.apache.org/jira/browse/HDFS-3388
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Reporter: Brandon Li
>Assignee: Brandon Li
> Attachments: HDFS-3388.HDFS-3092.patch, HDFS-3388.HDFS-3092.patch
>
>
> GetJournalEditServlet has the same problem as that of GetImageServlet 
> (HDFS-3330). It should be fixed in the same way. Also need to make 
> CheckpointFaultInjector visible for journal service tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3134) Harden edit log loader against malformed or malicious input

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272895#comment-13272895
 ] 

Hudson commented on HDFS-3134:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2299 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2299/])
HDFS-3134. harden edit log loader against malformed or malicious input. 
Contributed by Colin Patrick McCabe (Revision 1336943)

 Result = SUCCESS
eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336943
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenIdentifier.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogOp.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java


> Harden edit log loader against malformed or malicious input
> ---
>
> Key: HDFS-3134
> URL: https://issues.apache.org/jira/browse/HDFS-3134
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.0.0
>
> Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, 
> HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, 
> HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch
>
>
> Currently, the edit log loader does not handle bad or malicious input 
> sensibly.
> We can often cause OutOfMemory exceptions, null pointer exceptions, or other 
> unchecked exceptions to be thrown by feeding the edit log loader bad input.  
> In some environments, an out of memory error can cause the JVM process to be 
> terminated.
> It's clear that we want these exceptions to be thrown as IOException instead 
> of as unchecked exceptions.  We also want to avoid out of memory situations.
> The main task here is to put a sensible upper limit on the lengths of arrays 
> and strings we allocate on command.  The other task is to try to avoid 
> creating unchecked exceptions (by dereferencing potentially-NULL pointers, 
> for example).  Instead, we should verify ahead of time and give a more 
> sensible error message that reflects the problem with the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-744) Support hsync in HDFS

2012-05-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HDFS-744:
---

Attachment: HDFS-744-trunk-v2.patch

Patch that I tested against HBase.
(I post the required HBase changes on the linked jira.)

HBase starts up, I can flush, and compact tables.
I verified via debugger that the sync path is correctly triggered.

*Please* have a look. For users like us (Salesforce.com) this is an important 
data safety feature.


> Support hsync in HDFS
> -
>
> Key: HDFS-744
> URL: https://issues.apache.org/jira/browse/HDFS-744
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Hairong Kuang
> Attachments: HDFS-744-trunk-v2.patch, HDFS-744-trunk.patch, 
> hdfs-744-v2.txt, hdfs-744-v3.txt, hdfs-744.txt
>
>
> HDFS-731 implements hsync by default as hflush. As descriibed in HADOOP-6313, 
> the real expected semantics should be "flushes out to all replicas and all 
> replicas have done posix fsync equivalent - ie the OS has flushed it to the 
> disk device (but the disk may have it in its cache)." This jira aims to 
> implement the expected behaviour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3134) Harden edit log loader against malformed or malicious input

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3134:
--

Target Version/s:   (was: 2.0.0)
   Fix Version/s: 2.0.0
  Issue Type: Improvement  (was: Bug)
Hadoop Flags: Reviewed
 Summary: Harden edit log loader against malformed or malicious 
input  (was: harden edit log loader against malformed or malicious input)

I've committed this and merged to branch-2, thanks Colin!

> Harden edit log loader against malformed or malicious input
> ---
>
> Key: HDFS-3134
> URL: https://issues.apache.org/jira/browse/HDFS-3134
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Fix For: 2.0.0
>
> Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, 
> HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, 
> HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch
>
>
> Currently, the edit log loader does not handle bad or malicious input 
> sensibly.
> We can often cause OutOfMemory exceptions, null pointer exceptions, or other 
> unchecked exceptions to be thrown by feeding the edit log loader bad input.  
> In some environments, an out of memory error can cause the JVM process to be 
> terminated.
> It's clear that we want these exceptions to be thrown as IOException instead 
> of as unchecked exceptions.  We also want to avoid out of memory situations.
> The main task here is to put a sensible upper limit on the lengths of arrays 
> and strings we allocate on command.  The other task is to try to avoid 
> creating unchecked exceptions (by dereferencing potentially-NULL pointers, 
> for example).  Instead, we should verify ahead of time and give a more 
> sensible error message that reflects the problem with the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3134) harden edit log loader against malformed or malicious input

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272879#comment-13272879
 ] 

Eli Collins commented on HDFS-3134:
---

+1 looks good

> harden edit log loader against malformed or malicious input
> ---
>
> Key: HDFS-3134
> URL: https://issues.apache.org/jira/browse/HDFS-3134
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3134.001.patch, HDFS-3134.002.patch, 
> HDFS-3134.003.patch, HDFS-3134.004.patch, HDFS-3134.005.patch, 
> HDFS-3134.006.patch, HDFS-3134.007.patch, HDFS-3134.009.patch
>
>
> Currently, the edit log loader does not handle bad or malicious input 
> sensibly.
> We can often cause OutOfMemory exceptions, null pointer exceptions, or other 
> unchecked exceptions to be thrown by feeding the edit log loader bad input.  
> In some environments, an out of memory error can cause the JVM process to be 
> terminated.
> It's clear that we want these exceptions to be thrown as IOException instead 
> of as unchecked exceptions.  We also want to avoid out of memory situations.
> The main task here is to put a sensible upper limit on the lengths of arrays 
> and strings we allocate on command.  The other task is to try to avoid 
> creating unchecked exceptions (by dereferencing potentially-NULL pointers, 
> for example).  Instead, we should verify ahead of time and give a more 
> sensible error message that reflects the problem with the input.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase

2012-05-10 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3163:
-

Assignee: Brandon Li
  Status: Patch Available  (was: Open)

> TestHDFSCLI.testAll fails if the user name is not all lowercase
> ---
>
> Key: HDFS-3163
> URL: https://issues.apache.org/jira/browse/HDFS-3163
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Trivial
> Attachments: HDFS-3163.patch
>
>
> In the test resource file testHDFSConf.xml, the test comparators expect user 
> name to be all lowercase. 
> If the user issuing the test has an uppercase in the username (e.g., Brandon 
> instead of brandon), many RegexpComarator tests will fail. The following is 
> one example:
> {noformat} 
> 
>   RegexpComparator
>   ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( 
> )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( 
> )*/file1
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3163) TestHDFSCLI.testAll fails if the user name is not all lowercase

2012-05-10 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3163:
-

Attachment: HDFS-3163.patch

Changed testHDFSConf.xml to support user name with uppercase and number.

> TestHDFSCLI.testAll fails if the user name is not all lowercase
> ---
>
> Key: HDFS-3163
> URL: https://issues.apache.org/jira/browse/HDFS-3163
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Brandon Li
>Priority: Trivial
> Attachments: HDFS-3163.patch
>
>
> In the test resource file testHDFSConf.xml, the test comparators expect user 
> name to be all lowercase. 
> If the user issuing the test has an uppercase in the username (e.g., Brandon 
> instead of brandon), many RegexpComarator tests will fail. The following is 
> one example:
> {noformat} 
> 
>   RegexpComparator
>   ^-rw-r--r--( )*1( )*[a-z]*( )*supergroup( )*0( 
> )*[0-9]{4,}-[0-9]{2,}-[0-9]{2,} [0-9]{2,}:[0-9]{2,}( 
> )*/file1
> 
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3335:
---

Attachment: HDFS-3335-b1.004.patch

For the branch-1 patch:

Ignore corruption after the sentinel as long as it takes place in the last 2 
megabytes of the log.

Test this exception.

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335-b1.004.patch, HDFS-3335.001.patch, 
> HDFS-3335.002.patch, HDFS-3335.003.patch, HDFS-3335.004.patch, 
> HDFS-3335.005.patch, HDFS-3335.006.patch, HDFS-3335.007.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272858#comment-13272858
 ] 

Hudson commented on HDFS-3369:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2241 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2241/])
HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and 
BlocksMap to {get|set|add}BlockCollection(..).  Contributed by John George 
(Revision 1336909)

 Result = ABORTED
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336909
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java


> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272855#comment-13272855
 ] 

Aaron T. Myers commented on HDFS-3401:
--

Patch looks good to me. +1 pending Jenkins.

> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3404:
-

Attachment: HDFS-3404.patch

Here's an initial patch to make sure folks are OK with the approach. I'm still 
mulling over how best to write tests for this, which is a tad difficult on a 
single-node machine.

I tested this manually by setting up an HA setup where each NN itself binds to 
0.0.0.0, but has actual addresses for the other NN. It worked as expected.

> Make putImage in GetImageServlet infer remote address to fetch from
> ---
>
> Key: HDFS-3404
> URL: https://issues.apache.org/jira/browse/HDFS-3404
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3404.patch
>
>
> As it stands, daemons which perform checkpointing must determine their own 
> address on which they can be reached, so that the NN which they checkpoint 
> against knows what address to fetch a merged fsimage from. This causes 
> problems if, for example, the daemon performing checkpointing binds to 
> 0.0.0.0, and thus can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3404) Make putImage in GetImageServlet infer remote address to fetch from

2012-05-10 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-3404:


 Summary: Make putImage in GetImageServlet infer remote address to 
fetch from
 Key: HDFS-3404
 URL: https://issues.apache.org/jira/browse/HDFS-3404
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers


As it stands, daemons which perform checkpointing must determine their own 
address on which they can be reached, so that the NN which they checkpoint 
against knows what address to fetch a merged fsimage from. This causes problems 
if, for example, the daemon performing checkpointing binds to 0.0.0.0, and thus 
can't be sure of what address the NN can reach it at.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3391) TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3391:
--

Summary: TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing  
(was: Failing tests in branch-2)

> TestPipelinesFailover#testLeaseRecoveryAfterFailover is failing
> ---
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272847#comment-13272847
 ] 

Hadoop QA commented on HDFS-3026:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526418/HDFS-3026.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2414//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2414//console

This message is automatically generated.

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Moved] (HDFS-3403) SecondaryNamenode doesn't start up in secure cluster

2012-05-10 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony moved MAPREDUCE-4245 to HDFS-3403:
---

  Component/s: (was: security)
   security
Fix Version/s: (was: 0.22.1)
   0.22.1
Affects Version/s: (was: 0.22.0)
   0.22.0
  Key: HDFS-3403  (was: MAPREDUCE-4245)
  Project: Hadoop HDFS  (was: Hadoop Map/Reduce)

> SecondaryNamenode doesn't start up in secure cluster
> 
>
> Key: HDFS-3403
> URL: https://issues.apache.org/jira/browse/HDFS-3403
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
> Fix For: 0.22.1
>
> Attachments: incorrect-sn-principal.patch
>
>
> SN fails to startup due to access control error. This is an authorization 
> issue and not authentication issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272843#comment-13272843
 ] 

Hadoop QA commented on HDFS-3400:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526406/HDFS-3400.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2413//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2413//console

This message is automatically generated.

> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) Failing tests in branch-2

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272840#comment-13272840
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-3391:
--

It does fail in trunk as in [build 
#2397|https://builds.apache.org/job/PreCommit-HDFS-Build/2397//testReport/org.apache.hadoop.hdfs.server.namenode.ha/TestPipelinesFailover/testLeaseRecoveryAfterFailover/].
  The error is the same as Eli got.

> Failing tests in branch-2
> -
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272838#comment-13272838
 ] 

Eli Collins commented on HDFS-3026:
---

+1 looks great

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Moved] (HDFS-3402) Fix hdfs script for for secure datanodes

2012-05-10 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony moved HADOOP-8376 to HDFS-3402:


  Component/s: (was: security)
   security
Fix Version/s: (was: 0.22.1)
   0.22.1
Affects Version/s: (was: 0.22.0)
   0.22.0
  Key: HDFS-3402  (was: HADOOP-8376)
  Project: Hadoop HDFS  (was: Hadoop Common)

> Fix hdfs script for for secure datanodes
> 
>
> Key: HDFS-3402
> URL: https://issues.apache.org/jira/browse/HDFS-3402
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: security
>Affects Versions: 0.22.0
>Reporter: Benoy Antony
>Assignee: Benoy Antony
>Priority: Minor
> Fix For: 0.22.1
>
> Attachments: hdfs-jsvc.patch
>
>
> Starting secure datanode gives out the following error :
> Error thrown :
> 09/04/2012 12:09:30 2524 jsvc error: Invalid option -server
> 09/04/2012 12:09:30 2524 jsvc error: Cannot parse command line arguments

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272834#comment-13272834
 ] 

Eli Collins commented on HDFS-3400:
---

+1 looks good to me as well

> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3049:
---

Attachment: HDFS-3049.005.against3335.patch

* patch against 3335

> During the normal loading NN startup process, fall back on a different 
> EditLog if we see one that is corrupt
> 
>
> Key: HDFS-3049
> URL: https://issues.apache.org/jira/browse/HDFS-3049
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, 
> HDFS-3049.003.patch, HDFS-3049.005.against3335.patch
>
>
> During the NameNode startup process, we load an image, and then apply edit 
> logs to it until we believe that we have all the latest changes.  
> Unfortunately, if there is an I/O error while reading any of these files, in 
> most cases, we simply abort the startup process.  We should try harder to 
> locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a 
> misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to 
> pre-reserve blocks, a disk full condition may cause a truncated log in one 
> edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories 
> receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we 
> believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the 
> operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
> edits__curent__XYZ  size:213421345   md5:2345345
> image  size:213421345   md5:2345345
> 2. /foo/baz
> edits__curent__XYZ  size:213421345   md5:2345345345
> image  size:213421345   md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In 
> this case, this means that we should NOT fail if we can't find EditLogs that 
> would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the 
> NameNode name directory or directories hangs because of an I/O error.  This 
> may happen, for example, when trying to load an image from a hard-mounted NFS 
> directory, when the NFS server has gone away.  Just as now, the operator will 
> have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) Failing tests in branch-2

2012-05-10 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272832#comment-13272832
 ] 

Eli Collins commented on HDFS-3391:
---

Forgot to mention, I only see TestPipelinesFailover fail on branch-2-alpha, not 
trunk.

> Failing tests in branch-2
> -
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3049) During the normal loading NN startup process, fall back on a different EditLog if we see one that is corrupt

2012-05-10 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-3049:
---

Status: Open  (was: Patch Available)

> During the normal loading NN startup process, fall back on a different 
> EditLog if we see one that is corrupt
> 
>
> Key: HDFS-3049
> URL: https://issues.apache.org/jira/browse/HDFS-3049
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3049.001.patch, HDFS-3049.002.patch, 
> HDFS-3049.003.patch
>
>
> During the NameNode startup process, we load an image, and then apply edit 
> logs to it until we believe that we have all the latest changes.  
> Unfortunately, if there is an I/O error while reading any of these files, in 
> most cases, we simply abort the startup process.  We should try harder to 
> locate a readable edit log and/or image file.
> *There are three main use cases for this feature:*
> 1. If the operating system does not honor fsync (usually due to a 
> misconfiguration), a file may end up in an inconsistent state.
> 2. In certain older releases where we did not use fallocate() or similar to 
> pre-reserve blocks, a disk full condition may cause a truncated log in one 
> edit directory.
> 3. There may be a bug in HDFS which results in some of the data directories 
> receiving corrupt data, but not all.  This is the least likely use case.
> *Proposed changes to normal NN startup*
> * We should try a different FSImage if we can't load the first one we try.
> * We should examine other FSEditLogs if we can't load the first one(s) we try.
> * We should fail if we can't find EditLogs that would bring us up to what we 
> believe is the latest transaction ID.
> Proposed changes to recovery mode NN startup:
> we should list out all the available storage directories and allow the 
> operator to select which one he wants to use.
> Something like this:
> {code}
> Multiple storage directories found.
> 1. /foo/bar
> edits__curent__XYZ  size:213421345   md5:2345345
> image  size:213421345   md5:2345345
> 2. /foo/baz
> edits__curent__XYZ  size:213421345   md5:2345345345
> image  size:213421345   md5:2345345
> Which one would you like to use? (1/2)
> {code}
> As usual in recovery mode, we want to be flexible about error handling.  In 
> this case, this means that we should NOT fail if we can't find EditLogs that 
> would bring us up to what we believe is the latest transaction ID.
> *Not addressed by this feature*
> This feature will not address the case where an attempt to access the 
> NameNode name directory or directories hangs because of an I/O error.  This 
> may happen, for example, when trying to load an image from a hard-mounted NFS 
> directory, when the NFS server has gone away.  Just as now, the operator will 
> have to notice this problem and take steps to correct it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272827#comment-13272827
 ] 

Jakob Homan commented on HDFS-3400:
---

+1

> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272825#comment-13272825
 ] 

Hudson commented on HDFS-3369:
--

Integrated in Hadoop-Common-trunk-Commit #2223 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2223/])
HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and 
BlocksMap to {get|set|add}BlockCollection(..).  Contributed by John George 
(Revision 1336909)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336909
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java


> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272815#comment-13272815
 ] 

Hudson commented on HDFS-3369:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2298 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2298/])
HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and 
BlocksMap to {get|set|add}BlockCollection(..).  Contributed by John George 
(Revision 1336909)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1336909
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlocksMap.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFile.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/INodeFileUnderConstruction.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeJspHelper.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyRaid.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/src/contrib/raid/src/test/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockPlacementPolicyRaid.java


> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3335) check for edit log corruption at the end of the log

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272808#comment-13272808
 ] 

Hadoop QA commented on HDFS-3335:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526396/HDFS-3335.007.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2411//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2411//console

This message is automatically generated.

> check for edit log corruption at the end of the log
> ---
>
> Key: HDFS-3335
> URL: https://issues.apache.org/jira/browse/HDFS-3335
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-3335-b1.001.patch, HDFS-3335-b1.002.patch, 
> HDFS-3335-b1.003.patch, HDFS-3335.001.patch, HDFS-3335.002.patch, 
> HDFS-3335.003.patch, HDFS-3335.004.patch, HDFS-3335.005.patch, 
> HDFS-3335.006.patch, HDFS-3335.007.patch
>
>
> Even after encountering an OP_INVALID, we should check the end of the edit 
> log to make sure that it contains no more edits.
> This will catch things like rare race conditions or log corruptions that 
> would otherwise remain undetected.  They will got from being silent data loss 
> scenarios to being cases that we can detect and fix.
> Using recovery mode, we can choose to ignore the end of the log if necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3369:
-

  Resolution: Fixed
   Fix Version/s: 2.0.0
Target Version/s: 2.0.0, 3.0.0  (was: 3.0.0, 2.0.0)
  Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, John!

> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3401:
--

Status: Patch Available  (was: Open)

> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3401:
--

Attachment: hdfs-3401.txt

Patch attached.

> Cleanup DatanodeDescriptor creation in the tests
> 
>
> Key: HDFS-3401
> URL: https://issues.apache.org/jira/browse/HDFS-3401
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, test
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Eli Collins
> Attachments: hdfs-3401.txt
>
>
> Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3401) Cleanup DatanodeDescriptor creation in the tests

2012-05-10 Thread Eli Collins (JIRA)
Eli Collins created HDFS-3401:
-

 Summary: Cleanup DatanodeDescriptor creation in the tests
 Key: HDFS-3401
 URL: https://issues.apache.org/jira/browse/HDFS-3401
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node, test
Affects Versions: 2.0.0
Reporter: Eli Collins
Assignee: Eli Collins


Like HDFS-3230 but for DatanodeDescriptor.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3026:
-

Attachment: HDFS-3026.patch

Forgot to address the findbugs warning - just need to synchronize 
NameNode#setRuntimeForTesting.

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch, HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3369) change variable names referring to inode in blockmanagement to more appropriate

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272741#comment-13272741
 ] 

Hadoop QA commented on HDFS-3369:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526180/HDFS-3369.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2410//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2410//console

This message is automatically generated.

> change variable names referring to inode in blockmanagement to more 
> appropriate
> ---
>
> Key: HDFS-3369
> URL: https://issues.apache.org/jira/browse/HDFS-3369
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0, 3.0.0
>Reporter: John George
>Assignee: John George
>Priority: Minor
> Attachments: HDFS-3369.patch
>
>
> We should rename BlocksMap.getINode(..) and, in addition, the local variable 
> names such as fileInode to match 'block collection'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3372) offlineEditsViewer should be able to read a binary edits file with recovery mode

2012-05-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272743#comment-13272743
 ] 

Hadoop QA commented on HDFS-3372:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526388/HDFS-3372.002.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified test 
files.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2409//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2409//console

This message is automatically generated.

> offlineEditsViewer should be able to read a binary edits file with recovery 
> mode
> 
>
> Key: HDFS-3372
> URL: https://issues.apache.org/jira/browse/HDFS-3372
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Attachments: HDFS-3372.001.patch, HDFS-3372.002.patch
>
>
> It would be nice if oev (the offline edits viewer) had a switch that allowed 
> us to read a binary edits file using recovery mode.  oev can be very useful 
> when working with corrupt or messed up edit log files, and this would make it 
> even more so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3026) HA: Handle failure during HA state transition

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3026:
-

Attachment: HDFS-3026.patch

Thanks a lot for the review, Eli. Here's an updated patch.

Good idea re: trash emptier thread. I've done that in this patch.

As for the other exits in NameNode - all of those are as exit codes from shell 
commands (e.g. format, bootstrapStandby, etc.), or from the the static main 
function, none of which I think really benefit from calling this method. Good 
point about making the error message more generic, though. I've gone ahead and 
done that.

> HA: Handle failure during HA state transition
> -
>
> Key: HDFS-3026
> URL: https://issues.apache.org/jira/browse/HDFS-3026
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, name-node
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3026-HDFS-1623.patch, HDFS-3026.patch, 
> HDFS-3026.patch
>
>
> This JIRA is to address a TODO in NameNode about handling the possibility of 
> an incomplete HA state transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-3391) Failing tests in branch-2

2012-05-10 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272714#comment-13272714
 ] 

Todd Lipcon commented on HDFS-3391:
---

I'll investigate TestPipelinesFailover, since I wrote it.

> Failing tests in branch-2
> -
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-3391) Failing tests in branch-2

2012-05-10 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-3391:
-

Assignee: Todd Lipcon

> Failing tests in branch-2
> -
>
> Key: HDFS-3391
> URL: https://issues.apache.org/jira/browse/HDFS-3391
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Arun C Murthy
>Assignee: Todd Lipcon
>Priority: Critical
>
> Running org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.208 sec <<< 
> FAILURE!
> --
> Running org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover
> Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 81.195 sec 
> <<< FAILURE!
> --

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3368) Missing blocks due to bad DataNodes comming up and down.

2012-05-10 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-3368:
--

Attachment: blockDeletePolicy.patch

I end up using 4 as a multiplier for heartbeatInterval. Looked at my busy but 
healthy cluster. There are always some nodes with last heartbeat around 10. So 
multiplier 4 should cover that.
If there are nodes that are permanently late with heartbeats, then this policy 
will eventually reduce the block count on such nodes, which will reduce load on 
them, and potentially help with heartbeats.

> Missing blocks due to bad DataNodes comming up and down.
> 
>
> Key: HDFS-3368
> URL: https://issues.apache.org/jira/browse/HDFS-3368
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.22.0, 1.0.0, 2.0.0, 3.0.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Attachments: blockDeletePolicy.patch, blockDeletePolicy.patch
>
>
> All replicas of a block can be removed if bad DataNodes come up and down 
> during cluster restart resulting in data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3230) Cleanup DatanodeID creation in the tests

2012-05-10 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3230:
--

   Resolution: Fixed
Fix Version/s: 2.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks ATM. I've committed this and merged to branch-2.

> Cleanup DatanodeID creation in the tests
> 
>
> Key: HDFS-3230
> URL: https://issues.apache.org/jira/browse/HDFS-3230
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: test
>Reporter: Eli Collins
>Assignee: Eli Collins
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: hdfs-3230.txt, hdfs-3230.txt
>
>
> A lot of tests create dummy DatanodeIDs for testing, often use bogus values 
> when creating the objects (eg hostname in the IP field), which they can get 
> away with because the IDs aren't actually used. Let's add a test utility 
> method for creating a DatanodeID for testing and use it throughout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3400:
-

Status: Patch Available  (was: Open)

> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3400) DNs should be able start with jsvc even if security is disabled

2012-05-10 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3400:
-

Attachment: HDFS-3400.patch

Here's a patch which makes it so that DNs will start up even if JSVC is 
configured properly and security is disabled via the XML confs. The only things 
that will cause the DN to not start are now:

* Security is enabled but the DN is not configured to use low ports.
* JSVC_HOME is configured, but $JSVC_HOME/jsvc is not executable.

No tests are included since security and root access must be available to test 
this. I tested it manually by:

# Starting a DN with security disabled, and all security-related env vars set.
# Starting a DN with security enabled, and all security-related env vars set.
# Starting a DN with security disabled, and none of the security-related env 
vars set.
# Starting a DN with security enabled, and none of the security-related env 
vars set.
# Starting a DN with security enabled, all of the security-related env vars 
set, but not configured with low ports.

The DN now starts properly in the first three cases. It does not start in the 
last two. This is the expected behavior after this patch.

> DNs should be able start with jsvc even if security is disabled
> ---
>
> Key: HDFS-3400
> URL: https://issues.apache.org/jira/browse/HDFS-3400
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, scripts
>Affects Versions: 2.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-3400.patch
>
>
> Currently if one tries to start a DN with security disabled (via 
> hadoop.security.authentication = "simple" in the configs), but JSVC is 
> correctly configured, the DN will refuse to start.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >