[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446892#comment-13446892
 ] 

Hudson commented on HDFS-3880:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2697 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2697/])
HDFS-3880. Use Builder to build RPC server in HDFS. Contributed by Brandon 
Li. (Revision 1379917)

 Result = FAILURE
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379917
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/journalservice/JournalService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestClientProtocolWithDelegationToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java


> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446891#comment-13446891
 ] 

Hudson commented on HDFS-3880:
--

Integrated in Hadoop-Common-trunk-Commit #2672 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2672/])
HDFS-3880. Use Builder to build RPC server in HDFS. Contributed by Brandon 
Li. (Revision 1379917)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379917
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/journalservice/JournalService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestClientProtocolWithDelegationToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java


> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446890#comment-13446890
 ] 

Hudson commented on HDFS-3880:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2735 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2735/])
HDFS-3880. Use Builder to build RPC server in HDFS. Contributed by Brandon 
Li. (Revision 1379917)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379917
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/journalservice/JournalService.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestClientProtocolWithDelegationToken.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java


> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas updated HDFS-3880:
--

   Resolution: Fixed
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed the change to trunk. Thank you Brandon.

> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446884#comment-13446884
 ] 

Suresh Srinivas commented on HDFS-3880:
---

+1 for the patch.

> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?

2012-09-01 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446883#comment-13446883
 ] 

Aaron T. Myers commented on HDFS-3886:
--

Interesting idea. Perhaps we could add a "clean shutdown" dfsadmin command, and 
then add an extra action to the init.d script which a cautious admin can choose 
to run? That way we preserve the shutdown behavior that Steve is concerned 
about, but give the admin an option to have guaranteed-good metadata? Just 
thinking out loud.

> Shutdown requests can possibly check for checkpoint issues (corrupted edits) 
> and save a good namespace copy before closing down?
> 
>
> Key: HDFS-3886
> URL: https://issues.apache.org/jira/browse/HDFS-3886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Priority: Minor
>
> HDFS-3878 sorta gives me this idea. Aside of having a method to download it 
> to a different location, we can also lock up the namesystem (or deactivate 
> the client rpc server) and save the namesystem before we complete up the 
> shutdown.
> The init.d/shutdown scripts would have to work with this somehow though, to 
> not kill -9 it when in-process. Also, the new image may be stored in a 
> shutdown.chkpt directory, to not interfere in the regular dirs, but still 
> allow easier recovery.
> Obviously this will still not work if all directories are broken. So maybe we 
> could have some configs to tackle that as well?
> I haven't thought this through, so let me know what part is wrong to do :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3880:
-

Status: Patch Available  (was: Open)

> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3880) Use Builder to get RPC server in HDFS

2012-09-01 Thread Brandon Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li updated HDFS-3880:
-

Status: Open  (was: Patch Available)

> Use Builder to get RPC server in HDFS
> -
>
> Key: HDFS-3880
> URL: https://issues.apache.org/jira/browse/HDFS-3880
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: data-node, ha, name-node, security
>Affects Versions: 3.0.0
>Reporter: Brandon Li
>Assignee: Brandon Li
>Priority: Minor
> Attachments: HDFS-3880.patch
>
>
> In HADOOP-8736, a Builder is introduced to replace all the getServer() 
> variants. This JIRA is the change in HDFS. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)

2012-09-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446862#comment-13446862
 ] 

Hadoop QA commented on HDFS-3876:
-

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12543455/hdfs-3876.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3138//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3138//console

This message is automatically generated.

> NN should not RPC to self to find trash defaults (causes deadlock)
> --
>
> Key: HDFS-3876
> URL: https://issues.apache.org/jira/browse/HDFS-3876
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt
>
>
> When transitioning a SBN to active, I ran into the following situation:
> - the TrashPolicy first gets loaded by an IPC Server Handler thread. The 
> {{initialize}} function then tries to make an RPC to the same node to find 
> out the defaults.
> - This is happening inside the NN write lock (since it's part of the active 
> initialization). Hence, all of the other handler threads are already blocked 
> waiting to get the NN lock.
> - Since no handler threads are free, the RPC blocks forever and the NN never 
> enters active state.
> We need to have a general policy that the NN should never make RPCs to itself 
> for any reason, due to potential for deadlocks like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3828) Block Scanner rescans blocks too frequently

2012-09-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446861#comment-13446861
 ] 

Eli Collins commented on HDFS-3828:
---

Agree this approach is best for now. Please file a jira for the proposed 
refactoring outlining the issues it addresses with the current approach (eg 
that we do unnecessary work if we finish the scan w/in the period).

- Per the findbugs warnning I'd pull your new check in scanBlockPoolSlice out 
to a synchronized method (eg workRemainingInCurrentPeriod)
- DataBlockScanner#run should use SLEEP_PERIOD_MS (could use in 
getNextBPScanner as well, though it and waitForInit aren't part of the "period")
- In getTotalScans rather than throw IOE if given a bpid w/o a scanner I 
believe this should be an assert (we should always have a scanner for a block 
pool if we've enbabled scanning, which we have if we're in DataBlockScanner)


> Block Scanner rescans blocks too frequently
> ---
>
> Key: HDFS-3828
> URL: https://issues.apache.org/jira/browse/HDFS-3828
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.23.0, 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs-3828-1.txt, hdfs-3828-2.txt, hdfs3828.txt
>
>
> {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from 
> {{DataBlockScanner#run}} via {{scanBlockPoolSlice}}.  But cleanUp 
> unconditionally roll()s the verificationLogs, so after two iterations we have 
> lost the first iteration of block verification times.  As a result a cluster 
> with just one block repeatedly rescans it every 10 seconds:
> {noformat}
> 2012-08-16 15:59:57,884 INFO  datanode.BlockPoolSliceScanner 
> (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for 
> BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915
> 2012-08-16 16:00:07,904 INFO  datanode.BlockPoolSliceScanner 
> (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for 
> BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915
> 2012-08-16 16:00:17,925 INFO  datanode.BlockPoolSliceScanner 
> (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for 
> BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915
> {noformat}
> {quote}
> To fix this, we need to avoid roll()ing the logs multiple times per period.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)

2012-09-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446858#comment-13446858
 ] 

Hadoop QA commented on HDFS-3876:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12543455/hdfs-3876.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified test 
files.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3137//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3137//console

This message is automatically generated.

> NN should not RPC to self to find trash defaults (causes deadlock)
> --
>
> Key: HDFS-3876
> URL: https://issues.apache.org/jira/browse/HDFS-3876
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt
>
>
> When transitioning a SBN to active, I ran into the following situation:
> - the TrashPolicy first gets loaded by an IPC Server Handler thread. The 
> {{initialize}} function then tries to make an RPC to the same node to find 
> out the defaults.
> - This is happening inside the NN write lock (since it's part of the active 
> initialization). Hence, all of the other handler threads are already blocked 
> waiting to get the NN lock.
> - Since no handler threads are free, the RPC blocks forever and the NN never 
> enters active state.
> We need to have a general policy that the NN should never make RPCs to itself 
> for any reason, due to potential for deadlocks like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3870) QJM: add metrics to JournalNode

2012-09-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446854#comment-13446854
 ] 

Eli Collins commented on HDFS-3870:
---

+1 looks great. Sync and lag are IMO the most interesting.  Only other useful 
metrics I can think of are additions to the NN's JN metrics (NN -> Journal 
latency and failed Journal operations) though those aren't QJM specific.

> QJM: add metrics to JournalNode
> ---
>
> Key: HDFS-3870
> URL: https://issues.apache.org/jira/browse/HDFS-3870
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3870.txt
>
>
> The JournalNode should expose some basic metrics through the usual interface. 
> In particular:
> - the writer epoch, accepted epoch,
> - the last written transaction ID and last committed txid (which may be newer 
> in case that it's in the process of catching up)
> - latency information for how long the syncs are taking
> Please feel free to suggest others that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3884) QJM: Journal format() should reset cached values

2012-09-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446853#comment-13446853
 ] 

Eli Collins commented on HDFS-3884:
---

+1 lgtm

> QJM: Journal format() should reset cached values
> 
>
> Key: HDFS-3884
> URL: https://issues.apache.org/jira/browse/HDFS-3884
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: QuorumJournalManager (HDFS-3077)
>
> Attachments: hdfs-3884.txt
>
>
> Simple bug in the JournalNode: it caches certain values (eg accepted epoch) 
> in memory, and the cached values aren't reset when the journal is formatted. 
> So, after a format, further calls to the same Journal will see the old value 
> for accepted epoch, writer epoch, etc, preventing the journal from being 
> re-used until the JN is restarted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3863) QJM: track last "committed" txid

2012-09-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446852#comment-13446852
 ] 

Eli Collins commented on HDFS-3863:
---

Agree w you and Chao Shi, nice change to the protocol.

Consider making committedTxId and lastCommittedTxId non-optional?  Why not use 
INVALID_TXID rather than 0 as a default value in the file and protocol for 
tracking the committed txid?

> QJM: track last "committed" txid
> 
>
> Key: HDFS-3863
> URL: https://issues.apache.org/jira/browse/HDFS-3863
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3863-prelim.txt, hdfs-3863.txt
>
>
> Per some discussion with [~stepinto] 
> [here|https://issues.apache.org/jira/browse/HDFS-3077?focusedCommentId=13422579&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422579],
>  we should keep track of the "last committed txid" on each JournalNode. Then 
> during any recovery operation, we can sanity-check that we aren't asked to 
> truncate a log to an earlier transaction.
> This is also a necessary step if we want to support reading from in-progress 
> segments in the future (since we should only allow reads up to the commit 
> point)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3869) QJM: expose non-file journal manager details in web UI

2012-09-01 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446847#comment-13446847
 ] 

Eli Collins commented on HDFS-3869:
---

+1  looks great

I'd consider mentioning in a comment above journals that is COW because even 
though FSEdit* is synchronized these uses may race with the Web UI (my 
understanding of why it is now a CopyOnWriteArrayList).

> QJM: expose non-file journal manager details in web UI
> --
>
> Key: HDFS-3869
> URL: https://issues.apache.org/jira/browse/HDFS-3869
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: dir-failed.png, hdfs-3869.txt, hdfs-3869.txt, 
> lagging-jn.png, open-for-read.png, open-for-write.png
>
>
> Currently, the NN web UI only contains NN storage directories on local disk. 
> It should also include details about any non-file JournalManagers in use.
> This JIRA targets the QJM branch, but will be useful for BKJM as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)

2012-09-01 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3876:
--

Attachment: hdfs-3876.txt

> NN should not RPC to self to find trash defaults (causes deadlock)
> --
>
> Key: HDFS-3876
> URL: https://issues.apache.org/jira/browse/HDFS-3876
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt
>
>
> When transitioning a SBN to active, I ran into the following situation:
> - the TrashPolicy first gets loaded by an IPC Server Handler thread. The 
> {{initialize}} function then tries to make an RPC to the same node to find 
> out the defaults.
> - This is happening inside the NN write lock (since it's part of the active 
> initialization). Hence, all of the other handler threads are already blocked 
> waiting to get the NN lock.
> - Since no handler threads are free, the RPC blocks forever and the NN never 
> enters active state.
> We need to have a general policy that the NN should never make RPCs to itself 
> for any reason, due to potential for deadlocks like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)

2012-09-01 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3876:
--

Attachment: (was: hdfs-3876.txt)

> NN should not RPC to self to find trash defaults (causes deadlock)
> --
>
> Key: HDFS-3876
> URL: https://issues.apache.org/jira/browse/HDFS-3876
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt
>
>
> When transitioning a SBN to active, I ran into the following situation:
> - the TrashPolicy first gets loaded by an IPC Server Handler thread. The 
> {{initialize}} function then tries to make an RPC to the same node to find 
> out the defaults.
> - This is happening inside the NN write lock (since it's part of the active 
> initialization). Hence, all of the other handler threads are already blocked 
> waiting to get the NN lock.
> - Since no handler threads are free, the RPC blocks forever and the NN never 
> enters active state.
> We need to have a general policy that the NN should never make RPCs to itself 
> for any reason, due to potential for deadlocks like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)

2012-09-01 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-3876:
--

Attachment: hdfs-3876.txt

TestViewFsTrash failed because the test deletes "/" and we now get server 
defaults on the path (to get the trash configuration), which fails for viewfs 
for "/" because "/" is not associated with a file system and we fail due to a 
NotInMoutPointException.

Updated patch, catch Exception rather than IOE when obtaining server defaults 
so we fail the delete when we fail to get server defaults (rather than 
potentially ignoring the server trash configuration for transient errors 
getting server defaults) and updated TestTrash to not fail the test when we 
fail due to obtaining server defaults (which is what should happen in the 
viewfs case).

> NN should not RPC to self to find trash defaults (causes deadlock)
> --
>
> Key: HDFS-3876
> URL: https://issues.apache.org/jira/browse/HDFS-3876
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.2.0-alpha
>Reporter: Todd Lipcon
>Assignee: Eli Collins
>Priority: Blocker
> Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt
>
>
> When transitioning a SBN to active, I ran into the following situation:
> - the TrashPolicy first gets loaded by an IPC Server Handler thread. The 
> {{initialize}} function then tries to make an RPC to the same node to find 
> out the defaults.
> - This is happening inside the NN write lock (since it's part of the active 
> initialization). Hence, all of the other handler threads are already blocked 
> waiting to get the NN lock.
> - Since no handler threads are free, the RPC blocks forever and the NN never 
> enters active state.
> We need to have a general policy that the NN should never make RPCs to itself 
> for any reason, due to potential for deadlocks like this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446788#comment-13446788
 ] 

Hudson commented on HDFS-2580:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2696 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2696/])
HDFS-2580. NameNode#main(...) can make use of GenericOptionsParser. 
Contributed by harsh. (harsh) (Revision 1379828)

 Result = FAILURE
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379828
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


> NameNode#main(...) can make use of GenericOptionsParser.
> 
>
> Key: HDFS-2580
> URL: https://issues.apache.org/jira/browse/HDFS-2580
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-2580.patch
>
>
> DataNode supports passing generic opts when calling via {{hdfs datanode}}. 
> NameNode can support the same thing as well, but doesn't right now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446785#comment-13446785
 ] 

Hudson commented on HDFS-2580:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2734 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2734/])
HDFS-2580. NameNode#main(...) can make use of GenericOptionsParser. 
Contributed by harsh. (harsh) (Revision 1379828)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379828
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


> NameNode#main(...) can make use of GenericOptionsParser.
> 
>
> Key: HDFS-2580
> URL: https://issues.apache.org/jira/browse/HDFS-2580
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-2580.patch
>
>
> DataNode supports passing generic opts when calling via {{hdfs datanode}}. 
> NameNode can support the same thing as well, but doesn't right now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446784#comment-13446784
 ] 

Hudson commented on HDFS-2580:
--

Integrated in Hadoop-Common-trunk-Commit #2671 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2671/])
HDFS-2580. NameNode#main(...) can make use of GenericOptionsParser. 
Contributed by harsh. (harsh) (Revision 1379828)

 Result = SUCCESS
harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379828
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java


> NameNode#main(...) can make use of GenericOptionsParser.
> 
>
> Key: HDFS-2580
> URL: https://issues.apache.org/jira/browse/HDFS-2580
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-2580.patch
>
>
> DataNode supports passing generic opts when calling via {{hdfs datanode}}. 
> NameNode can support the same thing as well, but doesn't right now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.

2012-09-01 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-2580:
--

  Resolution: Fixed
   Fix Version/s: 3.0.0
Target Version/s:   (was: 3.0.0)
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks Eli. Rebased and committed to trunk as r1379828.

> NameNode#main(...) can make use of GenericOptionsParser.
> 
>
> Key: HDFS-2580
> URL: https://issues.apache.org/jira/browse/HDFS-2580
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HDFS-2580.patch
>
>
> DataNode supports passing generic opts when calling via {{hdfs datanode}}. 
> NameNode can support the same thing as well, but doesn't right now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3854) Implement a fence method which should fence the BK shared storage.

2012-09-01 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446778#comment-13446778
 ] 

Uma Maheswara Rao G commented on HDFS-3854:
---

Per the discussion in HDFS-3862, we may disable the fencing option with single 
writer storage.

> Implement a fence method which should fence the BK shared storage.
> --
>
> Key: HDFS-3854
> URL: https://issues.apache.org/jira/browse/HDFS-3854
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Reporter: Uma Maheswara Rao G
>
> Currently when machine down or network down, SSHFence can not ensure that, 
> other node is completely down. So, fence will fail and switch will not happen.
> [ internally we did work around to return true when machine is not reachable, 
> as BKJM already has fencing]
> It may be good idea to implement a fence method, which should ensure shared 
> storage fenced propertly and return true.
> We can plug in this new method in ZKFC fence methods.
> only pain points what I can see is, we may have to put the BKJM jar in ZKFC 
> lib for running this fence method.
> thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3862) QJM: don't require a fencer to be configured if shared storage has built-in single-writer semantics

2012-09-01 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446777#comment-13446777
 ] 

Uma Maheswara Rao G commented on HDFS-3862:
---

Todd, It seems like reasonable to me. I also filed one JIRA to handle this 
situation with single writer HDFS-3854.
But I thought, we could simply provide a fence method which will fence the 
writer, that means that we have guaranteed that no other NN can access shared 
storage and then go for state change.
In fact if we are ok with leaving the fence to writer level, that is more good.
Currently also simply we have a dummy fence method, which will return true as 
BK already has fencing.

>From above suggestion, adding API in JournalManager, it may require to 
>creating the JournalManager for getting this info in ZKFC right?
How about simply adding one config parameter? 

> QJM: don't require a fencer to be configured if shared storage has built-in 
> single-writer semantics
> ---
>
> Key: HDFS-3862
> URL: https://issues.apache.org/jira/browse/HDFS-3862
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>
> Currently, NN HA requires that the administrator configure a fencing method 
> to ensure that only a single NameNode may write to the shared storage at a 
> time. Some shared edits storage implementations (like QJM) inherently enforce 
> single-writer semantics at the storage level, and thus the user should not be 
> forced to specify one.
> We should extend the JournalManager interface so that the HA code can operate 
> without a configured fencer if the JM has such built-in fencing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: NativeS3FileSystem problem

2012-09-01 Thread Chris Collins
Any comment on this?
On Aug 28, 2012, at 11:43 PM, Chris Collins  wrote:

> I was attempting to use the natives3 file system outside of doing any map 
> reduce tasks.  A simple task of trying to create a directory:
> 
> 
> FileSystem fs = FileSystem.get(uri, conf);
> Path currPath = new Path("/a/b/c");
>  fs.mkdirs(currPath);
> 
> ( I can provide full code if needed).
> 
> Anyway the class Jets3tNativeFileSystemStore attempts to detect if each key 
> part of the object path exists expecting a 404 response if it does not:
> 
> public FileMetadata retrieveMetadata(String key) throws IOException {
> try {
>   S3Object object = s3Service.getObjectDetails(bucket, key);
>   return new FileMetadata(key, object.getContentLength(),
>   object.getLastModifiedDate().getTime());
> } catch (S3ServiceException e) {
>   // Following is brittle. Is there a better way?
>   if (e.getMessage().contains("ResponseCode=404")) {
> return null;
>   }
>   if (e.getCause() instanceof IOException) {
> throw (IOException) e.getCause();
>   }
>   throw new S3Exception(e);
> }
>   }
> 
> All version of jets3 I have looked at that seem to have a compatible class 
> structure (don't blow on AWSCredentials) actually return an exception 
> containing ".ResponseCode: 404
> 
> I took a copy of the code in this directory and fixed the following to read:
> 
> public FileMetadata retrieveMetadata(String key) throws IOException {
> try {
>   S3Object object = s3Service.getObjectDetails(bucket, key);
>   return new FileMetadata(key, object.getContentLength(),
>   object.getLastModifiedDate().getTime());
> } catch (S3ServiceException e) {
>   // Following is brittle. Is there a better way?
>   if (e.getResponseCode() == 404) {
> return null;
>   }
>   if (e.getCause() instanceof IOException) {
> throw (IOException) e.getCause();
>   }
>   throw new S3Exception(e);
> }
>   }
> 
> which seems to fix the issue.  Am I missing something?  Also this seems to of 
> been broken for a variety of hadoop versions.  Does anyone actually use this 
> code path and if so is there a valid version combination that should of 
> worked for me?
> 
> Comments welcome.
> 
> Chris



[jira] [Commented] (HDFS-2434) TestNameNodeMetrics.testCorruptBlock fails intermittently

2012-09-01 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446737#comment-13446737
 ] 

Kihwal Lee commented on HDFS-2434:
--

The test case fails this way when the corrupt replica is fixed right away 
before gathering namenode metrics. In one example, 
computeReplicationWorkForBlocks() was done within 10ms of the block corruption 
and the datanode did heartbeat in 380ms. The block corruption was resolved 
completely in 13ms after that.  

Since replication monitor and dn heartbeats are asynchronous, the current way 
of sleeping for 1 sec is not a reliable way to hit a moment between the two.

> TestNameNodeMetrics.testCorruptBlock fails intermittently
> -
>
> Key: HDFS-2434
> URL: https://issues.apache.org/jira/browse/HDFS-2434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Reporter: Uma Maheswara Rao G
>
> java.lang.AssertionError: Bad value for metric CorruptBlocks expected:<1> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:91)
>   at org.junit.Assert.failNotEquals(Assert.java:645)
>   at org.junit.Assert.assertEquals(Assert.java:126)
>   at org.junit.Assert.assertEquals(Assert.java:470)
>   at 
> org.apache.hadoop.test.MetricsAsserts.assertGauge(MetricsAsserts.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.__CLR3_0_2t8sh531i1k(TestNameNodeMetrics.java:175)
>   at 
> org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.testCorruptBlock(TestNameNodeMetrics.java:164)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at junit.framework.TestCase.runTest(TestCase.java:168)
>   at junit.framework.TestCase.runBare(TestCase.java:134)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3466) The SPNEGO filter for the NameNode should come out of the web keytab file

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446718#comment-13446718
 ] 

Hudson commented on HDFS-3466:
--

Integrated in Hadoop-Mapreduce-trunk #1183 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/])
HDFS-3466. Get HTTP kerberos principal from the web authentication keytab.
(omalley) (Revision 1379646)

 Result = SUCCESS
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379646
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java


> The SPNEGO filter for the NameNode should come out of the web keytab file
> -
>
> Key: HDFS-3466
> URL: https://issues.apache.org/jira/browse/HDFS-3466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node, security
>Affects Versions: 1.1.0, 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 1.1.0, 2.1.0-alpha
>
> Attachments: hdfs-3466-b1-2.patch, hdfs-3466-b1.patch, 
> hdfs-3466-trunk-2.patch, hdfs-3466-trunk-3.patch
>
>
> Currently, the spnego filter uses the DFS_NAMENODE_KEYTAB_FILE_KEY to find 
> the keytab. It should use the DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY to 
> do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3852) TestHftpDelegationToken is broken after HADOOP-8225

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446715#comment-13446715
 ] 

Hudson commented on HDFS-3852:
--

Integrated in Hadoop-Mapreduce-trunk #1183 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/])
HDFS-3852. TestHftpDelegationToken is broken after HADOOP-8225 (daryn) 
(Revision 1379623)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379623
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java


> TestHftpDelegationToken is broken after HADOOP-8225
> ---
>
> Key: HDFS-3852
> URL: https://issues.apache.org/jira/browse/HDFS-3852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, security
>Affects Versions: 0.23.3, 2.1.0-alpha
>Reporter: Aaron T. Myers
>Assignee: Daryn Sharp
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: HDFS-3852.patch
>
>
> It's been failing in all builds for the last 2 days or so. Git bisect 
> indicates that it's due to HADOOP-8225.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446713#comment-13446713
 ] 

Hudson commented on HDFS-3873:
--

Integrated in Hadoop-Mapreduce-trunk #1183 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/])
HDFS-3873. Hftp assumes security is disabled if token fetch fails (daryn) 
(Revision 1379615)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379615
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java


> Hftp assumes security is disabled if token fetch fails
> --
>
> Key: HDFS-3873
> URL: https://issues.apache.org/jira/browse/HDFS-3873
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch
>
>
> Hftp ignores all exceptions generated while trying to get a token, based on 
> the assumption that it means security is disabled.  Debugging problems is 
> excruciatingly difficult when security is enabled but something goes wrong.  
> Job submissions succeed, but tasks fail because the NN rejects the user as 
> unauthenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446710#comment-13446710
 ] 

Hudson commented on HDFS-3871:
--

Integrated in Hadoop-Mapreduce-trunk #1183 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/])
HDFS-3871. Change NameNodeProxies to use RetryUtils.  Contributed by Arun C 
Murthy (Revision 1379743)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


> Change NameNodeProxies to use HADOOP-8748
> -
>
> Key: HDFS-3871
> URL: https://issues.apache.org/jira/browse/HDFS-3871
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>Priority: Minor
> Fix For: 1.2.0, 2.2.0-alpha
>
> Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch
>
>
> Change NameNodeProxies to use util method introduced via HADOOP-8748.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3833) TestDFSShell fails on Windows due to file concurrent read write

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446706#comment-13446706
 ] 

Hudson commented on HDFS-3833:
--

Integrated in Hadoop-Mapreduce-trunk #1183 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/])
HDFS-3833. TestDFSShell fails on windows due to concurrent file read/write. 
Contributed by Brandon Li (Revision 1379525)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379525
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java


> TestDFSShell fails on Windows due to file concurrent read write
> ---
>
> Key: HDFS-3833
> URL: https://issues.apache.org/jira/browse/HDFS-3833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 1-win
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: 3.0.0, 1-win, 2.2.0-alpha
>
> Attachments: HDFS-3833.branch-1-win.patch, HDFS-3833.patch
>
>
> TestDFSShell sometimes fails due to the race between the write issued by the 
> test and blockscanner. Example stack trace:
> {noformat}
> Error Message
> c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The 
> requested operation cannot be performed on a file with a user-mapped section 
> open)
> Stacktrace
> java.io.FileNotFoundException: 
> c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The 
> requested operation cannot be performed on a file with a user-mapped section 
> open)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:194)
>   at java.io.FileOutputStream.(FileOutputStream.java:145)
>   at java.io.PrintWriter.(PrintWriter.java:218)
>   at org.apache.hadoop.hdfs.TestDFSShell.corrupt(TestDFSShell.java:1133)
>   at org.apache.hadoop.hdfs.TestDFSShell.testGet(TestDFSShell.java:1231)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3466) The SPNEGO filter for the NameNode should come out of the web keytab file

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446690#comment-13446690
 ] 

Hudson commented on HDFS-3466:
--

Integrated in Hadoop-Hdfs-trunk #1152 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/])
HDFS-3466. Get HTTP kerberos principal from the web authentication keytab.
(omalley) (Revision 1379646)

 Result = SUCCESS
omalley : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379646
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java


> The SPNEGO filter for the NameNode should come out of the web keytab file
> -
>
> Key: HDFS-3466
> URL: https://issues.apache.org/jira/browse/HDFS-3466
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node, security
>Affects Versions: 1.1.0, 2.0.0-alpha
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 1.1.0, 2.1.0-alpha
>
> Attachments: hdfs-3466-b1-2.patch, hdfs-3466-b1.patch, 
> hdfs-3466-trunk-2.patch, hdfs-3466-trunk-3.patch
>
>
> Currently, the spnego filter uses the DFS_NAMENODE_KEYTAB_FILE_KEY to find 
> the keytab. It should use the DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY to 
> do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3852) TestHftpDelegationToken is broken after HADOOP-8225

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446687#comment-13446687
 ] 

Hudson commented on HDFS-3852:
--

Integrated in Hadoop-Hdfs-trunk #1152 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/])
HDFS-3852. TestHftpDelegationToken is broken after HADOOP-8225 (daryn) 
(Revision 1379623)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379623
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java


> TestHftpDelegationToken is broken after HADOOP-8225
> ---
>
> Key: HDFS-3852
> URL: https://issues.apache.org/jira/browse/HDFS-3852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, security
>Affects Versions: 0.23.3, 2.1.0-alpha
>Reporter: Aaron T. Myers
>Assignee: Daryn Sharp
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: HDFS-3852.patch
>
>
> It's been failing in all builds for the last 2 days or so. Git bisect 
> indicates that it's due to HADOOP-8225.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446685#comment-13446685
 ] 

Hudson commented on HDFS-3873:
--

Integrated in Hadoop-Hdfs-trunk #1152 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/])
HDFS-3873. Hftp assumes security is disabled if token fetch fails (daryn) 
(Revision 1379615)

 Result = SUCCESS
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379615
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java


> Hftp assumes security is disabled if token fetch fails
> --
>
> Key: HDFS-3873
> URL: https://issues.apache.org/jira/browse/HDFS-3873
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch
>
>
> Hftp ignores all exceptions generated while trying to get a token, based on 
> the assumption that it means security is disabled.  Debugging problems is 
> excruciatingly difficult when security is enabled but something goes wrong.  
> Job submissions succeed, but tasks fail because the NN rejects the user as 
> unauthenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3833) TestDFSShell fails on Windows due to file concurrent read write

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446682#comment-13446682
 ] 

Hudson commented on HDFS-3833:
--

Integrated in Hadoop-Hdfs-trunk #1152 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/])
HDFS-3833. TestDFSShell fails on windows due to concurrent file read/write. 
Contributed by Brandon Li (Revision 1379525)

 Result = SUCCESS
suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379525
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java


> TestDFSShell fails on Windows due to file concurrent read write
> ---
>
> Key: HDFS-3833
> URL: https://issues.apache.org/jira/browse/HDFS-3833
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0, 1-win
>Reporter: Brandon Li
>Assignee: Brandon Li
> Fix For: 3.0.0, 1-win, 2.2.0-alpha
>
> Attachments: HDFS-3833.branch-1-win.patch, HDFS-3833.patch
>
>
> TestDFSShell sometimes fails due to the race between the write issued by the 
> test and blockscanner. Example stack trace:
> {noformat}
> Error Message
> c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The 
> requested operation cannot be performed on a file with a user-mapped section 
> open)
> Stacktrace
> java.io.FileNotFoundException: 
> c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The 
> requested operation cannot be performed on a file with a user-mapped section 
> open)
>   at java.io.FileOutputStream.open(Native Method)
>   at java.io.FileOutputStream.(FileOutputStream.java:194)
>   at java.io.FileOutputStream.(FileOutputStream.java:145)
>   at java.io.PrintWriter.(PrintWriter.java:218)
>   at org.apache.hadoop.hdfs.TestDFSShell.corrupt(TestDFSShell.java:1133)
>   at org.apache.hadoop.hdfs.TestDFSShell.testGet(TestDFSShell.java:1231)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446676#comment-13446676
 ] 

Hudson commented on HDFS-3871:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2695 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2695/])
HDFS-3871. Change NameNodeProxies to use RetryUtils.  Contributed by Arun C 
Murthy (Revision 1379743)

 Result = FAILURE
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


> Change NameNodeProxies to use HADOOP-8748
> -
>
> Key: HDFS-3871
> URL: https://issues.apache.org/jira/browse/HDFS-3871
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>Priority: Minor
> Fix For: 1.2.0, 2.2.0-alpha
>
> Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch
>
>
> Change NameNodeProxies to use util method introduced via HADOOP-8748.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446669#comment-13446669
 ] 

Hudson commented on HDFS-3873:
--

Integrated in Hadoop-Hdfs-0.23-Build #361 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/361/])
HDFS-3873. Hftp assumes security is disabled if token fetch fails (daryn) 
(Revision 1379620)

 Result = UNSTABLE
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379620
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java


> Hftp assumes security is disabled if token fetch fails
> --
>
> Key: HDFS-3873
> URL: https://issues.apache.org/jira/browse/HDFS-3873
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client
>Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch
>
>
> Hftp ignores all exceptions generated while trying to get a token, based on 
> the assumption that it means security is disabled.  Debugging problems is 
> excruciatingly difficult when security is enabled but something goes wrong.  
> Job submissions succeed, but tasks fail because the NN rejects the user as 
> unauthenticated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3852) TestHftpDelegationToken is broken after HADOOP-8225

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446672#comment-13446672
 ] 

Hudson commented on HDFS-3852:
--

Integrated in Hadoop-Hdfs-0.23-Build #361 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/361/])
svn merge -c 1379623 FIXES: HDFS-3852. TestHftpDelegationToken is broken 
after HADOOP-8225 (daryn) (Revision 1379627)

 Result = UNSTABLE
daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379627
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java


> TestHftpDelegationToken is broken after HADOOP-8225
> ---
>
> Key: HDFS-3852
> URL: https://issues.apache.org/jira/browse/HDFS-3852
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs client, security
>Affects Versions: 0.23.3, 2.1.0-alpha
>Reporter: Aaron T. Myers
>Assignee: Daryn Sharp
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: HDFS-3852.patch
>
>
> It's been failing in all builds for the last 2 days or so. Git bisect 
> indicates that it's due to HADOOP-8225.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?

2012-09-01 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446659#comment-13446659
 ] 

Harsh J commented on HDFS-3886:
---

Thanks much Steve. Perhaps instead can we have the shutdown scripts call a 
savenamespace pre signal? That way we sorta achieve the same?

> Shutdown requests can possibly check for checkpoint issues (corrupted edits) 
> and save a good namespace copy before closing down?
> 
>
> Key: HDFS-3886
> URL: https://issues.apache.org/jira/browse/HDFS-3886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Priority: Minor
>
> HDFS-3878 sorta gives me this idea. Aside of having a method to download it 
> to a different location, we can also lock up the namesystem (or deactivate 
> the client rpc server) and save the namesystem before we complete up the 
> shutdown.
> The init.d/shutdown scripts would have to work with this somehow though, to 
> not kill -9 it when in-process. Also, the new image may be stored in a 
> shutdown.chkpt directory, to not interfere in the regular dirs, but still 
> allow easier recovery.
> Obviously this will still not work if all directories are broken. So maybe we 
> could have some configs to tackle that as well?
> I haven't thought this through, so let me know what part is wrong to do :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446656#comment-13446656
 ] 

Hudson commented on HDFS-3871:
--

Integrated in Hadoop-Common-trunk-Commit #2670 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2670/])
HDFS-3871. Change NameNodeProxies to use RetryUtils.  Contributed by Arun C 
Murthy (Revision 1379743)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


> Change NameNodeProxies to use HADOOP-8748
> -
>
> Key: HDFS-3871
> URL: https://issues.apache.org/jira/browse/HDFS-3871
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>Priority: Minor
> Fix For: 1.2.0, 2.2.0-alpha
>
> Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch
>
>
> Change NameNodeProxies to use util method introduced via HADOOP-8748.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748

2012-09-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446658#comment-13446658
 ] 

Hudson commented on HDFS-3871:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2733 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2733/])
HDFS-3871. Change NameNodeProxies to use RetryUtils.  Contributed by Arun C 
Murthy (Revision 1379743)

 Result = SUCCESS
szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java


> Change NameNodeProxies to use HADOOP-8748
> -
>
> Key: HDFS-3871
> URL: https://issues.apache.org/jira/browse/HDFS-3871
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>Priority: Minor
> Fix For: 1.2.0, 2.2.0-alpha
>
> Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch
>
>
> Change NameNodeProxies to use util method introduced via HADOOP-8748.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748

2012-09-01 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated HDFS-3871:
-

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   1.2.0
   Status: Resolved  (was: Patch Available)

No problem.

I have committed this.  Thanks, Arun!

> Change NameNodeProxies to use HADOOP-8748
> -
>
> Key: HDFS-3871
> URL: https://issues.apache.org/jira/browse/HDFS-3871
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs client
>Reporter: Arun C Murthy
>Assignee: Arun C Murthy
>Priority: Minor
> Fix For: 1.2.0, 2.2.0-alpha
>
> Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch
>
>
> Change NameNodeProxies to use util method introduced via HADOOP-8748.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?

2012-09-01 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446653#comment-13446653
 ] 

Steve Loughran commented on HDFS-3886:
--

currently a kill -2 is the event sent from init.d to trigger a managed 
shutdown, but it needs to complete within a bounded period, otherwise robust 
init.d/ linux HA scripts will escalate to a -9; this is because they need to 
reliably shut down the system.

Any change that reverts service scripts from having timeout+escalation would be 
counterproductive from a service management perspective.

Now, if there were another signal handler that triggered lock up and system 
save, that could be good -but that would lie outside init.d land

> Shutdown requests can possibly check for checkpoint issues (corrupted edits) 
> and save a good namespace copy before closing down?
> 
>
> Key: HDFS-3886
> URL: https://issues.apache.org/jira/browse/HDFS-3886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Priority: Minor
>
> HDFS-3878 sorta gives me this idea. Aside of having a method to download it 
> to a different location, we can also lock up the namesystem (or deactivate 
> the client rpc server) and save the namesystem before we complete up the 
> shutdown.
> The init.d/shutdown scripts would have to work with this somehow though, to 
> not kill -9 it when in-process. Also, the new image may be stored in a 
> shutdown.chkpt directory, to not interfere in the regular dirs, but still 
> allow easier recovery.
> Obviously this will still not work if all directories are broken. So maybe we 
> could have some configs to tackle that as well?
> I haven't thought this through, so let me know what part is wrong to do :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?

2012-09-01 Thread Harsh J (JIRA)
Harsh J created HDFS-3886:
-

 Summary: Shutdown requests can possibly check for checkpoint 
issues (corrupted edits) and save a good namespace copy before closing down?
 Key: HDFS-3886
 URL: https://issues.apache.org/jira/browse/HDFS-3886
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Priority: Minor


HDFS-3878 sorta gives me this idea. Aside of having a method to download it to 
a different location, we can also lock up the namesystem (or deactivate the 
client rpc server) and save the namesystem before we complete up the shutdown.

The init.d/shutdown scripts would have to work with this somehow though, to not 
kill -9 it when in-process. Also, the new image may be stored in a 
shutdown.chkpt directory, to not interfere in the regular dirs, but still allow 
easier recovery.

Obviously this will still not work if all directories are broken. So maybe we 
could have some configs to tackle that as well?

I haven't thought this through, so let me know what part is wrong to do :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-3885) QJM: optimize log sync when JN is lagging behind

2012-09-01 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3885:
-

 Summary: QJM: optimize log sync when JN is lagging behind
 Key: HDFS-3885
 URL: https://issues.apache.org/jira/browse/HDFS-3885
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon


This is a potential optimization that we can add to the JournalNode: when one 
of the nodes is lagging behind the others (eg because its local disk is slower 
or there was a network blip), it receives edits after they've been committed to 
a majority. It can tell this because the committed txid included in the request 
info is higher than the highest txid in the actual batch to be written. In this 
case, we know that this batch has already been fsynced to a quorum of nodes, so 
we can skip the fsync() on the laggy node, helping it to catch back up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3870) QJM: add metrics to JournalNode

2012-09-01 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3870:
--

Attachment: hdfs-3870.txt

> QJM: add metrics to JournalNode
> ---
>
> Key: HDFS-3870
> URL: https://issues.apache.org/jira/browse/HDFS-3870
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3870.txt
>
>
> The JournalNode should expose some basic metrics through the usual interface. 
> In particular:
> - the writer epoch, accepted epoch,
> - the last written transaction ID and last committed txid (which may be newer 
> in case that it's in the process of catching up)
> - latency information for how long the syncs are taking
> Please feel free to suggest others that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3884) QJM: Journal format() should reset cached values

2012-09-01 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3884:
--

Attachment: hdfs-3884.txt

> QJM: Journal format() should reset cached values
> 
>
> Key: HDFS-3884
> URL: https://issues.apache.org/jira/browse/HDFS-3884
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: QuorumJournalManager (HDFS-3077)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: QuorumJournalManager (HDFS-3077)
>
> Attachments: hdfs-3884.txt
>
>
> Simple bug in the JournalNode: it caches certain values (eg accepted epoch) 
> in memory, and the cached values aren't reset when the journal is formatted. 
> So, after a format, further calls to the same Journal will see the old value 
> for accepted epoch, writer epoch, etc, preventing the journal from being 
> re-used until the JN is restarted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-3884) QJM: Journal format() should reset cached values

2012-09-01 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-3884:
-

 Summary: QJM: Journal format() should reset cached values
 Key: HDFS-3884
 URL: https://issues.apache.org/jira/browse/HDFS-3884
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: QuorumJournalManager (HDFS-3077)
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: QuorumJournalManager (HDFS-3077)
 Attachments: hdfs-3884.txt

Simple bug in the JournalNode: it caches certain values (eg accepted epoch) in 
memory, and the cached values aren't reset when the journal is formatted. So, 
after a format, further calls to the same Journal will see the old value for 
accepted epoch, writer epoch, etc, preventing the journal from being re-used 
until the JN is restarted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira