[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183911#comment-13183911
 ] 

Todd Lipcon commented on HDFS-2742:
---

bq. What is the implication of ignoring RBW altogether at the standby?

That's an idea I've thought a little about, but I think it has some 
implications for lease recovery. In actuality, in order to fix the cases in 
HDFS-2691, I think we need to send RBW blockReceived messages to the SBN as 
soon as a pipeline is constructed.

I do like it, though, as at least a stop-gap for now while we work on a more 
thorough solution.

bq. If editlog has a finalized record, can we just ignore the RBW from the 
block report?

Possibly - I haven't thought through the whole Append state machine. I assumed 
that the code that marks a RBW replica as corrupt when received for a COMPLETED 
block is probably there for a good reason... so changing the behavior there 
might introduce some other bugs that could even hurt the non-HA case.

I'm going to keep working on this and see if I can come up with a simpler 
solution based on some of Suresh's ideas above.

> HA: observed dataloss in replication stress test
> 
>
> Key: HDFS-2742
> URL: https://issues.apache.org/jira/browse/HDFS-2742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-2742.txt, log-colorized.txt
>
>
> The replication stress test case failed over the weekend since one of the 
> replicas went missing. Still diagnosing the issue, but it seems like the 
> chain of events was something like:
> - a block report was generated on one of the nodes while the block was being 
> written - thus the block report listed the block as RBW
> - when the standby replayed this queued message, it was replayed after the 
> file was marked complete. Thus it marked this replica as corrupt
> - it asked the DN holding the corrupt replica to delete it. And, I think, 
> removed it from the block map at this time.
> - That DN then did another block report before receiving the deletion. This 
> caused it to be re-added to the block map, since it was "FINALIZED" now.
> - Replication was lowered on the file, and it counted the above replica as 
> non-corrupt, and asked for the other replicas to be deleted.
> - All replicas were lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2766) HA: test for case where standby partially reads log and then performs checkpoint

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183902#comment-13183902
 ] 

Todd Lipcon commented on HDFS-2766:
---

+1 lgtm.

> HA: test for case where standby partially reads log and then performs 
> checkpoint
> 
>
> Key: HDFS-2766
> URL: https://issues.apache.org/jira/browse/HDFS-2766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: HDFS-2766-HDFS-1623.patch, HDFS-2766-HDFS-1623.patch
>
>
> Here's a potential bug case that we don't currently test for:
> - SBN is reading a finalized edits file when NFS disappears halfway through 
> (or some intermittent error happens)
> - SBN performs a checkpoint and uploads it to the NN
> - NN receives a checkpoint that doesn't correspond to the end of any log 
> segment
> - Both NN and SBN should be able to restart at this point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-2775.
---

   Resolution: Fixed
Fix Version/s: HA branch (HDFS-1623)
 Hadoop Flags: Reviewed

Committed to branch, thx.

> HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently
> ---
>
> Key: HDFS-2775
> URL: https://issues.apache.org/jira/browse/HDFS-2775
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623)
>
> Attachments: hdfs-2775.txt
>
>
> This test is failing periodically on this assertion:
> {code}
> assertEquals(12, nn0.getNamesystem().getFSImage().getStorage()
> .getMostRecentCheckpointTxId());
> {code}
> My guess is it's a test race. Investigating...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183900#comment-13183900
 ] 

Todd Lipcon commented on HDFS-2775:
---

bq. Should FSImage#getMostRecentCheckpointTxId perhaps be marked 
@VisibleForTesting?

Eh, I don't see any reason it shouldn't be used elsewhere in the code either. I 
generally try to only use that when you're exposing some piece of internal 
state that shouldn't normally be used from the main non-test code.

> HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently
> ---
>
> Key: HDFS-2775
> URL: https://issues.apache.org/jira/browse/HDFS-2775
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623)
>
> Attachments: hdfs-2775.txt
>
>
> This test is failing periodically on this assertion:
> {code}
> assertEquals(12, nn0.getNamesystem().getFSImage().getStorage()
> .getMostRecentCheckpointTxId());
> {code}
> My guess is it's a test race. Investigating...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183899#comment-13183899
 ] 

Todd Lipcon commented on HDFS-2738:
---

+1, looks good to me. Thanks for making those changes.

> FSEditLog.selectinputStreams is reading through in-progress streams even when 
> non-in-progress are requested
> ---
>
> Key: HDFS-2738
> URL: https://issues.apache.org/jira/browse/HDFS-2738
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
>Priority: Blocker
> Attachments: HDFS-2738-HDFS-1623.patch, HDFS-2738-HDFS-1623.patch, 
> HDFS-2738-HDFS-1623.patch
>
>
> The new code in HDFS-1580 is causing an issue with selectInputStreams in the 
> HA context. When the active is writing to the shared edits, 
> selectInputStreams is called on the standby. This ends up calling 
> {{journalSet.getInputStream}} but doesn't pass the {{inProgressOk=false}} 
> flag. So, {{getInputStream}} ends up reading and validating the in-progress 
> stream unnecessarily. Since the validation results are no longer properly 
> cached, {{findMaxTransaction}} also re-validates the in-progress stream, and 
> then breaks the corruption check in this code. The end result is a lot of 
> errors like:
> 2011-12-30 16:45:02,521 ERROR namenode.FileJournalManager 
> (FileJournalManager.java:getNumberOfTransactions(266)) - Gap in transactions, 
> max txnid is 579, 0 txns from 578
> 2011-12-30 16:45:02,521 INFO  ha.EditLogTailer (EditLogTailer.java:run(163)) 
> - Got error, will try again.
> java.io.IOException: No non-corrupt logs for txid 578
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.getInputStream(JournalSet.java:229)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1081)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:115)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$0(EditLogTailer.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:154)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2773) HA: reading edit logs from an earlier version leaves blocks in under-construction state

2012-01-10 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-2773.
---

   Resolution: Fixed
Fix Version/s: HA branch (HDFS-1623)
 Hadoop Flags: Reviewed

Committed to branch, thx for review.

> HA: reading edit logs from an earlier version leaves blocks in 
> under-construction state
> ---
>
> Key: HDFS-2773
> URL: https://issues.apache.org/jira/browse/HDFS-2773
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: HA branch (HDFS-1623)
>
> Attachments: hadoop-1.0-multiblock-file.tgz, hdfs-2773.txt
>
>
> In HDFS-2602, the code for applying OP_ADD and OP_CLOSE was changed a bit, 
> and the new code has the following problem: if an OP_CLOSE includes new 
> blocks (ie not previously seen in an OP_ADD) then those blocks will remain in 
> the "under construction" state rather than being marked "complete". This is 
> because {{updateBlocks}} always creates {{BlockInfoUnderConstruction}} 
> regardless of the opcode. This bug only affects the upgrade path, since in 
> trunk we always persist blocks with OP_ADDs before we call OP_CLOSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183895#comment-13183895
 ] 

Uma Maheswara Rao G commented on HDFS-2592:
---

Thanks a lot again, Todd. I will address all your comments in next patch. 
Infact i already started the refactoring, mainly to avoid the duplicates. Was 
waiting for the initial feedback on approach.


Thanks
Uma

> HA: Balancer support for HA namenodes
> -
>
> Key: HDFS-2592
> URL: https://issues.apache.org/jira/browse/HDFS-2592
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2592.patch, HDFS-2592.patch
>
>
> The balancer currently interacts directly with namenode InetSocketAddresses 
> and makes its own IPC proxies. We need to integrate it with HA so that it 
> uses the same client failover infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2499) Fix RPC client creation bug from HDFS-2459

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183896#comment-13183896
 ] 

Hudson commented on HDFS-2499:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1544 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1544/])
Add HDFS-2499 to CHANGES.txt.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229897
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix RPC client creation bug from HDFS-2459
> --
>
> Key: HDFS-2499
> URL: https://issues.apache.org/jira/browse/HDFS-2499
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.24.0
>
> Attachments: HDFS-2499.txt, HDFS-2499.txt
>
>
> HDFS-2459 incorrectly implemented the RPC getProxy for the JournalProtocol 
> client side. It sets retry policies and other policies that are not necessary

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2773) HA: reading edit logs from an earlier version leaves blocks in under-construction state

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183894#comment-13183894
 ] 

Todd Lipcon commented on HDFS-2773:
---

I added the following:
{code}
+  // OP_CLOSE should add finalized blocks. This code path
+  // is only executed when loading edits written by prior
+  // versions of Hadoop. Current versions always log
+  // OP_ADD operations as each block is allocated.
+  newBI = new BlockInfo(newBlock, file.getReplication());
{code}
Will commit momentarily.

> HA: reading edit logs from an earlier version leaves blocks in 
> under-construction state
> ---
>
> Key: HDFS-2773
> URL: https://issues.apache.org/jira/browse/HDFS-2773
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hadoop-1.0-multiblock-file.tgz, hdfs-2773.txt
>
>
> In HDFS-2602, the code for applying OP_ADD and OP_CLOSE was changed a bit, 
> and the new code has the following problem: if an OP_CLOSE includes new 
> blocks (ie not previously seen in an OP_ADD) then those blocks will remain in 
> the "under construction" state rather than being marked "complete". This is 
> because {{updateBlocks}} always creates {{BlockInfoUnderConstruction}} 
> regardless of the opcode. This bug only affects the upgrade path, since in 
> trunk we always persist blocks with OP_ADDs before we call OP_CLOSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-2753.
---

   Resolution: Fixed
Fix Version/s: HA branch (HDFS-1623)
 Hadoop Flags: Reviewed

> Standby namenode stuck in safemode during a failover
> 
>
> Key: HDFS-2753
> URL: https://issues.apache.org/jira/browse/HDFS-2753
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Fix For: HA branch (HDFS-1623)
>
> Attachments: HDFS-2753.patch, hdfs-2753.txt, hdfs-2753.txt
>
>
> Write traffic initiated from the client. Manual failover is done by killing 
> NN and converting a  different standby to active. NN is restarted as standby. 
> The restarted standby stays in safemode forever. More information in the 
> description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2527) Remove the use of Range header from webhdfs

2012-01-10 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2527:
--

Target Version/s:   (was: 1.0.0)
   Fix Version/s: (was: 1.1.0)
  (was: 0.23.0)

This was included in branch-23.0 but not shipped as part of the 23.0 release.

> Remove the use of Range header from webhdfs
> ---
>
> Key: HDFS-2527
> URL: https://issues.apache.org/jira/browse/HDFS-2527
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.24.0, 0.23.1, 1.0.0
>
> Attachments: h2527_2001b_0.20s.patch, h2527_2002.patch, 
> h2527_2002_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2416) distcp with a webhdfs uri on a secure cluster fails

2012-01-10 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2416:
--

Target Version/s:   (was: 1.0.0)
   Fix Version/s: (was: 1.1.0)
  (was: 0.23.0)
  0.23.1

This was included in branch-23.0 but not shipped as part of the 23.0 release.

> distcp with a webhdfs uri on a secure cluster fails
> ---
>
> Key: HDFS-2416
> URL: https://issues.apache.org/jira/browse/HDFS-2416
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Jitendra Nath Pandey
> Fix For: 0.24.0, 0.23.1, 1.0.0
>
> Attachments: HDFS-2416-branch-0.20-security.6.patch, 
> HDFS-2416-branch-0.20-security.7.patch, 
> HDFS-2416-branch-0.20-security.8.patch, HDFS-2416-branch-0.20-security.patch, 
> HDFS-2416-trunk.patch, HDFS-2416-trunk.patch, 
> HDFS-2419-branch-0.20-security.patch, HDFS-2419-branch-0.20-security.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2539) Support doAs and GETHOMEDIRECTORY in webhdfs

2012-01-10 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2539:
--

Fix Version/s: (was: 1.1.0)
   (was: 0.23.0)

This was included in branch-23.0 but not shipped as part of the 23.0 release.

> Support doAs and GETHOMEDIRECTORY in webhdfs
> 
>
> Key: HDFS-2539
> URL: https://issues.apache.org/jira/browse/HDFS-2539
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.24.0, 0.23.1, 1.0.0
>
> Attachments: h2539_2008.patch, h2539_2008_0.20s.patch, 
> h2539_2008_0.20s.patch, h2539_2009.patch, h2539_2009_0.20s.patch, 
> h2539_2009b.patch, h2539_2009b_0.20s.patch, h2539_2009c.patch, 
> h2539_2009c_0.20s.patch, h2539_2010.patch, 
> h2539_2010_0.20s.patch, h2539_2010b.patch, h2539_2010b_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2528) webhdfs rest call to a secure dn fails when a token is sent

2012-01-10 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2528:
--

Target Version/s:   (was: 1.0.0)
   Fix Version/s: (was: 1.1.0)
  (was: 0.23.0)

This was included in branch-23.0 but not shipped as part of the 23.0 release.

> webhdfs rest call to a secure dn fails when a token is sent
> ---
>
> Key: HDFS-2528
> URL: https://issues.apache.org/jira/browse/HDFS-2528
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 0.20.205.0
>Reporter: Arpit Gupta
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.24.0, 0.23.1, 1.0.0
>
> Attachments: h2528_2001.patch, h2528_2001_0.20s.patch, 
> h2528_2001b.patch, h2528_2001b_0.20s.patch, h2528_2002.patch, 
> h2528_2002_0.20s.patch, h2528_2003.patch, h2528_2003_0.20s.patch, 
> h2528_2003_0.20s.patch
>
>
> curl -L -u : --negotiate -i 
> "http://NN:50070/webhdfs/v1/tmp/webhdfs_data/file_small_data.txt?op=OPEN";
> the following exception is thrown by the datanode when the redirect happens.
> {"RemoteException":{"exception":"IOException","javaClassName":"java.io.IOException","message":"Call
>  to  failed on local exception: java.io.IOException: 
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]"}}
> Interestingly when using ./bin/hadoop with a webhdfs path we are able to cat 
> or tail a file successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2540) Change WebHdfsFileSystem to two-step create/append

2012-01-10 Thread Eli Collins (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins updated HDFS-2540:
--

Target Version/s:   (was: 1.0.0)
   Fix Version/s: (was: 1.1.0)
  (was: 0.23.0)

This was included in branch-23.0 but not shipped as part of the 23.0 release.

> Change WebHdfsFileSystem to two-step create/append
> --
>
> Key: HDFS-2540
> URL: https://issues.apache.org/jira/browse/HDFS-2540
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: 0.24.0, 0.23.1, 1.0.0
>
> Attachments: h2540_2007.patch, h2540_2007_0.20s.patch, 
> h2540_2008.patch, h2540_2008_0.20s.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Uma Maheswara Rao G (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G reassigned HDFS-2767:
-

Assignee: Uma Maheswara Rao G  (was: Todd Lipcon)

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Blocker
> Attachments: HDFS-2767.patch, hdfs-2767-what-todd-had.txt
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2753:
--

Attachment: hdfs-2753.txt

Adjusted the test case javadoc to explain the queueing more clearly. I also 
fixed a bad javadoc @link higher up in the test cases I noticed while I was in 
there. Will commit this momentarily.

> Standby namenode stuck in safemode during a failover
> 
>
> Key: HDFS-2753
> URL: https://issues.apache.org/jira/browse/HDFS-2753
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFS-2753.patch, hdfs-2753.txt, hdfs-2753.txt
>
>
> Write traffic initiated from the client. Manual failover is done by killing 
> NN and converting a  different standby to active. NN is restarted as standby. 
> The restarted standby stays in safemode forever. More information in the 
> description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2499) Fix RPC client creation bug from HDFS-2459

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183883#comment-13183883
 ] 

Hudson commented on HDFS-2499:
--

Integrated in Hadoop-Common-trunk-Commit #1525 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1525/])
Add HDFS-2499 to CHANGES.txt.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229897
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix RPC client creation bug from HDFS-2459
> --
>
> Key: HDFS-2499
> URL: https://issues.apache.org/jira/browse/HDFS-2499
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.24.0
>
> Attachments: HDFS-2499.txt, HDFS-2499.txt
>
>
> HDFS-2459 incorrectly implemented the RPC getProxy for the JournalProtocol 
> client side. It sets retry policies and other policies that are not necessary

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2499) Fix RPC client creation bug from HDFS-2459

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183882#comment-13183882
 ] 

Hudson commented on HDFS-2499:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1598 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1598/])
Add HDFS-2499 to CHANGES.txt.

szetszwo : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229897
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


> Fix RPC client creation bug from HDFS-2459
> --
>
> Key: HDFS-2499
> URL: https://issues.apache.org/jira/browse/HDFS-2499
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.24.0
>
> Attachments: HDFS-2499.txt, HDFS-2499.txt
>
>
> HDFS-2459 incorrectly implemented the RPC getProxy for the JournalProtocol 
> client side. It sets retry policies and other policies that are not necessary

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2737) HA: Automatically trigger log rolls periodically on the active NN

2012-01-10 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2737:
--

Attachment: hdfs-2737-prelim.txt

attaching preliminary patch since ATM was wanting to do some cluster testing 
which depends on this. Still need to finish some work on this, I forget if it's 
actually working or not :)

> HA: Automatically trigger log rolls periodically on the active NN
> -
>
> Key: HDFS-2737
> URL: https://issues.apache.org/jira/browse/HDFS-2737
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-2737-prelim.txt
>
>
> Currently, the edit log tailing process can only read finalized log segments. 
> So, if the active NN is not rolling its logs periodically, the SBN will lag a 
> lot. This also causes many datanode messages to be queued up in the 
> PendingDatanodeMessage structure.
> To combat this, the active NN needs to roll its logs periodically -- perhaps 
> based on a time threshold, or perhaps based on a number of transactions. I'm 
> not sure yet whether it's better to have the NN roll on its own or to have 
> the SBN ask the active NN to roll its logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2499) Fix RPC client creation bug from HDFS-2459

2012-01-10 Thread Tsz Wo (Nicholas), SZE (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183880#comment-13183880
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-2499:
--

Forgot to add an entry in CHANGES.txt?  Let me update it now.

> Fix RPC client creation bug from HDFS-2459
> --
>
> Key: HDFS-2499
> URL: https://issues.apache.org/jira/browse/HDFS-2499
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 0.23.0, 0.24.0
>Reporter: Suresh Srinivas
>Assignee: Suresh Srinivas
> Fix For: 0.24.0
>
> Attachments: HDFS-2499.txt, HDFS-2499.txt
>
>
> HDFS-2459 incorrectly implemented the RPC getProxy for the JournalProtocol 
> client side. It sets retry policies and other policies that are not necessary

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183876#comment-13183876
 ] 

Todd Lipcon commented on HDFS-2592:
---

This looks fairly reasonable. A few items:
- Is it possible to move that new code out of the NameNodeConnector constructor 
into a static method in DFSUtil or even DFSClient?
- Rather than duplicating the code to parse the maxFailoverAttempts, 
failoverBaseSleepMillis, etc, can we reuse some of the code that's in 
DFSClient? If we move the connection code into a static method in DFSClient, 
then we can instantiate a DFSClient.Conf and pull out the variables from there, 
for example.
- Some too-long lines in the new test code
- The new test is mostly dup code from TestBalancer. Is it possible to reuse 
more of the code by refactoring into static methods, etc?
- Similarly much of the setup code is duplicated from 
HAUtil.configureFailoverFs. Can you just call that function, then grab the conf 
from the resulting filesystem, or refactor that method so you can reuse the 
configuration generating code?

> HA: Balancer support for HA namenodes
> -
>
> Key: HDFS-2592
> URL: https://issues.apache.org/jira/browse/HDFS-2592
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2592.patch, HDFS-2592.patch
>
>
> The balancer currently interacts directly with namenode InetSocketAddresses 
> and makes its own IPC proxies. We need to integrate it with HA so that it 
> uses the same client failover infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183872#comment-13183872
 ] 

Uma Maheswara Rao G commented on HDFS-2767:
---

Thanks a lot Todd, for the comments.
I will check and update accordingly. Mean.while can you please take a look on 
Balancer issue also?


> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: HDFS-2767.patch, hdfs-2767-what-todd-had.txt
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183867#comment-13183867
 ] 

Todd Lipcon commented on HDFS-2767:
---

Hi Uma. I had started working on this before you posted your patch, but looks 
like we went a similar direction. The only suggestion I have is to make the 
interface an argument of the constructor rather than calling a setter after 
it's instantiated. I'll upload what I have - do you think you could make that 
change in your patch?

Also, regarding this section:
{code}
+// TODO(HA): Need other way to create the proxy instance based on
+// protocol here.
+if (protocol != null && NamenodeProtocol.class.equals(protocol)) {
+  current.namenode = DFSUtil.createNamenodeWithNNProtocol(
+  current.address, conf);
+} else {
{code}
I think you can remove the TODO and change the {{else}} to an {{else if}} to 
check for ClientProtocol, with a final {{else}} clause that throws an 
AssertionError or IllegalStateException.

Lastly, I think we do need to wire the {{ugi}} parameter in to 
{{createNamenodeWithNNProtocol}} or else after a failover the user accessing 
HDFS might accidentally switch!

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: HDFS-2767.patch, hdfs-2767-what-todd-had.txt
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2767:
--

Attachment: hdfs-2767-what-todd-had.txt

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: HDFS-2767.patch, hdfs-2767-what-todd-had.txt
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183838#comment-13183838
 ] 

Hudson commented on HDFS-2739:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #1543 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1543/])
HDFS-2739. SecondaryNameNode doesn't start up.

jitendra : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229877
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/NamenodeProtocol.proto


> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183825#comment-13183825
 ] 

Hudson commented on HDFS-2739:
--

Integrated in Hadoop-Common-trunk-Commit #1524 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1524/])
HDFS-2739. SecondaryNameNode doesn't start up.

jitendra : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229877
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/NamenodeProtocol.proto


> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183824#comment-13183824
 ] 

Hudson commented on HDFS-2739:
--

Integrated in Hadoop-Hdfs-trunk-Commit #1597 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1597/])
HDFS-2739. SecondaryNameNode doesn't start up.

jitendra : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229877
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolServerSideTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/NamenodeProtocolTranslatorPB.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/NamenodeProtocol.proto


> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2739:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed.

> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Jitendra Nath Pandey (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183818#comment-13183818
 ] 

Jitendra Nath Pandey commented on HDFS-2739:


TestDistributedUpgrade failure is unrelated and it actually passes on my 
machine.
The findbugs, release audit and javadoc warnings also seem to exist in trunk 
for a while now.

I tested the patch manually because secondary namenode was required to be run 
as a separate process to reproduce the problem.

> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183806#comment-13183806
 ] 

Hadoop QA commented on HDFS-2739:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510134/HDFS-2739.trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated 21 warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

-1 release audit.  The applied patch generated 1 release audit warnings 
(more than the trunk's current 0 warnings).

-1 core tests.  The patch failed these unit tests:
  org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1771//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1771//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/1771//artifact/trunk/hadoop-hdfs-project/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1771//console

This message is automatically generated.

> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProto

[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183804#comment-13183804
 ] 

Uma Maheswara Rao G commented on HDFS-2767:
---

If you agree with this approach, i can just file on issue in common for 
FailoverProxyProvider interface method.
If you have any other approach which works better, please suggest i can make 
the changes.

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: HDFS-2767.patch
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183800#comment-13183800
 ] 

Uma Maheswara Rao G commented on HDFS-2592:
---

This patch expects HDFS-2767 to apply first.

> HA: Balancer support for HA namenodes
> -
>
> Key: HDFS-2592
> URL: https://issues.apache.org/jira/browse/HDFS-2592
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2592.patch, HDFS-2592.patch
>
>
> The balancer currently interacts directly with namenode InetSocketAddresses 
> and makes its own IPC proxies. We need to integrate it with HA so that it 
> uses the same client failover infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2772) HA: On transition to active, standby should not swallow ELIE

2012-01-10 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183799#comment-13183799
 ] 

Aaron T. Myers commented on HDFS-2772:
--

I'm still working on writing a test case for it, but I can confirm that the 
present code will not in fact allow the standby to "silently fail to load all 
the edits before becoming active." That is, the current code does not presently 
have any correctness issue. Writing a test for this is a little annoying 
because of the way exceptions are propagated, but I'll post a patch for it 
shortly.

> HA: On transition to active, standby should not swallow ELIE
> 
>
> Key: HDFS-2772
> URL: https://issues.apache.org/jira/browse/HDFS-2772
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>
> EditLogTailer#doTailEdits currently catches, logs, and swallows 
> EditLogInputException. This is fine in the case when the standby is sitting 
> idly behind tailing logs. However, when the standby is transitioning to 
> active, swallowing this exception is incorrect, since it could cause the 
> standby to silently fail to load all the edits before becoming active.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2584) Out of the box, visiting /jmx on the NN gives a whole lot of errors in logs.

2012-01-10 Thread Chris Leroy (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183782#comment-13183782
 ] 

Chris Leroy commented on HDFS-2584:
---

I think the exception being thrown is not a problem. We're getting it because 
in exploring the JMX bean the way we are we're effectively trying to get the 
usage threshold of a memory pool where isUsageThresholdSupported is false. This 
leads the UnsupportedOperationException to be thrown, and then we spew. Isn't 
it reasonable to just catch the exception and not log when this happens?

Something like:

diff --git a/src/core/org/apache/hadoop/jmx/JMXJsonServlet.java 
b/src/core/org/apache/hadoop/jmx/JMXJsonServlet.java
index 2c8f797..e9d1f9e 100644
--- a/src/core/org/apache/hadoop/jmx/JMXJsonServlet.java
+++ b/src/core/org/apache/hadoop/jmx/JMXJsonServlet.java
@@ -34,6 +34,7 @@ import javax.management.MBeanServer;
 import javax.management.MalformedObjectNameException;
 import javax.management.ObjectName;
 import javax.management.ReflectionException;
+import javax.management.RuntimeMBeanException;
 import javax.management.openmbean.CompositeData;
 import javax.management.openmbean.CompositeType;
 import javax.management.openmbean.TabularData;
@@ -239,6 +240,15 @@ public class JMXJsonServlet extends HttpServlet {
   // and fall back on the class name
   LOG.error("getting attribute " + prs + " of " + oname
   + " threw an exception", e);
+   } catch (RuntimeMBeanException e) {
+   // The code inside the attribute getter threw an exception, so we
+   // skip outputting the attribute. We will log the exception in
+   // certain cases, but suppress the log message in others.
+   if (!(e.getCause() instanceof UnsupportedOperationException)) {
+   LOG.error("getting attribute " + attName + " of " + oname +
+ " threw an exception", e);
+   }
+   return;
 } catch (RuntimeException e) {
   // For some reason even with an MBeanException available to them
   // Runtime exceptionscan still find their way through, so treat them





> Out of the box, visiting /jmx on the NN gives a whole lot of errors in logs.
> 
>
> Key: HDFS-2584
> URL: https://issues.apache.org/jira/browse/HDFS-2584
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>Priority: Minor
>
> Logs that follow a {{/jmx}} servlet visit:
> {code}
> 11/11/22 12:09:52 ERROR jmx.JMXJsonServlet: getting attribute UsageThreshold 
> of java.lang:type=MemoryPool,name=Par Eden Space threw an exception
> javax.management.RuntimeMBeanException: 
> java.lang.UnsupportedOperationException: Usage threshold is not supported
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:856)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:869)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:670)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
>   at 
> org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:314)
>   at 
> org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:292)
>   at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:192)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>   at 
> org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:940)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>   at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>   at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>   at 
> org.mortbay.jetty.handler.ContextHandlerCollecti

[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183777#comment-13183777
 ] 

Uma Maheswara Rao G commented on HDFS-2592:
---

Hi Todd, Thanks for the care on this issue. Actually I stated work on HDFS-2767 
also as part of this issue. Just updated my work in HDFS-2767. With that 
change, Balancer can work with failover now.

{noformat}
2012-01-11 06:48:43,791 INFO  balancer.Balancer (Balancer.java:run(1390)) - p   
  = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0]
Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
Bytes Being Moved
2012-01-11 06:48:43,891 WARN  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:invoke(105)) - Exception while invoking create of 
class org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB 
after 0 fail over attempts. Trying to fail over immediately.

...
2012-01-11 06:48:58,857 WARN  retry.RetryInvocationHandler 
(RetryInvocationHandler.java:invoke(105)) - Exception while invoking getBlocks 
of class org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB after 0 
fail over attempts. Trying to fail over immediately.
{noformat}

> HA: Balancer support for HA namenodes
> -
>
> Key: HDFS-2592
> URL: https://issues.apache.org/jira/browse/HDFS-2592
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2592.patch, HDFS-2592.patch
>
>
> The balancer currently interacts directly with namenode InetSocketAddresses 
> and makes its own IPC proxies. We need to integrate it with HA so that it 
> uses the same client failover infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2592:
--

Attachment: HDFS-2592.patch

> HA: Balancer support for HA namenodes
> -
>
> Key: HDFS-2592
> URL: https://issues.apache.org/jira/browse/HDFS-2592
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2592.patch, HDFS-2592.patch
>
>
> The balancer currently interacts directly with namenode InetSocketAddresses 
> and makes its own IPC proxies. We need to integrate it with HA so that it 
> uses the same client failover infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Uma Maheswara Rao G (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183772#comment-13183772
 ] 

Uma Maheswara Rao G commented on HDFS-2767:
---

Hi Todd, I worked on this as part of Balancer issue. 
Included common class FailoverProxyProvider interface also in this one.
Added one setter method for setting the protocol and created the corresponding 
proxy instance based on protocol.


Thanks
Uma

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: HDFS-2767.patch
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2739:
---

Assignee: Jitendra Nath Pandey  (was: Suresh Srinivas)
Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Jitendra Nath Pandey
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Uma Maheswara Rao G (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-2767:
--

Attachment: HDFS-2767.patch

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: HDFS-2767.patch
>
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183758#comment-13183758
 ] 

Suresh Srinivas commented on HDFS-2739:
---

+1 for the patch.

> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Suresh Srinivas
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2739) SecondaryNameNode doesn't start up

2012-01-10 Thread Jitendra Nath Pandey (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-2739:
---

Attachment: HDFS-2739.trunk.patch

The problem is that WritableRpc proxy is getting created in the PB translator. 
We don't see this problem in unit tests because rpc engines are globally 
configured.

I have tested the patch on a single node installation.

Also fixed the typo pointed out by Harsh.

> SecondaryNameNode doesn't start up
> --
>
> Key: HDFS-2739
> URL: https://issues.apache.org/jira/browse/HDFS-2739
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Sho Shimauchi
>Assignee: Suresh Srinivas
>Priority: Critical
> Attachments: HDFS-2739.trunk.patch
>
>
> Built a 0.24-SNAPSHOT tar from today, used a general config, started NN/DN, 
> but SNN won't come up with following error:
> {code}
> 11/12/31 12:13:14 ERROR namenode.SecondaryNameNode: Throwable Exception in 
> doCheckpoint
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> java.lang.RuntimeException: java.lang.NoSuchFieldException: versionID
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:154)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invocation.(WritableRpcEngine.java:112)
>   at 
> org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:226)
>   at $Proxy9.getTransationId(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolTranslatorPB.getTransactionID(NamenodeProtocolTranslatorPB.java:185)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.countUncheckpointedTxns(SecondaryNameNode.java:625)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.shouldCheckpointBasedOnCount(SecondaryNameNode.java:633)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:386)
>   at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:356)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.NoSuchFieldException: versionID
>   at java.lang.Class.getField(Class.java:1520)
>   at org.apache.hadoop.ipc.RPC.getProtocolVersion(RPC.java:150)
>   ... 9 more
> 11/12/31 12:13:14 INFO namenode.SecondaryNameNode: SHUTDOWN_MSG: 
> /
> SHUTDOWN_MSG: Shutting down SecondaryNameNode at sho-mba.local/192.168.11.2
> /
> {code}
> full error log: http://pastebin.com/mSaVbS34

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Hari Mankude (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183747#comment-13183747
 ] 

Hari Mankude commented on HDFS-2753:


Sounds good.

> Standby namenode stuck in safemode during a failover
> 
>
> Key: HDFS-2753
> URL: https://issues.apache.org/jira/browse/HDFS-2753
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFS-2753.patch, hdfs-2753.txt
>
>
> Write traffic initiated from the client. Manual failover is done by killing 
> NN and converting a  different standby to active. NN is restarted as standby. 
> The restarted standby stays in safemode forever. More information in the 
> description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183743#comment-13183743
 ] 

Aaron T. Myers commented on HDFS-2753:
--

Hari, does Todd's explanation address your concerns? I verified that the test 
fails consistently without the fix (ran it 10 times) and that it passes with 
the fix.

Todd, my only suggestion would be to add the above explanation about why this 
test case elicits the desired behavior as a comment, given that it's not at all 
obvious.

+1 once that issue is addressed.

> Standby namenode stuck in safemode during a failover
> 
>
> Key: HDFS-2753
> URL: https://issues.apache.org/jira/browse/HDFS-2753
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFS-2753.patch, hdfs-2753.txt
>
>
> Write traffic initiated from the client. Manual failover is done by killing 
> NN and converting a  different standby to active. NN is restarted as standby. 
> The restarted standby stays in safemode forever. More information in the 
> description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2773) HA: reading edit logs from an earlier version leaves blocks in under-construction state

2012-01-10 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183732#comment-13183732
 ] 

Aaron T. Myers commented on HDFS-2773:
--

Looking good, Todd. I verified that the test passes with this fix, and fails 
without it.

My only suggestion would be to add a comment in the OP_CLOSE case to mention 
that this code will only be reached when loading old edits log versions.

+1 once the above is addressed.

> HA: reading edit logs from an earlier version leaves blocks in 
> under-construction state
> ---
>
> Key: HDFS-2773
> URL: https://issues.apache.org/jira/browse/HDFS-2773
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hadoop-1.0-multiblock-file.tgz, hdfs-2773.txt
>
>
> In HDFS-2602, the code for applying OP_ADD and OP_CLOSE was changed a bit, 
> and the new code has the following problem: if an OP_CLOSE includes new 
> blocks (ie not previously seen in an OP_ADD) then those blocks will remain in 
> the "under construction" state rather than being marked "complete". This is 
> because {{updateBlocks}} always creates {{BlockInfoUnderConstruction}} 
> regardless of the opcode. This bug only affects the upgrade path, since in 
> trunk we always persist blocks with OP_ADDs before we call OP_CLOSE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2771) Move Federation and WebHDFS documentation into HDFS project

2012-01-10 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183730#comment-13183730
 ] 

Suresh Srinivas commented on HDFS-2771:
---

Before this can happen, the HDFS docs should move to maven Doxia apt.

> Move Federation and WebHDFS documentation into HDFS project
> ---
>
> Key: HDFS-2771
> URL: https://issues.apache.org/jira/browse/HDFS-2771
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 0.23.0
>Reporter: Todd Lipcon
>
> For some strange reason, the WebHDFS and Federation documentation is 
> currently in the hadoop-yarn site. This is counter-intuitive. We should move 
> these documents to an hdfs site, or if we think that all documentation should 
> go on one site, it should go into the hadoop-common project somewhere.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested

2012-01-10 Thread Aaron T. Myers (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-2738:
-

Attachment: HDFS-2738-HDFS-1623.patch

Thanks a lot for the review, Todd. Here's an updated patch which should address 
your concerns.

And yes, I did manually verify that those exceptions as described no longer 
show up in the logs when the SBN is tailing an NN making continuous edits.

> FSEditLog.selectinputStreams is reading through in-progress streams even when 
> non-in-progress are requested
> ---
>
> Key: HDFS-2738
> URL: https://issues.apache.org/jira/browse/HDFS-2738
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
>Priority: Blocker
> Attachments: HDFS-2738-HDFS-1623.patch, HDFS-2738-HDFS-1623.patch, 
> HDFS-2738-HDFS-1623.patch
>
>
> The new code in HDFS-1580 is causing an issue with selectInputStreams in the 
> HA context. When the active is writing to the shared edits, 
> selectInputStreams is called on the standby. This ends up calling 
> {{journalSet.getInputStream}} but doesn't pass the {{inProgressOk=false}} 
> flag. So, {{getInputStream}} ends up reading and validating the in-progress 
> stream unnecessarily. Since the validation results are no longer properly 
> cached, {{findMaxTransaction}} also re-validates the in-progress stream, and 
> then breaks the corruption check in this code. The end result is a lot of 
> errors like:
> 2011-12-30 16:45:02,521 ERROR namenode.FileJournalManager 
> (FileJournalManager.java:getNumberOfTransactions(266)) - Gap in transactions, 
> max txnid is 579, 0 txns from 578
> 2011-12-30 16:45:02,521 INFO  ha.EditLogTailer (EditLogTailer.java:run(163)) 
> - Got error, will try again.
> java.io.IOException: No non-corrupt logs for txid 578
>   at 
> org.apache.hadoop.hdfs.server.namenode.JournalSet.getInputStream(JournalSet.java:229)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1081)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:115)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$0(EditLogTailer.java:100)
>   at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:154)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183702#comment-13183702
 ] 

Aaron T. Myers commented on HDFS-2775:
--

+1, the patch looks good to me.

Should {{FSImage#getMostRecentCheckpointTxId}} perhaps be marked 
{{@VisibleForTesting}}?

> HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently
> ---
>
> Key: HDFS-2775
> URL: https://issues.apache.org/jira/browse/HDFS-2775
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-2775.txt
>
>
> This test is failing periodically on this assertion:
> {code}
> assertEquals(12, nn0.getNamesystem().getFSImage().getStorage()
> .getMostRecentCheckpointTxId());
> {code}
> My guess is it's a test race. Investigating...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2776) Missing interface annotation on JournalSet

2012-01-10 Thread Brandon Li (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Li reassigned HDFS-2776:


Assignee: Brandon Li

> Missing interface annotation on JournalSet
> --
>
> Key: HDFS-2776
> URL: https://issues.apache.org/jira/browse/HDFS-2776
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: name-node
>Affects Versions: 0.24.0
>Reporter: Todd Lipcon
>Assignee: Brandon Li
>Priority: Trivial
>  Labels: newbie
>
> This public class is missing an annotation that it is for private usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2766) HA: test for case where standby partially reads log and then performs checkpoint

2012-01-10 Thread Aaron T. Myers (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-2766:
-

Attachment: HDFS-2766-HDFS-1623.patch

Thanks a lot for the review, Todd. Here's an updated patch which adds the 
requested comment.

> HA: test for case where standby partially reads log and then performs 
> checkpoint
> 
>
> Key: HDFS-2766
> URL: https://issues.apache.org/jira/browse/HDFS-2766
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Aaron T. Myers
> Attachments: HDFS-2766-HDFS-1623.patch, HDFS-2766-HDFS-1623.patch
>
>
> Here's a potential bug case that we don't currently test for:
> - SBN is reading a finalized edits file when NFS disappears halfway through 
> (or some intermittent error happens)
> - SBN performs a checkpoint and uploads it to the NN
> - NN receives a checkpoint that doesn't correspond to the end of any log 
> segment
> - Both NN and SBN should be able to restart at this point.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-10 Thread Suresh Srinivas (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183658#comment-13183658
 ] 

Suresh Srinivas commented on HDFS-2742:
---

Todd, Couple of questions:
# What is the implication of ignoring RBW altogether at the standby?
# If editlog has a finalized record, can we just ignore the RBW from the block 
report?

> HA: observed dataloss in replication stress test
> 
>
> Key: HDFS-2742
> URL: https://issues.apache.org/jira/browse/HDFS-2742
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node, ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Blocker
> Attachments: hdfs-2742.txt, log-colorized.txt
>
>
> The replication stress test case failed over the weekend since one of the 
> replicas went missing. Still diagnosing the issue, but it seems like the 
> chain of events was something like:
> - a block report was generated on one of the nodes while the block was being 
> written - thus the block report listed the block as RBW
> - when the standby replayed this queued message, it was replayed after the 
> file was marked complete. Thus it marked this replica as corrupt
> - it asked the DN holding the corrupt replica to delete it. And, I think, 
> removed it from the block map at this time.
> - That DN then did another block report before receiving the deletion. This 
> caused it to be re-added to the block map, since it was "FINALIZED" now.
> - Replication was lowered on the file, and it counted the above replica as 
> non-corrupt, and asked for the other replicas to be deleted.
> - All replicas were lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2777) When copying a file out of HDFS, modifying it, and uploading it back into HDFS, the put fails due to a CRC mismatch

2012-01-10 Thread Kevin J. Price (Created) (JIRA)
When copying a file out of HDFS, modifying it, and uploading it back into HDFS, 
the put fails due to a CRC mismatch
---

 Key: HDFS-2777
 URL: https://issues.apache.org/jira/browse/HDFS-2777
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
 Environment: KR at Yahoo
Reporter: Kevin J. Price


Performing an hdfs -get on a file, modifying the file, and using hdfs -put to 
place the file back into HDFS results in a checksum error.  It seems that the 
problem is a .crc file being generated locally from the -get command even 
though the -crc option was NOT specified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183626#comment-13183626
 ] 

Todd Lipcon commented on HDFS-2753:
---

The test adds blocks while the SBN is down. This makes them get queued up in 
the block received list of that BPServiceActor.
When it restarts, the DN calls register(), followed by 
reportReceivedDeletedBlocks(), followed by blockReport(). So the received 
blocks always show up first.

If you comment out the fix, the test case reliably fails with the error you 
described (stuck in safemode).

> Standby namenode stuck in safemode during a failover
> 
>
> Key: HDFS-2753
> URL: https://issues.apache.org/jira/browse/HDFS-2753
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFS-2753.patch, hdfs-2753.txt
>
>
> Write traffic initiated from the client. Manual failover is done by killing 
> NN and converting a  different standby to active. NN is restarted as standby. 
> The restarted standby stays in safemode forever. More information in the 
> description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Hari Mankude (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183612#comment-13183612
 ] 

Hari Mankude commented on HDFS-2753:


Todd,

Looking at the new test in the patch, I am not sure as to how this is going to 
bring out the race. The race happens when a blockreceived happens before the 
first block report when NN is in safemode. I don't see test guarantee this 
sequence of operations.

> Standby namenode stuck in safemode during a failover
> 
>
> Key: HDFS-2753
> URL: https://issues.apache.org/jira/browse/HDFS-2753
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Hari Mankude
>Assignee: Hari Mankude
> Attachments: HDFS-2753.patch, hdfs-2753.txt
>
>
> Write traffic initiated from the client. Manual failover is done by killing 
> NN and converting a  different standby to active. NN is restarted as standby. 
> The restarted standby stays in safemode forever. More information in the 
> description.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-2775:
--

Attachment: hdfs-2775.txt

Simple fix - makes the call for {{getMostRecentCheckpointTxId}} go through 
FSImage, where it can be synchronized against the saveNamespace call. I 
verified the fix by adding a sleep before setting {{mostRecentCheckpointTxId}} 
- it used to make the test fail reliably, but now the test  passes regardless.

> HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently
> ---
>
> Key: HDFS-2775
> URL: https://issues.apache.org/jira/browse/HDFS-2775
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-2775.txt
>
>
> This test is failing periodically on this assertion:
> {code}
> assertEquals(12, nn0.getNamesystem().getFSImage().getStorage()
> .getMostRecentCheckpointTxId());
> {code}
> My guess is it's a test race. Investigating...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2776) Missing interface annotation on JournalSet

2012-01-10 Thread Todd Lipcon (Created) (JIRA)
Missing interface annotation on JournalSet
--

 Key: HDFS-2776
 URL: https://issues.apache.org/jira/browse/HDFS-2776
 Project: Hadoop HDFS
  Issue Type: Task
  Components: name-node
Affects Versions: 0.24.0
Reporter: Todd Lipcon
Priority: Trivial


This public class is missing an annotation that it is for private usage.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183502#comment-13183502
 ] 

Todd Lipcon commented on HDFS-2775:
---

Yes, this just a test race. The issue is that the checkpoint is saved to 
storage, and only after that is {{mostRecentCheckpointTxId}} updated. So, the 
test sees the checkpoint and then the assert fails. We should probably fix this 
with some simple synchronization - but it's only a test problem and not a code 
issue.

> HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently
> ---
>
> Key: HDFS-2775
> URL: https://issues.apache.org/jira/browse/HDFS-2775
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, test
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>
> This test is failing periodically on this assertion:
> {code}
> assertEquals(12, nn0.getNamesystem().getFSImage().getStorage()
> .getMostRecentCheckpointTxId());
> {code}
> My guess is it's a test race. Investigating...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Created) (JIRA)
HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently
---

 Key: HDFS-2775
 URL: https://issues.apache.org/jira/browse/HDFS-2775
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, test
Affects Versions: HA branch (HDFS-1623)
Reporter: Todd Lipcon
Assignee: Todd Lipcon


This test is failing periodically on this assertion:
{code}
assertEquals(12, nn0.getNamesystem().getFSImage().getStorage()
.getMostRecentCheckpointTxId());
{code}
My guess is it's a test race. Investigating...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183486#comment-13183486
 ] 

Todd Lipcon commented on HDFS-2592:
---

Uma, do you mind if I take this over to finish up your patch? I was planning on 
working on HDFS-2767 which is closely related.

> HA: Balancer support for HA namenodes
> -
>
> Key: HDFS-2592
> URL: https://issues.apache.org/jira/browse/HDFS-2592
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: balancer, ha
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Uma Maheswara Rao G
> Attachments: HDFS-2592.patch
>
>
> The balancer currently interacts directly with namenode InetSocketAddresses 
> and makes its own IPC proxies. We need to integrate it with HA so that it 
> uses the same client failover infrastructure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Todd Lipcon (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HDFS-2767:
-

Assignee: Todd Lipcon

> HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol
> ---
>
> Key: HDFS-2767
> URL: https://issues.apache.org/jira/browse/HDFS-2767
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, hdfs client
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Uma Maheswara Rao G
>Assignee: Todd Lipcon
>Priority: Blocker
>
> Presentely ConfiguredFailoverProxyProvider supports ClinetProtocol.
> It should support NameNodeProtocol also, because Balancer uses 
> NameNodeProtocol for getting blocks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2740) Enable the trash feature by default

2012-01-10 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183454#comment-13183454
 ] 

Harsh J commented on HDFS-2740:
---

[~eli] - The FsShell does log out that the file was moved to trash and not 
completely removed. If we can solve this with more info/doc efforts, am up for 
doing that.

I do think a lot of them miss out on the trash feature until they run into a 
situation that makes them search if there is one.

Stuff we can document more explicitly about, to help:
- How do I disable Trash?
- How do I clear out Trash?
- How do I force-delete a file (skipping trash)?
- How do I tweak the checkpoint periods?

And maybe some dev documentation on trash policies, as I think that is now 
pluggable (evolving API)?

> Enable the trash feature by default
> ---
>
> Key: HDFS-2740
> URL: https://issues.apache.org/jira/browse/HDFS-2740
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>  Labels: newbie
> Attachments: hdfs-2740.patch, hdfs-2740.patch
>
>
> Currently trash is disabled out of box. I do not think it'd be of high 
> surprise to anyone (but surely a relief when *hit happens) to have trash 
> enabled by default, with the usually recommended periods of 1-day.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2740) Enable the trash feature by default

2012-01-10 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183419#comment-13183419
 ] 

Eli Collins commented on HDFS-2740:
---

I'm not sold that we should ship with Trash enabled out of the box. Equally 
confusing is users who expect deleting files actually frees up space right?

> Enable the trash feature by default
> ---
>
> Key: HDFS-2740
> URL: https://issues.apache.org/jira/browse/HDFS-2740
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>  Labels: newbie
> Attachments: hdfs-2740.patch, hdfs-2740.patch
>
>
> Currently trash is disabled out of box. I do not think it'd be of high 
> surprise to anyone (but surely a relief when *hit happens) to have trash 
> enabled by default, with the usually recommended periods of 1-day.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2740) Enable the trash feature by default

2012-01-10 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183330#comment-13183330
 ] 

Hadoop QA commented on HDFS-2740:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12510051/hdfs-2740.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1770//console

This message is automatically generated.

> Enable the trash feature by default
> ---
>
> Key: HDFS-2740
> URL: https://issues.apache.org/jira/browse/HDFS-2740
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>  Labels: newbie
> Attachments: hdfs-2740.patch, hdfs-2740.patch
>
>
> Currently trash is disabled out of box. I do not think it'd be of high 
> surprise to anyone (but surely a relief when *hit happens) to have trash 
> enabled by default, with the usually recommended periods of 1-day.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-2740) Enable the trash feature by default

2012-01-10 Thread T Meyarivan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Meyarivan updated HDFS-2740:
--

Attachment: hdfs-2740.patch

Includes changes to docs.

> Enable the trash feature by default
> ---
>
> Key: HDFS-2740
> URL: https://issues.apache.org/jira/browse/HDFS-2740
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>  Labels: newbie
> Attachments: hdfs-2740.patch, hdfs-2740.patch
>
>
> Currently trash is disabled out of box. I do not think it'd be of high 
> surprise to anyone (but surely a relief when *hit happens) to have trash 
> enabled by default, with the usually recommended periods of 1-day.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2740) Enable the trash feature by default

2012-01-10 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183262#comment-13183262
 ] 

Harsh J commented on HDFS-2740:
---

Meyarivan,

Your changeset is fine. HDFS buildbot failed to apply it cause its under the 
hadoop-common directory, but I think the JIRA should stick here, on HDFS, as 
its relevant to this component.

Could you also update the docs regarding this behavior change, and incorporate 
[~daryn]'s and [~revans2]'s comments above into the docs?

The trash feature is currently documented at 
{{hadoop-hdfs-project/hadoop-hdfs/src/main/docs/src/documentation/content/xdocs/hdfs_design.xml}},
 but if you feel its fit in the filesystem_shell guide itself, feel free to 
move it there.

> Enable the trash feature by default
> ---
>
> Key: HDFS-2740
> URL: https://issues.apache.org/jira/browse/HDFS-2740
> Project: Hadoop HDFS
>  Issue Type: Wish
>  Components: hdfs client, name-node
>Affects Versions: 0.23.0
>Reporter: Harsh J
>  Labels: newbie
> Attachments: hdfs-2740.patch
>
>
> Currently trash is disabled out of box. I do not think it'd be of high 
> surprise to anyone (but surely a relief when *hit happens) to have trash 
> enabled by default, with the usually recommended periods of 1-day.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2724) NN web UI can throw NPE after startup, before standby state is entered

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183227#comment-13183227
 ] 

Hudson commented on HDFS-2724:
--

Integrated in Hadoop-Hdfs-HAbranch-build #43 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/43/])
HDFS-2724. NN web UI can throw NPE after startup, before standby state is 
entered. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229466
Files : 
* 
/hadoop/common/branches/HDFS-1623/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/HAServiceProtocol.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.jsp


> NN web UI can throw NPE after startup, before standby state is entered
> --
>
> Key: HDFS-2724
> URL: https://issues.apache.org/jira/browse/HDFS-2724
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623)
>
> Attachments: hdfs-2724.txt
>
>
> There's a brief period of time (a few seconds) after the NN web server has 
> been initialized, but before the NN's HA state is initialized. If 
> {{dfshealth.jsp}} is hit during this time, a {{NullPointerException}} will be 
> thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2762) TestCheckpoint is timing out

2012-01-10 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183226#comment-13183226
 ] 

Hudson commented on HDFS-2762:
--

Integrated in Hadoop-Hdfs-HAbranch-build #43 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/43/])
HDFS-2762. Fix TestCheckpoint timing out on HA branch. Contributed by Uma 
Maheswara Rao G.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1229464
Files : 
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java


> TestCheckpoint is timing out
> 
>
> Key: HDFS-2762
> URL: https://issues.apache.org/jira/browse/HDFS-2762
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Aaron T. Myers
>Assignee: Uma Maheswara Rao G
> Fix For: HA branch (HDFS-1623)
>
> Attachments: HDFS-2762.patch
>
>
> TestCheckpoint is timing out on the HA branch, and has been for a few days.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira