[jira] [Commented] (HDFS-2803) Adding logging to LeaseRenewer for better lease expiration triage.

2012-01-17 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187981#comment-13187981 ] Todd Lipcon commented on HDFS-2803: --- I think DEBUG level is more appropriate for both.

[jira] [Commented] (HDFS-2681) Add ZK client for leader election

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187168#comment-13187168 ] Todd Lipcon commented on HDFS-2681: --- bq. So if your TCP disconnect timeouts are not set

[jira] [Commented] (HDFS-2798) Append may race with datanode block scanner, causing replica to be incorrectly marked corrupt

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187170#comment-13187170 ] Todd Lipcon commented on HDFS-2798: --- You can reproduce this fairly reliably by adding a

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187207#comment-13187207 ] Todd Lipcon commented on HDFS-2742: --- I also ran the replication stress test for 10x as

[jira] [Commented] (HDFS-2747) HA: entering safe mode after starting SBN can NPE

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187213#comment-13187213 ] Todd Lipcon commented on HDFS-2747: --- +1, looks good to me. I'll double-check the tests

[jira] [Commented] (HDFS-2772) HA: On transition to active, standby should not swallow ELIE

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187246#comment-13187246 ] Todd Lipcon commented on HDFS-2772: --- +1, lgtm. HA: On transition to

[jira] [Commented] (HDFS-2691) HA: Tests and fixes for pipeline targets and replica recovery

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187256#comment-13187256 ] Todd Lipcon commented on HDFS-2691: --- In order to fix this, we need to get the {{targets}}

[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187302#comment-13187302 ] Todd Lipcon commented on HDFS-2767: --- {code} +} catch (RuntimeException rte) { +

[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187340#comment-13187340 ] Todd Lipcon commented on HDFS-2767: --- I meant that, in this patch, you can make

[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187411#comment-13187411 ] Todd Lipcon commented on HDFS-2592: --- Looks mostly good. One small nit - can you add some

[jira] [Commented] (HDFS-2795) HA: Standby NN takes a long time to recover from a dead DN starting up

2012-01-16 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187467#comment-13187467 ] Todd Lipcon commented on HDFS-2795: --- Woops, this broke one of the TestPersistBlocks tests

[jira] [Commented] (HDFS-2747) HA: entering safe mode after starting SBN can NPE

2012-01-15 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186659#comment-13186659 ] Todd Lipcon commented on HDFS-2747: --- - add braces around the body of the new if statement

[jira] [Commented] (HDFS-2681) Add ZK client for leader election

2012-01-15 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186672#comment-13186672 ] Todd Lipcon commented on HDFS-2681: --- - Can {{ActiveStandbyElector}} be made

[jira] [Commented] (HDFS-2794) HA: Active NN may purge edit log files before standby NN has a chance to read them

2012-01-15 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186739#comment-13186739 ] Todd Lipcon commented on HDFS-2794: --- Worth noting that this only happens if the admin

[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-14 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186407#comment-13186407 ] Todd Lipcon commented on HDFS-2791: --- To start brainstorming a solution, here are a few

[jira] [Commented] (HDFS-2731) Autopopulate standby name dirs if they're empty

2012-01-13 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185751#comment-13185751 ] Todd Lipcon commented on HDFS-2731: --- bq. I am missing this: both Image and Edits should

[jira] [Commented] (HDFS-2747) HA: entering safe mode after starting SBN can NPE

2012-01-12 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185076#comment-13185076 ] Todd Lipcon commented on HDFS-2747: --- Ah, that makes sense. Good sleuthing. Planning to

[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183486#comment-13183486 ] Todd Lipcon commented on HDFS-2592: --- Uma, do you mind if I take this over to finish up

[jira] [Commented] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183502#comment-13183502 ] Todd Lipcon commented on HDFS-2775: --- Yes, this just a test race. The issue is that the

[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183626#comment-13183626 ] Todd Lipcon commented on HDFS-2753: --- The test adds blocks while the SBN is down. This

[jira] [Commented] (HDFS-2767) HA: ConfiguredFailoverProxyProvider should support NameNodeProtocol

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183867#comment-13183867 ] Todd Lipcon commented on HDFS-2767: --- Hi Uma. I had started working on this before you

[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183876#comment-13183876 ] Todd Lipcon commented on HDFS-2592: --- This looks fairly reasonable. A few items: - Is it

[jira] [Commented] (HDFS-2773) HA: reading edit logs from an earlier version leaves blocks in under-construction state

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183894#comment-13183894 ] Todd Lipcon commented on HDFS-2773: --- I added the following: {code} + // OP_CLOSE

[jira] [Commented] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183899#comment-13183899 ] Todd Lipcon commented on HDFS-2738: --- +1, looks good to me. Thanks for making those

[jira] [Commented] (HDFS-2775) HA: TestStandbyCheckpoints.testBothNodesInStandbyState fails intermittently

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183900#comment-13183900 ] Todd Lipcon commented on HDFS-2775: --- bq. Should FSImage#getMostRecentCheckpointTxId

[jira] [Commented] (HDFS-2766) HA: test for case where standby partially reads log and then performs checkpoint

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183902#comment-13183902 ] Todd Lipcon commented on HDFS-2766: --- +1 lgtm. HA: test for case where

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183911#comment-13183911 ] Todd Lipcon commented on HDFS-2742: --- bq. What is the implication of ignoring RBW

[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182746#comment-13182746 ] Todd Lipcon commented on HDFS-2753: --- It seems like this should be easy to produce -- just

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182787#comment-13182787 ] Todd Lipcon commented on HDFS-2742: --- This seems to have caused some issues in

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182907#comment-13182907 ] Todd Lipcon commented on HDFS-2742: --- Looking into this has made me aware of a lurking can

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182947#comment-13182947 ] Todd Lipcon commented on HDFS-2742: --- I'm looking into a simpler solution where we redo a

[jira] [Commented] (HDFS-2766) HA: test for case where standby partially reads log and then performs checkpoint

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183047#comment-13183047 ] Todd Lipcon commented on HDFS-2766: --- Looks good. Can you add a javadoc to the new test

[jira] [Commented] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183050#comment-13183050 ] Todd Lipcon commented on HDFS-2738: --- {code} + throw new

[jira] [Commented] (HDFS-2752) HA: exit if multiple shared dirs are configured

2012-01-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183052#comment-13183052 ] Todd Lipcon commented on HDFS-2752: --- I agree with Eli - we don't currently use the

[jira] [Commented] (HDFS-2762) TestCheckpoint is timing out

2012-01-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182356#comment-13182356 ] Todd Lipcon commented on HDFS-2762: --- +1, will commit momentarily after checking that the

[jira] [Commented] (HDFS-2762) TestCheckpoint is timing out

2012-01-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182360#comment-13182360 ] Todd Lipcon commented on HDFS-2762: --- It looks like TestStandbyCheckpoints is broken with

[jira] [Commented] (HDFS-2770) Block reports may mark corrupt blocks pending deletion as non-corrupt

2012-01-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182377#comment-13182377 ] Todd Lipcon commented on HDFS-2770: --- I believe the issue may be with any place we check:

[jira] [Commented] (HDFS-2587) Add WebHDFS apt doc

2012-01-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182385#comment-13182385 ] Todd Lipcon commented on HDFS-2587: --- Can anyone explain why these docs are in the

[jira] [Commented] (HDFS-2756) Warm standby does not read the in_progress edit log

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181517#comment-13181517 ] Todd Lipcon commented on HDFS-2756: --- The old log files are purged when checkpoints are

[jira] [Commented] (HDFS-2753) Standby namenode stuck in safemode during a failover

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181521#comment-13181521 ] Todd Lipcon commented on HDFS-2753: --- This looks reasonable. Can you add a regression test

[jira] [Commented] (HDFS-2762) TestCheckpoint is timing out

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181574#comment-13181574 ] Todd Lipcon commented on HDFS-2762: --- It seems like the {{testMultipleSecondaryNameNodes}}

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181575#comment-13181575 ] Todd Lipcon commented on HDFS-2709: --- +1, looks good to me. Will commit momentarily after

[jira] [Commented] (HDFS-2765) TestNameEditsConfigs is incorrectly swallowing IOE

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181726#comment-13181726 ] Todd Lipcon commented on HDFS-2765: --- +1 TestNameEditsConfigs is

[jira] [Commented] (HDFS-2737) HA: Automatically trigger log rolls periodically on the active NN

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181727#comment-13181727 ] Todd Lipcon commented on HDFS-2737: --- bq. My preference is to read the editlog in progress

[jira] [Commented] (HDFS-2764) HA: TestBackupNode is failing

2012-01-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181786#comment-13181786 ] Todd Lipcon commented on HDFS-2764: --- Passes for me.. which revision are you testing from?

[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180524#comment-13180524 ] Todd Lipcon commented on HDFS-2592: --- My hope is to propose a merge within the next week

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2012-01-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181031#comment-13181031 ] Todd Lipcon commented on HDFS-2709: --- I'm skeptical of the fix -- the question is _why_ we

[jira] [Commented] (HDFS-2756) Warm standby does not read the in_progress edit log

2012-01-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1318#comment-1318 ] Todd Lipcon commented on HDFS-2756: --- The FileJournalManager does support skipping to the

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2012-01-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181122#comment-13181122 ] Todd Lipcon commented on HDFS-2709: --- - Doesn't look like the interface audience

[jira] [Commented] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested

2012-01-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181131#comment-13181131 ] Todd Lipcon commented on HDFS-2738: --- {code:title=BookKeeperJournalManager.java} + public

[jira] [Commented] (HDFS-2749) Wrong fsimage format while entering recovery mode

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179614#comment-13179614 ] Todd Lipcon commented on HDFS-2749: --- Good find! I think this is probably the cause of

[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179850#comment-13179850 ] Todd Lipcon commented on HDFS-2291: --- bq. dfs.namenode.standby.checkpoints - perhaps

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179983#comment-13179983 ] Todd Lipcon commented on HDFS-2709: --- {code} + public void skipTransactions(long

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179989#comment-13179989 ] Todd Lipcon commented on HDFS-2709: --- Oh, I also had to add the following: {code} ---

[jira] [Commented] (HDFS-2751) Datanode drops OS cache behind reads even for short reads

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179997#comment-13179997 ] Todd Lipcon commented on HDFS-2751: --- (thanks to JD Cryans for finding this one)

[jira] [Commented] (HDFS-2592) HA: Balancer support for HA namenodes

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180026#comment-13180026 ] Todd Lipcon commented on HDFS-2592: --- Hey Uma. Any progress on this? Would be nice to have

[jira] [Commented] (HDFS-2185) HA: ZK-based FailoverController

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180049#comment-13180049 ] Todd Lipcon commented on HDFS-2185: --- Sure, that makes sense. I'm a little skeptical that

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180237#comment-13180237 ] Todd Lipcon commented on HDFS-2709: --- hrm... I was running the unit tests and it looks

[jira] [Commented] (HDFS-2743) Streamline usage of bookkeeper journal manager

2012-01-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179089#comment-13179089 ] Todd Lipcon commented on HDFS-2743: --- This doesn't look quite right. The

[jira] [Commented] (HDFS-2731) Autopopulate standby name dirs if they're empty

2011-12-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177732#comment-13177732 ] Todd Lipcon commented on HDFS-2731: --- We don't currently support using the 2NN with HA --

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2011-12-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177761#comment-13177761 ] Todd Lipcon commented on HDFS-2709: --- Aaron and I chatted offline about the above

[jira] [Commented] (HDFS-2736) HA: support separate SBN and 2NN?

2011-12-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177791#comment-13177791 ] Todd Lipcon commented on HDFS-2736: --- bq. if we fail over to the SBN does it continue to

[jira] [Commented] (HDFS-2736) HA: support 2NN with SBN

2011-12-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177858#comment-13177858 ] Todd Lipcon commented on HDFS-2736: --- Agreed. Currently we also basically support

[jira] [Commented] (HDFS-2737) HA: Automatically trigger log rolls periodically on the active NN

2011-12-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177866#comment-13177866 ] Todd Lipcon commented on HDFS-2737: --- A couple options here: *1) Add a thread to the NN

[jira] [Commented] (HDFS-2737) HA: Automatically trigger log rolls periodically on the active NN

2011-12-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177869#comment-13177869 ] Todd Lipcon commented on HDFS-2737: --- bq. Is it worth considering supporting tailing

[jira] [Commented] (HDFS-2692) HA: Bugs related to failover from/into safe-mode

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177367#comment-13177367 ] Todd Lipcon commented on HDFS-2692: --- bq. In FSEditLogLoader#loadFSEdits, should we really

[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177371#comment-13177371 ] Todd Lipcon commented on HDFS-2720: --- Small nits: {code} + // Now format 1st NN and

[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177434#comment-13177434 ] Todd Lipcon commented on HDFS-2720: --- That would be a nice improvement... but I think it

[jira] [Commented] (HDFS-2732) Add support for the standby in the bin scripts

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177479#comment-13177479 ] Todd Lipcon commented on HDFS-2732: --- For me, start-dfs.sh actually already works, since

[jira] [Commented] (HDFS-2731) Autopopulate standby name dirs if they're empty

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177478#comment-13177478 ] Todd Lipcon commented on HDFS-2731: --- bq. as an optimization it could copy the logs from

[jira] [Commented] (HDFS-2731) Autopopulate standby name dirs if they're empty

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177488#comment-13177488 ] Todd Lipcon commented on HDFS-2731: --- The primary shouldn't be removing any old images

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2011-12-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177540#comment-13177540 ] Todd Lipcon commented on HDFS-2709: --- A few thoughts on the overall approach: - Rather

[jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer

2011-12-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176232#comment-13176232 ] Todd Lipcon commented on HDFS-2709: --- What about the case where the edit log is large

[jira] [Commented] (HDFS-2720) HA : TestStandbyIsHot is failing while copying in_use.lock file from NN1 nameSpaceDirs to NN2 nameSpaceDirs

2011-12-23 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175532#comment-13175532 ] Todd Lipcon commented on HDFS-2720: --- For testing on actual clusters, I've done this by

[jira] [Commented] (HDFS-2623) HA: Add test case for hot standby capability

2011-12-22 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174942#comment-13174942 ] Todd Lipcon commented on HDFS-2623: --- Hi Uma. Yes, it passes here on Linux... my guess is

[jira] [Commented] (HDFS-2716) HA: Configuration needs to allow different dfs.http.addresses for each HA NN

2011-12-22 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175097#comment-13175097 ] Todd Lipcon commented on HDFS-2716: --- The {{getInfoServer}} function in {{DFSUtil}} only

[jira] [Commented] (HDFS-2718) Optimize OP_ADD in edits loading

2011-12-22 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175114#comment-13175114 ] Todd Lipcon commented on HDFS-2718: --- This is very similar to HDFS-2602 - we should at

[jira] [Commented] (HDFS-2185) HA: ZK-based FailoverController

2011-12-21 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174246#comment-13174246 ] Todd Lipcon commented on HDFS-2185: --- Yea, this is very similar to the leader election

[jira] [Commented] (HDFS-2185) HA: ZK-based FailoverController

2011-12-21 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174263#comment-13174263 ] Todd Lipcon commented on HDFS-2185: --- Twitter's also got a nice library of ZK stuff. But I

[jira] [Commented] (HDFS-2185) HA: ZK-based FailoverController

2011-12-21 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174338#comment-13174338 ] Todd Lipcon commented on HDFS-2185: --- Great, thanks for the link, Uma. I will be sure to

[jira] [Commented] (HDFS-2713) HA : An alternative approach to clients handling Namenode failover.

2011-12-21 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174373#comment-13174373 ] Todd Lipcon commented on HDFS-2713: --- IMO it seems preferable to enhance (or replace) the

[jira] [Commented] (HDFS-2699) Store data and checksums together in block file

2011-12-20 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173421#comment-13173421 ] Todd Lipcon commented on HDFS-2699: --- We already basically inline them on the wire in 64K

[jira] [Commented] (HDFS-2693) Synchronization issues around state transition

2011-12-20 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173766#comment-13173766 ] Todd Lipcon commented on HDFS-2693: --- bq. Can't we nuke the new code in

[jira] [Commented] (HDFS-2692) HA: Bugs related to failover from/into safe-mode

2011-12-20 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173775#comment-13173775 ] Todd Lipcon commented on HDFS-2692: --- btw, this patch is on top of HDFS-2693 and

[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-12-20 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173895#comment-13173895 ] Todd Lipcon commented on HDFS-2291: --- I plan to start working on this tomorrow. My

[jira] [Commented] (HDFS-2701) Cleanup FS* processIOError methods

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172511#comment-13172511 ] Todd Lipcon commented on HDFS-2701: --- Ah, right. I was thinking about trunk wrt

[jira] [Commented] (HDFS-2702) A single failed name dir can cause the NN to exit

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172513#comment-13172513 ] Todd Lipcon commented on HDFS-2702: --- Oh, right. duh :) Thanks, +1. A

[jira] [Commented] (HDFS-2678) HA: When a FailoverProxyProvider is used, DFSClient should not retry connection ten times before failing over

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172597#comment-13172597 ] Todd Lipcon commented on HDFS-2678: --- +1 modulo one comment: should use

[jira] [Commented] (HDFS-2702) A single failed name dir can cause the NN to exit

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172612#comment-13172612 ] Todd Lipcon commented on HDFS-2702: --- I think the addendum is correct. I wish we had a

[jira] [Commented] (HDFS-2162) Merge NameNode roles into NodeType.

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172614#comment-13172614 ] Todd Lipcon commented on HDFS-2162: --- Suresh, are you still planning on working on this?

[jira] [Commented] (HDFS-2682) HA: When a FailoverProxyProvider is used, Client should not retry for 45 times(hard coded value) if it is timing out to connect to server.

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172650#comment-13172650 ] Todd Lipcon commented on HDFS-2682: --- +1, will commit momentarily HA:

[jira] [Commented] (HDFS-1972) HA: Datanode fencing mechanism

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172742#comment-13172742 ] Todd Lipcon commented on HDFS-1972: --- I think I will actually integrate HDFS-2603 into

[jira] [Commented] (HDFS-2191) Move datanodeMap from FSNamesystem to DatanodeManager

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172780#comment-13172780 ] Todd Lipcon commented on HDFS-2191: --- Hi Nicholas. After this patch, the block invalidate

[jira] [Commented] (HDFS-2693) Synchronization issues around state transition

2011-12-19 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172886#comment-13172886 ] Todd Lipcon commented on HDFS-2693: --- Found a nasty bug in the current patch: in

[jira] [Commented] (HDFS-2699) Store data and checksums together in block file

2011-12-18 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171961#comment-13171961 ] Todd Lipcon commented on HDFS-2699: --- The idea of introducing the new format as a

[jira] [Commented] (HDFS-2699) Store data and checksums together in block file

2011-12-18 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172006#comment-13172006 ] Todd Lipcon commented on HDFS-2699: --- bq. Modifying any portion of that region will

[jira] [Commented] (HDFS-2701) Cleanup FS* processIOError methods

2011-12-18 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172023#comment-13172023 ] Todd Lipcon commented on HDFS-2701: --- in open(), if all of them fail to open, we'll have

[jira] [Commented] (HDFS-2703) removedStorageDirs is not updated everywhere we remove a storage dir

2011-12-18 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172024#comment-13172024 ] Todd Lipcon commented on HDFS-2703: --- +1 removedStorageDirs is not

[jira] [Commented] (HDFS-2702) A single failed name dir can cause the NN to exit

2011-12-18 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172025#comment-13172025 ] Todd Lipcon commented on HDFS-2702: --- - in {{fatalExit}}, can you change it to: {code}

[jira] [Commented] (HDFS-2679) Add interface to query current state to HAServiceProtocol

2011-12-18 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172032#comment-13172032 ] Todd Lipcon commented on HDFS-2679: --- +1. I'll commit this momentarily

<    1   2   3   4   5   6   7   >