[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN

2012-02-15 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208299#comment-13208299 ] Todd Lipcon commented on HDFS-1623: --- Good point about upgrade failover. I think for the f

[jira] [Commented] (HDFS-2949) HA: Add check to active state transition to prevent operator-induced split brain

2012-02-14 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208232#comment-13208232 ] Todd Lipcon commented on HDFS-2949: --- Yep, this is not supposed to solve issues, just to p

[jira] [Commented] (HDFS-2949) HA: Add check to active state transition to prevent operator-induced split brain

2012-02-14 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208168#comment-13208168 ] Todd Lipcon commented on HDFS-2949: --- I think we should probably un-document the transitio

[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN

2012-02-14 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208152#comment-13208152 ] Todd Lipcon commented on HDFS-1623: --- Hi Mingjie, Yes, the first cut doesn't include auto

[jira] [Commented] (HDFS-2934) HA: Allow configs to be scoped to all NNs in the nameservice

2012-02-14 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208047#comment-13208047 ] Todd Lipcon commented on HDFS-2934: --- Ran all the HA tests with this change and they passe

[jira] [Commented] (HDFS-2948) HA: NN throws NPE during shutdown if it fails to startup

2012-02-14 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13207867#comment-13207867 ] Todd Lipcon commented on HDFS-2948: --- exception from the jenkins build: {code| java.lang.N

[jira] [Commented] (HDFS-2815) Namenode is not coming out of safemode when we perform ( NN crash + restart ) . Also FSCK report shows blocks missed.

2012-02-12 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206569#comment-13206569 ] Todd Lipcon commented on HDFS-2815: --- Oops, sorry, I phrased that poorly (I agree that HDF

[jira] [Commented] (HDFS-2506) Umbrella jira for tracking separation of wire protocol datatypes from the implementation types

2012-02-10 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205958#comment-13205958 ] Todd Lipcon commented on HDFS-2506: --- Can this be marked resolved? > Umbr

[jira] [Commented] (HDFS-2935) Shared edits dir property should be suffixed with nameservice and namenodeID

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205273#comment-13205273 ] Todd Lipcon commented on HDFS-2935: --- Combined with HDFS-2934, one could do either.

[jira] [Commented] (HDFS-2934) HA: Allow configs to be scoped to all NNs in the nameservice

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13205271#comment-13205271 ] Todd Lipcon commented on HDFS-2934: --- The shared edits dir, for example, might be {{/filer

[jira] [Commented] (HDFS-2917) HA: haadmin should not work if run by regular user

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204984#comment-13204984 ] Todd Lipcon commented on HDFS-2917: --- k, +1 with the typo fixed > HA: haa

[jira] [Commented] (HDFS-2920) HA: fix remaining TODO items

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204945#comment-13204945 ] Todd Lipcon commented on HDFS-2920: --- I'm going to work on this, and also address a few ot

[jira] [Commented] (HDFS-2781) Add client protocol and DFSadmin for command to restore failed storage

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204924#comment-13204924 ] Todd Lipcon commented on HDFS-2781: --- There's some interaction with fencing, here, though.

[jira] [Commented] (HDFS-2909) HA: Inaccessible shared edits dir not getting removed from FSImage storage dirs upon error

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204765#comment-13204765 ] Todd Lipcon commented on HDFS-2909: --- {quote} Say everything is healthy and FSImage.rollEd

[jira] [Commented] (HDFS-2865) Standby namenode gets a "cannot lock storage" exception during startup

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204764#comment-13204764 ] Todd Lipcon commented on HDFS-2865: --- Hi Hari. I'd like to close this as invalid, since it

[jira] [Commented] (HDFS-2866) Standby does not start up due to a gap in transaction id

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204763#comment-13204763 ] Todd Lipcon commented on HDFS-2866: --- Hi Hari. Is this issue still valid? If not can we cl

[jira] [Commented] (HDFS-1623) High Availability Framework for HDFS NN

2012-02-09 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204755#comment-13204755 ] Todd Lipcon commented on HDFS-1623: --- Hey Konstantin. What specific performance tests woul

[jira] [Commented] (HDFS-2510) Add HA-related metrics

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204308#comment-13204308 ] Todd Lipcon commented on HDFS-2510: --- Sorry, missed the comment above: {quote} Similarly,

[jira] [Commented] (HDFS-2917) HA: haadmin should not work if run by regular user

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204306#comment-13204306 ] Todd Lipcon commented on HDFS-2917: --- I think consistency with the other administrative op

[jira] [Commented] (HDFS-2922) HA: close out operation categories

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204289#comment-13204289 ] Todd Lipcon commented on HDFS-2922: --- +1 > HA: close out operation catego

[jira] [Commented] (HDFS-2579) Starting delegation token manager during safemode fails

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204275#comment-13204275 ] Todd Lipcon commented on HDFS-2579: --- Hi Jitendra. I considered using tryLock with a timeo

[jira] [Commented] (HDFS-2922) HA: close out operation categories

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204262#comment-13204262 ] Todd Lipcon commented on HDFS-2922: --- setBalancerBandwidth (a somewhat bizarre RPC) would

[jira] [Commented] (HDFS-2911) Gracefully handle OutOfMemoryErrors

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204254#comment-13204254 ] Todd Lipcon commented on HDFS-2911: --- OK, you've convinced me :) > Gracef

[jira] [Commented] (HDFS-2911) Gracefully handle OutOfMemoryErrors

2012-02-08 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204026#comment-13204026 ] Todd Lipcon commented on HDFS-2911: --- We could simply recommend the following for the java

[jira] [Commented] (HDFS-2912) HA: Namenode not shutting down when shared edits dir is inaccessible

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203217#comment-13203217 ] Todd Lipcon commented on HDFS-2912: --- I think the issue is this -- previously the abort lo

[jira] [Commented] (HDFS-2910) HA: Active NN reports Bad state: BETWEEN_LOG_SEGMENTS when shared edits dir is inaccessible during log roll

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203171#comment-13203171 ] Todd Lipcon commented on HDFS-2910: --- In order to make the NN ride over a hiccup, it seems

[jira] [Commented] (HDFS-2912) HA: Namenode not shutting down when shared edits dir is inaccessible

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203106#comment-13203106 ] Todd Lipcon commented on HDFS-2912: --- In log4j, LOG.fatal doesn't actually terminate the N

[jira] [Commented] (HDFS-2579) Starting delegation token manager during safemode fails

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203006#comment-13203006 ] Todd Lipcon commented on HDFS-2579: --- We've found one bug during stress testing - there's

[jira] [Commented] (HDFS-2910) HA: Active NN reports Bad state: BETWEEN_LOG_SEGMENTS when shared edits dir is inaccessible during log roll

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203000#comment-13203000 ] Todd Lipcon commented on HDFS-2910: --- We should just do a hard exit here -- upon restart o

[jira] [Commented] (HDFS-2913) HA: Need a way to shutdown the Name Node

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202997#comment-13202997 ] Todd Lipcon commented on HDFS-2913: --- Currently it is meant to do a "fail fast" shutdown -

[jira] [Commented] (HDFS-2510) Add HA-related metrics

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202950#comment-13202950 ] Todd Lipcon commented on HDFS-2510: --- {code} + public long getMillisSinceLastLoadedEdits(

[jira] [Commented] (HDFS-2911) Gracefully handle OutOfMemoryErrors

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202860#comment-13202860 ] Todd Lipcon commented on HDFS-2911: --- One option we use in HBase is to set the JVM to kill

[jira] [Commented] (HDFS-2839) HA: Remove need for client configuration of nameservice ID

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202821#comment-13202821 ] Todd Lipcon commented on HDFS-2839: --- One possibility we'd discussed last year is to use D

[jira] [Commented] (HDFS-2909) HA: Inaccessible shared edits dir not getting removed from FSImage storage dirs upon error

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202675#comment-13202675 ] Todd Lipcon commented on HDFS-2909: --- It seems that, when rollEditLogs fails in the shared

[jira] [Commented] (HDFS-2905) Standby NN NPE when shared edits dir is deleted

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202669#comment-13202669 ] Todd Lipcon commented on HDFS-2905: --- Can you add a unit test for this in TestFileJournalM

[jira] [Commented] (HDFS-2905) Standby NN NPE when shared edits dir is deleted

2012-02-07 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202578#comment-13202578 ] Todd Lipcon commented on HDFS-2905: --- Probably just need to replace currentDir.listFiles()

[jira] [Commented] (HDFS-2586) Add protobuf service and implementation for HAServiceProtocol

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201959#comment-13201959 ] Todd Lipcon commented on HDFS-2586: --- Did you mean for the changes to the NodeType enum to

[jira] [Commented] (HDFS-2781) Add client protocol and DFSadmin for command to restore failed storage

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201945#comment-13201945 ] Todd Lipcon commented on HDFS-2781: --- I think if you were continuously writing to the acti

[jira] [Commented] (HDFS-2794) HA: Active NN may purge edit log files before standby NN has a chance to read them

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201819#comment-13201819 ] Todd Lipcon commented on HDFS-2794: --- I intend to commit this later this afternoon unless

[jira] [Commented] (HDFS-2794) HA: Active NN may purge edit log files before standby NN has a chance to read them

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201612#comment-13201612 ] Todd Lipcon commented on HDFS-2794: --- I thought a bit about that, but it would require ano

[jira] [Commented] (HDFS-2781) Add client protocol and DFSadmin for command to restore failed storage

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201591#comment-13201591 ] Todd Lipcon commented on HDFS-2781: --- Currently, if the shared edits goes away, we don't d

[jira] [Commented] (HDFS-2733) Document HA configuration and CLI

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201587#comment-13201587 ] Todd Lipcon commented on HDFS-2733: --- +1, lgtm. > Document HA configurati

[jira] [Commented] (HDFS-2794) HA: Active NN may purge edit log files before standby NN has a chance to read them

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201584#comment-13201584 ] Todd Lipcon commented on HDFS-2794: --- Changed the description to: {quote} The number of

[jira] [Commented] (HDFS-2782) HA: Support multiple shared edits dirs

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201583#comment-13201583 ] Todd Lipcon commented on HDFS-2782: --- Been thinking about this a bit... I don't think it's

[jira] [Commented] (HDFS-2752) HA: exit if multiple shared dirs are configured

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201568#comment-13201568 ] Todd Lipcon commented on HDFS-2752: --- The other issue is that, in order to support multipl

[jira] [Commented] (HDFS-2733) Document HA configuration and CLI

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201513#comment-13201513 ] Todd Lipcon commented on HDFS-2733: --- - "In the event of an unplanned event" could be word

[jira] [Commented] (HDFS-2901) HA: Improvements for SBN web UI - not show under-replicated/missing blocks

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201489#comment-13201489 ] Todd Lipcon commented on HDFS-2901: --- Hi Brandon. Since you already have HDFS-2830 assigne

[jira] [Commented] (HDFS-2819) Document new HA-related configs in hdfs-default.xml

2012-02-06 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201122#comment-13201122 ] Todd Lipcon commented on HDFS-2819: --- +1 > Document new HA-related config

[jira] [Commented] (HDFS-2894) HA: automatically determine the nameservice Id if only one nameservice is configured

2012-02-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201113#comment-13201113 ] Todd Lipcon commented on HDFS-2894: --- +1 > HA: automatically determine th

[jira] [Commented] (HDFS-2819) Document new HA-related configs in hdfs-default.xml

2012-02-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201097#comment-13201097 ] Todd Lipcon commented on HDFS-2819: --- {code} +often the logs are rolled. Note that fai

[jira] [Commented] (HDFS-2752) HA: exit if multiple shared dirs are configured

2012-02-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201091#comment-13201091 ] Todd Lipcon commented on HDFS-2752: --- Just one nit: you can remove the following empty jav

[jira] [Commented] (HDFS-2894) HA: automatically determine the nameservice Id if only one nameservice is configured

2012-02-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201089#comment-13201089 ] Todd Lipcon commented on HDFS-2894: --- - Did you test the non-HA federated config as well?

[jira] [Commented] (HDFS-2893) The start/stop scripts don't start/stop the 2NN when using the default configuration

2012-02-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201086#comment-13201086 ] Todd Lipcon commented on HDFS-2893: --- In the HA case, where we don't want to start any 2NN

[jira] [Commented] (HDFS-2794) HA: Active NN may purge edit log files before standby NN has a chance to read them

2012-02-05 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201082#comment-13201082 ] Todd Lipcon commented on HDFS-2794: --- Attached patch adds a new configuration, dfs.namenod

[jira] [Commented] (HDFS-2579) Starting delegation token manager during safemode fails

2012-02-04 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200657#comment-13200657 ] Todd Lipcon commented on HDFS-2579: --- Spent the afternoon working on this. Here's the diag

[jira] [Commented] (HDFS-2792) HA: Make fsck work

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200304#comment-13200304 ] Todd Lipcon commented on HDFS-2792: --- +1 > HA: Make fsck work > -

[jira] [Commented] (HDFS-2890) HA: DFSUtil#getSuffixIDs should skip unset configurations

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200296#comment-13200296 ] Todd Lipcon commented on HDFS-2890: --- lgtm, +1 > HA: DFSUtil#getSuffixIDs

[jira] [Commented] (HDFS-2808) HA: haadmin should use namenode ids

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200295#comment-13200295 ] Todd Lipcon commented on HDFS-2808: --- typo: +// No namepsace ID was given and more

[jira] [Commented] (HDFS-2792) HA: Make fsck work

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200288#comment-13200288 ] Todd Lipcon commented on HDFS-2792: --- You've still got a bunch of changes in DFSUtil?

[jira] [Commented] (HDFS-2890) HA: DFSUtil#getSuffixIDs should skip unset configurations

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200287#comment-13200287 ] Todd Lipcon commented on HDFS-2890: --- Want to also include the other change you add, where

[jira] [Commented] (HDFS-2874) HA: edit log should log to shared dirs before local dirs

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200281#comment-13200281 ] Todd Lipcon commented on HDFS-2874: --- I ran two manual tests using mdadm fault injection:

[jira] [Commented] (HDFS-2874) HA: edit log should log to shared dirs before local dirs

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200235#comment-13200235 ] Todd Lipcon commented on HDFS-2874: --- It should restart from 101 since, when it starts up,

[jira] [Commented] (HDFS-2792) HA: Make fsck work

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200231#comment-13200231 ] Todd Lipcon commented on HDFS-2792: --- - Can you move the changes in DFSUtil to another pat

[jira] [Commented] (HDFS-2874) HA: edit log should log to shared dirs before local dirs

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200214#comment-13200214 ] Todd Lipcon commented on HDFS-2874: --- I ran all the tests which reference EditLog and they

[jira] [Commented] (HDFS-2379) 0.20: Allow block reports to proceed without holding FSDataset lock

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200026#comment-13200026 ] Todd Lipcon commented on HDFS-2379: --- I'm currently pretty slammed with work on the HA bra

[jira] [Commented] (HDFS-2808) HA: haadmin should use namenode ids

2012-02-03 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199988#comment-13199988 ] Todd Lipcon commented on HDFS-2808: --- - I think we do have to push the idea of the "namese

[jira] [Commented] (HDFS-2718) Optimize OP_ADD in edits loading

2012-02-02 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199367#comment-13199367 ] Todd Lipcon commented on HDFS-2718: --- No major issues, but this is fairly critical code an

[jira] [Commented] (HDFS-2718) Optimize OP_ADD in edits loading

2012-02-02 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199309#comment-13199309 ] Todd Lipcon commented on HDFS-2718: --- Looking at what was committed to trunk, I found a co

[jira] [Commented] (HDFS-2877) If locking of a storage dir fails, it will remove the other NN's lock file on exit

2012-02-02 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199054#comment-13199054 ] Todd Lipcon commented on HDFS-2877: --- No, because on a local disk, if the process crashes,

[jira] [Commented] (HDFS-2874) HA: edit log should log to shared dirs before local dirs

2012-02-02 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13199031#comment-13199031 ] Todd Lipcon commented on HDFS-2874: --- bq. Adding any type of write ordering to edit log up

[jira] [Commented] (HDFS-2877) If locking of a storage dir fails, it will remove the other NN's lock file on exit

2012-02-02 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198983#comment-13198983 ] Todd Lipcon commented on HDFS-2877: --- Uma's got it -- the RandomAccessFile constructor wil

[jira] [Commented] (HDFS-2874) HA: edit log should log to shared dirs before local dirs

2012-02-01 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198564#comment-13198564 ] Todd Lipcon commented on HDFS-2874: --- Yes, we could certainly attach a priority level to e

[jira] [Commented] (HDFS-2865) Standby namenode gets a "cannot lock storage" exception during startup

2012-02-01 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198403#comment-13198403 ] Todd Lipcon commented on HDFS-2865: --- Oh, I meant to attach that to HDFS-2877... sorry, wi

[jira] [Commented] (HDFS-2876) The unit tests (src/test/unit) are not being compiled and are not runnable

2012-02-01 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198390#comment-13198390 ] Todd Lipcon commented on HDFS-2876: --- I vote we just move these "unit" tests back into the

[jira] [Commented] (HDFS-2859) LOCAL_ADDRESS_MATCHER.match has NPE when called from DFSUtil.getSuffixIDs when the host is incorrect

2012-02-01 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198193#comment-13198193 ] Todd Lipcon commented on HDFS-2859: --- bq. A general question I have is whether it is ok fo

[jira] [Commented] (HDFS-2861) HA: checkpointing should verify that the dfs.http.address has been configured to a non-loopback for peer NN

2012-01-31 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197605#comment-13197605 ] Todd Lipcon commented on HDFS-2861: --- I ran the tests before commit and realized this need

[jira] [Commented] (HDFS-2866) Standby does not start up due to a gap in transaction id

2012-01-31 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197499#comment-13197499 ] Todd Lipcon commented on HDFS-2866: --- Assuming /homes/hortonha is NFS-mounted (makes sense

[jira] [Commented] (HDFS-2865) Standby namenode gets a "cannot lock storage" exception during startup

2012-01-31 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197500#comment-13197500 ] Todd Lipcon commented on HDFS-2865: --- If you have both NNs pointing to the same name dir,

[jira] [Commented] (HDFS-2865) Standby namenode gets a "cannot lock storage" exception during startup

2012-01-31 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197482#comment-13197482 ] Todd Lipcon commented on HDFS-2865: --- Can you please include your configuration snippet fo

[jira] [Commented] (HDFS-2288) Replicas awaiting recovery should return a full visible length

2012-01-31 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197474#comment-13197474 ] Todd Lipcon commented on HDFS-2288: --- Sorry to have left this ticket idle for such a long

[jira] [Commented] (HDFS-2866) Standby does not start up due to a gap in transaction id

2012-01-31 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197446#comment-13197446 ] Todd Lipcon commented on HDFS-2866: --- One possibility I can imagine is that, if the NN wri

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196735#comment-13196735 ] Todd Lipcon commented on HDFS-2742: --- Here's an explanation of why this is still important

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196680#comment-13196680 ] Todd Lipcon commented on HDFS-2856: --- We do already wait for a BlockOpResponseProto before

[jira] [Commented] (HDFS-2824) HA: failover does not succeed if prior NN died just after creating an edit log segment

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196490#comment-13196490 ] Todd Lipcon commented on HDFS-2824: --- +1, thanks for the explanation > HA

[jira] [Commented] (HDFS-2824) HA: failover does not succeed if prior NN died just after creating an edit log segment

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196387#comment-13196387 ] Todd Lipcon commented on HDFS-2824: --- - Please add INFO or WARN level logs anywhere you de

[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196376#comment-13196376 ] Todd Lipcon commented on HDFS-2856: --- Or handshake at the beginning of a write -- we alrea

[jira] [Commented] (HDFS-2742) HA: observed dataloss in replication stress test

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196301#comment-13196301 ] Todd Lipcon commented on HDFS-2742: --- Sanjay makes a good point above about this being les

[jira] [Commented] (HDFS-2691) HA: Tests and fixes for pipeline targets and replica recovery

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196300#comment-13196300 ] Todd Lipcon commented on HDFS-2691: --- oops -- the above comment was meant for HDFS-2742. P

[jira] [Commented] (HDFS-2691) HA: Tests and fixes for pipeline targets and replica recovery

2012-01-30 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196280#comment-13196280 ] Todd Lipcon commented on HDFS-2691: --- Sanjay makes a good point above about this being les

[jira] [Commented] (HDFS-2779) HA: Add lease recovery handling to HA

2012-01-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195837#comment-13195837 ] Todd Lipcon commented on HDFS-2779: --- Hi Suresh, I think this is handled by HDFS-2691, rig

[jira] [Commented] (HDFS-2841) HA: HAAdmin does not work if security is enabled

2012-01-29 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195835#comment-13195835 ] Todd Lipcon commented on HDFS-2841: --- +1 > HA: HAAdmin does not work if s

[jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195092#comment-13195092 ] Todd Lipcon commented on HDFS-2759: --- bq. The explanation, per the comment, is that syncin

[jira] [Commented] (HDFS-2791) If block report races with closing of file, replica is incorrectly marked corrupt

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195086#comment-13195086 ] Todd Lipcon commented on HDFS-2791: --- bq. I am coming to the conclusion that when a NN as

[jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195074#comment-13195074 ] Todd Lipcon commented on HDFS-2759: --- This seems reasonable. I remember you ran a benchmar

[jira] [Commented] (HDFS-2844) HA: TestSafeMode#testNoExtensionIfNoBlocks is failing

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194871#comment-13194871 ] Todd Lipcon commented on HDFS-2844: --- Sorry, I mentioned this in HDFS-2742: https://issue

[jira] [Commented] (HDFS-2830) HA: Improvements for SBN web UI

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194866#comment-13194866 ] Todd Lipcon commented on HDFS-2830: --- HDFS-2845 points out that we should remove the "brow

[jira] [Commented] (HDFS-2847) NamenodeProtocol#getBlocks() should use DatanodeID as an argument instead of DatanodeInfo

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194863#comment-13194863 ] Todd Lipcon commented on HDFS-2847: --- +1 pending Jenkins results > Nameno

[jira] [Commented] (HDFS-2825) Add test hook to turn off the writer preferring its local DN

2012-01-27 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194857#comment-13194857 ] Todd Lipcon commented on HDFS-2825: --- Yes, for create/append > Add test h

[jira] [Commented] (HDFS-2825) Add test hook to turn off the writer preferring its local DN

2012-01-26 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194436#comment-13194436 ] Todd Lipcon commented on HDFS-2825: --- Maybe we should open a new JIRA to do this as a clie

[jira] [Commented] (HDFS-2839) Nameservice id in file uri could cause issues

2012-01-24 Thread Todd Lipcon (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192842#comment-13192842 ] Todd Lipcon commented on HDFS-2839: --- For #1 above, you could configure the "logical URI"

<    1   2   3   4   5   6   7   8   >