[jira] [Commented] (HADOOP-8247) Auto-HA: add a config to enable auto-HA, which disables manual FC

Todd Lipcon (Commented) (JIRA) Mon, 09 Apr 2012 16:29:42 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-8247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250294#comment-13250294
 ]


Todd Lipcon commented on HADOOP-8247:
-------------------------------------

I also ran the manual tests again. Here's the usage output of HAAdmin:

{code}
Usage: DFSHAAdmin [-ns <nameserviceId>]
    [-transitionToActive [--forcemanual] <serviceId>]
    [-transitionToStandby [--forcemanual] <serviceId>]
    [-failover [--forcefence] [--forceactive] [--forcemanual] <serviceId> 
<serviceId>]
    [-getServiceState <serviceId>]
    [-checkHealth <serviceId>]
    [-help <command>]

  --forceManual allows the manual failover commands to be used
                even when automatic failover is enabled. This
                flag is DANGEROUS and should only be used with
                expert guidance.
{code}

Here's what happens if I try to use a state change command with auto-HA enabled:

{code}
$ ./bin/hdfs haadmin -transitionToActive nn1
Automatic failover is enabled for NameNode at todd-w510/127.0.0.1:8021
Refusing to manually manage HA state, since it may cause
a split-brain scenario or other incorrect state.
If you are very sure you know what you are doing, please 
specify the forcemanual flag.
$ echo $?
255
{code}

Also checked the other two state-changing ops (transitionToStandby and 
failover) and they yielded the same error message.


- I verified that {{-getServiceState}} and {{-checkHealth}} continue to work.

- I verified that the -forceManual flag worked:

{code}
$ ./bin/hdfs haadmin -transitionToStandby -forcemanual nn1
12/04/09 16:12:38 WARN ha.HAAdmin: Proceeding with manual HA state management 
even though
automatic failover is enabled for NameNode at todd-w510/127.0.0.1:8021
{code}
(also for -transitionToActive and -failover)

- Verified that {{start-dfs.sh}} starts the ZKFCs on both of my configured NNs 
when auto-HA is enabled. Also verified {{stop-dfs.sh}} stops the ZKFCs. 
Discovered trivial bug HDFS-3234 here.

----

Next, I modified my config to set the auto failover flag to false.

- verified that start-dfs.sh doesn't try to start ZKFCs.
- verified that if I try to start a ZKFC, it bails:
{code}
12/04/09 16:19:12 INFO tools.DFSZKFailoverController: Failover controller 
configured for NameNode nameserviceId1.nn2
12/04/09 16:19:12 FATAL ha.ZKFailoverController: Automatic failover is not 
enabled for NameNode at todd-w510/127.0.0.1:8022. Please ensure that automatic 
failover is enabled in the configuration before running the ZK failover 
controller.
{code}

- verified that the haadmin commands all function without any {{-forcemanual}} 
flag specified.

                
> Auto-HA: add a config to enable auto-HA, which disables manual FC
> -----------------------------------------------------------------
>
>                 Key: HADOOP-8247
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8247
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: auto-failover, ha
>    Affects Versions: Auto Failover (HDFS-3042)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hadoop-8247.txt, hadoop-8247.txt, hadoop-8247.txt, 
> hadoop-8247.txt
>
>
> Currently, if automatic failover is set up and running, and the user uses the 
> "haadmin -failover" command, he or she can end up putting the system in an 
> inconsistent state, where the state in ZK disagrees with the actual state of 
> the world. To fix this, we should add a config flag which is used to enable 
> auto-HA. When this flag is set, we should disallow use of the haadmin command 
> to initiate failovers. We should refuse to run ZKFCs when the flag is not 
> set. Of course, this flag should be scoped by nameservice.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8247) Auto-HA: add a config to enable auto-HA, which disables manual FC

Reply via email to