[jira] [Updated] (AMBARI-12488) RU - Use haadmin failover command instead of killing ZKFC during upgrade/downgrade

Alejandro Fernandez (JIRA) Wed, 22 Jul 2015 13:55:19 -0700

     [ 
https://issues.apache.org/jira/browse/AMBARI-12488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alejandro Fernandez updated AMBARI-12488:
-----------------------------------------
    Attachment: AMBARI-12488.branch-2.1.2.patch
                AMBARI-12488.branch-2.1.2.additional.patch

> RU - Use haadmin failover command instead of killing ZKFC during 
> upgrade/downgrade
> ----------------------------------------------------------------------------------
>
>                 Key: AMBARI-12488
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12488
>             Project: Ambari
>          Issue Type: Story
>          Components: ambari-server
>    Affects Versions: 2.0.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>              Labels: rolling_upgrade
>             Fix For: 2.1.2
>
>         Attachments: AMBARI-12488.branch-2.1.2.additional.patch, 
> AMBARI-12488.branch-2.1.2.patch, AMBARI-12488.branch-2.1.2.patch, 
> AMBARI-12488.patch
>
>
> Currently RU orchestration during upgrade/downgrade kills ZKFC on the active 
> NameNode to initiate a failover to standby. We should instead use the 
> failover command.
> E.g.,
> {code}
> su hdfs -c 'hdfs haadmin -failover nn1 nn2'
> {code}
> Where nn1 is the current namenode if it if the active one, and nn2 is the 
> remaining namenode.
> This is safer than killing zkfc on the active namenode because this command 
> first tries to gracefully transition a NameNode to the Standby state. If this 
> fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be 
> attempted until one succeeds. After this process the second NameNode will be 
> transitioned to the Active state. 
> It reduces long waits between ZKFC kill, failure kicking-in after a timeout, 
> and then NN becoming active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (AMBARI-12488) RU - Use haadmin failover command instead of killing ZKFC during upgrade/downgrade

Reply via email to