[ https://issues.apache.org/jira/browse/AMBARI-12488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alejandro Fernandez updated AMBARI-12488: ----------------------------------------- Attachment: AMBARI-12488.branch-2.1.2.patch AMBARI-12488.branch-2.1.2.additional.patch > RU - Use haadmin failover command instead of killing ZKFC during > upgrade/downgrade > ---------------------------------------------------------------------------------- > > Key: AMBARI-12488 > URL: https://issues.apache.org/jira/browse/AMBARI-12488 > Project: Ambari > Issue Type: Story > Components: ambari-server > Affects Versions: 2.0.0 > Reporter: Alejandro Fernandez > Assignee: Alejandro Fernandez > Labels: rolling_upgrade > Fix For: 2.1.2 > > Attachments: AMBARI-12488.branch-2.1.2.additional.patch, > AMBARI-12488.branch-2.1.2.patch, AMBARI-12488.branch-2.1.2.patch, > AMBARI-12488.patch > > > Currently RU orchestration during upgrade/downgrade kills ZKFC on the active > NameNode to initiate a failover to standby. We should instead use the > failover command. > E.g., > {code} > su hdfs -c 'hdfs haadmin -failover nn1 nn2' > {code} > Where nn1 is the current namenode if it if the active one, and nn2 is the > remaining namenode. > This is safer than killing zkfc on the active namenode because this command > first tries to gracefully transition a NameNode to the Standby state. If this > fails, the fencing methods (as configured by dfs.ha.fencing.methods) will be > attempted until one succeeds. After this process the second NameNode will be > transitioned to the Active state. > It reduces long waits between ZKFC kill, failure kicking-in after a timeout, > and then NN becoming active. -- This message was sent by Atlassian JIRA (v6.3.4#6332)