-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30603/
-----------------------------------------------------------

Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and 
Yurii Shylov.


Bugs: AMBARI-9467
    https://issues.apache.org/jira/browse/AMBARI-9467


Repository: ambari


Description
-------

UpgradeHelper somehow calls the active Namenode first, but this ends up being 
the standby namenode by the time it gets called; investigate why.

We will abide by the order in the runbook to first upgrade the standby then the 
active namenode, which then causes a flip.
In rare cases, if a namenode fails for whatever reason, ZKFC will initiate a 
failover, which explains why sometimes the order may be flipped by the time 
that the Namenode prepare happens. However, the namenode_upgrade.py script 
works in both cases (active first, or standby first). So this explains the rare 
behavior.
There's another Jira to run the namenode_upgrade script as part of the 
Pre-Cluster group to make the backup, so this should reduce the likelyhood of a 
flip happening after the calculation was made.


Diffs
-----

  
ambari-server/src/main/java/org/apache/ambari/server/serveraction/upgrades/FinalizeUpgradeAction.java
 fceb44d 
  ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 
0c6f68a 
  
ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java
 db17109 
  
ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/params.py
 2484463 
  
ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/service_check.py
 338de32 
  
ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/zookeeper_server.py
 a7ca335 

Diff: https://reviews.apache.org/r/30603/diff/


Testing
-------

Verified Rolling Upgrade a 3-node cluster with HDFS, ZK, and Namenode HA. The 
flip happens rarely, but ambari must be robust to handle it.

Unit tests are in progress.


Thanks,

Alejandro Fernandez

Reply via email to