----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30603/ -----------------------------------------------------------
Review request for Ambari, Dmitro Lisnichenko, Jonathan Hurley, Nate Cole, and Yurii Shylov. Bugs: AMBARI-9467 https://issues.apache.org/jira/browse/AMBARI-9467 Repository: ambari Description ------- UpgradeHelper somehow calls the active Namenode first, but this ends up being the standby namenode by the time it gets called; investigate why. We will abide by the order in the runbook to first upgrade the standby then the active namenode, which then causes a flip. In rare cases, if a namenode fails for whatever reason, ZKFC will initiate a failover, which explains why sometimes the order may be flipped by the time that the Namenode prepare happens. However, the namenode_upgrade.py script works in both cases (active first, or standby first). So this explains the rare behavior. There's another Jira to run the namenode_upgrade script as part of the Pre-Cluster group to make the backup, so this should reduce the likelyhood of a flip happening after the calculation was made. Diffs ----- ambari-server/src/main/java/org/apache/ambari/server/serveraction/upgrades/FinalizeUpgradeAction.java fceb44d ambari-server/src/main/java/org/apache/ambari/server/state/UpgradeHelper.java 0c6f68a ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java db17109 ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/params.py 2484463 ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/service_check.py 338de32 ambari-server/src/main/resources/common-services/ZOOKEEPER/3.4.5.2.0/package/scripts/zookeeper_server.py a7ca335 Diff: https://reviews.apache.org/r/30603/diff/ Testing ------- Verified Rolling Upgrade a 3-node cluster with HDFS, ZK, and Namenode HA. The flip happens rarely, but ambari must be robust to handle it. Unit tests are in progress. Thanks, Alejandro Fernandez