----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/40924/#review109002 -----------------------------------------------------------
Ship it! Looks ok to me. I do think we should ask Sumit Mohanty or Sid Wagle to review this as well, to make sure the Cluster changes around the desired configuration API are correct. Thanks. - Robert Nettleton On Dec. 3, 2015, 9:03 p.m., Sebastian Toader wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/40924/ > ----------------------------------------------------------- > > (Updated Dec. 3, 2015, 9:03 p.m.) > > > Review request for Ambari, Oliver Szabo, Robert Nettleton, and Sandor Magyari. > > > Bugs: AMBARI-14188 > https://issues.apache.org/jira/browse/AMBARI-14188 > > > Repository: ambari > > > Description > ------- > > 1. Increased the interval for Cluster configuration request retries from 100 > ms to 1 sec in order to reduce the burden on the CPU caused by persistent > failures. > > > 2. When Ambari is (re)started verifies if there are any persisted cluster > configuration requests that were not completed and will replay those. The way > it verifies if it has to create a cluster configuration request is looking at > the latest version of the cluster configs. If there is none config type with > tag=TOPOLOGY_RESOLVED than it will create a cluster configuration request. > > When the cluster is provisioned using a Blueprint config types will have two > version one with tag=INITIAL and one with tag=TOPOLOGY_RESOLVED the later > being the latest version (active). Then upgrading the cluster to a different > HDP version will update all config types creating new versions with > tag="version....". If Ambari is restarted at this stage it will look at the > active versions of the cluster configs. None of them being with > tag=TOPOLOGY_RESOLVED it will create a cluster configuration request. A > cluster configuration task is scheduled to handle the request. The logic that > executes the tasks and tries to update configuration types it will throw an > exception saying that there is a config type already with > tag=TOPOLOGY_RESOLVED since this looks at all version not only at active one. > This resulting in the retry mechanism for Cluster configuration to keep > retrying every 100ms for 30 min havign the side effect of Ambari server being > unresponsive. > > Changed the logic that determines if cluster configuration request has to be > replayed to look at all existing versions of config types and verify if there > at least one that went through the INITIAL -> TOPOLOGY_RESOLVES transition. > > > Diffs > ----- > > ambari-server/src/main/java/org/apache/ambari/server/state/Cluster.java > 2afba7e > > ambari-server/src/main/java/org/apache/ambari/server/state/DesiredConfig.java > 0635284 > > ambari-server/src/main/java/org/apache/ambari/server/state/cluster/ClusterImpl.java > 7ced845 > > ambari-server/src/main/java/org/apache/ambari/server/topology/AmbariContext.java > 608e6ca > > ambari-server/src/main/java/org/apache/ambari/server/topology/TopologyManager.java > 9b6c9ad > > ambari-server/src/test/java/org/apache/ambari/server/state/DesiredConfigTest.java > 93e3f07 > > ambari-server/src/test/java/org/apache/ambari/server/topology/AmbariContextTest.java > 254d3a3 > > Diff: https://reviews.apache.org/r/40924/diff/ > > > Testing > ------- > > Manual testing: > > 1. Created HDP2.2 cluster with Blueprint > 2. Upgraded cluster to HDP 2.3.2.0 > 3. Restarted Ambari Server > 4. Verified that ambari server is not erroring in a loop which was causing it > to become unresponsive > > Unit test results: > > Results : > > Tests run: 3518, Failures: 0, Errors: 0, Skipped: 28 > > > Thanks, > > Sebastian Toader > >