[ 
https://issues.apache.org/jira/browse/AMBARI-22848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapan Shridhar updated AMBARI-22848:
-------------------------------------
    Fix Version/s:     (was: 2.6.2)
                   2.7.0

> Blueprint database inconsistency should be caught by Ambari DB consistency 
> checker
> ----------------------------------------------------------------------------------
>
>                 Key: AMBARI-22848
>                 URL: https://issues.apache.org/jira/browse/AMBARI-22848
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.5.0
>            Reporter: Robert Nettleton
>            Assignee: Robert Nettleton
>            Priority: Critical
>             Fix For: 2.7.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> We've seen some Blueprint deployments fail after an upgrade to Ambari 
> 2.5.2/2.6) causes older configuration to be reset.  
> 1. User deploys cluster via Blueprints with older (older than Ambari 2.5/2.6) 
> version of Ambari.
> 2. Cluster deployment fails, and either the user doesn't realize the 
> deployment has failed, or works through the manual configuration changes 
> required to get failed services up and running. 
> 3. Things run fine, sometimes for quite a while.
> 4. User upgrades ambari-server to Ambari 2.5 or Ambari 2.6.
> 5. Upon the restart of ambari-server, some services seem to be failing, due 
> to invalid, or old configuration.
> The root cause of this problem is that the Blueprints TopologyManager class 
> will attempt to "replay" any failed requests, which was originally 
> implemented to allow a Blueprints install to continue working even if 
> ambari-server is stopped and restarted.
> Since the original Blueprint deployment failed, the Ambari Server database is 
> in an inconsistent state, which causes the Blueprints ToplogyManager to 
> attempt a replay of various configuration tasks. This ends up causing the 
> TopologyManager to send configuration updates from the Blueprints's 
> configuration sections, why by now may be quite out of date, as the cluster 
> may have changed over time while being adminstered.
> This in turn causes some services to fail, as older configuration may not 
> match the current environment.
>  
> The ambari-server update mechanism should be modified to include integrity 
> checks on the Blueprint-related tables in the database. In particular, if a 
> Blueprint deployment is detected, at the very least the "clusterconfig" table 
> needs to be checked, to ensure that at least one configuration type's version 
> has a
> {code:java}
> version_tag{code}
> of "TOPOLOGY_RESOLVED". If no configuration versions are found to have a tag 
> of "TOPOLOGY_RESOLVED", then the ambari-server upgrade should fail with the 
> appropriate messages, to allow the user to make the manual changes required 
> in order to resolve the problem, usually by applying a workaround.
> Having this check at the ambari-server upgrade time seems like the correct 
> way to move forward, as this will more quickly detect this problem, and will 
> keep users from accidentally moving forward with an upgrade that will corrupt 
> the cluster's configuration with older configuration items.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to