On 18.8.2017 13:18, Sofer Athlan-Guyot wrote:
Hi,

We may have missing packages when the user is adding a new role to its
roles_data file and the base image is coming from previous version.

The workflow would be this one:
  - install newton
  - upgrade to ocata
  - add collectd to roles_data and redeploy the stack

For instance if one is adding
OS::TripleO::Services::Collectdservices::collectd in an ocata env coming
from an upgraded newton env, he/she won't have the necessary packages
(for instance collectd-disk).  The puppet manifest will fail has the
package is missing and puppet doesn't install package.  The upgrade
task[1] is useless as the new role wasn't added during the upgrade but
after.

Right, but the package could be added during the upgrade. The upgrade_tasks could/should make the set of installed overcloud RPMs on par with the overcloud-full image of the respective release, ideally. So you'd have collectd RPMs installed always, both on freshly deployed and upgraded envs, regardless if you actually use collectd or not. We already did some package installs/uninstalls as part of upgrades and updates, but probably didn't have 100% coverage.


I don't see any easy way to solve this.  Basically we need a way to keep
in sync base image between release without using the upgrade_tasks,
maybe in the tripleo-package one ?

Given that released code is affected, we may treat it as a bug that requires a minor update, and in addition to upgrade_tasks, we can add all the necessary package installs into minor update code (yum_update.sh) too. Again this shouldn't depend on what services are actually enabled, just unconditionally sync with latest content of overcloud-full image of the respective release.

I guess the time consuming part will be preparing the envs that will allow comparing a fresh deploy vs. an upgraded one to get the `rpm -qa | sort` difference. Or we could try a shortcut and see what changes went into tripleo-puppet-elements in each release.


This shouldn't be a problem with container, but everything before pike
is affected.

Indeed. There will still be some basic baremetal host content management as long as we're not using Atomic, but the room for potential problems will be much smaller.

Jirka


Originially seen there[2]

[1] 
https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/puppet/services/metrics/collectd.yaml#L130..L134
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1455065



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to