Re: [openstack-dev] [tripleo] [ci] Adding idempotency job on overcloud deployment.

Ben Nemec Thu, 08 Jun 2017 10:04:47 -0700


On 06/08/2017 10:16 AM, Emilien Macchi wrote:

On Thu, Jun 8, 2017 at 1:47 PM, Sofer Athlan-Guyot <sathl...@redhat.com> wrote:

Hi,

Alex Schultz <aschu...@redhat.com> writes:

On Wed, Jun 7, 2017 at 5:20 AM, Sofer Athlan-Guyot <sathl...@redhat.com> wrote:

Hi,

Emilien Macchi <emil...@redhat.com> writes:

On Wed, Jun 7, 2017 at 12:45 PM, Sofer Athlan-Guyot <sathl...@redhat.com> wrote:

Hi,

I don't think we have such a job in place.  Basically that would check
that re-running the "openstack deploy ..." command won't do anything.


I've had a look at openstack-infra/tripleo-ci.  Should I test it in with
ovb/quickstart or tripleo.sh.  Both way are fine by me, but I may be
lacking context about which one is more relevant.

We had such an error by the past[1], but I'm not sure this has been
captured by an associated job.

WDYT ?


It would be interesting to measure how much time does it take to run
it again.


Could you point out how such an experiment could be done ?

If it's short, we could add it to all our scenarios + ovb
jobs.  If it's long, maybe we need an additional job, but it would
take more resources, so maybe we could run it in periodic pipeline
(note that periodic jobs are not optimal since we could break
something quite easily).


Just adding as context that the issue was already raised[1].  Beside
time constraint, it was pointed out that we would also need to parse the
log to find out if anything was restarted.  But it could be a second
step.  For parsing, this code was pointed out[2].


There's a few things that would need to be enabled in order to reuse
some of this work.  We'll need to add the ability to generate a report
on the puppet run[0]. And then we'll need to be able to capture it[1]
somewhere that we could then use that parsing code on.  From there,
just rerunning the installation would be a simple start to the
idempotency check.  In fuel, we had hacked in a special flag[2] that
we used in testing to actually rerun the task immediately to find when
a specific task was not idempotent in addition to also rerunning the
entire deployment. For tripleo a similar concept would be to rerun the
steps twice but that's usually not where the issues crop us for us. So
rerunning the entire installation deployment would be better as we
tend to have issues with configuration items between steps
conflicting.


Maybe we could go with something equivalent to:

  ts="$(date '+%F %T')"
  ... re-run deploy command ...

  sudo journalctl --since="${ts}" | egrep 'Stopping|Starting' | grep -v 
'user.*slice' > restarted.log
  wc -l restarted.log

This should be 0 on every overcloud nodes.

This is simpler to implement and should catch any unwanted service
restart.

WDYT ?


It's smart, for services. It doesn't cover configuration files changes
and other resources managed by Puppet, like Keystone resources, etc.
But it's an excellent start to me.

I just want to point out that the updates job is already doing this whenit runs in every repo except tripleo-heat-templates (that's the onlypackage we actually update in the updates job, every other project is anoop). I can also tell you how long it takes to redo a deployment withno changes: just under 2000 seconds, or around 33 minutes. At leastthat's the current average in tripleo-ci right now (although I see wejust added around 100 seconds to the update time in the last day or two.*sigh*).


Thanks,
-Alex

[0] https://review.openstack.org/#/c/273740/4/mcagents/puppetd.rb@204
[1] https://review.openstack.org/#/c/273740/4/mcagents/puppetd.rb@102
[2] https://review.openstack.org/#/c/273737/

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-March/114836.html
[2] 
https://review.openstack.org/#/c/279271/9/fuelweb_test/helpers/astute_log_parser.py@212

[1] https://bugs.launchpad.net/tripleo/+bug/1664650


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [tripleo] [ci] Adding idempotency job on overcloud deployment.

Reply via email to