Excerpts from Michael Kerrin's message of 2014-07-17 07:54:26 -0700: > On Thursday 26 June 2014 12:20:30 Clint Byrum wrote: > > Excerpts from Macdonald-Wallace, Matthew's message of 2014-06-26 04:13:31 > -0700: > > > Hi all, > > > > > > I've been working more and more with TripleO recently and whilst it does > > > seem to solve a number of problems well, I have found a couple of > > > idiosyncrasies that I feel would be easy to address. > > > > > > My primary concern lies in the fact that os-refresh-config does not run on > > > every boot/reboot of a system. Surely a reboot *is* a configuration > > > change and therefore we should ensure that the box has come up in the > > > expected state with the correct config? > > > > > > This is easily fixed through the addition of an "@reboot" entry in > > > /etc/crontab to run o-r-c or (less easily) by re-designing o-r-c to run > > > as a service. > > > > > > My secondary concern is that through not running os-refresh-config on a > > > regular basis by default (i.e. every 15 minutes or something in the same > > > style as chef/cfengine/puppet), we leave ourselves exposed to someone > > > trying to make a "quick fix" to a production node and taking that node > > > offline the next time it reboots because the config was still left as > > > broken owing to a lack of updates to HEAT (I'm thinking a "quick change" > > > to allow root access via SSH during a major incident that is then left > > > unchanged for months because no-one updated HEAT). > > > > > > There are a number of options to fix this including Modifying > > > os-collect-config to auto-run os-refresh-config on a regular basis or > > > setting os-refresh-config to be its own service running via upstart or > > > similar that triggers every 15 minutes > > > > > > I'm sure there are other solutions to these problems, however I know from > > > experience that claiming this is solved through "education of users" or > > > (more severely!) via HR is not a sensible approach to take as by the time > > > you realise that your configuration has been changed for the last 24 > > > hours it's often too late! > > So I see two problems highlighted above. > > > > 1) We don't re-assert ephemeral state set by o-r-c scripts. You're right, > > and we've been talking about it for a while. The right thing to do is > > have os-collect-config re-run its command on boot. I don't think a cron > > job is the right way to go, we should just have a file in /var/run that > > is placed there only on a successful run of the command. If that file > > does not exist, then we run the command. > > > > I've just opened this bug in response: > > > > https://bugs.launchpad.net/os-collect-config/+bug/1334804 > > > > I have been looking into bug #1334804 and I have a review up to resolve it. I > want to highlight something. > > Currently on a reboot we start all services via upstart (on debian anyways) > and there have been quite a lot of issues around this - missing upstart > scripts and timing issues. I don't know the issues on fedora. > > So with a fix to #1334804, on a reboot upstart will start all the services > first (with potentially out-of-date configuration), then o-c-c will start o-r- > c and will now configure all services and restart them or start them if > upstart isn't configured properly. > > I would like to turn off all boot scripts for services we configure and leave > all this to o-r-c. I think this will simplify things and put us in control of > starting services. I believe that it will also narrow the gap between fedora > and debian or debian and debian so what works on one should work on the other > and make it easier for developers.
Agreed, and that is actually really simple. I hate to steal your thunder, but this is the patch: https://review.openstack.org/107772 > > Having the ability to service nova-api stop|start|restart is very handy but > this will be a manually thing and I intend to leave that there. > > What do people think and how best do I push this forward. I feel that this > leads into the the re-assert-system-state spec but mainly I think this is a > bug and doesn't require a spec. > > I will be at the tripleo mid-cycle meetup next and willing to discuss this > with anyone interested in this and put together the necessary bits to make > this happen. As I said, it is simple. :) I suggest testing the patch above and adding anything I missed to it. Systemd based systems will likely need something different. I'm still burying my head int he sand and not learning systemd, but perhaps a follow-up patch from somebody who understands it can make those systems do the same thing. _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev