Ok, thanks for the in-depth explanation. My take away is that we need to file any rootwrap updates as exceptions for now (so releasenotes and grenade scripts).
- Gus On Mon, 27 Jun 2016 at 21:25 Sean Dague <s...@dague.net> wrote: > On 06/26/2016 10:02 PM, Angus Lees wrote: > > On Fri, 24 Jun 2016 at 20:48 Sean Dague <s...@dague.net > > <mailto:s...@dague.net>> wrote: > > > > On 06/24/2016 05:12 AM, Thierry Carrez wrote: > > > I'm adding Possibility (0): change Grenade so that rootwrap > > filters from > > > N+1 are put in place before you upgrade. > > > > If you do that as general course what you are saying is that every > > installer and install process includes overwriting all of rootwrap > > before every upgrade. Keep in mind we do upstream upgrade as offline, > > which means that we've fully shut down the cloud. This would remove > the > > testing requirement that rootwrap configs were even compatible > between N > > and N+1. And you think this is theoretical, you should see the > patches > > I've gotten over the years to grenade because people didn't see an > issue > > with that at all. :) > > > > I do get that people don't like the constraints we've self imposed, > but > > we've done that for very good reasons. The #1 complaint from > operators, > > for ever, has been the pain and danger of upgrading. That's why we > are > > still trademarking new Juno clouds. When you upgrade Apache, you > don't > > have to change your config files. > > > > > > In case it got lost, I'm 100% on board with making upgrades safe and > > straightforward, and I understand that grenade is merely a tool to help > > us test ourselves against our process and not an enemy to be worked > > around. I'm an ops guy proud and true and hate you all for making > > openstack hard to upgrade in the first place :P > > > > Rootwrap configs need to be updated in line with new rootwrap-using code > > - that's just the way the rootwrap security mechanism works, since the > > security "trust" flows from the root-installed rootwrap config files. > > > > I would like to clarify what our self-imposed upgrade rules are so that > > I can design code within those constraints, and no-one is answering my > > question so I'm just getting more confused as this thread progresses... > > > > *** > > What are we trying to impose on ourselves for upgrades for the present > > and near future (ie: while rootwrap is still a thing)? > > *** > > > > A. Sean says above that we do "offline" upgrades, by which I _think_ he > > means a host-by-host (or even global?) "turn everything (on the same > > host/container) off, upgrade all files on disk for that host/container, > > turn it all back on again". If this is the model, then we can trivially > > update rootwrap files during the "upgrade" step, and I don't see any > > reason why we need to discuss anything further - except how we implement > > this in grenade. > > > > B. We need to support a mix of old + new code running on the same > > host/container, running against the same config files (presumably > > because we're updating service-by-service, or want to minimise the > > service-unavailability during upgrades to literally just a process > > restart). So we need to think about how and when we stage config vs > > code updates, and make sure that any overlap is appropriately allowed > > for (expand-contract, etc). > > > > C. We would like to just never upgrade rootwrap (or other config) files > > ever again (implying a freeze in as_root command lines, effective ~a > > year ago). Any config update is an exception dealt with through > > case-by-case process and release notes. > > > > > > I feel like the grenade check currently implements (B) with a 6 month > > lead time on config changes, but the "theory of upgrade" doc and our > > verbal policy might actually be (C) (see this thread, eg), and Sean > > above introduced the phrase "offline" which threw me completely into > > thinking maybe we're aiming for (A). You can see why I'm looking for > > clarification ;) > > Ok, there is theory of what we are striving for, and there is what is > viable to test consistently. > > The thing we are shooting for is making the code Continuously > Deployable. Which means the upgrade process should be "pip install -U > $foo && $foo-manage db-sync" on the API surfaces and "pip install -U > $foo; service restart" on everything else. > > Logic we can put into the python install process is common logic shared > by all deployment tools, and we can encode it in there. So all > installers just get it. > > The challenge is there is no facility for config file management in > python native packaging. Which means that software which *depends* on > config files for new or even working features now moves from the camp of > CDable to manual upgrade needed. What you need to do is in releasenotes, > not in code that's shipped with your software. Release notes are not > scriptable. > > So, we've said, doing that has to be the exception and not the rule. > It's also the same reasoning behind our deprecation phase for all config > options. Things move from working (in N), to working with warnings (in > N+1), to not working (in N+2). Which allows people to CD across this > boundary, and do config file fixing in their Config Management tools > *post* upgrade. > > Our testing, like all testing, is a trade off for what we could do > consistently, and feel confident of the results. That's grenade. We need > to operate on an all in one node, because that's what we have. We're > using system level installs, because > 50% of our user base does. This > does mean all of everything is getting upgraded all at once in the > normal pip install -U flow, because the moment you start replacing > system level libraries, bets are kind of off for services that are still > running. > > But, if we exploit every weakness of the testing to figure out exactly > the minimum we need to make the testing pass, we stop trying to do the > thing we set out. Painless upgrades. > > The theory that rootwrap rules have to be inspected manually and > adjusted by every deployer during upgrade seems... odd. It's like if you > tried to upgrade firefox, and it wouldn't start until you adjusted your > profile manually. > > So we are not aiming for A, we're actually aiming much higher. But > testing, consistently, that much higher bar is a thing we can't easily > do. So the structure of the testing for our offline upgrades, with the > policy rules about what we should not change, is our check and balance > for getting to properly seemless fully online upgrades. > > -Sean > > -- > Sean Dague > http://dague.net > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- > Message protected by MailGuard: e-mail anti-virus, anti-spam and content > filtering.http://www.mailguard.com.au/mg > Click here to report this message as spam: > https://console.mailguard.com.au/ras/1OJ137Hmex/7hJ0sxibjR6Z5nVC229GOK/0.22 > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev