Re: [Openstack-operators] [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading
I've mentioned this elsewhere but writing here for posterity... Making N to N+1 upgrades seamless and work well is already challenging today which is one of the reasons why people aren't upgrading in the first place. Making N to N+1 upgrades work as well as possible already puts a great strain on developers and resources, think about the testing and CI involved in making sure things really work. My opinion is that of upgrades were made out to be a simple, easy and seamless operation, it wouldn't be that much of a problem to upgrade from N to N+3 by upgrading from release to release (three times) until you've caught up. But then, if upgrades are awesome, maybe operators won't be lagging 3 releases behind anymore. David Moreau Simard Senior Software Engineer | Openstack RDO dmsimard = [irc, github, twitter] On Thu, May 25, 2017 at 9:55 PM, Carter, Kevinwrote: > Hello Stackers, > > As I'm sure many of you know there was a talk about doing "skip-level"[0] > upgrades at the OpenStack Summit which quite a few folks were interested in. > Today many of the interested parties got together and talked about doing > more of this in a formalized capacity. Essentially we're looking for cloud > upgrades with the possibility of skipping releases, ideally enabling an N+3 > upgrade. In our opinion it would go a very long way to solving cloud > consumer and deployer problems it folks didn't have to deal with an upgrade > every six months. While we talked about various issues and some of the > current approaches being kicked around we wanted to field our general chat > to the rest of the community and request input from folks that may have > already fought such a beast. If you've taken on an adventure like this how > did you approach it? Did it work? Any known issues, gotchas, or things folks > should be generally aware of? > > > During our chat today we generally landed on an in-place upgrade with known > API service downtime and little (at least as little as possible) data plane > downtime. The process discussed was basically: > a1. Create utility "thing-a-me" (container, venv, etc) which contains the > required code to run a service through all of the required upgrades. > a2. Stop service(s). > a3. Run migration(s)/upgrade(s) for all releases using the utility > "thing-a-me". > a4. Repeat for all services. > > b1. Once all required migrations are complete run a deployment using the > target release. > b2. Ensure all services are restarted. > b3. Ensure cloud is functional. > b4. profit! > > Obviously, there's a lot of hand waving here but such a process is being > developed by the OpenStack-Ansible project[1]. Currently, the OSA tooling > will allow deployers to upgrade from Juno/Kilo to Newton using Ubuntu 14.04. > While this has worked in the lab, it's early in development (YMMV). Also, > the tooling is not very general purpose or portable outside of OSA but it > could serve as a guide or just a general talking point. Are there other > tools out there that solve for the multi-release upgrade? Are there any > folks that might want to share their expertise? Maybe a process outline that > worked? Best practices? Do folks believe tools are the right way to solve > this or would comprehensive upgrade documentation be better for the general > community? > > As most of the upgrade issues center around database migrations, we > discussed some of the potential pitfalls at length. One approach was to > roll-up all DB migrations into a single repository and run all upgrades for > a given project in one step. Another was to simply have mutliple python > virtual environments and just run in-line migrations from a version specific > venv (this is what the OSA tooling does). Does one way work better than the > other? Any thoughts on how this could be better? Would having N+2/3 > migrations addressable within the projects, even if they're not tested any > longer, be helpful? > > It was our general thought that folks would be interested in having the > ability to skip releases so we'd like to hear from the community to validate > our thinking. Additionally, we'd like to get more minds together and see if > folks are wanting to work on such an initiative, even if this turns into > nothing more than a co-op/channel where we can "phone a friend". Would it be > good to try and secure some PTG space to work on this? Should we try and > create working group going? > > If you've made it this far, please forgive my stream of consciousness. I'm > trying to ask a lot of questions and distill long form conversation(s) into > as little text as possible all without writing a novel. With that said, I > hope this finds you well, I look forward to hearing from (and working with) > you soon. > > [0] https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading > [1] > https://github.com/openstack/openstack-ansible-ops/tree/master/leap-upgrades > > > -- > > Kevin Carter > IRC: Cloudnull > >
Re: [Openstack-operators] [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading
Excerpts from Dan Smith's message of 2017-05-26 07:56:02 -0700: > > As most of the upgrade issues center around database migrations, we > > discussed some of the potential pitfalls at length. One approach was to > > roll-up all DB migrations into a single repository and run all upgrades > > for a given project in one step. Another was to simply have mutliple > > python virtual environments and just run in-line migrations from a > > version specific venv (this is what the OSA tooling does). Does one way > > work better than the other? Any thoughts on how this could be better? > > IMHO, and speaking from a Nova perspective, I think that maintaining a > separate repo of migrations is a bad idea. We occasionally have to fix a > migration to handle a case where someone is stuck and can't move past a > certain revision due to some situation that was not originally > understood. If you have a separate copy of our migrations, you wouldn't > get those fixes. Nova hasn't compacted migrations in a while anyway, so > there's not a whole lot of value there I think. > > The other thing to consider is that our _schema_ migrations often > require _data_ migrations to complete before moving on. That means you > really have to move to some milestone version of the schema, then > move/transform data, and then move to the next milestone. Since we > manage those according to releases, those are the milestones that are > most likely to be successful if you're stepping through things. > > I do think that the idea of being able to generate a small utility > container (using the broad sense of the word) from each release, and > using those to step through N, N+1, N+2 to arrive at N+3 makes the most > sense. > > Nova has offline tooling to push our data migrations (even though the > command is intended to be runnable online). The concern I would have It seems like it would be very useful to have a tool like this for all projects that support online data migrations, no matter if they are written in python, as DB triggers, or whatever. > would be over how to push Keystone's migrations mechanically, since I > believe they moved forward with their proposal to do data migrations in > stored procedures with triggers. Presumably there is a need for > something similar to nova's online-data-migrations command which will > trip all the triggers and provide a green light for moving on? > > In the end, projects support N->N+1 today, so if you're just stepping > through actual 1-version gaps, you should be able to do as many of those > as you want and still be running "supported" transitions. There's a lot > of value in that, IMHO. +1 > > --Dan > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading
> As most of the upgrade issues center around database migrations, we > discussed some of the potential pitfalls at length. One approach was to > roll-up all DB migrations into a single repository and run all upgrades > for a given project in one step. Another was to simply have mutliple > python virtual environments and just run in-line migrations from a > version specific venv (this is what the OSA tooling does). Does one way > work better than the other? Any thoughts on how this could be better? IMHO, and speaking from a Nova perspective, I think that maintaining a separate repo of migrations is a bad idea. We occasionally have to fix a migration to handle a case where someone is stuck and can't move past a certain revision due to some situation that was not originally understood. If you have a separate copy of our migrations, you wouldn't get those fixes. Nova hasn't compacted migrations in a while anyway, so there's not a whole lot of value there I think. The other thing to consider is that our _schema_ migrations often require _data_ migrations to complete before moving on. That means you really have to move to some milestone version of the schema, then move/transform data, and then move to the next milestone. Since we manage those according to releases, those are the milestones that are most likely to be successful if you're stepping through things. I do think that the idea of being able to generate a small utility container (using the broad sense of the word) from each release, and using those to step through N, N+1, N+2 to arrive at N+3 makes the most sense. Nova has offline tooling to push our data migrations (even though the command is intended to be runnable online). The concern I would have would be over how to push Keystone's migrations mechanically, since I believe they moved forward with their proposal to do data migrations in stored procedures with triggers. Presumably there is a need for something similar to nova's online-data-migrations command which will trip all the triggers and provide a green light for moving on? In the end, projects support N->N+1 today, so if you're just stepping through actual 1-version gaps, you should be able to do as many of those as you want and still be running "supported" transitions. There's a lot of value in that, IMHO. --Dan ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators