Re: [Openstack-operators] [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading

2017-05-26 Thread David Moreau Simard
I've mentioned this elsewhere but writing here for posterity...

Making N to N+1 upgrades seamless and work well is already challenging
today which is one of the reasons why people aren't upgrading in the
first place.
Making N to N+1 upgrades work as well as possible already puts a great
strain on developers and resources, think about the testing and CI
involved in making sure things really work.

My opinion is that of upgrades were made out to be a simple, easy and
seamless operation, it wouldn't be that much of a problem to upgrade
from N to N+3 by upgrading from release to release (three times) until
you've caught up.
But then, if upgrades are awesome, maybe operators won't be lagging 3
releases behind anymore.


David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]


On Thu, May 25, 2017 at 9:55 PM, Carter, Kevin  wrote:
> Hello Stackers,
>
> As I'm sure many of you know there was a talk about doing "skip-level"[0]
> upgrades at the OpenStack Summit which quite a few folks were interested in.
> Today many of the interested parties got together and talked about doing
> more of this in a formalized capacity. Essentially we're looking for cloud
> upgrades with the possibility of skipping releases, ideally enabling an N+3
> upgrade. In our opinion it would go a very long way to solving cloud
> consumer and deployer problems it folks didn't have to deal with an upgrade
> every six months. While we talked about various issues and some of the
> current approaches being kicked around we wanted to field our general chat
> to the rest of the community and request input from folks that may have
> already fought such a beast. If you've taken on an adventure like this how
> did you approach it? Did it work? Any known issues, gotchas, or things folks
> should be generally aware of?
>
>
> During our chat today we generally landed on an in-place upgrade with known
> API service downtime and little (at least as little as possible) data plane
> downtime. The process discussed was basically:
> a1. Create utility "thing-a-me" (container, venv, etc) which contains the
> required code to run a service through all of the required upgrades.
> a2. Stop service(s).
> a3. Run migration(s)/upgrade(s) for all releases using the utility
> "thing-a-me".
> a4. Repeat for all services.
>
> b1. Once all required migrations are complete run a deployment using the
> target release.
> b2. Ensure all services are restarted.
> b3. Ensure cloud is functional.
> b4. profit!
>
> Obviously, there's a lot of hand waving here but such a process is being
> developed by the OpenStack-Ansible project[1]. Currently, the OSA tooling
> will allow deployers to upgrade from Juno/Kilo to Newton using Ubuntu 14.04.
> While this has worked in the lab, it's early in development (YMMV). Also,
> the tooling is not very general purpose or portable outside of OSA but it
> could serve as a guide or just a general talking point. Are there other
> tools out there that solve for the multi-release upgrade? Are there any
> folks that might want to share their expertise? Maybe a process outline that
> worked? Best practices? Do folks believe tools are the right way to solve
> this or would comprehensive upgrade documentation be better for the general
> community?
>
> As most of the upgrade issues center around database migrations, we
> discussed some of the potential pitfalls at length. One approach was to
> roll-up all DB migrations into a single repository and run all upgrades for
> a given project in one step. Another was to simply have mutliple python
> virtual environments and just run in-line migrations from a version specific
> venv (this is what the OSA tooling does). Does one way work better than the
> other? Any thoughts on how this could be better? Would having N+2/3
> migrations addressable within the projects, even if they're not tested any
> longer, be helpful?
>
> It was our general thought that folks would be interested in having the
> ability to skip releases so we'd like to hear from the community to validate
> our thinking. Additionally, we'd like to get more minds together and see if
> folks are wanting to work on such an initiative, even if this turns into
> nothing more than a co-op/channel where we can "phone a friend". Would it be
> good to try and secure some PTG space to work on this? Should we try and
> create working group going?
>
> If you've made it this far, please forgive my stream of consciousness. I'm
> trying to ask a lot of questions and distill long form conversation(s) into
> as little text as possible all without writing a novel. With that said, I
> hope this finds you well, I look forward to hearing from (and working with)
> you soon.
>
> [0] https://etherpad.openstack.org/p/BOS-forum-skip-level-upgrading
> [1]
> https://github.com/openstack/openstack-ansible-ops/tree/master/leap-upgrades
>
>
> --
>
> Kevin Carter
> IRC: Cloudnull
>
> 

Re: [Openstack-operators] [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading

2017-05-26 Thread Doug Hellmann
Excerpts from Dan Smith's message of 2017-05-26 07:56:02 -0700:
> > As most of the upgrade issues center around database migrations, we
> > discussed some of the potential pitfalls at length. One approach was to
> > roll-up all DB migrations into a single repository and run all upgrades
> > for a given project in one step. Another was to simply have mutliple
> > python virtual environments and just run in-line migrations from a
> > version specific venv (this is what the OSA tooling does). Does one way
> > work better than the other? Any thoughts on how this could be better?
> 
> IMHO, and speaking from a Nova perspective, I think that maintaining a
> separate repo of migrations is a bad idea. We occasionally have to fix a
> migration to handle a case where someone is stuck and can't move past a
> certain revision due to some situation that was not originally
> understood. If you have a separate copy of our migrations, you wouldn't
> get those fixes. Nova hasn't compacted migrations in a while anyway, so
> there's not a whole lot of value there I think.
> 
> The other thing to consider is that our _schema_ migrations often
> require _data_ migrations to complete before moving on. That means you
> really have to move to some milestone version of the schema, then
> move/transform data, and then move to the next milestone. Since we
> manage those according to releases, those are the milestones that are
> most likely to be successful if you're stepping through things.
> 
> I do think that the idea of being able to generate a small utility
> container (using the broad sense of the word) from each release, and
> using those to step through N, N+1, N+2 to arrive at N+3 makes the most
> sense.
> 
> Nova has offline tooling to push our data migrations (even though the
> command is intended to be runnable online). The concern I would have

It seems like it would be very useful to have a tool like this for
all projects that support online data migrations, no matter if they
are written in python, as DB triggers, or whatever.

> would be over how to push Keystone's migrations mechanically, since I
> believe they moved forward with their proposal to do data migrations in
> stored procedures with triggers. Presumably there is a need for
> something similar to nova's online-data-migrations command which will
> trip all the triggers and provide a green light for moving on?
> 
> In the end, projects support N->N+1 today, so if you're just stepping
> through actual 1-version gaps, you should be able to do as many of those
> as you want and still be running "supported" transitions. There's a lot
> of value in that, IMHO.

+1

> 
> --Dan
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading

2017-05-26 Thread Dan Smith
> As most of the upgrade issues center around database migrations, we
> discussed some of the potential pitfalls at length. One approach was to
> roll-up all DB migrations into a single repository and run all upgrades
> for a given project in one step. Another was to simply have mutliple
> python virtual environments and just run in-line migrations from a
> version specific venv (this is what the OSA tooling does). Does one way
> work better than the other? Any thoughts on how this could be better?

IMHO, and speaking from a Nova perspective, I think that maintaining a
separate repo of migrations is a bad idea. We occasionally have to fix a
migration to handle a case where someone is stuck and can't move past a
certain revision due to some situation that was not originally
understood. If you have a separate copy of our migrations, you wouldn't
get those fixes. Nova hasn't compacted migrations in a while anyway, so
there's not a whole lot of value there I think.

The other thing to consider is that our _schema_ migrations often
require _data_ migrations to complete before moving on. That means you
really have to move to some milestone version of the schema, then
move/transform data, and then move to the next milestone. Since we
manage those according to releases, those are the milestones that are
most likely to be successful if you're stepping through things.

I do think that the idea of being able to generate a small utility
container (using the broad sense of the word) from each release, and
using those to step through N, N+1, N+2 to arrive at N+3 makes the most
sense.

Nova has offline tooling to push our data migrations (even though the
command is intended to be runnable online). The concern I would have
would be over how to push Keystone's migrations mechanically, since I
believe they moved forward with their proposal to do data migrations in
stored procedures with triggers. Presumably there is a need for
something similar to nova's online-data-migrations command which will
trip all the triggers and provide a green light for moving on?

In the end, projects support N->N+1 today, so if you're just stepping
through actual 1-version gaps, you should be able to do as many of those
as you want and still be running "supported" transitions. There's a lot
of value in that, IMHO.

--Dan

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators