Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

Matt Riedemann Fri, 13 Jan 2017 12:11:56 -0800

On 1/13/2017 11:43 AM, Alex Schultz wrote:

Ahoy folks,

So we've been running into issues with the addition of the cell v2
setup as part of the requirements for different parts of nova. It was
recommended that we move the conversation to the ML to get a wider
audience. Basically, cell v2 has been working it's way into a
required thing for various parts of nova for the Ocata cycle. We've
hit several issues[0][1] because of this. I put a wider audience than
just nova because deployment tools need to understand how this works
as it impacts anyone installing or upgrading.

What is not clear is what is the expectation around how to install and
configure cell v2. When we hit the first bug in the upgrade, we
reached out in irc[2] around this and it seemed that there was little
to no documentation around how this stuff all works. There are
mentions of these new commands in the release notes[3] but it's not
clear on the specifics for both the upgrade process and also a fresh
install. We attempted to just run simple_cell_setup in the puppet
(and tripleo downstream) because we assumed this would handle all the
things. It's become clear that this is not the case. The latest
bug[1] has shown that we do not have a proper understanding of what it
means to setup cell v2, so I'd like to use this email to start the
conversation as it impacts anyone either install Ocata fresh or
attempting some sort of Newton -> Ocata upgrade.

Additionally after some conversations today on irc[4], it's also
become clear there is some disconnect around understanding between
nova folks and people who deploy as to how this thing would ideally
work. So, what I would like to know is how should people be
installing and configuring nova cell v2? Specifically what are the
steps so that the deployment tools and operators can properly handle
these complexities. What are the assumptions being baked into
simple_cell_setup? It seems to assume computes should exist before
the cell simple setup where as traditionally computes are the last
thing to be setup (for new installs).

So, help?

Thanks,
-Alex

[0] https://bugs.launchpad.net/tripleo/+bug/1649341
[1] https://bugs.launchpad.net/nova/+bug/1656276
[2]
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-12-12.log.html#t2016-12-12T17:38:56
[3] http://docs.openstack.org/releasenotes/nova/unreleased.html#id12
[4]
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-01-13.log.html#t2017-01-13T14:11:37

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Thanks for starting this thread.

First, I want to apologize for the lack of communication anddocumentation around this. I know this is frustrating. We've been veryheads down on getting the changes out and getting devstack/grenadeworking with this stuff that we haven't taken the time to document itoutside of the release notes, which isn't adequate.

Without going into details, there was a major change in personnel thisrelease for working on cells v2 so we've been doing some catch-up andthat's part of why things are a bit scatter-brained.

Based on the discussion in IRC this morning, I took some time to try andcapture some of the immediate issues/todos/questions in the cells v2wiki [1].

Documenting this is going to be a priority. We should have something upfor review in Nova by next week (like Monday), at least a draft.

I think it's also important to realize (for the nova team) that we'vebeen thinking about cells v2 deployment from an upgrade perspective,which I think is why we had the simple_cell_setup command asserting thatyou needed computes already registered to map to hosts and cells. Asnoted, this is not going to work in a fresh install deployment where thecontrol services are setup before the computes. We're working onaddressing that in [2].

To compare with how simple_cell_setup works, the recently created'nova-status upgrade check' command [3] is OK with there being nocompute nodes yet (it does fail if you don't have any cell mappingsthough). It's OK with that because of the fresh install scenario. Itdoesn't fail but it does report that you need to remember to discovernew hosts and map them once they are registered.

So for whatever reason the existing commands and code were written underthe assumption that you'd first create computes (or already have them)and then map those to a cell, and we need to adjust the tooling for thescenario that you want to create the cell first and map hosts later. Ithink today grenade and devstack are working the same way as far assetting up cells v2, but eventually I think we're going to want to makegrenade do the more specific upgrade path where we expect a compute nodeto already exist and get mapped, and change the devstack path to use thefresh install scenario where there might not be a compute node when thecell is created, but then map it later after starting the nova-computeservice. That will build on top of [2].

We've also kicked around the idea of the computes auto-registering witha cell when they are created, but we don't know which cell (if there aremultiple) they'd register with, and we've wanted to avoid an up-callfrom the computes to the API to do that sort of thing. But that wouldtake away the manual step of running the discover_hosts command to mapthe compute node to a cell. The way things are working right now whengetting instances at the API layer is we look for a cell mapping and ifthe instance isn't in a cell, we fallback to the main database (no cellsconfigured). A warning gets logged when that happens. I think we canlive with that for now in Ocata and then when we get into Pike startconsidering that an error but talk about auto-registration, i.e. maybewe auto-register with a single cell and it's only an error if there aremultiple cells and we don't know which to register with. Or we mark aspecific cell as the 'default' or 'staging' cell that all new computenodes go into, and then provide a CLI or API to move hosts/instances toanother cell if needed. Anyway, those are longer-term concerns/ideasright now.

First priority right now is going to be fixing simple_cell_setup [2] anddocumentation, including not only the deployment steps, but alsodocumenting the various commands themselves, including inputs/outputs,assumptions made and in what situations you can or should run them.


[1] https://wiki.openstack.org/wiki/Nova-Cells-v2
[2] https://review.openstack.org/#/c/420079/

[3]https://github.com/openstack/nova/blob/31774fa2285a29e7b711a90586d120aaab0b627d/nova/cmd/status.py#L120-L173


--

Thanks,

Matt Riedemann


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][puppet][tripleo][kolla][fuel] Let's talk nova cell v2 deployments and upgrades

Reply via email to