Hi folks, I see there is significant interest in neutron upgrade strategy. I suggest we meet on summit on Fri during ‘unplugged’ track in a small group and come up with a plan for Mitaka and beyond. I start to think that the work we can expect is quite enormous, and some coordination is due. Maybe we’ll need to form a close subteam to track the effort in Mitaka.
I started an etherpad to track upgrade strategy discussions at: https://etherpad.openstack.org/p/neutron-upgrade-strategy Note that Artur already added the track to unplugged etherpad: https://etherpad.openstack.org/p/mitaka-neutron-unplugged-track See you in Tokyo, Ihar > On 19 Oct 2015, at 11:03, Miguel Angel Ajo <mangel...@redhat.com> wrote: > > Rossella Sblendido wrote: >> Hello Artur, >> >> thanks for staring this thread. See inline please. >> >> On 10/15/2015 05:23 PM, Ihar Hrachyshka wrote: >>> Hi Artur, >>> >>> thanks a lot for caring about upgrades! >>> >>> There are a lot of good points below. As you noted, surprisingly, we seem >>> to have rolling upgrades working for RPC layer. Before we go into >>> complicating database workflow by doing oslo.versionedobjects transition >>> heavy-lifting, I would like us to spend cycles on making sure rolling >>> upgrades work not just surprisingly, but also covered with appropriate >>> gating (I speak grenade). >> >> +1 agreed that the first step is to have test coverage then we can go on >> improving the process :) >> >>> >>> I also feel that upgrades are in lots of ways not only a technical issue, >>> but a cultural one too. You should have reviewers being aware of all the >>> moving parts, and how a seemingly innocent change can break the flow. >>> That’s why I plan to start on a devref page specifically about upgrades, >>> where we could lay ground about which scenarios we should support, and >>> those we should not (f.e. we have plenty of compatibility code in agents >>> that to handle old controller scenario, which should not be supported); how >>> all pieces interact and behave in transition, and what to look for during >>> reviews. Hopefully, once such a page is up and read by folks, we will be >>> able to have more meaningful conversation about our upgrade strategy. >>> >>>> On 14 Oct 2015, at 20:10, Korzeniewski, Artur >>>> <artur.korzeniew...@intel.com> wrote: >>>> >>>> Hi all, >>>> >>>> I would like to gather all upgrade activities in Neutron in one place, in >>>> order to summarizes the current status and future activities on rolling >>>> upgrades in Mitaka. >>>> >>> >>> If you think it’s worth it, we can start up a new etherpad page to gather >>> upgrade ideas and things to do. >>> >>>> >>>> >>>> 1. RPC versioning >>>> >>>> a. It is already implemented in Neutron. >>>> >>>> b. TODO: To have the rolling upgrade we have to implement the RPC >>>> version pinning in conf. >>>> >>>> i. I’m not a big >>>> fan of this solution, but we can work out better idea if needed. >>> >>> As Dan pointed out, and as I think Miguel was thinking about, we can have >>> pin defined by agents in the cluster. Actually, we can have per agent pin. >> >> I am not a big fan either mostly because the pinning is a manual task. >> Anyway looking at the patch Dan linked >> https://review.openstack.org/#/c/233289/ ...if we remove the manual step I >> can become a fan of this approach :) >> > Yes, the minimum implementation we could agree on initially was pining. > Direct request of objects from agents > to neutron-server includes the requested version, so that's always OK, the > complicated part is notification of object > changes via fanout. > > In that case, I thinking of including the supported object versions on agent > status reports, so neutron server can > decide on runtime which versions to send (in some cases it may need to send > several versions in parallel), I'm in > long due to upload the strategy to the rpc callbacks devref. But it will be > along those lines. > >>> >>>> >>>> c. Possible unit/functional tests to catch RPC version >>>> incompatibilities between RPC revisions. >>>> >>>> d. TODO: Multi-node Grenade job to have rolling upgrades covered in >>>> CI. >>> >>> That is not for unit or functional test level. >>> >>> As you mentioned, we already have grenade project that is designed to test >>> upgrades. To validate RPC compatibility on rolling upgrade we would need so >>> called ‘partial’ job (when different components are running with different >>> versions; in case of neutron it would mean a new controller and old >>> agents). The job is present in nova gate and validates RPC compatibility. >>> >>> As far as I know, Russell Bryant was looking into introducing the job for >>> neutron, but was blocked by ongoing grenade refactoring to support partial >>> upgrades ‘the right way’ (using multinode setups). I think that we should >>> check with grenade folks on that matter, I have heard start of Mitaka was >>> ETA for this work to complete. >>> >>>> >>>> 2. Message content versioning – versioned objects >>>> >>>> a. TODO: implement Oslo Versionobject in Mitaka cycle. The >>>> interesting entities to be implemented: network, subnet, port, security >>>> groups… >>> >>> Though we haven’t touched base neutron resources in Liberty, we introduced >>> oslo.versionedobjects based NeutronObject class during Liberty as part of >>> QoS effort. I plan to expand on that work during Mitaka. > ++ >>> >>> The existing code for QoS resources can be found at: >>> >>> https://github.com/openstack/neutron/tree/master/neutron/objects >>> >>>> >>>> b. Will OVO have impact on vendor plugins? >>> >>> It surely can have significant impact, but hopefully dict compat layer >>> should make transition more smooth: >>> >>> https://github.com/openstack/neutron/blob/master/neutron/objects/base.py#L50 > > Correct. >>> >>>> >>>> c. Be strict on changes in version objects in code review, any change >>>> in object structure should increment the minor (backward-compatible) or >>>> major (breaking change) RPC version. >>> >>> That’s assuming we have a clear mapping of objects onto current RPC >>> interfaces, which is not obvious. Another problem we would need to solve is >>> core resource extensions (currently available in ml2 only), like qos or >>> port_security, that modify resources based on controller configuration. >>> >>>> >>>> d. Indirection API – message from newer format should be translated >>>> to older version by neutron server. >>> >>> For QoS, we used a new object agnostic subscriber mechanism to propagate >>> changes applied to QoS objects into agents: >>> http://docs.openstack.org/developer/neutron/devref/rpc_callbacks.html >>> >>> It is already (expected) to downgrade objects based on agent version (note >>> it’s not implemented yet, but will surely be ready during Mitaka): >>> >>> https://github.com/openstack/neutron/blob/master/neutron/api/rpc/handlers/resources_rpc.py#L142 > Yes, that's exactly what I was talking above. It has an object retrieval for > agents, where they can specify a version, > but subscription/notifications is the complicated part. > > >>> >>>> >>>> 3. Database migration >>>> >>>> a. Online schema migration was done in Liberty release, any work left >>>> to do? >>> >>> Nothing specific, maybe a bug or two here and there. >>> >>>> >>>> b. TODO: Online data migration to be introduced in Mitaka cycle. >>>> >>>> i. Online data >>>> migration can be done during normal operation on the data. >>>> >>>> ii. There should be >>>> also the script to invoke the data migration in the background. >>>> >>>> c. Currently the contract phase is doing the data migration. But >>>> since the contract phase should be run offline, we should move the data >>>> migration to preceding step. Also the contract phase should be blocked if >>>> there is still relevant data in removed entities. >>> >>> Yes, we definitely need a stop mechanism first, then play with data >>> migrations. I don’t think we can consider data migration before we have a >>> way to hide bloody migration details behind abstract resources (read: >>> versioned objects). Realistically, I would consider data migration too far >>> at the moment to consider as a todo step. But we definitely should look >>> forward to it. >>> >>>> >>>> i. Contract phase >>>> can be executed online, if there is all new code running in setup. >>> >>> I am not sure how it’s possible. Do you think it’s realistic to expect >>> controller to resolve a lot of checks that usually db does (constraints?) >>> while schema is not enforced? >>> >>>> >>>> d. The other strategy is to not drop tables, alter names or remove >>>> the columns from the DB – what’s in, it’s in. We should put more attention >>>> on code reviews, merge only additive changes and avoid questionable DB >>>> modification. >>>> >>> >>> I don’t like that approach. It suggests there is no way back if we screw >>> something. Having a short contract phase which is offline seems to me like >>> a reasonable approach. Anyway, it can be reconsidered after we have the >>> elephant in the room solved (the data migration problem). >>> >>>> e. The Neutron server should be updated first, in order to do data >>>> translation between old format into new schema. When doing this, we can be >>>> sure that old data would not be inserted into old DB structures. >>>> >>> >>> To my taste, that’s ^ the most clear way to go. > > Correct. >>> >>>> >>>> >>>> I have performed the manual Kilo to Liberty upgrade, both in operational >>>> manner and in code review of the RPC APIs. All is working fine. >>>> >>>> We can have some discussion on cross-project session [7] or we can also >>>> review any issues with Neutron upgrade in Friday’s unplugged session [8]. >>> >>> I will be more than happy to sit with folks interested in our upgrade story >>> and go write a plan for Mitaka. >> >> I am interested too and I am based in Italy (same time zone, yuppie) >> >> cheers, >> >> Rossella > > Ping me for discussion, it's a topic I'm interested on too. >> >>> >>> Please ping me on irc (ihrachys), and we will think how we can sync >>> effectively and push the effort forward. (btw I am located in Czech >>> Republic, so we should be in the same time zone). >>> >>> Regards, >>> Ihar >>> >>> >>> >>> __________________________________________________________________________ >>> OpenStack Development Mailing List (not for usage questions) >>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
signature.asc
Description: Message signed with OpenPGP using GPGMail
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev