Hi folks,

I see there is significant interest in neutron upgrade strategy. I suggest we 
meet on summit on Fri during ‘unplugged’ track in a small group and come up 
with a plan for Mitaka and beyond. I start to think that the work we can expect 
is quite enormous, and some coordination is due. Maybe we’ll need to form a 
close subteam to track the effort in Mitaka.

I started an etherpad to track upgrade strategy discussions at:

https://etherpad.openstack.org/p/neutron-upgrade-strategy

Note that Artur already added the track to unplugged etherpad:

https://etherpad.openstack.org/p/mitaka-neutron-unplugged-track

See you in Tokyo,
Ihar

> On 19 Oct 2015, at 11:03, Miguel Angel Ajo <mangel...@redhat.com> wrote:
> 
> Rossella Sblendido wrote:
>> Hello Artur,
>> 
>> thanks for staring this thread. See inline please.
>> 
>> On 10/15/2015 05:23 PM, Ihar Hrachyshka wrote:
>>> Hi Artur,
>>> 
>>> thanks a lot for caring about upgrades!
>>> 
>>> There are a lot of good points below. As you noted, surprisingly, we seem 
>>> to have rolling upgrades working for RPC layer. Before we go into 
>>> complicating database workflow by doing oslo.versionedobjects transition 
>>> heavy-lifting, I would like us to spend cycles on making sure rolling 
>>> upgrades work not just surprisingly, but also covered with appropriate 
>>> gating (I speak grenade).
>> 
>> +1 agreed that the first step is to have test coverage then we can go on 
>> improving the process :)
>> 
>>> 
>>> I also feel that upgrades are in lots of ways not only a technical issue, 
>>> but a cultural one too. You should have reviewers being aware of all the 
>>> moving parts, and how a seemingly innocent change can break the flow. 
>>> That’s why I plan to start on a devref page specifically about upgrades, 
>>> where we could lay ground about which scenarios we should support, and 
>>> those we should not (f.e. we have plenty of compatibility code in agents 
>>> that to handle old controller scenario, which should not be supported); how 
>>> all pieces interact and behave in transition, and what to look for during 
>>> reviews. Hopefully, once such a page is up and read by folks, we will be 
>>> able to have more meaningful conversation about our upgrade strategy.
>>> 
>>>> On 14 Oct 2015, at 20:10, Korzeniewski, Artur 
>>>> <artur.korzeniew...@intel.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> I would like to gather all upgrade activities in Neutron in one place, in 
>>>> order to summarizes the current status and future activities on rolling 
>>>> upgrades in Mitaka.
>>>> 
>>> 
>>> If you think it’s worth it, we can start up a new etherpad page to gather 
>>> upgrade ideas and things to do.
>>> 
>>>> 
>>>> 
>>>> 1.      RPC versioning
>>>> 
>>>> a.      It is already implemented in Neutron.
>>>> 
>>>> b.      TODO: To have the rolling upgrade we have to implement the RPC 
>>>> version pinning in conf.
>>>> 
>>>>                                                     i.     I’m not a big 
>>>> fan of this solution, but we can work out better idea if needed.
>>> 
>>> As Dan pointed out, and as I think Miguel was thinking about, we can have 
>>> pin defined by agents in the cluster. Actually, we can have per agent pin.
>> 
>> I am not a big fan either mostly because the pinning is a manual task. 
>> Anyway looking at the patch Dan linked 
>> https://review.openstack.org/#/c/233289/ ...if we remove the manual step I 
>> can become a fan of this approach :)
>> 
> Yes, the minimum implementation we could agree on initially was pining. 
> Direct request of objects from agents
> to neutron-server includes the requested version, so that's always OK, the 
> complicated part is notification of object
> changes via fanout.
> 
> In that case, I thinking of including the supported object versions on agent 
> status reports, so neutron server can
> decide on runtime which versions to send (in some cases it may need to send 
> several versions in parallel), I'm in
> long due to upload the strategy to the rpc callbacks devref. But it will be 
> along those lines.
> 
>>> 
>>>> 
>>>> c.      Possible unit/functional tests to catch RPC version 
>>>> incompatibilities between RPC revisions.
>>>> 
>>>> d.      TODO: Multi-node Grenade job to have rolling upgrades covered in 
>>>> CI.
>>> 
>>> That is not for unit or functional test level.
>>> 
>>> As you mentioned, we already have grenade project that is designed to test 
>>> upgrades. To validate RPC compatibility on rolling upgrade we would need so 
>>> called ‘partial’ job (when different components are running with different 
>>> versions; in case of neutron it would mean a new controller and old 
>>> agents). The job is present in nova gate and validates RPC compatibility.
>>> 
>>> As far as I know, Russell Bryant was looking into introducing the job for 
>>> neutron, but was blocked by ongoing grenade refactoring to support partial 
>>> upgrades ‘the right way’ (using multinode setups). I think that we should 
>>> check with grenade folks on that matter, I have heard start of Mitaka was 
>>> ETA for this work to complete.
>>> 
>>>> 
>>>> 2.      Message content versioning – versioned objects
>>>> 
>>>> a.      TODO: implement Oslo Versionobject in Mitaka cycle. The 
>>>> interesting entities to be implemented: network, subnet, port, security 
>>>> groups…
>>> 
>>> Though we haven’t touched base neutron resources in Liberty, we introduced 
>>> oslo.versionedobjects based NeutronObject class during Liberty as part of 
>>> QoS effort. I plan to expand on that work during Mitaka.
> ++
>>> 
>>> The existing code for QoS resources can be found at:
>>> 
>>> https://github.com/openstack/neutron/tree/master/neutron/objects
>>> 
>>>> 
>>>> b.      Will OVO have impact on vendor plugins?
>>> 
>>> It surely can have significant impact, but hopefully dict compat layer 
>>> should make transition more smooth:
>>> 
>>> https://github.com/openstack/neutron/blob/master/neutron/objects/base.py#L50
> 
> Correct.
>>> 
>>>> 
>>>> c.      Be strict on changes in version objects in code review, any change 
>>>> in object structure should increment the minor (backward-compatible) or 
>>>> major (breaking change) RPC version.
>>> 
>>> That’s assuming we have a clear mapping of objects onto current RPC 
>>> interfaces, which is not obvious. Another problem we would need to solve is 
>>> core resource extensions (currently available in ml2 only), like qos or 
>>> port_security, that modify resources based on controller configuration.
>>> 
>>>> 
>>>> d.      Indirection API – message from newer format should be translated 
>>>> to older version by neutron server.
>>> 
>>> For QoS, we used a new object agnostic subscriber mechanism to propagate 
>>> changes applied to QoS objects into agents: 
>>> http://docs.openstack.org/developer/neutron/devref/rpc_callbacks.html
>>> 
>>> It is already (expected) to downgrade objects based on agent version (note 
>>> it’s not implemented yet, but will surely be ready during Mitaka):
>>> 
>>> https://github.com/openstack/neutron/blob/master/neutron/api/rpc/handlers/resources_rpc.py#L142
> Yes, that's exactly what I was talking above. It has an object retrieval for 
> agents, where they can specify a version,
> but subscription/notifications is the complicated part.
> 
> 
>>> 
>>>> 
>>>> 3.      Database migration
>>>> 
>>>> a.      Online schema migration was done in Liberty release, any work left 
>>>> to do?
>>> 
>>> Nothing specific, maybe a bug or two here and there.
>>> 
>>>> 
>>>> b.      TODO: Online data migration to be introduced in Mitaka cycle.
>>>> 
>>>>                                                     i.     Online data 
>>>> migration can be done during normal operation on the data.
>>>> 
>>>>                                                    ii.     There should be 
>>>> also the script to invoke the data migration in the background.
>>>> 
>>>> c.      Currently the contract phase is doing the data migration. But 
>>>> since the contract phase should be run offline, we should move the data 
>>>> migration to preceding step. Also the contract phase should be blocked if 
>>>> there is still relevant data in removed entities.
>>> 
>>> Yes, we definitely need a stop mechanism first, then play with data 
>>> migrations. I don’t think we can consider data migration before we have a 
>>> way to hide bloody migration details behind abstract resources (read: 
>>> versioned objects). Realistically, I would consider data migration too far 
>>> at the moment to consider as a todo step. But we definitely should look 
>>> forward to it.
>>> 
>>>> 
>>>>                                                     i.     Contract phase 
>>>> can be executed online, if there is all new code running in setup.
>>> 
>>> I am not sure how it’s possible. Do you think it’s realistic to expect 
>>> controller to resolve a lot of checks that usually db does (constraints?) 
>>> while schema is not enforced?
>>> 
>>>> 
>>>> d.      The other strategy is to not drop tables, alter names or remove 
>>>> the columns from the DB – what’s in, it’s in. We should put more attention 
>>>> on code reviews, merge only additive changes and avoid questionable DB 
>>>> modification.
>>>> 
>>> 
>>> I don’t like that approach. It suggests there is no way back if we screw 
>>> something. Having a short contract phase which is offline seems to me like 
>>> a reasonable approach. Anyway, it can be reconsidered after we have the 
>>> elephant in the room solved (the data migration problem).
>>> 
>>>> e.      The Neutron server should be updated first, in order to do data 
>>>> translation between old format into new schema. When doing this, we can be 
>>>> sure that old data would not be inserted into old DB structures.
>>>> 
>>> 
>>> To my taste, that’s ^ the most clear way to go.
> 
> Correct.
>>> 
>>>> 
>>>> 
>>>> I have performed the manual Kilo to Liberty upgrade, both in operational 
>>>> manner and in code review of the RPC APIs. All is working fine.
>>>> 
>>>> We can have some discussion on cross-project session [7] or we can also 
>>>> review any issues with Neutron upgrade in Friday’s unplugged session [8].
>>> 
>>> I will be more than happy to sit with folks interested in our upgrade story 
>>> and go write a plan for Mitaka.
>> 
>> I am interested too and I am based in Italy (same time zone, yuppie)
>> 
>> cheers,
>> 
>> Rossella
> 
> Ping me for discussion, it's a topic I'm interested on too.
>> 
>>> 
>>> Please ping me on irc (ihrachys), and we will think how we can sync 
>>> effectively and push the effort forward. (btw I am located in Czech 
>>> Republic, so we should be in the same time zone).
>>> 
>>> Regards,
>>> Ihar
>>> 
>>> 
>>> 
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> 
>> 
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to