Re: [openstack-dev] [Heat] RPC API versioning
On 06/08/15 08:20, Grasza, Grzegorz wrote: -Original Message- From: Zane Bitter [mailto:zbit...@redhat.com] Sent: Thursday, 6 August, 2015 2:57 To: OpenStack Development Mailing List Subject: [openstack-dev] [Heat] RPC API versioning We've been talking about this since before summit without much consensus. I think a large part of the problem is that very few people have deep knowledge of both Heat and Versioned Objects. However, I think we are at a point where we should be able to settle on an approach at least for the API<- engine RPC interface. I've been talking to Dan Smith about the Nova team's plan for upgrades, which goes something like this: * Specify a max RPC API version in the config file * In the RPC client lib, add code to handle versions as far back as the previous release * The operator rolls out the updated code, keeping the existing config file with the pin to the previous release's RPC version * Once all services are upgraded, the operator rolls out a new config file shifting the pin * The backwards compat code to handle release N-1 is removed in the N+1 release This is, I believe, sufficient to solve our entire problem. Specifically, we have no need for an indirection API that rebroadcasts messages that are too new (since that can't happen with pinning) and no need for Versioned Objects in the RPC layer. (Versioned objects for the DB are still critical, and we are very much better off for all the hard work that Michal and others have put into them. Thanks!) What is the use of versioned objects outside of RPC? I've written some documentation for Oslo VO and helped in introducing them in Heat. As I understand it, the only use cases for VO are * to serialize objects to dicts with version information when they are sent over RPC * handle version dependent code inside the objects (instead of scattering it around the codebase) This is what we currently have them for AIUI - so that we can do updates to the DB schema without downtime. The VO abstracts the version-dependent parts in a single place, so the rest of the code can continue to work even with an old DB schema. This means that we can roll out new code and only update the schema once it is all running. * provide an object oriented and transparent access to the resources represented by the objects to services which don't have direct access to that resource (via RPC) - the indirection API The last point was not yet discussed in Heat as far as I know, but the indirection API also contains an interface for backporting objects, which is something that is currently only used in Nova, and as you say, doesn't have a use when version pinning is in place. The nature of Heat's RPC API is that it is effectively user-facing - the heat-api process is essentially a thin proxy between ReST and RPC. We already have a translation layer between the internal representation(s) of objects and the user-facing representation, in the form of heat.engine.api, and the RPC API is firmly on the user-facing side. The requirements for the content of these messages are actually much stricter than anything we need for RPC API stability, since they need to remain compatible not just with heat-api but with heatclient - and we have *zero* control over when that gets upgraded. Despite that, we've managed quite nicely for ~3 years without breaking changes afaik. Versioned Objects is a great way of retaining control when you need to share internal data structures between processes. Fortunately the architecture of Heat makes that unnecessary. That was a good design decision. We are not going to reverse that design decision in order to use Versioned Objects. (In the interest of making sure everyone uses their time productively, perhaps I should clarify that to: "your patch is subject to -2 on sight if it introduces internal engine data structures to heat-api/heat-cfn-api".) Hopefully I've convinced you of the sufficiency of this plan for the API<- engine interface specifically. If anyone disagrees, let them speak now, &c. I don't understand - what is the distinction between internal and external data structures? Internal data structures are just whatever it's most convenient for the application to work with. They're subject to change/refactoring at any time. External data structures are ones that have an explicit stability guarantee, in this case because they're part of the user interface. From what I understand, versioned objects were introduced in Heat to represent objects which are sent over RPC between Heat services. I believe you have been misinformed. All of the versioned objects that have been created so far are representations of tables in the database. However, we never send these over RPC: pre-VO it would obviously have been dumb to couple the user interface to the representation in the database, so we just didn
Re: [openstack-dev] [Heat] RPC API versioning
On 06/08/15 10:08, Dan Smith wrote: This is, I believe, sufficient to solve our entire problem. Specifically, we have no need for an indirection API that rebroadcasts messages that are too new (since that can't happen with pinning) and no need for Versioned Objects in the RPC layer. (Versioned objects for the DB are still critical, and we are very much better off for all the hard work that Michal and others have put into them. Thanks!) So all your calls have simple types for all the arguments? Meaning, everything looks like this: do_thing(uuid, 'foo', 'bar', 123) Mostly. and not: do_thing(uuid, params, data, dict_of_stuff) ? We do have that, but dict_of_stuff is always just data that was provided to us by the user, verbatim. e.g. it's the contents of a template or an environment file. We can't control what the user sends us, so pretending to 'version' it is meaningless. We just pass it on without modification, and the engine either handles it or raises an exception so we can provide a 40x error to the user. This isn't actually the interesting part though, because you're still thinking of it backwards - in Heat (unlike Nova) the API has no access to the DB, so it's not like dict_of_stuff could contain any internal data structures because there _are_ no internal data structures. The interesting part is the *response* containing complex types. However, the same argument applies: the response just contains data that we are going to pass straight back to the user verbatim (at least in the native API), and comprises a mix of simple types and echoing data we originally received from the user. If you have the latter, then just doing RPC versioning is a mirage. Nova has had basic RPC versioning forever, but we didn't get actual upgrade ability until we tightened the screws on what we're actually sending over the wire. Just versioning the signatures of the calls doesn't help you if you're sending complex data structures (such as our Instance) over the wire. If you think that the object facade is necessary for insulating you from DB changes, I feel pretty confident that you need it for the RPC side for the same reason. This assumes Nova's architecture. Unless you're going to unpack everything from the object into primitive call arguments and ensure that nobody ever changes one. This is effectively what we do, although as noted above it's actually the response and not the arguments that we're talking about. If you pull things out of the DB and send them over the wire, then the DB schema affects your RPC API. As I've been trying to explain, apparently unsuccessfully, we never ever ever pull things out of the DB and send them over the wire. Ever. Never have. Never will. Here's an example of what we actually do: http://git.openstack.org/cgit/openstack/heat/tree/heat/engine/api.py?h=stable%2Fkilo#n158 This is how we show a resource. The function takes a Resource object, which in turn contains a VO with the DB representation of the resource. We extract various attributes and perform various calculations with the methods of the Resource object (all of which rely to some extent on data obtained from the DB). Each bit of data becomes an entry in a dict - this is actually the return value, but you could think of it as equivalent to each item in the dict as being an argument to call if the RPC were initiated from the opposite direction. The values are, for the most part, simple types, and the few exceptions are either very basic, well-defined and unchanging or they're just echoing data provided originally by the user. The keys to the dict (~argument names) are well-defined in the heat.rpc.api module. We never remove a key, because that would break userspace. We never change the format of an item, because that would break userspace. Sometimes we add a key, but we always implement heat-api in such a way that it doesn't care whether the new key is present or not (i.e. it passes the data directly to the user without looking, or uses response.get(rpc_api.NEW_KEY, default) if it really needs to introspect it). The nature of Heat's RPC API is that it is effectively user-facing - the heat-api process is essentially a thin proxy between ReST and RPC. We already have a translation layer between the internal representation(s) of objects and the user-facing representation, in the form of heat.engine.api, and the RPC API is firmly on the user-facing side. The requirements for the content of these messages are actually much stricter than anything we need for RPC API stability, since they need to remain compatible not just with heat-api but with heatclient - and we have *zero* control over when that gets upgraded. Despite that, we've managed quite nicely for ~3 years without breaking changes afaik. I'm not sure how you evolve the internals without affecting the REST side if you don't have a translation layer. If you do, then the RPC API changes independently from
Re: [openstack-dev] [Heat] RPC API versioning
> This is, I believe, sufficient to solve our entire problem. > Specifically, we have no need for an indirection API that rebroadcasts > messages that are too new (since that can't happen with pinning) and no > need for Versioned Objects in the RPC layer. (Versioned objects for the > DB are still critical, and we are very much better off for all the hard > work that Michal and others have put into them. Thanks!) So all your calls have simple types for all the arguments? Meaning, everything looks like this: do_thing(uuid, 'foo', 'bar', 123) and not: do_thing(uuid, params, data, dict_of_stuff) ? If you have the latter, then just doing RPC versioning is a mirage. Nova has had basic RPC versioning forever, but we didn't get actual upgrade ability until we tightened the screws on what we're actually sending over the wire. Just versioning the signatures of the calls doesn't help you if you're sending complex data structures (such as our Instance) over the wire. If you think that the object facade is necessary for insulating you from DB changes, I feel pretty confident that you need it for the RPC side for the same reason. Unless you're going to unpack everything from the object into primitive call arguments and ensure that nobody ever changes one. If you pull things out of the DB and send them over the wire, then the DB schema affects your RPC API. > The nature of Heat's RPC API is that it is effectively user-facing - the > heat-api process is essentially a thin proxy between ReST and RPC. We > already have a translation layer between the internal representation(s) > of objects and the user-facing representation, in the form of > heat.engine.api, and the RPC API is firmly on the user-facing side. The > requirements for the content of these messages are actually much > stricter than anything we need for RPC API stability, since they need to > remain compatible not just with heat-api but with heatclient - and we > have *zero* control over when that gets upgraded. Despite that, we've > managed quite nicely for ~3 years without breaking changes afaik. I'm not sure how you evolve the internals without affecting the REST side if you don't have a translation layer. If you do, then the RPC API changes independently from the REST API. Anyway, I don't really know anything about the internals of heat, and am completely willing to believe that it's fundamentally different in some way that makes it immune to the problems something like Nova has trying to make this work. I'm not sure I'm convinced of that so far, but that's fine :) --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] RPC API versioning
> -Original Message- > From: Zane Bitter [mailto:zbit...@redhat.com] > Sent: Thursday, 6 August, 2015 2:57 > To: OpenStack Development Mailing List > Subject: [openstack-dev] [Heat] RPC API versioning > > We've been talking about this since before summit without much consensus. > I think a large part of the problem is that very few people have deep > knowledge of both Heat and Versioned Objects. However, I think we are at a > point where we should be able to settle on an approach at least for the API<- > >engine RPC interface. I've been talking to Dan Smith about the Nova team's > plan for upgrades, which goes something like this: > > * Specify a max RPC API version in the config file > * In the RPC client lib, add code to handle versions as far back as the > previous > release > * The operator rolls out the updated code, keeping the existing config file > with the pin to the previous release's RPC version > * Once all services are upgraded, the operator rolls out a new config file > shifting the pin > * The backwards compat code to handle release N-1 is removed in the N+1 > release > > This is, I believe, sufficient to solve our entire problem. > Specifically, we have no need for an indirection API that rebroadcasts > messages that are too new (since that can't happen with pinning) and no > need for Versioned Objects in the RPC layer. (Versioned objects for the DB > are still critical, and we are very much better off for all the hard work that > Michal and others have put into them. Thanks!) What is the use of versioned objects outside of RPC? I've written some documentation for Oslo VO and helped in introducing them in Heat. As I understand it, the only use cases for VO are * to serialize objects to dicts with version information when they are sent over RPC * handle version dependent code inside the objects (instead of scattering it around the codebase) * provide an object oriented and transparent access to the resources represented by the objects to services which don't have direct access to that resource (via RPC) - the indirection API The last point was not yet discussed in Heat as far as I know, but the indirection API also contains an interface for backporting objects, which is something that is currently only used in Nova, and as you say, doesn't have a use when version pinning is in place. > > The nature of Heat's RPC API is that it is effectively user-facing - the > heat-api > process is essentially a thin proxy between ReST and RPC. We already have a > translation layer between the internal representation(s) of objects and the > user-facing representation, in the form of heat.engine.api, and the RPC API is > firmly on the user-facing side. The requirements for the content of these > messages are actually much stricter than anything we need for RPC API > stability, since they need to remain compatible not just with heat-api but > with heatclient - and we have *zero* control over when that gets upgraded. > Despite that, we've managed quite nicely for ~3 years without breaking > changes afaik. > > Versioned Objects is a great way of retaining control when you need to share > internal data structures between processes. Fortunately the architecture of > Heat makes that unnecessary. That was a good design decision. We are not > going to reverse that design decision in order to use Versioned Objects. (In > the interest of making sure everyone uses their time productively, perhaps I > should clarify that to: "your patch is subject to -2 on sight if it introduces > internal engine data structures to heat-api/heat-cfn-api".) > > Hopefully I've convinced you of the sufficiency of this plan for the API<- > >engine interface specifically. If anyone disagrees, let them speak now, &c. I don't understand - what is the distinction between internal and external data structures? >From what I understand, versioned objects were introduced in Heat to represent >objects which are sent over RPC between Heat services. > > I think there is still a case that could be made for a different approach to > the > RPC API for convergence, which is engine->engine and > (probably) doesn't yet have a formal translation layer of the same kind. > At a minimum, obviously, we should do the same stuff listed above (though I > don't think we need to declare that interface stable until the first release > where we enable convergence by default). > I agree this could be a good use case for VO. > There's probably places where versioned objects could benefit us. For > example, when we trigger a check on a resource we pass it a bundle of data > containing all the attributes and IDs it might need from r
[openstack-dev] [Heat] RPC API versioning
We've been talking about this since before summit without much consensus. I think a large part of the problem is that very few people have deep knowledge of both Heat and Versioned Objects. However, I think we are at a point where we should be able to settle on an approach at least for the API<->engine RPC interface. I've been talking to Dan Smith about the Nova team's plan for upgrades, which goes something like this: * Specify a max RPC API version in the config file * In the RPC client lib, add code to handle versions as far back as the previous release * The operator rolls out the updated code, keeping the existing config file with the pin to the previous release's RPC version * Once all services are upgraded, the operator rolls out a new config file shifting the pin * The backwards compat code to handle release N-1 is removed in the N+1 release This is, I believe, sufficient to solve our entire problem. Specifically, we have no need for an indirection API that rebroadcasts messages that are too new (since that can't happen with pinning) and no need for Versioned Objects in the RPC layer. (Versioned objects for the DB are still critical, and we are very much better off for all the hard work that Michal and others have put into them. Thanks!) The nature of Heat's RPC API is that it is effectively user-facing - the heat-api process is essentially a thin proxy between ReST and RPC. We already have a translation layer between the internal representation(s) of objects and the user-facing representation, in the form of heat.engine.api, and the RPC API is firmly on the user-facing side. The requirements for the content of these messages are actually much stricter than anything we need for RPC API stability, since they need to remain compatible not just with heat-api but with heatclient - and we have *zero* control over when that gets upgraded. Despite that, we've managed quite nicely for ~3 years without breaking changes afaik. Versioned Objects is a great way of retaining control when you need to share internal data structures between processes. Fortunately the architecture of Heat makes that unnecessary. That was a good design decision. We are not going to reverse that design decision in order to use Versioned Objects. (In the interest of making sure everyone uses their time productively, perhaps I should clarify that to: "your patch is subject to -2 on sight if it introduces internal engine data structures to heat-api/heat-cfn-api".) Hopefully I've convinced you of the sufficiency of this plan for the API<->engine interface specifically. If anyone disagrees, let them speak now, &c. I think there is still a case that could be made for a different approach to the RPC API for convergence, which is engine->engine and (probably) doesn't yet have a formal translation layer of the same kind. At a minimum, obviously, we should do the same stuff listed above (though I don't think we need to declare that interface stable until the first release where we enable convergence by default). There's probably places where versioned objects could benefit us. For example, when we trigger a check on a resource we pass it a bundle of data containing all the attributes and IDs it might need from resources it depends on. It definitely makes sense to me that that bundle would be a Versioned Object. (In fact, that data gets stored in the DB - as SyncPoint in the prototype - so we wouldn't even need to create a new object type. This seems like a clear win.) What I do NOT want to do is to e.g. replace the resource_id of the resource to check with a versioned object containing the DB representation of a resource. In most cases that doesn't save a DB lookup, it just moves it to the other end of the RPC call. And as long as primary keys are unique, this can never create a compatibility problem :) Grzegorz mentioned in https://review.openstack.org/#/c/196670/ the possibility of using the indirection API to handle changes to the message format - I guess this is an alternative to the version pinning approach mentioned by Dan? I'm currently leaning toward thinking that this is unnecessary, but I'm not sure I understand it completely so I'd need to see a patch to form an opinion. I'm wide open to persuasion on this part. However I'd be opposed to this *replacing* (rather than supplementing) the version pinning behaviour described above - the whole retransmission of too-new messages thing sounds like it could have any number of pathological corner cases and I'd prefer to avoid it. cheers, Zane. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev