Re: [openstack-dev] [Heat] RPC API versioning

2015-08-06 Thread Zane Bitter

On 06/08/15 08:20, Grasza, Grzegorz wrote:




-Original Message-
From: Zane Bitter [mailto:zbit...@redhat.com]
Sent: Thursday, 6 August, 2015 2:57
To: OpenStack Development Mailing List
Subject: [openstack-dev] [Heat] RPC API versioning

We've been talking about this since before summit without much consensus.
I think a large part of the problem is that very few people have deep
knowledge of both Heat and Versioned Objects. However, I think we are at a
point where we should be able to settle on an approach at least for the API<-

engine RPC interface. I've been talking to Dan Smith about the Nova team's

plan for upgrades, which goes something like this:

* Specify a max RPC API version in the config file
* In the RPC client lib, add code to handle versions as far back as the previous
release
* The operator rolls out the updated code, keeping the existing config file
with the pin to the previous release's RPC version
* Once all services are upgraded, the operator rolls out a new config file
shifting the pin
* The backwards compat code to handle release N-1 is removed in the N+1
release

This is, I believe, sufficient to solve our entire problem.
Specifically, we have no need for an indirection API that rebroadcasts
messages that are too new (since that can't happen with pinning) and no
need for Versioned Objects in the RPC layer. (Versioned objects for the DB
are still critical, and we are very much better off for all the hard work that
Michal and others have put into them. Thanks!)


What is the use of versioned objects outside of RPC?
I've written some documentation for Oslo VO and helped in introducing them in 
Heat.
As I understand it, the only use cases for VO are
* to serialize objects to dicts with version information when they are sent 
over RPC
* handle version dependent code inside the objects (instead of scattering it 
around the codebase)


This is what we currently have them for AIUI - so that we can do updates 
to the DB schema without downtime. The VO abstracts the 
version-dependent parts in a single place, so the rest of the code can 
continue to work even with an old DB schema. This means that we can roll 
out new code and only update the schema once it is all running.



* provide an object oriented and transparent access to the resources 
represented by the objects to services which don't have direct access to that 
resource (via RPC) - the indirection API

The last point was not yet discussed in Heat as far as I know, but the 
indirection API also contains an interface for backporting objects, which is 
something that is currently only used in Nova, and as you say, doesn't have a 
use when version pinning is in place.



The nature of Heat's RPC API is that it is effectively user-facing - the 
heat-api
process is essentially a thin proxy between ReST and RPC. We already have a
translation layer between the internal representation(s) of objects and the
user-facing representation, in the form of heat.engine.api, and the RPC API is
firmly on the user-facing side. The requirements for the content of these
messages are actually much stricter than anything we need for RPC API
stability, since they need to remain compatible not just with heat-api but
with heatclient - and we have *zero* control over when that gets upgraded.
Despite that, we've managed quite nicely for ~3 years without breaking
changes afaik.

Versioned Objects is a great way of retaining control when you need to share
internal data structures between processes. Fortunately the architecture of
Heat makes that unnecessary. That was a good design decision. We are not
going to reverse that design decision in order to use Versioned Objects. (In
the interest of making sure everyone uses their time productively, perhaps I
should clarify that to: "your patch is subject to -2 on sight if it introduces
internal engine data structures to heat-api/heat-cfn-api".)

Hopefully I've convinced you of the sufficiency of this plan for the API<-

engine interface specifically. If anyone disagrees, let them speak now, &c.


I don't understand - what is the distinction between internal and external data 
structures?


Internal data structures are just whatever it's most convenient for the 
application to work with. They're subject to change/refactoring at any time.


External data structures are ones that have an explicit stability 
guarantee, in this case because they're part of the user interface.



 From what I understand, versioned objects were introduced in Heat to represent 
objects which are sent over RPC between Heat services.


I believe you have been misinformed. All of the versioned objects that 
have been created so far are representations of tables in the database. 
However, we never send these over RPC: pre-VO it would obviously have 
been dumb to couple the user interface to the representation in the 
database, so we just didn

Re: [openstack-dev] [Heat] RPC API versioning

2015-08-06 Thread Zane Bitter

On 06/08/15 10:08, Dan Smith wrote:

This is, I believe, sufficient to solve our entire problem.
Specifically, we have no need for an indirection API that rebroadcasts
messages that are too new (since that can't happen with pinning) and no
need for Versioned Objects in the RPC layer. (Versioned objects for the
DB are still critical, and we are very much better off for all the hard
work that Michal and others have put into them. Thanks!)


So all your calls have simple types for all the arguments? Meaning,
everything looks like this:

   do_thing(uuid, 'foo', 'bar', 123)


Mostly.


and not:

   do_thing(uuid, params, data, dict_of_stuff)

?


We do have that, but dict_of_stuff is always just data that was provided 
to us by the user, verbatim. e.g. it's the contents of a template or an 
environment file. We can't control what the user sends us, so pretending 
to 'version' it is meaningless. We just pass it on without modification, 
and the engine either handles it or raises an exception so we can 
provide a 40x error to the user.


This isn't actually the interesting part though, because you're still 
thinking of it backwards - in Heat (unlike Nova) the API has no access 
to the DB, so it's not like dict_of_stuff could contain any internal 
data structures because there _are_ no internal data structures.


The interesting part is the *response* containing complex types. 
However, the same argument applies: the response just contains data that 
we are going to pass straight back to the user verbatim (at least in the 
native API), and comprises a mix of simple types and echoing data we 
originally received from the user.



If you have the latter, then just doing RPC versioning is a mirage. Nova
has had basic RPC versioning forever, but we didn't get actual upgrade
ability until we tightened the screws on what we're actually sending
over the wire. Just versioning the signatures of the calls doesn't help
you if you're sending complex data structures (such as our Instance)
over the wire.

If you think that the object facade is necessary for insulating you from
DB changes, I feel pretty confident that you need it for the RPC side
for the same reason.


This assumes Nova's architecture.


Unless you're going to unpack everything from the
object into primitive call arguments and ensure that nobody ever changes
one.


This is effectively what we do, although as noted above it's actually 
the response and not the arguments that we're talking about.



If you pull things out of the DB and send them over the wire, then
the DB schema affects your RPC API.


As I've been trying to explain, apparently unsuccessfully, we never ever 
ever pull things out of the DB and send them over the wire. Ever. Never 
have. Never will.


Here's an example of what we actually do:

http://git.openstack.org/cgit/openstack/heat/tree/heat/engine/api.py?h=stable%2Fkilo#n158

This is how we show a resource. The function takes a Resource object, 
which in turn contains a VO with the DB representation of the resource. 
We extract various attributes and perform various calculations with the 
methods of the Resource object (all of which rely to some extent on data 
obtained from the DB). Each bit of data becomes an entry in a dict - 
this is actually the return value, but you could think of it as 
equivalent to each item in the dict as being an argument to call if the 
RPC were initiated from the opposite direction. The values are, for the 
most part, simple types, and the few exceptions are either very basic, 
well-defined and unchanging or they're just echoing data provided 
originally by the user.


The keys to the dict (~argument names) are well-defined in the 
heat.rpc.api module. We never remove a key, because that would break 
userspace. We never change the format of an item, because that would 
break userspace. Sometimes we add a key, but we always implement 
heat-api in such a way that it doesn't care whether the new key is 
present or not (i.e. it passes the data directly to the user without 
looking, or uses response.get(rpc_api.NEW_KEY, default) if it really 
needs to introspect it).



The nature of Heat's RPC API is that it is effectively user-facing - the
heat-api process is essentially a thin proxy between ReST and RPC. We
already have a translation layer between the internal representation(s)
of objects and the user-facing representation, in the form of
heat.engine.api, and the RPC API is firmly on the user-facing side. The
requirements for the content of these messages are actually much
stricter than anything we need for RPC API stability, since they need to
remain compatible not just with heat-api but with heatclient - and we
have *zero* control over when that gets upgraded. Despite that, we've
managed quite nicely for ~3 years without breaking changes afaik.


I'm not sure how you evolve the internals without affecting the REST
side if you don't have a translation layer. If you do, then the RPC API
changes independently from

Re: [openstack-dev] [Heat] RPC API versioning

2015-08-06 Thread Dan Smith
> This is, I believe, sufficient to solve our entire problem.
> Specifically, we have no need for an indirection API that rebroadcasts
> messages that are too new (since that can't happen with pinning) and no
> need for Versioned Objects in the RPC layer. (Versioned objects for the
> DB are still critical, and we are very much better off for all the hard
> work that Michal and others have put into them. Thanks!)

So all your calls have simple types for all the arguments? Meaning,
everything looks like this:

  do_thing(uuid, 'foo', 'bar', 123)

and not:

  do_thing(uuid, params, data, dict_of_stuff)

?

If you have the latter, then just doing RPC versioning is a mirage. Nova
has had basic RPC versioning forever, but we didn't get actual upgrade
ability until we tightened the screws on what we're actually sending
over the wire. Just versioning the signatures of the calls doesn't help
you if you're sending complex data structures (such as our Instance)
over the wire.

If you think that the object facade is necessary for insulating you from
DB changes, I feel pretty confident that you need it for the RPC side
for the same reason. Unless you're going to unpack everything from the
object into primitive call arguments and ensure that nobody ever changes
one. If you pull things out of the DB and send them over the wire, then
the DB schema affects your RPC API.

> The nature of Heat's RPC API is that it is effectively user-facing - the
> heat-api process is essentially a thin proxy between ReST and RPC. We
> already have a translation layer between the internal representation(s)
> of objects and the user-facing representation, in the form of
> heat.engine.api, and the RPC API is firmly on the user-facing side. The
> requirements for the content of these messages are actually much
> stricter than anything we need for RPC API stability, since they need to
> remain compatible not just with heat-api but with heatclient - and we
> have *zero* control over when that gets upgraded. Despite that, we've
> managed quite nicely for ~3 years without breaking changes afaik.

I'm not sure how you evolve the internals without affecting the REST
side if you don't have a translation layer. If you do, then the RPC API
changes independently from the REST API.

Anyway, I don't really know anything about the internals of heat, and am
completely willing to believe that it's fundamentally different in some
way that makes it immune to the problems something like Nova has trying
to make this work. I'm not sure I'm convinced of that so far, but that's
fine :)

--Dan


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] RPC API versioning

2015-08-06 Thread Grasza, Grzegorz


> -Original Message-
> From: Zane Bitter [mailto:zbit...@redhat.com]
> Sent: Thursday, 6 August, 2015 2:57
> To: OpenStack Development Mailing List
> Subject: [openstack-dev] [Heat] RPC API versioning
> 
> We've been talking about this since before summit without much consensus.
> I think a large part of the problem is that very few people have deep
> knowledge of both Heat and Versioned Objects. However, I think we are at a
> point where we should be able to settle on an approach at least for the API<-
> >engine RPC interface. I've been talking to Dan Smith about the Nova team's
> plan for upgrades, which goes something like this:
> 
> * Specify a max RPC API version in the config file
> * In the RPC client lib, add code to handle versions as far back as the 
> previous
> release
> * The operator rolls out the updated code, keeping the existing config file
> with the pin to the previous release's RPC version
> * Once all services are upgraded, the operator rolls out a new config file
> shifting the pin
> * The backwards compat code to handle release N-1 is removed in the N+1
> release
> 
> This is, I believe, sufficient to solve our entire problem.
> Specifically, we have no need for an indirection API that rebroadcasts
> messages that are too new (since that can't happen with pinning) and no
> need for Versioned Objects in the RPC layer. (Versioned objects for the DB
> are still critical, and we are very much better off for all the hard work that
> Michal and others have put into them. Thanks!)

What is the use of versioned objects outside of RPC?
I've written some documentation for Oslo VO and helped in introducing them in 
Heat.
As I understand it, the only use cases for VO are
* to serialize objects to dicts with version information when they are sent 
over RPC
* handle version dependent code inside the objects (instead of scattering it 
around the codebase)
* provide an object oriented and transparent access to the resources 
represented by the objects to services which don't have direct access to that 
resource (via RPC) - the indirection API

The last point was not yet discussed in Heat as far as I know, but the 
indirection API also contains an interface for backporting objects, which is 
something that is currently only used in Nova, and as you say, doesn't have a 
use when version pinning is in place.

> 
> The nature of Heat's RPC API is that it is effectively user-facing - the 
> heat-api
> process is essentially a thin proxy between ReST and RPC. We already have a
> translation layer between the internal representation(s) of objects and the
> user-facing representation, in the form of heat.engine.api, and the RPC API is
> firmly on the user-facing side. The requirements for the content of these
> messages are actually much stricter than anything we need for RPC API
> stability, since they need to remain compatible not just with heat-api but
> with heatclient - and we have *zero* control over when that gets upgraded.
> Despite that, we've managed quite nicely for ~3 years without breaking
> changes afaik.
> 
> Versioned Objects is a great way of retaining control when you need to share
> internal data structures between processes. Fortunately the architecture of
> Heat makes that unnecessary. That was a good design decision. We are not
> going to reverse that design decision in order to use Versioned Objects. (In
> the interest of making sure everyone uses their time productively, perhaps I
> should clarify that to: "your patch is subject to -2 on sight if it introduces
> internal engine data structures to heat-api/heat-cfn-api".)
> 
> Hopefully I've convinced you of the sufficiency of this plan for the API<-
> >engine interface specifically. If anyone disagrees, let them speak now, &c.

I don't understand - what is the distinction between internal and external data 
structures?

>From what I understand, versioned objects were introduced in Heat to represent 
>objects which are sent over RPC between Heat services.

> 
> I think there is still a case that could be made for a different approach to 
> the
> RPC API for convergence, which is engine->engine and
> (probably) doesn't yet have a formal translation layer of the same kind.
> At a minimum, obviously, we should do the same stuff listed above (though I
> don't think we need to declare that interface stable until the first release
> where we enable convergence by default).
> 

I agree this could be a good use case for VO.


> There's probably places where versioned objects could benefit us. For
> example, when we trigger a check on a resource we pass it a bundle of data
> containing all the attributes and IDs it might need from r

[openstack-dev] [Heat] RPC API versioning

2015-08-05 Thread Zane Bitter
We've been talking about this since before summit without much 
consensus. I think a large part of the problem is that very few people 
have deep knowledge of both Heat and Versioned Objects. However, I think 
we are at a point where we should be able to settle on an approach at 
least for the API<->engine RPC interface. I've been talking to Dan Smith 
about the Nova team's plan for upgrades, which goes something like this:


* Specify a max RPC API version in the config file
* In the RPC client lib, add code to handle versions as far back as the 
previous release
* The operator rolls out the updated code, keeping the existing config 
file with the pin to the previous release's RPC version
* Once all services are upgraded, the operator rolls out a new config 
file shifting the pin
* The backwards compat code to handle release N-1 is removed in the N+1 
release


This is, I believe, sufficient to solve our entire problem. 
Specifically, we have no need for an indirection API that rebroadcasts 
messages that are too new (since that can't happen with pinning) and no 
need for Versioned Objects in the RPC layer. (Versioned objects for the 
DB are still critical, and we are very much better off for all the hard 
work that Michal and others have put into them. Thanks!)


The nature of Heat's RPC API is that it is effectively user-facing - the 
heat-api process is essentially a thin proxy between ReST and RPC. We 
already have a translation layer between the internal representation(s) 
of objects and the user-facing representation, in the form of 
heat.engine.api, and the RPC API is firmly on the user-facing side. The 
requirements for the content of these messages are actually much 
stricter than anything we need for RPC API stability, since they need to 
remain compatible not just with heat-api but with heatclient - and we 
have *zero* control over when that gets upgraded. Despite that, we've 
managed quite nicely for ~3 years without breaking changes afaik.


Versioned Objects is a great way of retaining control when you need to 
share internal data structures between processes. Fortunately the 
architecture of Heat makes that unnecessary. That was a good design 
decision. We are not going to reverse that design decision in order to 
use Versioned Objects. (In the interest of making sure everyone uses 
their time productively, perhaps I should clarify that to: "your patch 
is subject to -2 on sight if it introduces internal engine data 
structures to heat-api/heat-cfn-api".)


Hopefully I've convinced you of the sufficiency of this plan for the 
API<->engine interface specifically. If anyone disagrees, let them speak 
now, &c.



I think there is still a case that could be made for a different 
approach to the RPC API for convergence, which is engine->engine and 
(probably) doesn't yet have a formal translation layer of the same kind. 
At a minimum, obviously, we should do the same stuff listed above 
(though I don't think we need to declare that interface stable until the 
first release where we enable convergence by default).


There's probably places where versioned objects could benefit us. For 
example, when we trigger a check on a resource we pass it a bundle of 
data containing all the attributes and IDs it might need from resources 
it depends on. It definitely makes sense to me that that bundle would be 
a Versioned Object. (In fact, that data gets stored in the DB - as 
SyncPoint in the prototype - so we wouldn't even need to create a new 
object type. This seems like a clear win.)


What I do NOT want to do is to e.g. replace the resource_id of the 
resource to check with a versioned object containing the DB 
representation of a resource. In most cases that doesn't save a DB 
lookup, it just moves it to the other end of the RPC call. And as long 
as primary keys are unique, this can never create a compatibility problem :)


Grzegorz mentioned in https://review.openstack.org/#/c/196670/ the 
possibility of using the indirection API to handle changes to the 
message format - I guess this is an alternative to the version pinning 
approach mentioned by Dan? I'm currently leaning toward thinking that 
this is unnecessary, but I'm not sure I understand it completely so I'd 
need to see a patch to form an opinion. I'm wide open to persuasion on 
this part. However I'd be opposed to this *replacing* (rather than 
supplementing) the version pinning behaviour described above - the whole 
retransmission of too-new messages thing sounds like it could have any 
number of pathological corner cases and I'd prefer to avoid it.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev