Re: [openstack-dev] [nova] Need feedback on spec for handling down cells in the API

2018-06-25 Thread Matt Riedemann

On 6/7/2018 9:02 AM, Matt Riedemann wrote:
We have a nova spec [1] which is at the point that it needs some API 
user (and operator) feedback on what nova API should be doing when 
listing servers and there are down cells (unable to reach the cell DB or 
it times out).


tl;dr: the spec proposes to return "shell" instances which have the 
server uuid and created_at fields set, and maybe some other fields we 
can set, but otherwise a bunch of fields in the server response would be 
set to UNKNOWN sentinel values. This would be unversioned, and therefore 
could wreak havoc on existing client side code that expects fields like 
'config_drive' and 'updated' to be of a certain format.


There are alternatives listed in the spec so please read this over and 
provide feedback since this is a pretty major UX change.


Oh, and no pressure, but today is the spec freeze deadline for Rocky.

[1] https://review.openstack.org/#/c/557369/


The options laid out right now are:

1. Without a new microversion, include 'shell' servers in the response 
when listing over down cells. These would have UNKNOWN values for the 
fields in the server object. gibi and I didn't like this because 
existing client code wouldn't know how to deal with these UNKNOWN shell 
instances - and not all of the server fields are simple strings, we have 
booleans, integers, dicts and lists, so what would those values be?


2. In a new microversion, return a new top-level parameter when listing 
servers which would include minimal details about servers that are in 
down cells (minimal like just the uuid). This was an alternative gibi 
and I had discussed because we didn't like the client-side impacts w/o a 
microversion or the full 'shell' servers in option 1. From an IRC 
conversation last week with mordred [1], dansmith and mordred don't care 
for the new top-level parameter since clients would have to merge that 
in to the full list of available servers. Plus, in the future, if we 
ever have some kind of caching mechanism in the API from which we can 
pull instance information if it's in a down cell, then the new top-level 
parameter becomes kind of pointless.


3. In a new microversion, include servers from down cells in the same 
top-level servers response parameter but for those in down cells, we'll 
just include minimal information (status=UNKNOWN and the uuid). Clients 
would opt-in to the new microversion when they know how to deal with 
what an instance in UNKNOWN status means. In the future, we could use a 
caching mechanism to fill in these details for instances in down cells.


#3 is kind of a compromise on options 1 and 2, and I'm OK with it 
(barring any hairy details).


In all cases, we won't include 'shell' servers in the response if the 
user is filtering (or paging?) because we can't be honest about the 
results and just have to treat the filters as if they don't apply to the 
instances in the down cell.


If you have a server in a down cell, you can't delete it or really do 
anything with it because we literally can't pull the instance out of the 
cell database while the cell is down. You'd get a 500 or 503 in that case.


Regardless of microversion, we plan on omitting instances from down 
cells when listing which is a backportable reliability bug fix [2] so we 
don't 500 the API when listing across 70 cells and 1 is down.


[1] 
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2018-06-20.log.html#t2018-06-20T16:52:27

[2] https://review.openstack.org/#/c/575734/

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Need feedback on spec for handling down cells in the API

2018-06-07 Thread Matt Riedemann
We have a nova spec [1] which is at the point that it needs some API 
user (and operator) feedback on what nova API should be doing when 
listing servers and there are down cells (unable to reach the cell DB or 
it times out).


tl;dr: the spec proposes to return "shell" instances which have the 
server uuid and created_at fields set, and maybe some other fields we 
can set, but otherwise a bunch of fields in the server response would be 
set to UNKNOWN sentinel values. This would be unversioned, and therefore 
could wreak havoc on existing client side code that expects fields like 
'config_drive' and 'updated' to be of a certain format.


There are alternatives listed in the spec so please read this over and 
provide feedback since this is a pretty major UX change.


Oh, and no pressure, but today is the spec freeze deadline for Rocky.

[1] https://review.openstack.org/#/c/557369/

--

Thanks,

Matt

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev