Re: [openstack-dev] [nova] Belated nova newton midcycle recap (part 2)

2016-08-02 Thread Matt Riedemann

On 8/2/2016 12:25 PM, Jim Rollenhagen wrote:

On Mon, Aug 01, 2016 at 09:15:46PM -0500, Matt Riedemann wrote:




* Placement API for resource providers

Jay's personal goal for Newton is for the resource tracker to be writing
inventory and allocation data via the placement API. We want to get the data
writing into the placement API in Newton so we can start using it in Ocata.

There are some spec amendments up for resource providers, at least one has
merged, and the initial placement API change merged today:

https://review.openstack.org/#/c/329149/

We talked about supporting dynamic resource classes for Ironic use cases
which is a stretch goal for Nova in Newton. Jay has a spec for that here:

https://review.openstack.org/#/c/312696/

There is a lot more detail in the etherpad and honestly Jay Pipes or Jim
Rollenhagen would be better to summarize what came out of this at the
midcycle and what's being worked on for dynamic resource classes right now.


I actually wrote a bit about this last week:
http://lists.openstack.org/pipermail/openstack-dev/2016-July/099922.html

I'm not sure it covers everything, but it's the important pieces I got
from it.

// jim


We talked about a separate placement API database but decided this should be
optional to avoid forcing yet another nova database on deployers in a couple
of releases. This would be available for deployers to use to avoid some
future upgrade pain when the placement service is split out from Nova, but
if not configured it will default to the API database for the placement API.
There are a bunch more details and discussion on that in this thread that
Chris Dent started after the midcycle:

http://lists.openstack.org/pipermail/openstack-dev/2016-July/100302.html



--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Perfect, thanks! I totally missed that.

--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Belated nova newton midcycle recap (part 2)

2016-08-02 Thread Brian Haley

On 08/01/2016 10:15 PM, Matt Riedemann wrote:

Starting from where I accidentally left off:





We also talked a bit about live migration with Neutron. There has been a fix up
for live migration + DVR since Mitaka:

https://review.openstack.org/#/c/275073

It's a bit of a hacky workaround but the longer term solution that we all want (
https://review.openstack.org/#/c/309416 ) is not going to be in Newton and will
need discussion at the Ocata summit in Barcelona (John Garbutt was going to work
with the Neutron team on preparing for the summit on that one). So we agreed to
go with Swami's DVR fix but it needs to be rebased (which still hasn't happened
since the midcycle).


I just pushed an update to the DVR live-migration patch re-based to master, so 
feel free to review again.  Swami or myself will answer any other comments as 
quickly as possible.


Thanks,

-Brian

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Belated nova newton midcycle recap (part 2)

2016-08-02 Thread Jim Rollenhagen
On Mon, Aug 01, 2016 at 09:15:46PM -0500, Matt Riedemann wrote:
> 
> 
> 
> * Placement API for resource providers
> 
> Jay's personal goal for Newton is for the resource tracker to be writing
> inventory and allocation data via the placement API. We want to get the data
> writing into the placement API in Newton so we can start using it in Ocata.
> 
> There are some spec amendments up for resource providers, at least one has
> merged, and the initial placement API change merged today:
> 
> https://review.openstack.org/#/c/329149/
> 
> We talked about supporting dynamic resource classes for Ironic use cases
> which is a stretch goal for Nova in Newton. Jay has a spec for that here:
> 
> https://review.openstack.org/#/c/312696/
> 
> There is a lot more detail in the etherpad and honestly Jay Pipes or Jim
> Rollenhagen would be better to summarize what came out of this at the
> midcycle and what's being worked on for dynamic resource classes right now.

I actually wrote a bit about this last week:
http://lists.openstack.org/pipermail/openstack-dev/2016-July/099922.html

I'm not sure it covers everything, but it's the important pieces I got
from it.

// jim

> We talked about a separate placement API database but decided this should be
> optional to avoid forcing yet another nova database on deployers in a couple
> of releases. This would be available for deployers to use to avoid some
> future upgrade pain when the placement service is split out from Nova, but
> if not configured it will default to the API database for the placement API.
> There are a bunch more details and discussion on that in this thread that
> Chris Dent started after the midcycle:
> 
> http://lists.openstack.org/pipermail/openstack-dev/2016-July/100302.html
> 
> 
> 
> -- 
> 
> Thanks,
> 
> Matt Riedemann
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Belated nova newton midcycle recap (part 2)

2016-08-01 Thread Matt Riedemann

Starting from where I accidentally left off:

* Vendor metadata reboot

We agreed that we still wanted mikal to keep working on this so we can 
keep a timeline for removing the deprecated dynamic vendor data classloader.


The API change was merged last week:

https://review.openstack.org/#/c/317739/

There are still some testing and documentation changes left.

* Microversion API testing in Tempest

We talked about the current state of getting changes into Tempest for 
Nova microversions and how a bunch of changes were up for review at the 
same time for backfill some schema responses, like for 2.3 and 2.26.


I already posted what I thought we had agreed on at the midcycle:

http://lists.openstack.org/pipermail/openstack-dev/2016-July/099860.html

But there is some disagreement about how I tried to write that up in the 
Tempest docs so we're still trying to hash out this policy:


https://review.openstack.org/#/c/346092/

* Placement API for resource providers

Jay's personal goal for Newton is for the resource tracker to be writing 
inventory and allocation data via the placement API. We want to get the 
data writing into the placement API in Newton so we can start using it 
in Ocata.


There are some spec amendments up for resource providers, at least one 
has merged, and the initial placement API change merged today:


https://review.openstack.org/#/c/329149/

We talked about supporting dynamic resource classes for Ironic use cases 
which is a stretch goal for Nova in Newton. Jay has a spec for that here:


https://review.openstack.org/#/c/312696/

There is a lot more detail in the etherpad and honestly Jay Pipes or Jim 
Rollenhagen would be better to summarize what came out of this at the 
midcycle and what's being worked on for dynamic resource classes right now.


We talked about a separate placement API database but decided this 
should be optional to avoid forcing yet another nova database on 
deployers in a couple of releases. This would be available for deployers 
to use to avoid some future upgrade pain when the placement service is 
split out from Nova, but if not configured it will default to the API 
database for the placement API. There are a bunch more details and 
discussion on that in this thread that Chris Dent started after the 
midcycle:


http://lists.openstack.org/pipermail/openstack-dev/2016-July/100302.html

* nova/cinder interlock

A few of us called into the cinder midcycle hangout to talk through a 
few ongoing efforts between projects.


John Griffith has some POC code up that adds a new set of APIs to Cinder 
which consolidates the os-reserve, os-initialize_connection and 
os-attach APIs into a single API along with changes to cinderclient and 
nova to use the APIs. This is to close the gap on some long-standing 
race issues between nova and cinder volume attach operations and will 
feed into the volume multi-attach work as cinder will be storing the 
attachment information differently so we can detach properly. I need to 
fix up my devstack(-gate) change to test the entire stack, and John 
Griffith was going to write a spec for Cinder for the new APIs.


John Garbutt and I had a TODO to review Walter Boring's bug fix / 
cleanup nova-api change to stop doing state checking from the API and 
let Cinder handle that in os-reserve:


https://review.openstack.org/#/c/315789/

There is some other related work to that but it gets a bit more 
complicated in the boot from volume and live migration cases. John 
Griffith was also going to check with some other Cinder storage 
providers like Ceph/FC to make sure these changes would be OK for them, 
and to check on live migration testing (PureStorage is working on 
multinode live migration test coverage for their third party CI using 
iscsi/fibrechannel).


Matt Treinish also helped sort out some issues with getting a cinder 
multi-backend job in the gate that Scott D'Angelo has been working on. 
There is a series of changes to project-config, devstack and Tempest to 
get this testing working so we can test things like volume 
retype/migration and swap-volume in the gate with an LVM backend. The 
devstack change should just be a pass-through config to Tempest from the 
job rather than the existing approach.


* nova/neutron interlock

Carl Baldwin was at the meetup so we mostly talked about routed networks 
and the deferred IP allocation change he's been working on:


https://review.openstack.org/#/c/299591/

This is a step toward using routed networks before we have the full 
proper scheduling in place with resource providers and the placement 
API. We talked through some issues with build failures and rescheduling 
which right now will just reschedule until failure, but Carl has some 
changes to detect fixed IP allocation on the wrong host and fail, which 
Nova can then handle and abort the build rather than reschedule. We can 
work that in as a bug fix though once we get the initial change in to 
support deferred IP allocation.


We 

Re: [openstack-dev] [nova] Belated nova newton midcycle recap

2016-08-01 Thread Matt Riedemann

On 8/1/2016 4:19 PM, Matt Riedemann wrote:

It's a little late but I wanted to get a high level recap of the nova
newton midcycle written up for those that didn't make it.

First off, thanks again to Intel for hosting and especially to Cindy
Sirianni at Intel for making sure we were taken care of. We had about 40
people each day in a single room so it was a little cramped but being
the champions we are we survived.

The full etherpad is here:

https://etherpad.openstack.org/p/nova-newton-midcycle

I won't go into all of the details about every topic because (a) there
was a lot of discussion and a lot of topics and (b) I honestly didn't
catch everything, so I'm going to go over the highlights/decisions/todos
(in no particular order).

* cells v2 progress/status check

The aggregates and server group data migration changes are underway and
being reviewed. Migrating quotas to the API DB needs work though and
someone besides Mark Doffman (doffm) will probably need to pick that up.

For cell0 only scheduler failures live there, so we talked about how
those fit into the server list response. We decided that servers without
a host will be sorted at the front of the list response, and servers
with a list will be sorted after that. This will need to be documented
behavior in the API and could be improved later with Searchlight. We
would like someone to be a point person for interlocking with the
Searchlight team and we thought Balazs Gibizer (gibi) would be a good
person for this.

Andrew Laski has a change up for migrating from non-cells to cells v2.
We want to force people to upgrade to cells v2 in Newton so that we can
land a breaking change in Ocata to block people that aren't on cells v2
yet. Doing this is contingent on grenade testing. Dan Smith has the TODO
to look at the grenade changes. We don't plan on grenade testing cells
v1 to cells v2. We'll need to get docs changes for upgrades for the
process of migrating to cells v2. Michael Still (mikal) said we needed
to open bugs against the docs team for this.

The goal for Newton with cells v2 is that an instance record will not be
created until we pick a cell and we'll use the BuildRequest until that
point, and listing/deleting instances during that window will still work
as normal. For listing instances, we will prepend BuildRequests to the
front of the list (unsorted). We'll also limit the sort_keys in the API,
at least to excluded fields on joined tables - that can be fixed as a
bug fix.

For RPC/DB context switching, the infrastructure is in place but we
probably won't use this in Newton. There is a problem with version caps
and sending a new object to and old cell. There are a few proposed
solutions and Dan Smith was looking at testing a solution for this, but
we'll most likely end up documenting it for upgrades.

* API policy in code

Claudiu Belu has a patch up for a nova-manage command to check what APIs
a given user can perform. This is a first step to eventually getting to
a discoverable policy CLI and it also provides a debug tool for
operators when API users get policy errors.

We also said that any command for determining the effective policy of a
deployment or checking duplicates should live in oslo.policy, not nova,
since other projects are looking for the same thing, like Ironic. Nova
wouldn't have a nova-manage command for this but would have an
entrypoint. We also need to prioritize anything that needs to get into
oslo.policy so we're not caught by the final non-client library release
the week of 8/22.

* API docs in tree

Things are slow but that's mostly OK, we'll continue working on this
past feature freeze since it's docs. And we'll probably schedule an
api-ref docs review sprint early in September after feature freeze hits.

* Proxy API deprecations

We talked quite a bit about how to land the proxy API deprecation and
network API changes in a single microversion, which actually happened
with 2.36 last week.

Most of the discussion was around how to handle the network API
deprecation since if you're using nova0-network it's not a proxy. We
didn't want to really case the network APIs though, and we wanted the
additional signaling mechanism that the network APIs, and nova-network,
are deprecated, so we ultimately decided to include nova-network and all
network APIs in the 2.36 microversion for deprecation. The sticky thing
is that today you can request <2.36 and the API still works. After
nova-network is deleted from code, that will no longer work. Yes this is
a backward incompatible change, but we wanted the further signaling of
the removal rather than just yank it outright when the time comes.

To ease some of the client experience, Dan Smith is working a change in
python-novaclient to deprecate the network CLIs and if requesting
microversion>=2.36 we'll fallback to 2.35 (or the latest available that
still makes this work). So the network CLIs will be deprecated and emit
a warning but continue to work even though API users will not be 

[openstack-dev] [nova] Belated nova newton midcycle recap

2016-08-01 Thread Matt Riedemann
It's a little late but I wanted to get a high level recap of the nova 
newton midcycle written up for those that didn't make it.


First off, thanks again to Intel for hosting and especially to Cindy 
Sirianni at Intel for making sure we were taken care of. We had about 40 
people each day in a single room so it was a little cramped but being 
the champions we are we survived.


The full etherpad is here:

https://etherpad.openstack.org/p/nova-newton-midcycle

I won't go into all of the details about every topic because (a) there 
was a lot of discussion and a lot of topics and (b) I honestly didn't 
catch everything, so I'm going to go over the highlights/decisions/todos 
(in no particular order).


* cells v2 progress/status check

The aggregates and server group data migration changes are underway and 
being reviewed. Migrating quotas to the API DB needs work though and 
someone besides Mark Doffman (doffm) will probably need to pick that up.


For cell0 only scheduler failures live there, so we talked about how 
those fit into the server list response. We decided that servers without 
a host will be sorted at the front of the list response, and servers 
with a list will be sorted after that. This will need to be documented 
behavior in the API and could be improved later with Searchlight. We 
would like someone to be a point person for interlocking with the 
Searchlight team and we thought Balazs Gibizer (gibi) would be a good 
person for this.


Andrew Laski has a change up for migrating from non-cells to cells v2. 
We want to force people to upgrade to cells v2 in Newton so that we can 
land a breaking change in Ocata to block people that aren't on cells v2 
yet. Doing this is contingent on grenade testing. Dan Smith has the TODO 
to look at the grenade changes. We don't plan on grenade testing cells 
v1 to cells v2. We'll need to get docs changes for upgrades for the 
process of migrating to cells v2. Michael Still (mikal) said we needed 
to open bugs against the docs team for this.


The goal for Newton with cells v2 is that an instance record will not be 
created until we pick a cell and we'll use the BuildRequest until that 
point, and listing/deleting instances during that window will still work 
as normal. For listing instances, we will prepend BuildRequests to the 
front of the list (unsorted). We'll also limit the sort_keys in the API, 
at least to excluded fields on joined tables - that can be fixed as a 
bug fix.


For RPC/DB context switching, the infrastructure is in place but we 
probably won't use this in Newton. There is a problem with version caps 
and sending a new object to and old cell. There are a few proposed 
solutions and Dan Smith was looking at testing a solution for this, but 
we'll most likely end up documenting it for upgrades.


* API policy in code

Claudiu Belu has a patch up for a nova-manage command to check what APIs 
a given user can perform. This is a first step to eventually getting to 
a discoverable policy CLI and it also provides a debug tool for 
operators when API users get policy errors.


We also said that any command for determining the effective policy of a 
deployment or checking duplicates should live in oslo.policy, not nova, 
since other projects are looking for the same thing, like Ironic. Nova 
wouldn't have a nova-manage command for this but would have an 
entrypoint. We also need to prioritize anything that needs to get into 
oslo.policy so we're not caught by the final non-client library release 
the week of 8/22.


* API docs in tree

Things are slow but that's mostly OK, we'll continue working on this 
past feature freeze since it's docs. And we'll probably schedule an 
api-ref docs review sprint early in September after feature freeze hits.


* Proxy API deprecations

We talked quite a bit about how to land the proxy API deprecation and 
network API changes in a single microversion, which actually happened 
with 2.36 last week.


Most of the discussion was around how to handle the network API 
deprecation since if you're using nova0-network it's not a proxy. We 
didn't want to really case the network APIs though, and we wanted the 
additional signaling mechanism that the network APIs, and nova-network, 
are deprecated, so we ultimately decided to include nova-network and all 
network APIs in the 2.36 microversion for deprecation. The sticky thing 
is that today you can request <2.36 and the API still works. After 
nova-network is deleted from code, that will no longer work. Yes this is 
a backward incompatible change, but we wanted the further signaling of 
the removal rather than just yank it outright when the time comes.


To ease some of the client experience, Dan Smith is working a change in 
python-novaclient to deprecate the network CLIs and if requesting 
microversion>=2.36 we'll fallback to 2.35 (or the latest available that 
still makes this work). So the network CLIs will be deprecated and emit 
a warning but continue to work even