On 4/14/2017 5:18 AM, Chris Dent wrote:

Here's the 19th placement and resource providers update. As usual,
if I've left anything out, please followup with that information.

# What Matters Most

Discussion continues on the spec for claims being done during
scheduling. Getting this worked out and implemented is a high
priority. Links below.

# What's Changed

The routes and handlers for adding and manipulating traits in the
placement API have now merged. This opens the door for starting to
report traits for compute-nodes and other resource providers and
filtering based on those traits (note that the added code does not
support that filtering, what's been added is the interface to CRUD
traits).

The spec for associating user and project information with
allocations and for being able to view usages based on those
characteristics has been merged. We had to go a few rounds as we
were so excited about this idea we missed some critical bits:


http://specs.openstack.org/openstack/nova-specs/specs/pike/approved/placement-project-user.html


The placement-api-ref gate job that checks the docs is now linking
the output. Here's a sample (which may have expired by the time you
are reading this message):


http://docs-draft.openstack.org/98/456198/1/check/gate-placement-api-ref-nv/deee665//placement-api-ref/build/html/

Cool, this looks nice.



More about docs below.

# Help Wanted

Areas where volunteers are needed.

* General attention to bugs tagged placement:
     https://bugs.launchpad.net/nova/+bugs?field.tag=placement

* Helping to create api documentation for placement (see the Docs
     section below).

* Helping to create and evaluate functional tests of the resource
     tracker and the ways in which it and nova-scheduler use the
     reporting client. For some info see
     https://etherpad.openstack.org/p/nova-placement-functional
     and talk to edleafe.

* Performance testing. If you have access to some nodes, some basic
    benchmarking and profiling would be very useful. See the
    performance section below.

# Main Themes

## Traits

The main API is in place, there's one patch left for a new command
to sync the os-traits library into the database:

     https://review.openstack.org/#/c/450125/

There is a stack of changes to the os-traits library to add more traits
and also automate creating symbols associated with the trait
strings:

     https://review.openstack.org/#/c/448282/

## Ironic/Custom Resource Classes

There's a blueprint for "custom resource classes in flavors" that
describes the stuff that will actually make use of custom resource
classes:


https://blueprints.launchpad.net/nova/+spec/custom-resource-classes-in-flavors

Due to the OSIC thing, Jay Pipes is going to pick this up now.



The spec has merged, but the implementation has not yet started.

Over in Ironic some functional and integration tests have started:

     https://review.openstack.org/#/c/443628/

## Claims in the Scheduler

Progress has been made on the spec for claims in the scheduler:

     https://review.openstack.org/#/c/437424/

Continued eyes and brains required. The current state is that more
detail is desired on why some particular design choices are being
made.

As of today's scheduler subteam meeting, the current most important decision to be made is if conductor does the retries when a claim fails due to conflict, or if the scheduler should. Related to the need for performance testing at scale, it would really help to gather some data on both approaches for retries here. Retrying in nova-conductor means going back through the scheduler to pull all of the hosts fresh and filtering/weighing them again, which would be more accurate but could be inefficient. Retrying in the filter scheduler would be more efficient since we have the hosts in an ordered list and don't need to refresh - but that could mean they are stale now too. Maybe we can stub some scale testing with fake compute nodes and the fake compute driver for this. Having a 'test' in tree doesn't make a lot of sense though as it does not really pass or fail, it's just there to compare against alternatives. I was wondering if we could re-use some of Yingxin's performance scale testing work that he presented at the Newton summit [1]. Alex said Yingxin is working on Ceph now, but the tooling should be on github somewhere.


Thinking about this stuff has also revealed some places where it's
possible for allocations to become wrong or orphaned:

     https://bugs.launchpad.net/nova/+bug/1679750
     https://bugs.launchpad.net/nova/+bug/1661312

## Shared Resource Providers

https://blueprints.launchpad.net/nova/+spec/shared-resources-pike

Progress on this will continue once traits and claims have moved forward.

## Nested Resource Providers

On hold while attention is given to traits and claims. There's a
stack of code waiting until all of that settles:


https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/nested-resource-providers


## Docs

https://review.openstack.org/#/q/topic:cd/placement-api-ref

Several reviews are in progress for documenting the placement API.
This is likely going to take quite a few iterations as we work out
the patterns and tooling. But it's great to see the progress and
when looking at the draft rendered docs it makes placement feel like
a real thing™.

Find me (cdent) or Andrey (avolkov) if you want to help out or have
other questions.

## Performance

We're aware that there are some redundancies in the resource tracker
that we'd like to clean up


http://lists.openstack.org/pipermail/openstack-dev/2017-January/110953.html

but it's also the case that we've done no performance testing on the
placement service itself.

We ought to do some testing to make sure there aren't unexpected
performance drains.

# Other Code/Specs

* https://review.openstack.org/#/c/448791/
    Idempotent PUT for resource classes. This is something that was
    discovered while evaluating some resource tracker code.

    Once this merges a change to the report client can be made
    to use it.

* https://bugs.launchpad.net/nova/+bug/1632852
    Cache headers not produced by placement API. This was assigned to
    several different people over time, but I'm not sure if there is
    any active code.

* https://etherpad.openstack.org/p/placement-newton-leftovers
    There's still some lingering stuff on here, some of which is
    mentioned elsewhere in this message, but not all.

* https://review.openstack.org/#/c/456717/
    There's effort afoot over in devstack to use a combination of
    apache2, mod_proxy_uwsgi, and uwsgi itself to run the services.
    The review above is to the placement part of that. This allows
    the placement api to be managed by systemd, not occupy a port,
    and have some reasonable log handling.

* https://review.openstack.org/#/q/project:openstack/osc-placement
    Work has started on an osc-plugin that can provide a command
    line interface to the placement API.

# End

Thanks for reading.



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[1] https://docs.google.com/presentation/d/1UG1HkEWyxPVMXseLwJ44ZDm-ek_MPc4M65H8EiwZnWs/edit?ts=571fcdd5#slide=id.g12d2cf15cd_2_90

--

Thanks,

Matt

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to