On 06/20/2017 09:51 AM, Alex Xu wrote:
2017-06-19 22:17 GMT+08:00 Jay Pipes <jaypi...@gmail.com <mailto:jaypi...@gmail.com>>:
        * Scheduler then creates a list of N of these data structures,
        with the first being the data for the selected host, and the the
        rest being data structures representing alternates consisting of
        the next hosts in the ranked list that are in the same cell as
        the selected host.

    Yes, this is the proposed solution for allowing retries within a cell.

Is that possible we use trait to distinguish different cells? Then the retry can be done in the cell by query the placement directly with trait which indicate the specific cell.

Those traits will be some custom traits, and generate by the cell name.

No, we're not going to use traits in this way, for a couple reasons:

1) Placement doesn't and shouldn't know about Nova's internals. Cells are internal structures of Nova. Users don't know about them, neither should placement.

2) Traits describe a resource provider. A cell ID doesn't describe a resource provider, just like an aggregate ID doesn't describe a resource provider.

        * Scheduler returns that list to conductor.
        * Conductor determines the cell of the selected host, and sends
        that list to the target cell.
        * Target cell tries to build the instance on the selected host.
        If it fails, it uses the allocation data in the data structure
        to unclaim the resources for the selected host, and tries to
        claim the resources for the next host in the list using its
        allocation data. It then tries to build the instance on the next
        host in the list of alternates. Only when all alternates fail
        does the build request fail.

In the compute node, will we get rid of the allocation update in the periodic task "update_available_resource"? Otherwise, we will have race between the claim in the nova-scheduler and that periodic task.

Yup, good point, and yes, we will be removing the call to PUT /allocations in the compute node resource tracker. Only DELETE /allocations/{instance_uuid} will be called if something goes terribly wrong on instance launch.

Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to