Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

Balázs Gibizer Tue, 29 May 2018 05:25:05 -0700

On Tue, May 29, 2018 at 1:47 PM, Sylvain Bauza <[email protected]>wrote:

Le mar. 29 mai 2018 à 11:02, Balázs Gibizer<[email protected]> a écrit :



On Tue, May 29, 2018 at 9:38 AM, Sylvain Bauza <[email protected]>
wrote:
>
>
> On Tue, May 29, 2018 at 3:08 AM, TETSURO NAKAMURA
> <[email protected]> wrote
>
>> > In that situation, say for example with VGPU inventories, that
>> would mean
>> > that the compute node would stop reporting inventories for its
>> root RP, but

>> > would rather report inventories for at least one single childRP.

>> > In that model, do we reconcile the allocations that were already
>> made
>> > against the "root RP" inventory ?
>>
>> It would be nice to see Eric and Jay comment on this,
>> but if I'm not mistaken, when the virt driver stops reporting
>> inventories for its root RP, placement would try to delete that
>> inventory inside and raise InventoryInUse exception if any
>> allocations still exist on that resource.
>>
>> ```
>> update_from_provider_tree() (nova/compute/resource_tracker.py)

>> + _set_inventory_for_provider()(nova/scheduler/client/report.py)

>>       + put() - PUT /resource_providers/<rp_uuid>/inventories with
>> new inventories (scheduler/client/report.py)
>>           + set_inventories() (placement/handler/inventory.py)
>>               + _set_inventory()
>> (placement/objects/resource_proveider.py)
>>                   + _delete_inventory_from_provider()
>> (placement/objects/resource_proveider.py)
>>                       -> raise exception.InventoryInUse
>> ```
>>
>> So we need some trick something like deleting VGPU allocations
>> before upgrading and set the allocation again for the created new
>> child after upgrading?
>>
>

> I wonder if we should keep the existing inventory in the root RP,and

> somehow just reserve the left resources (so Placement wouldn't pass
> that root RP for queries, but would still have allocations). But
> then, where and how to do this ? By the resource tracker ?
>

AFAIK it is the virt driver that decides to model the VGU resourceat a

different place in the RP tree so I think it is the responsibility of

the same virt driver to move any existing allocation from the oldplace

to the new place during this change.

Cheers,
gibi

Why not instead not move the allocation but rather have the virtdriver updating the root RP by modifying the reserved value to thetotal size?

That way, the virt driver wouldn't need to ask for an allocation butrather continue to provide inventories...


Thoughts?

Keeping the old allocaton at the old RP and adding a similar sizedreservation in the new RP feels hackis as those are not really reservedGPUs but used GPUs just from the old RP. If somebody sums up the totalreported GPUs in this setup via the placement API then she will getmore GPUs in total that what is physically visible for the hypervisoras the GPUs part of the old allocation reported twice in two differenttotal value. Could we just report less GPU inventories to the new RPuntil the old RP has GPU allocations?


Some alternatives from my jetlagged brain:

a) Implement a move inventory/allocation API in placement. Given aresource class and a source RP uuid and a destination RP uuid placementmoves the inventory and allocations of that resource class from thesource RP to the destination RP. Then the virt drive can call this APIto move the allocation. This has an impact on the fast forward upgradeas it needs running virt driver to do the allocation move.

b) For this I assume that live migrating an instance having a GPUallocation on the old RP will allocate GPU for that instance from thenew RP. In the virt driver do not report GPUs to the new RP while thereis allocation for such GPUs in the old RP. Let the deployer livemigrate away the instances. When the virt driver detects that there isno more GPU allocations on the old RP it can delete the inventory fromthe old RP and report it to the new RP.

c) For this I assume that there is no support for live migration of aninstance having a GPU. If there is GPU allocation in the old RP thenvirt driver does not report GPU inventory to the new RP just createsthe new nested RPs. Provide a placement-manage command to do theinventory + allocation copy from the old RP to the new RP.


Cheers,
gibi


> -Sylvain
>


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)

Unsubscribe:[email protected]?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

Reply via email to