[Yahoo-eng-team] [Bug 1939020] [NEW] nova-manage placement heal_allocations does not support instances with VGPU or Cyborg device profile request

Balazs Gibizer Thu, 05 Aug 2021 07:01:16 -0700

Public bug reported:

The nova-manage placement heal_allocations tool predates nested
allocation support in nova. It gained explicit support for nested
allocation only in case if the nested resources are coming from the port
resource request [1]. If the resource request are coming from the flavor
extra_specs (e.g. resources:VGPU=1) then the tool assumes that such
resource can be fulfilled from the root resource provider[2][3]. This is
obviously wrong for the VGPU resource that are provided on nested
resource providers. Also [3] does not resolve cyborg device profiles to
request groups so that request is also ignored.


As --force flag allows recreating instance allocations even if the tool
does not detect a missing allocation using the tool on VGPU and Cyborg
instances could result in loosing otherwise correct resource
allocations. Loosing such resource allocation can lead to physical
resource over-allocation later.

[1] 
https://github.com/openstack/nova/blob/2ffd9738602531e93495a1feca76bbb687c3e72c/nova/cmd/manage.py#L1814
[2] 
https://github.com/openstack/nova/blob/2ffd9738602531e93495a1feca76bbb687c3e72c/nova/cmd/manage.py#L1700
[3] 
https://github.com/openstack/nova/blob/2ffd9738602531e93495a1feca76bbb687c3e72c/nova/scheduler/utils.py#L607-L614

** Affects: nova
     Importance: Wishlist
     Assignee: Sylvain Bauza (sylvain-bauza)
         Status: Confirmed


** Tags: nova-manage

** Tags added: nova-manage

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1939020

Title:
  nova-manage placement heal_allocations does not support instances with
  VGPU or Cyborg device profile request

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  The nova-manage placement heal_allocations tool predates nested
  allocation support in nova. It gained explicit support for nested
  allocation only in case if the nested resources are coming from the
  port resource request [1]. If the resource request are coming from the
  flavor extra_specs (e.g. resources:VGPU=1) then the tool assumes that
  such resource can be fulfilled from the root resource provider[2][3].
  This is obviously wrong for the VGPU resource that are provided on
  nested resource providers. Also [3] does not resolve cyborg device
  profiles to request groups so that request is also ignored.

  As --force flag allows recreating instance allocations even if the
  tool does not detect a missing allocation using the tool on VGPU and
  Cyborg instances could result in loosing otherwise correct resource
  allocations. Loosing such resource allocation can lead to physical
  resource over-allocation later.

  [1] 
https://github.com/openstack/nova/blob/2ffd9738602531e93495a1feca76bbb687c3e72c/nova/cmd/manage.py#L1814
  [2] 
https://github.com/openstack/nova/blob/2ffd9738602531e93495a1feca76bbb687c3e72c/nova/cmd/manage.py#L1700
  [3] 
https://github.com/openstack/nova/blob/2ffd9738602531e93495a1feca76bbb687c3e72c/nova/scheduler/utils.py#L607-L614

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1939020/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1939020] [NEW] nova-manage placement heal_allocations does not support instances with VGPU or Cyborg device profile request

Reply via email to