[Yahoo-eng-team] [Bug 1828937] Re: Getting allocation candidates is slow with "placement microversion < 1.29" from rocky release
** Also affects: nova/rocky Importance: Undecided Status: New ** No longer affects: nova ** Description changed: Description === - In rocky cycle, 'GET /allocation_candidates' started to be aware of nested providers from microversion 1.29. + In rocky cycle, 'GET /allocation_candidates' started to be aware of nested providers from microversion 1.29. From microversion 1.29, it can join allocations from resource providers in the same tree. To keep the behavior of microversion before 1.29, it filters nested providers [1] This function "_exclude_nested_providers()" is skipped on microversion >= 1.29 but is heavy on microversion < 1.29. This is executed and still heavy even if there is no nested providers in the environment when microversion < 1.29. [1] https://github.com/openstack/placement/blob/e69366675a2ee4532ae3039104b1a5ee8d775083/placement/handlers/allocation_candidate.py#L207-L238 Steps to reproduce == * Create about 6000 resource providers with some inventory and aggregates (using placeload [2]) * Query "GET /allocation_candidates?resources=VCPU:1,DISK_GB:10,MEMORY_MB:256_of=${SOME_AGGREGATE}=${SOME_TRAIT}" with microversion 1.29 and 1.25 [2] https://github.com/cdent/placeload/tree/master/placeload Expected (Ideal) result == * No performance difference with microversion 1.25 <-> 1.29 - Actual (Ideal) result + Actual result == * __15.995s__ for microversion 1.25 * __5.541s__ for microversion 1.29 with profiler enabled, * __32.219s__ for microversion 1.25 - Note that 24.1s(75%) is consumed in the "_exclude_nested_providers()" * __7.871s__ for microversion 1.29 - Note that this is roughly 32.219s - 24.1s... -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1828937 Title: Getting allocation candidates is slow with "placement microversion < 1.29" from rocky release Status in OpenStack Compute (nova) rocky series: New Bug description: Description === In rocky cycle, 'GET /allocation_candidates' started to be aware of nested providers from microversion 1.29. From microversion 1.29, it can join allocations from resource providers in the same tree. To keep the behavior of microversion before 1.29, it filters nested providers [1] This function "_exclude_nested_providers()" is skipped on microversion >= 1.29 but is heavy on microversion < 1.29. This is executed and still heavy even if there is no nested providers in the environment when microversion < 1.29. [1] https://github.com/openstack/placement/blob/e69366675a2ee4532ae3039104b1a5ee8d775083/placement/handlers/allocation_candidate.py#L207-L238 Steps to reproduce == * Create about 6000 resource providers with some inventory and aggregates (using placeload [2]) * Query "GET /allocation_candidates?resources=VCPU:1,DISK_GB:10,MEMORY_MB:256_of=${SOME_AGGREGATE}=${SOME_TRAIT}" with microversion 1.29 and 1.25 [2] https://github.com/cdent/placeload/tree/master/placeload Expected (Ideal) result == * No performance difference with microversion 1.25 <-> 1.29 Actual result == * __15.995s__ for microversion 1.25 * __5.541s__ for microversion 1.29 with profiler enabled, * __32.219s__ for microversion 1.25 - Note that 24.1s(75%) is consumed in the "_exclude_nested_providers()" * __7.871s__ for microversion 1.29 - Note that this is roughly 32.219s - 24.1s... To manage notifications about this bug go to: https://bugs.launchpad.net/nova/rocky/+bug/1828937/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1828937] [NEW] Getting allocation candidates is slow with "placement microversion < 1.29" from rocky release
Public bug reported: Description === In rocky cycle, 'GET /allocation_candidates' started to be aware of nested providers from microversion 1.29. >From microversion 1.29, it can join allocations from resource providers in the >same tree. To keep the behavior of microversion before 1.29, it filters nested providers [1] This function "_exclude_nested_providers()" is skipped on microversion >= 1.29 but is heavy on microversion < 1.29. This is executed and still heavy even if there is no nested providers in the environment when microversion < 1.29. [1] https://github.com/openstack/placement/blob/e69366675a2ee4532ae3039104b1a5ee8d775083/placement/handlers/allocation_candidate.py#L207-L238 Steps to reproduce == * Create about 6000 resource providers with some inventory and aggregates (using placeload [2]) * Query "GET /allocation_candidates?resources=VCPU:1,DISK_GB:10,MEMORY_MB:256_of=${SOME_AGGREGATE}=${SOME_TRAIT}" with microversion 1.29 and 1.25 [2] https://github.com/cdent/placeload/tree/master/placeload Expected (Ideal) result == * No performance difference with microversion 1.25 <-> 1.29 Actual (Ideal) result == * __15.995s__ for microversion 1.25 * __5.541s__ for microversion 1.29 with profiler enabled, * __32.219s__ for microversion 1.25 - Note that 24.1s(75%) is consumed in the "_exclude_nested_providers()" * __7.871s__ for microversion 1.29 - Note that this is roughly 32.219s - 24.1s... ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1828937 Title: Getting allocation candidates is slow with "placement microversion < 1.29" from rocky release Status in OpenStack Compute (nova): New Bug description: Description === In rocky cycle, 'GET /allocation_candidates' started to be aware of nested providers from microversion 1.29. From microversion 1.29, it can join allocations from resource providers in the same tree. To keep the behavior of microversion before 1.29, it filters nested providers [1] This function "_exclude_nested_providers()" is skipped on microversion >= 1.29 but is heavy on microversion < 1.29. This is executed and still heavy even if there is no nested providers in the environment when microversion < 1.29. [1] https://github.com/openstack/placement/blob/e69366675a2ee4532ae3039104b1a5ee8d775083/placement/handlers/allocation_candidate.py#L207-L238 Steps to reproduce == * Create about 6000 resource providers with some inventory and aggregates (using placeload [2]) * Query "GET /allocation_candidates?resources=VCPU:1,DISK_GB:10,MEMORY_MB:256_of=${SOME_AGGREGATE}=${SOME_TRAIT}" with microversion 1.29 and 1.25 [2] https://github.com/cdent/placeload/tree/master/placeload Expected (Ideal) result == * No performance difference with microversion 1.25 <-> 1.29 Actual (Ideal) result == * __15.995s__ for microversion 1.25 * __5.541s__ for microversion 1.29 with profiler enabled, * __32.219s__ for microversion 1.25 - Note that 24.1s(75%) is consumed in the "_exclude_nested_providers()" * __7.871s__ for microversion 1.29 - Note that this is roughly 32.219s - 24.1s... To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1828937/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1817458] [NEW] duplicate allocation candidates with granular request
Public bug reported: Description === When we request shared resource granularly, we can get duplicate allocation candidates for a same resource provider How to reproduce 1. Set up 1-1. Set up two compute nodes (cn1, cn2 with VCPU resources) 1-2. Set up one shared storage (ss1 with DISK_GB resources) marked with "MISC_SHARES_VIA_AGGREGATE" 1-3. Put all of them in one aggregate 2. Request only DISK_GB resource with granular request -> you will get duplicate allocation request of DISK_GB resource on ss1 (NOTE): non-granular requests don't provide such duplicate entry ** Affects: nova Importance: Undecided Status: New ** Tags: placememt ** Tags added: placememt -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1817458 Title: duplicate allocation candidates with granular request Status in OpenStack Compute (nova): New Bug description: Description === When we request shared resource granularly, we can get duplicate allocation candidates for a same resource provider How to reproduce 1. Set up 1-1. Set up two compute nodes (cn1, cn2 with VCPU resources) 1-2. Set up one shared storage (ss1 with DISK_GB resources) marked with "MISC_SHARES_VIA_AGGREGATE" 1-3. Put all of them in one aggregate 2. Request only DISK_GB resource with granular request -> you will get duplicate allocation request of DISK_GB resource on ss1 (NOTE): non-granular requests don't provide such duplicate entry To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1817458/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1812829] [NEW] `placement-status upgrade check` fails
Public bug reported: Description === https://review.openstack.org/#/c/631604/ added a new command to placement: `placement-status upgrade check` However, it fails locally logging that it finds no database connection. $ placement-status upgrade check Error: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/oslo_upgradecheck/upgradecheck.py", line 189, in main return conf.command.action_fn() File "/usr/local/lib/python3.6/dist-packages/oslo_upgradecheck/upgradecheck.py", line 98, in check result = func(self) File "/opt/stack/placement/placement/cmd/status.py", line 60, in _check_incomplete_consumers missing_consumer_count = self._count_missing_consumers(self.ctxt) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1011, in wrapper with self._transaction_scope(context): File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1061, in _transaction_scope context=context) as resource: File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 659, in _session bind=self.connection, mode=self.mode) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 418, in _create_session self._start() File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 510, in _start engine_args, maker_args) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 532, in _setup_for_connection "No sql_connection parameter is established") oslo_db.exception.CantStartEngineError: No sql_connection parameter is established ** Affects: nova Importance: High Assignee: Tetsuro Nakamura (tetsuro0907) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1812829 Title: `placement-status upgrade check` fails Status in OpenStack Compute (nova): In Progress Bug description: Description === https://review.openstack.org/#/c/631604/ added a new command to placement: `placement-status upgrade check` However, it fails locally logging that it finds no database connection. $ placement-status upgrade check Error: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/oslo_upgradecheck/upgradecheck.py", line 189, in main return conf.command.action_fn() File "/usr/local/lib/python3.6/dist-packages/oslo_upgradecheck/upgradecheck.py", line 98, in check result = func(self) File "/opt/stack/placement/placement/cmd/status.py", line 60, in _check_incomplete_consumers missing_consumer_count = self._count_missing_consumers(self.ctxt) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1011, in wrapper with self._transaction_scope(context): File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1061, in _transaction_scope context=context) as resource: File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__ return next(self.gen) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 659, in _session bind=self.connection, mode=self.mode) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 418, in _create_session self._start() File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 510, in _start engine_args, maker_args) File "/usr/local/lib/python3.6/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 532, in _setup_for_connection "No sql_connection parameter is established") oslo_db.exception.CantStartEngineError: No sql_connection parameter is established To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1812829/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1803925] Re: There is no interface for operators to migrate *all* the existing compute resource providers to be ready for nested providers
Abandoned https://review.openstack.org/#/c/619126/ in favor of https://review.openstack.org/#/c/624943/, which is now committed. ** Changed in: nova Status: New => Won't Fix ** Changed in: nova Status: Won't Fix => Confirmed ** Changed in: nova Status: Confirmed => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1803925 Title: There is no interface for operators to migrate *all* the existing compute resource providers to be ready for nested providers Status in OpenStack Compute (nova): In Progress Bug description: When nested resource provider feature was added in Rocky, root_provider_uuid column, which should be non-None value is created in the resource provider DB. For existing resource providers created before queens, we have an online data migration: https://review.openstack.org/#/c/377138/62/nova/objects/resource_provider.py@917 But it's only done via listing/showing resource providers. We should have explicit migration script something like "placement-manage db online_data_migrations" to make sure all the resource providers are ready for the nested provider feature, that is all the root_provider_uuid column has non-None value. This bug tracking can be closed when the following tasks are done - Provide something like "placement-manage db online_data_migrations" so that in Stein we are sure all the root_provider_uuid column is non-None value. - Clean placement/objects/resource_provider.py removing many TODOs like "Change this to an inner join when we are sure all root_provider_id values are NOT NULL" NOTE: This report is created after fixing/closing https://bugs.launchpad.net/nova/+bug/1799892 in a temporary way without the explicit DB migration script. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1803925/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1803925] [NEW] There is no interface for operators to migrate *all* the existing compute resource providers to be ready for nested providers
Public bug reported: When nested resource provider feature was added in Rocky, root_provider_uuid column, which should be non-None value is created in the resource provider DB. For existing resource providers created before queens, we have an online data migration: https://review.openstack.org/#/c/377138/62/nova/objects/resource_provider.py@917 But it's only done via listing/showing resource providers. We should have explicit migration script something like "placement-manage db online_data_migrations" to make sure all the resource providers are ready for the nested provider feature, that is all the root_provider_uuid column has non-None value. This bug tracking can be closed when the following tasks are done - Provide something like "placement-manage db online_data_migrations" so that in Stein we are sure all the root_provider_uuid column is non-None value. - Clean placement/objects/resource_provider.py removing many TODOs like "Change this to an inner join when we are sure all root_provider_id values are NOT NULL" NOTE: This report is created after fixing/closing https://bugs.launchpad.net/nova/+bug/1799892 in a temporary way without the explicit DB migration script. ** Affects: nova Importance: Undecided Status: New ** Tags: placement ** Tags added: placement -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1803925 Title: There is no interface for operators to migrate *all* the existing compute resource providers to be ready for nested providers Status in OpenStack Compute (nova): New Bug description: When nested resource provider feature was added in Rocky, root_provider_uuid column, which should be non-None value is created in the resource provider DB. For existing resource providers created before queens, we have an online data migration: https://review.openstack.org/#/c/377138/62/nova/objects/resource_provider.py@917 But it's only done via listing/showing resource providers. We should have explicit migration script something like "placement-manage db online_data_migrations" to make sure all the resource providers are ready for the nested provider feature, that is all the root_provider_uuid column has non-None value. This bug tracking can be closed when the following tasks are done - Provide something like "placement-manage db online_data_migrations" so that in Stein we are sure all the root_provider_uuid column is non-None value. - Clean placement/objects/resource_provider.py removing many TODOs like "Change this to an inner join when we are sure all root_provider_id values are NOT NULL" NOTE: This report is created after fixing/closing https://bugs.launchpad.net/nova/+bug/1799892 in a temporary way without the explicit DB migration script. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1803925/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1792503] [NEW] allocation candidates "?member_of=" doesn't work with nested providers
Public bug reported: "GET /allocation_candidates" now supports "member_of" parameter. With nested providers present, this should work with the following constraints. - (a) With "member_of" qparam, aggregates on the root should span on the whole tree If a root provider is in the aggregate, which has been specified by "member_of" qparam, the resource providers under that root can be in allocation candidates even the root is absent. (b) Without "member_of" qparam, sharing resource provider should be shared with the whole tree If a sharing provider is in the same aggregate with one resource provider (rpA), and "member_of" hasn't been specified in qparam by user, the sharing provider can be in allocation candidates with any of the resource providers in the same tree with rpA. (c) With "member_of" qparam, the range of the share of sharing resource providers should shrink to the resource providers "under the specified aggregates" in a tree. Here, whether the rp is "under the specified aggregates" is determined with the constraints of (a). Namely, not only rps that belongs to the aggregates directly are "under the aggregates", but olso rps whose root is under the aggregates are also "under the aggregates". - So far at Stein PTG time, 2018 Sep. 13th, this constraint is broken in the point that when placement picks up allocation candidates, the aggregates of nested providers are assumed as the same as root providers. This means it ignores the aggregates of the nested provider itself. This could result in the lack of allocation candidates when an aggregate which on a nested provider but not on the root has been specified in the `member_of` query parameter. This bug is well described in a test case which is submitted shortly. ** Affects: nova Importance: Undecided Status: New ** Tags: placement -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1792503 Title: allocation candidates "?member_of=" doesn't work with nested providers Status in OpenStack Compute (nova): New Bug description: "GET /allocation_candidates" now supports "member_of" parameter. With nested providers present, this should work with the following constraints. - (a) With "member_of" qparam, aggregates on the root should span on the whole tree If a root provider is in the aggregate, which has been specified by "member_of" qparam, the resource providers under that root can be in allocation candidates even the root is absent. (b) Without "member_of" qparam, sharing resource provider should be shared with the whole tree If a sharing provider is in the same aggregate with one resource provider (rpA), and "member_of" hasn't been specified in qparam by user, the sharing provider can be in allocation candidates with any of the resource providers in the same tree with rpA. (c) With "member_of" qparam, the range of the share of sharing resource providers should shrink to the resource providers "under the specified aggregates" in a tree. Here, whether the rp is "under the specified aggregates" is determined with the constraints of (a). Namely, not only rps that belongs to the aggregates directly are "under the aggregates", but olso rps whose root is under the aggregates are also "under the aggregates". - So far at Stein PTG time, 2018 Sep. 13th, this constraint is broken in the point that when placement picks up allocation candidates, the aggregates of nested providers are assumed as the same as root providers. This means it ignores the aggregates of the nested provider itself. This could result in the lack of allocation candidates when an aggregate which on a nested provider but not on the root has been specified in the `member_of` query parameter. This bug is well described in a test case which is submitted shortly. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1792503/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1785382] [NEW] GET /resource_providers/{uuid}/allocations doesn't get all the allocations
Public bug reported: Description === GET /resource_providers/{uuid}/allocations doesn't get all the allocations Reproduce = 1. Set 1 resource provider with some inventories 2. A user (userA) in a project(projectX) makes 1 consumer (Consumer1) allocate on the rp 3. The same user (userA) in the project(projectX) makes another consumer (Consumer2) allocate on the rp 4. Another user (userB) in the project(projectX) makes another consumer (Consumer3) allocate on the rp 5. An admin uses `GET /resource_providers/{rp_uuid}/allocations` to get the consumers allocated. Expected The admin gets 3 consumers for the response, Consumer1, 2 and 3. Actual == The admin gets 2 consumers for the response, Consumer1, 2. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1785382 Title: GET /resource_providers/{uuid}/allocations doesn't get all the allocations Status in OpenStack Compute (nova): New Bug description: Description === GET /resource_providers/{uuid}/allocations doesn't get all the allocations Reproduce = 1. Set 1 resource provider with some inventories 2. A user (userA) in a project(projectX) makes 1 consumer (Consumer1) allocate on the rp 3. The same user (userA) in the project(projectX) makes another consumer (Consumer2) allocate on the rp 4. Another user (userB) in the project(projectX) makes another consumer (Consumer3) allocate on the rp 5. An admin uses `GET /resource_providers/{rp_uuid}/allocations` to get the consumers allocated. Expected The admin gets 3 consumers for the response, Consumer1, 2 and 3. Actual == The admin gets 2 consumers for the response, Consumer1, 2. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1785382/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1779818] [NEW] child's root provider is not updated.
Public bug reported: Description === You can update a resource provider(old root RP)'s parent RP from None to a specific existing RP(original root RP). But if the resource provider(old root RP) has a child RP, the child RP's root RP is not updated automatically to the new root RP. Reproduction 1. There is already an RP ``` * original_root ``` { "resource_providers": [ { "uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "parent_provider_uuid": null, "generation": 1, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "name": "original_root" } ] } 2. create a new RP and its child using POST /resource_providers ``` * original_root * old_root_rp | +-- child_rp ``` { "resource_providers": [ { "uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "parent_provider_uuid": null, "generation": 1, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "name": "original_root" }, { "uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", "parent_provider_uuid": null, "generation": 0, "root_provider_uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", "name": "old_root_rp" } { "uuid": "b80b63c9-1923-42ac-8659-e32479c70eaf", "parent_provider_uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", "generation": 0, "root_provider_uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", "name": "child_rp" } ] } 3. Update old root rp's parent to the original root using PUT /resource_providers/6985934e-0d44-404e-9b59-92d33f89d9ef ``` * original_root | +-- old_root_rp | +-- child_rp ``` { "resource_providers": [ { "uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "parent_provider_uuid": null, "generation": 1, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "name": "original_root" }, { "uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", "parent_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", # updated :) "generation": 0, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", # updated :) "name": "old_root_rp" }, { "uuid": "b80b63c9-1923-42ac-8659-e32479c70eaf", "parent_provider_uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", # not updated :( "generation": 0, "root_provider_uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", # not updated :( "name": "child_rp" } ] } The old_root_rp's root provider uuid is updated, but the child_rp's root provider uuid remains old root rp's uuid. Expected The child_rp's root provider uuid is also updated to the original_root's rp uuid. { "resource_providers": [ { "uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "parent_provider_uuid": null, "generation": 1, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "name": "original_root" }, { "uuid": "6985934e-0d44-404e-9b59-92d33f89d9ef", "parent_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "generation": 0, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "name": "old_root_rp" }, { "uuid": "b80b63c9-1923-42ac-8659-e32479c70eaf", "parent_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", # updated as well :) "generation": 0, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", # updated as well :) "name": "child_rp" } ] } ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1779818 Title: child's root provider is not updated. Status in OpenStack Compute (nova): New Bug description: Description === You can update a resource provider(old root RP)'s parent RP from None to a specific existing RP(original root RP). But if the resource provider(old root RP) has a child RP, the child RP's root RP is not updated automatically to the new root RP. Reproduction 1. There is already an RP ``` * original_root ``` { "resource_providers": [ { "uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "parent_provider_uuid": null, "generation": 1, "root_provider_uuid": "da9bd8c5-e376-4828-b8ed-080081f8e4ed", "name": "original_root"
[Yahoo-eng-team] [Bug 1769853] Re: Local disk without enough capacity appears in allocation candidates
This was novel when I reported this, but now the bug was fixed during granular-candidates work. Linked the related patches manually just now. Unit test submitted in https://review.openstack.org/#/c/566842/ Fixed in https://review.openstack.org/#/c/517757/ ** Changed in: nova Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1769853 Title: Local disk without enough capacity appears in allocation candidates Status in OpenStack Compute (nova): Fix Released Bug description: How to reproduce - In placement, 1. Setup a compute node resource provider with inventories of - 24 VCPU - 2048 MEMORY_MB - 1600 DISK_GB 2. Setup a shared storage resource provider with "MISC_SHARES_VIA_AGGREGATE" tag with - 2000 DISK_GB inventory 3. Set them both in a same aggregate 4. Get allocation candidates requesting - 1 VCPU - 64 MEMORY_MB - 1800 DISK_GB inventory. Expected - Get one allocation request where DISK inventory is provided by the shared storage Actual -- Get two allocation request, in one of which DISK inventory is provided by the compute node To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1769853/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1771707] [NEW] allocation candidates with nested providers have inappropriate candidates when traits specified
Public bug reported: * We are setting up two compute nodes with numa node & pf nested providers. And only one pf from cn1 has HW_NIC_OFFLOAD_GENEVE trait. compute node (cn1) [CPU:16, MEMORY_MB:32768] /+++\ / \ cn1_numa0 cn1_numa1 | ++ | | ++ | cn1_numa0_pf0 cn1_numa1_pf1 (trait=HW_NIC_OFFLOAD_GENEVE) [SRIOV_NET_VF:8] [SRIOV_NET_VF:8] compute node (cn2) [CPU:16, MEMORY_MB:32768] /++\ /+++ \ cn2_numa0 cn2_numa1 | | | | cn2_numa0_pf0 cn2_numa1_pf1 [SRIOV_NET_VF:8] [SRIOV_NET_VF:8] * Next request with - resources={CPU: 2, MEMORY_MB: 256, SRIOV_NET_VF: 1} - required_traits=[HW_NIC_OFFLOAD_GENEVE] * The expected result is to get allocation request with only “cn1_numa1_pf1”, [('cn1’, fields.ResourceClass.VCPU, 2), ('cn1’, fields.ResourceClass.MEMORY_MB, 256), ('cn1_numa1_pf1’, fields.ResourceClass.SRIOV_NET_VF, 1)], * But actually we also get allocation request with “cn1_numa1_pf0” from the same tree with traits. [('cn1’, fields.ResourceClass.VCPU, 2), ('cn1’, fields.ResourceClass.MEMORY_MB, 256), ('cn1_numa1_pf1’, fields.ResourceClass.SRIOV_NET_VF, 1)], [('cn1’, fields.ResourceClass.VCPU, 2), ('cn1’, fields.ResourceClass.MEMORY_MB, 256), ('cn1_numa0_pf0', fields.ResourceClass.SRIOV_NET_VF, 1)], ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1771707 Title: allocation candidates with nested providers have inappropriate candidates when traits specified Status in OpenStack Compute (nova): New Bug description: * We are setting up two compute nodes with numa node & pf nested providers. And only one pf from cn1 has HW_NIC_OFFLOAD_GENEVE trait. compute node (cn1) [CPU:16, MEMORY_MB:32768] /+++\ / \ cn1_numa0 cn1_numa1 | ++ | | ++ | cn1_numa0_pf0 cn1_numa1_pf1 (trait=HW_NIC_OFFLOAD_GENEVE) [SRIOV_NET_VF:8] [SRIOV_NET_VF:8] compute node (cn2) [CPU:16, MEMORY_MB:32768] /++\ /+++ \ cn2_numa0 cn2_numa1 | | | | cn2_numa0_pf0 cn2_numa1_pf1 [SRIOV_NET_VF:8] [SRIOV_NET_VF:8] * Next request with - resources={CPU: 2, MEMORY_MB: 256, SRIOV_NET_VF: 1} - required_traits=[HW_NIC_OFFLOAD_GENEVE] * The expected result is to get allocation request with only “cn1_numa1_pf1”, [('cn1’, fields.ResourceClass.VCPU, 2), ('cn1’, fields.ResourceClass.MEMORY_MB, 256), ('cn1_numa1_pf1’, fields.ResourceClass.SRIOV_NET_VF, 1)], * But actually we also get allocation request with “cn1_numa1_pf0” from the same tree with traits. [('cn1’, fields.ResourceClass.VCPU, 2), ('cn1’, fields.ResourceClass.MEMORY_MB, 256), ('cn1_numa1_pf1’, fields.ResourceClass.SRIOV_NET_VF, 1)], [('cn1’, fields.ResourceClass.VCPU, 2), ('cn1’, fields.ResourceClass.MEMORY_MB, 256), ('cn1_numa0_pf0', fields.ResourceClass.SRIOV_NET_VF, 1)], To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1771707/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1769854] [NEW] Local disk without enough capacity appears in allocation candidates
Public bug reported: How to reproduce - In placement, 1. Setup a compute node resource provider with inventories of - 24 VCPU - 2048 MEMORY_MB - 1600 DISK_GB 2. Setup a shared storage resource provider with "MISC_SHARES_VIA_AGGREGATE" tag with - 2000 DISK_GB inventory 3. Set them both in a same aggregate 4. Get allocation candidates requesting - 1 VCPU - 64 MEMORY_MB - 1800 DISK_GB inventory. Expected - Get one allocation request where DISK inventory is provided by the shared storage Actual -- Get two allocation request, in one of which DISK inventory is provided by the compute node ** Affects: nova Importance: Undecided Status: New ** Tags: placement ** Tags added: place ** Tags removed: place ** Tags added: placement -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1769854 Title: Local disk without enough capacity appears in allocation candidates Status in OpenStack Compute (nova): New Bug description: How to reproduce - In placement, 1. Setup a compute node resource provider with inventories of - 24 VCPU - 2048 MEMORY_MB - 1600 DISK_GB 2. Setup a shared storage resource provider with "MISC_SHARES_VIA_AGGREGATE" tag with - 2000 DISK_GB inventory 3. Set them both in a same aggregate 4. Get allocation candidates requesting - 1 VCPU - 64 MEMORY_MB - 1800 DISK_GB inventory. Expected - Get one allocation request where DISK inventory is provided by the shared storage Actual -- Get two allocation request, in one of which DISK inventory is provided by the compute node To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1769854/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1769853] [NEW] Local disk without enough capacity appears in allocation candidates
Public bug reported: How to reproduce - In placement, 1. Setup a compute node resource provider with inventories of - 24 VCPU - 2048 MEMORY_MB - 1600 DISK_GB 2. Setup a shared storage resource provider with "MISC_SHARES_VIA_AGGREGATE" tag with - 2000 DISK_GB inventory 3. Set them both in a same aggregate 4. Get allocation candidates requesting - 1 VCPU - 64 MEMORY_MB - 1800 DISK_GB inventory. Expected - Get one allocation request where DISK inventory is provided by the shared storage Actual -- Get two allocation request, in one of which DISK inventory is provided by the compute node ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1769853 Title: Local disk without enough capacity appears in allocation candidates Status in OpenStack Compute (nova): New Bug description: How to reproduce - In placement, 1. Setup a compute node resource provider with inventories of - 24 VCPU - 2048 MEMORY_MB - 1600 DISK_GB 2. Setup a shared storage resource provider with "MISC_SHARES_VIA_AGGREGATE" tag with - 2000 DISK_GB inventory 3. Set them both in a same aggregate 4. Get allocation candidates requesting - 1 VCPU - 64 MEMORY_MB - 1800 DISK_GB inventory. Expected - Get one allocation request where DISK inventory is provided by the shared storage Actual -- Get two allocation request, in one of which DISK inventory is provided by the compute node To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1769853/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1763907] [NEW] allocation candidates member_of gets all the shared providers
Public bug reported: When the `member_of` parameter is present, only non shared providers in the specified aggregation are picked, but the non shared provider brings shared providers from out of the specified aggregation. For example, with the following set up, ``` CN1 (VCPU) CN2 (VCPU) / agg3 \ agg1/ agg1 \ agg2 SS3 (DISK_GB) SS1 (DISK_GB) SS2 (DISK_GB) ``` When you request allocation candidates in "agg3" using `member_of` parameter, expected is to get one allocation request of the combination of (CN1+SS3), but actual is to get two allocation request, (CN1+SS3) and (CN1+SS1). ** Affects: nova Importance: Undecided Status: New ** Description changed: When the `member_of` parameter is present, only non shared providers in the specified aggregation are picked, but the non shared provider brings shared providers from out of the specified aggregation. For example, with the following set up, ``` -CN1 (VCPU)CN2 (VCPU) - / agg3 \ agg1 / agg1 \ agg2 - SS3 (DISK_GB) SS1 (DISK_GB) SS2 (DISK_GB) + CN1 (VCPU) CN2 (VCPU) + / agg3 \ agg1/ agg1 \ agg2 + SS3 (DISK_GB) SS1 (DISK_GB) SS2 (DISK_GB) ``` When you request allocation candidates in "agg3" using `member_of` parameter, expected is to get one allocation request of the combination of (CN1+SS3), but actual is to get two allocation request, (CN1+SS3) and (CN1+SS1). -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1763907 Title: allocation candidates member_of gets all the shared providers Status in OpenStack Compute (nova): New Bug description: When the `member_of` parameter is present, only non shared providers in the specified aggregation are picked, but the non shared provider brings shared providers from out of the specified aggregation. For example, with the following set up, ``` CN1 (VCPU) CN2 (VCPU) / agg3 \ agg1/ agg1 \ agg2 SS3 (DISK_GB) SS1 (DISK_GB) SS2 (DISK_GB) ``` When you request allocation candidates in "agg3" using `member_of` parameter, expected is to get one allocation request of the combination of (CN1+SS3), but actual is to get two allocation request, (CN1+SS3) and (CN1+SS1). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1763907/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1760276] [NEW] "provider_summaries" doesn't include resources that are not requested
Public bug reported: Description In ``GET /allocation_candidates`` API, ``provider_summaries`` should show all the inventories for all the resource classes in all the resource providers. However, ``provider_summaries`` doesn't contain resources that aren't requested. Steps to reproduce == Here's one example: CN1 has inventory in VCPU, MEMORY_MB, and DISK_GB. I make a request for only VCPU resource. Expected result === In API response, * "allocation_requests" shows "allocation" of VCPU resource of CN1. * "provider_summaries" shows "resources" of VCPU, MEMORY_MB, and DISK_GB of CN1. Actual result = In API response, * "allocation_requests" shows "allocation" of VCPU resource of CN1. * "provider_summaries" shows "resources" of only VCPU of CN1. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1760276 Title: "provider_summaries" doesn't include resources that are not requested Status in OpenStack Compute (nova): New Bug description: Description In ``GET /allocation_candidates`` API, ``provider_summaries`` should show all the inventories for all the resource classes in all the resource providers. However, ``provider_summaries`` doesn't contain resources that aren't requested. Steps to reproduce == Here's one example: CN1 has inventory in VCPU, MEMORY_MB, and DISK_GB. I make a request for only VCPU resource. Expected result === In API response, * "allocation_requests" shows "allocation" of VCPU resource of CN1. * "provider_summaries" shows "resources" of VCPU, MEMORY_MB, and DISK_GB of CN1. Actual result = In API response, * "allocation_requests" shows "allocation" of VCPU resource of CN1. * "provider_summaries" shows "resources" of only VCPU of CN1. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1760276/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1750701] [NEW] NUMATopologyFilter doesn't exclude Hyper-V when cpu pinning specified.
Public bug reported: Description === As described in [1], Hyper-V driver supports NUMA placement policies. But it doesn't support cpu pinning policy[2]. So the host should be excluded in NUMATopologyFilter if the end user try to build a VM with cpu pinning policy. [1] https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#customizing-instance-numa-placement-policies [2] https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#customizing-instance-cpu-pinning-policies Environment & Steps to reproduce == 1. Install OpenStack with Hyper-V driver (with NUMATopologyFilter set in nova.conf) 2. Try to Build a VM with "cpu_policy=dedicated". Expected & Actual result === Expected: No valid host error Actual: Logs & Configs == ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1750701 Title: NUMATopologyFilter doesn't exclude Hyper-V when cpu pinning specified. Status in OpenStack Compute (nova): New Bug description: Description === As described in [1], Hyper-V driver supports NUMA placement policies. But it doesn't support cpu pinning policy[2]. So the host should be excluded in NUMATopologyFilter if the end user try to build a VM with cpu pinning policy. [1] https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#customizing-instance-numa-placement-policies [2] https://docs.openstack.org/nova/latest/admin/cpu-topologies.html#customizing-instance-cpu-pinning-policies Environment & Steps to reproduce == 1. Install OpenStack with Hyper-V driver (with NUMATopologyFilter set in nova.conf) 2. Try to Build a VM with "cpu_policy=dedicated". Expected & Actual result === Expected: No valid host error Actual: Logs & Configs == To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1750701/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1746674] [NEW] isolated cpu thread policy doesn't work with multi numa node
Public bug reported: Description === As described in test_multi_nodes_isolate() in https://github.com/openstack/nova/blob/master/nova/tests/unit/virt/test_hardware.py#L3006-L3024, numa_fit_instance_to_host() function returns None for cpuset_reserved for cells with id >1. - def test_multi_nodes_isolate(self): host_topo = self._host_topology() inst_topo = objects.InstanceNUMATopology( emulator_threads_policy=( fields.CPUEmulatorThreadsPolicy.ISOLATE), cells=[objects.InstanceNUMACell( id=0, cpuset=set([0]), memory=2048, cpu_policy=fields.CPUAllocationPolicy.DEDICATED), objects.InstanceNUMACell( id=1, cpuset=set([1]), memory=2048, cpu_policy=fields.CPUAllocationPolicy.DEDICATED)]) inst_topo = hw.numa_fit_instance_to_host(host_topo, inst_topo) self.assertEqual({0: 0}, inst_topo.cells[0].cpu_pinning) self.assertEqual(set([1]), inst_topo.cells[0].cpuset_reserved) self.assertEqual({1: 2}, inst_topo.cells[1].cpu_pinning) self.assertIsNone(inst_topo.cells[1].cpuset_reserved) However, we are testing libvirt driver with non-None value in https://github.com/openstack/nova/blob/master/nova/tests/unit/virt/libvirt/test_driver.py#L3052. This causes errors when deploying VMs with multi numa nodes with `cpu_thread_policy=isolate`. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1746674 Title: isolated cpu thread policy doesn't work with multi numa node Status in OpenStack Compute (nova): New Bug description: Description === As described in test_multi_nodes_isolate() in https://github.com/openstack/nova/blob/master/nova/tests/unit/virt/test_hardware.py#L3006-L3024, numa_fit_instance_to_host() function returns None for cpuset_reserved for cells with id >1. - def test_multi_nodes_isolate(self): host_topo = self._host_topology() inst_topo = objects.InstanceNUMATopology( emulator_threads_policy=( fields.CPUEmulatorThreadsPolicy.ISOLATE), cells=[objects.InstanceNUMACell( id=0, cpuset=set([0]), memory=2048, cpu_policy=fields.CPUAllocationPolicy.DEDICATED), objects.InstanceNUMACell( id=1, cpuset=set([1]), memory=2048, cpu_policy=fields.CPUAllocationPolicy.DEDICATED)]) inst_topo = hw.numa_fit_instance_to_host(host_topo, inst_topo) self.assertEqual({0: 0}, inst_topo.cells[0].cpu_pinning) self.assertEqual(set([1]), inst_topo.cells[0].cpuset_reserved) self.assertEqual({1: 2}, inst_topo.cells[1].cpu_pinning) self.assertIsNone(inst_topo.cells[1].cpuset_reserved) However, we are testing libvirt driver with non-None value in https://github.com/openstack/nova/blob/master/nova/tests/unit/virt/libvirt/test_driver.py#L3052. This causes errors when deploying VMs with multi numa nodes with `cpu_thread_policy=isolate`. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1746674/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1746393] [NEW] 'cpu_thread_policy' impacts on emulator threads
Public bug reported: In bug/1744965(https://bugs.launchpad.net/nova/+bug/1744965), it is reported that the way emulator_threads_policy allocates the extra cpu resource for emulator is not optimal. This report reports the bug also stays when `cpu_thread_policy=isolate`. The instance I use for testing is a 3-vcpu VM with `cpu_thread_policy=isolate`; before enable this emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable hyper threading) in nova config, vcpu_pin_set=8,10,12,32,34,36 Now when we enable emulator_threads_policy, in stead of adding one more thread to this vcpu pin list in the nova config, I end up adding two more sibling threads (on the same core) vcpu_pin_set=8,10,12,16,32,34,36,40 So I ended up using 2 more threads, but only one of them is used for emulator and the other thread is wasted. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1746393 Title: 'cpu_thread_policy' impacts on emulator threads Status in OpenStack Compute (nova): New Bug description: In bug/1744965(https://bugs.launchpad.net/nova/+bug/1744965), it is reported that the way emulator_threads_policy allocates the extra cpu resource for emulator is not optimal. This report reports the bug also stays when `cpu_thread_policy=isolate`. The instance I use for testing is a 3-vcpu VM with `cpu_thread_policy=isolate`; before enable this emulator_threads_policy, I reserve 6 cpu (actually 6 threads since we enable hyper threading) in nova config, vcpu_pin_set=8,10,12,32,34,36 Now when we enable emulator_threads_policy, in stead of adding one more thread to this vcpu pin list in the nova config, I end up adding two more sibling threads (on the same core) vcpu_pin_set=8,10,12,16,32,34,36,40 So I ended up using 2 more threads, but only one of them is used for emulator and the other thread is wasted. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1746393/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1737449] [NEW] [libvirt] virt_type=qemu doesn't support NUMA related features
Public bug reported: With the change of https://review.openstack.org/#/c/465160, NUMA related features like CPU pinning, hugepages, and realtime are now explicitly disabled when using libvirt driver with `virt_type=qemu`, and compute hosts with libvirt/qemu driver are filtered out with NUMATopologyFilter. This is because qemu with the TCG(tiny code generator) backend doesn’t support cpu pinning and currently nova uses cpu pinning implicitly when NUMA related features are specified in image/flavor. However, qemu with the TCG backend potentially has capability to support some NUMA related features such as NUMA topology and hugepages. So, we should change codes to enable libvirt/qemu driver to support these NUMA features without cpu pinning. Steps to reproduce == - deploy nova with virt_type=qemu and enabled_filters=NUMATopologyFilter in nova.conf. - set hw:numa_nodes=1 or hw:mem_page_size=large in the nova flavor. - boot an instance with this modified flavor. Expected result === VM starts with a virtual numa topology of 1 numa node(hw:numa_nodes=1) or with hugepage backed memory (hw:mem_page_size=large). Actual result = nova reports no valid host error because of the NUMATopologyFilter. Note: Remenber to update at least these documents when they are enabled. https://docs.openstack.org/nova/pike/admin/huge-pages.html https://docs.openstack.org/nova/pike/admin/cpu-topologies.html ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1737449 Title: [libvirt] virt_type=qemu doesn't support NUMA related features Status in OpenStack Compute (nova): New Bug description: With the change of https://review.openstack.org/#/c/465160, NUMA related features like CPU pinning, hugepages, and realtime are now explicitly disabled when using libvirt driver with `virt_type=qemu`, and compute hosts with libvirt/qemu driver are filtered out with NUMATopologyFilter. This is because qemu with the TCG(tiny code generator) backend doesn’t support cpu pinning and currently nova uses cpu pinning implicitly when NUMA related features are specified in image/flavor. However, qemu with the TCG backend potentially has capability to support some NUMA related features such as NUMA topology and hugepages. So, we should change codes to enable libvirt/qemu driver to support these NUMA features without cpu pinning. Steps to reproduce == - deploy nova with virt_type=qemu and enabled_filters=NUMATopologyFilter in nova.conf. - set hw:numa_nodes=1 or hw:mem_page_size=large in the nova flavor. - boot an instance with this modified flavor. Expected result === VM starts with a virtual numa topology of 1 numa node(hw:numa_nodes=1) or with hugepage backed memory (hw:mem_page_size=large). Actual result = nova reports no valid host error because of the NUMATopologyFilter. Note: Remenber to update at least these documents when they are enabled. https://docs.openstack.org/nova/pike/admin/huge-pages.html https://docs.openstack.org/nova/pike/admin/cpu-topologies.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1737449/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1737450] [NEW] [libvirt] virt_type=xen doesn't support NUMA related features
Public bug reported: With the change of https://review.openstack.org/#/c/465160, NUMA related features like CPU pinning, hugepages, and realtime are now explicitly disabled when using libvirt driver with `virt_type=xen`, and compute hosts with libvirt/xen driver are filtered out with NUMATopologyFilter. We should test and make clear which of the NUMA related features can be supported by libvirt/xen driver and enable them with documents. Note: Remenber to update at least these documents when they are enabled. https://docs.openstack.org/nova/pike/admin/huge-pages.html https://docs.openstack.org/nova/pike/admin/cpu-topologies.html ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1737450 Title: [libvirt] virt_type=xen doesn't support NUMA related features Status in OpenStack Compute (nova): New Bug description: With the change of https://review.openstack.org/#/c/465160, NUMA related features like CPU pinning, hugepages, and realtime are now explicitly disabled when using libvirt driver with `virt_type=xen`, and compute hosts with libvirt/xen driver are filtered out with NUMATopologyFilter. We should test and make clear which of the NUMA related features can be supported by libvirt/xen driver and enable them with documents. Note: Remenber to update at least these documents when they are enabled. https://docs.openstack.org/nova/pike/admin/huge-pages.html https://docs.openstack.org/nova/pike/admin/cpu-topologies.html To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1737450/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp