On Fri, Nov 2, 2018 at 9:32 PM Matt Riedemann <mriede...@gmail.com> wrote: > > On 11/2/2018 2:22 PM, Eric Fried wrote: > > Based on a (long) discussion yesterday [1] I have put up a patch [2] > > whereby you can set [compute]resource_provider_association_refresh to > > zero and the resource tracker will never* refresh the report client's > > provider cache. Philosophically, we're removing the "healing" aspect of > > the resource tracker's periodic and trusting that placement won't > > diverge from whatever's in our cache. (If it does, it's because the op > > hit the CLI, in which case they should SIGHUP - see below.) > > > > *except: > > - When we initially create the compute node record and bootstrap its > > resource provider. > > - When the virt driver's update_provider_tree makes a change, > > update_from_provider_tree reflects them in the cache as well as pushing > > them back to placement. > > - If update_from_provider_tree fails, the cache is cleared and gets > > rebuilt on the next periodic. > > - If you send SIGHUP to the compute process, the cache is cleared. > > > > This should dramatically reduce the number of calls to placement from > > the compute service. Like, to nearly zero, unless something is actually > > changing. > > > > Can I get some initial feedback as to whether this is worth polishing up > > into something real? (It will probably need a bp/spec if so.) > > > > [1] > > http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2018-11-01.log.html#t2018-11-01T17:32:03 > > [2]https://review.openstack.org/#/c/614886/ > > > > ========== > > Background > > ========== > > In the Queens release, our friends at CERN noticed a serious spike in > > the number of requests to placement from compute nodes, even in a > > stable-state cloud. Given that we were in the process of adding a ton of > > infrastructure to support sharing and nested providers, this was not > > unexpected. Roughly, what was previously: > > > > @periodic_task: > > GET/resource_providers/$compute_uuid > > GET/resource_providers/$compute_uuid/inventories > > > > became more like: > > > > @periodic_task: > > # In Queens/Rocky, this would still just return the compute RP > > GET /resource_providers?in_tree=$compute_uuid > > # In Queens/Rocky, this would return nothing > > GET /resource_providers?member_of=...&required=MISC_SHARES... > > for each provider returned above: # i.e. just one in Q/R > > GET/resource_providers/$compute_uuid/inventories > > GET/resource_providers/$compute_uuid/traits > > GET/resource_providers/$compute_uuid/aggregates > > > > In a cloud the size of CERN's, the load wasn't acceptable. But at the > > time, CERN worked around the problem by disabling refreshing entirely. > > (The fact that this seems to have worked for them is an encouraging sign > > for the proposed code change.) > > > > We're not actually making use of most of that information, but it sets > > the stage for things that we're working on in Stein and beyond, like > > multiple VGPU types, bandwidth resource providers, accelerators, NUMA, > > etc., so removing/reducing the amount of information we look at isn't > > really an option strategically. > > A few random points from the long discussion that should probably > re-posed here for wider thought: > > * There was probably a lot of discussion about why we needed to do this > caching and stuff in the compute in the first place. What has changed > that we no longer need to aggressively refresh the cache on every > periodic? I thought initially it came up because people really wanted > the compute to be fully self-healing to any external changes, including > hot plugging resources like disk on the host to automatically reflect > those changes in inventory. Similarly, external user/service > interactions with the placement API which would then be automatically > picked up by the next periodic run - is that no longer a desire, and/or > how was the decision made previously that simply requiring a SIGHUP in > that case wasn't sufficient/desirable. > > * I believe I made the point yesterday that we should probably not > refresh by default, and let operators opt-in to that behavior if they > really need it, i.e. they are frequently making changes to the > environment, potentially by some external service (I could think of > vCenter doing this to reflect changes from vCenter back into > nova/placement), but I don't think that should be the assumed behavior > by most and our defaults should reflect the "normal" use case. > > * I think I've noted a few times now that we don't actually use the > provider aggregates information (yet) in the compute service. Nova host > aggregate membership is mirror to placement since Rocky [1] but that > happens in the API, not the the compute. The only thing I can think of > that relied on resource provider aggregate information in the compute is > the shared storage providers concept, but that's not supported (yet) > [2]. So do we need to keep retrieving aggregate information when nothing > in compute uses it yet? > > * Similarly, why do we need to get traits on each periodic? The only > in-tree virt driver I'm aware of that *reports* traits is the libvirt > driver for CPU features [3]. Otherwise I think the idea behind getting > the latest traits is so the virt driver doesn't overwrite any traits set > externally on the compute node root resource provider. I think that > still stands and is probably OK, even though we have generations now > which should keep us from overwriting if we don't have the latest > traits, but I wanted to bring it up since it's related to the "why do we > need provider aggregates in the compute?" question. > > * Regardless of what we do, I think we should probably *at least* make > that refresh associations config allow 0 to disable it so CERN (and > others) can avoid the need to continually forward-porting code to > disable it. > > [1] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/placement-mirror-host-aggregates.html > [2] https://bugs.launchpad.net/nova/+bug/1784020 > [3] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/report-cpu-features-as-traits.html > > -- > > Thanks, > > Matt > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mna...@vexxhost.com W. http://vexxhost.com __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev