On Fri, Oct 18, 2019 at 11:38:37AM -0500, Michael Roth wrote: > Quoting Dr. David Alan Gilbert (2019-10-18 04:43:52) > > * Laurent Vivier (lviv...@redhat.com) wrote: > > > On 18/10/2019 10:16, Dr. David Alan Gilbert wrote: > > > > * Scott Cheloha (chel...@linux.vnet.ibm.com) wrote: > > > >> savevm_state's SaveStateEntry TAILQ is a priority queue. Priority > > > >> sorting is maintained by searching from head to tail for a suitable > > > >> insertion spot. Insertion is thus an O(n) operation. > > > >> > > > >> If we instead keep track of the head of each priority's subqueue > > > >> within that larger queue we can reduce this operation to O(1) time. > > > >> > > > >> savevm_state_handler_remove() becomes slightly more complex to > > > >> accomodate these gains: we need to replace the head of a priority's > > > >> subqueue when removing it. > > > >> > > > >> With O(1) insertion, booting VMs with many SaveStateEntry objects is > > > >> more plausible. For example, a ppc64 VM with maxmem=8T has 40000 such > > > >> objects to insert. > > > > > > > > Separate from reviewing this patch, I'd like to understand why you've > > > > got 40000 objects. This feels very very wrong and is likely to cause > > > > problems to random other bits of qemu as well. > > > > > > I think the 40000 objects are the "dr-connectors" that are used to plug > > > peripherals (memory, pci card, cpus, ...). > > > > Yes, Scott confirmed that in the reply to the previous version. > > IMHO nothing in qemu is designed to deal with that many devices/objects > > - I'm sure that something other than the migration code is going to get > > upset. > > The device/object management aspect seems to handle things *mostly* okay, at > least ever since QOM child properties started being tracked by a hash table > instead of a linked list. It's worth noting that that change (b604a854) was > done to better handle IRQ pins for ARM guests with lots of CPUs. I think it is > inevitable that certain machine types/configurations will call for large > numbers of objects and I think it is fair to improve things to allow for this > sort of scalability. > > But I agree it shouldn't be abused, and you're right that there are some > problem areas that arise. Trying to outline them: > > a) introspection commands like 'info qom-tree' become pretty unwieldly, > and with large enough numbers of objects might even break things (QMP > response size limits maybe?) > b) various related lists like reset handlers, vmstate/savevm handlers might > grow quite large > > I think we could work around a) with maybe flagging certain > "internally-only" objects as 'hidden'. Introspection routines could then > filter these out, and routines like qom-set/qom-get could return report > something similar to EACCESS so they are never used/useful to management > tools. > > In cases like b) we can optimize things where it makes sense like with > Scott's patch here. In most cases these lists need to be walked one way > or another, whether it's done internally by the object or through common > interfaces provided by QEMU. It's really just the O(n^2) type handling > where relying on common interfaces becomes drastically less efficient, > but I think we should avoid implementing things in that way anyway, or > improve them as needed. > > > > > Is perhaps the structure wrong somewhere - should there be a single DRC > > device that knows about all DRCs? > > That's an interesting proposition, I think it's worth exploring further, > but from a high level: > > - each SpaprDrc has migration state, and some sub-classes SpaprDrc (e.g. > SpaprDrcPhysical) have additional migration state. These are sent > as-needed as separate VMState entries in the migration stream. > Moving to a single DRC means we're either sending them as an flat > array or a sparse list, which would put just as much load on the > migration code (at least, with Scott's changes in place). It would > also be difficult to do all this in a way which maintains migration > compatibility with older machine types. > - other aspects of modeling these as QOM objects, such as look-ups, > reset-handling, and memory allocations, wouldn't be dramatically > improved upon by handling it all internally within the object > > AFAICT the biggest issue with modeling the DRCs as individual objects > is actually how we deal with introspection, and we should try to > improve. What do you think of the alternative suggestion above of > marking certain objects as 'hidden' from various introspection > interfaces?
So, that's not something I'd considered particularly in this context, but it has bothered me in other contexts. The fact that all the QOM interfaces are freely user-inspectable, but are also used for a bunch of interaction between qemu components means that we (arguably) routinely make a bunch of stuff into user-visible API which we probably don't really need or want to. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature