On Wed, 24 Feb 2016 21:51:19 +1100 David Gibson <da...@gibson.dropbear.id.au> wrote:
> On Wed, Feb 24, 2016 at 09:42:10AM +0100, Markus Armbruster wrote: > > David Gibson <da...@gibson.dropbear.id.au> writes: > > > > > On Mon, Feb 22, 2016 at 10:05:54AM +0100, Markus Armbruster wrote: > > >> David Gibson <da...@gibson.dropbear.id.au> writes: > > >> > > >> > On Fri, Feb 19, 2016 at 10:51:11AM +0100, Markus Armbruster wrote: > > >> >> David Gibson <da...@gibson.dropbear.id.au> writes: > > >> >> > > >> >> > On Thu, Feb 18, 2016 at 11:37:39AM +0100, Igor Mammedov wrote: > > >> >> >> On Thu, 18 Feb 2016 14:39:52 +1100 > > >> >> >> David Gibson <da...@gibson.dropbear.id.au> wrote: > > >> >> >> > > >> >> >> > On Tue, Feb 16, 2016 at 11:36:55AM +0100, Igor Mammedov wrote: > > >> >> >> > > On Mon, 15 Feb 2016 20:43:41 +0100 > > >> >> >> > > Markus Armbruster <arm...@redhat.com> wrote: > > >> >> >> > > > > >> >> >> > > > Igor Mammedov <imamm...@redhat.com> writes: > > >> >> >> > > > > > >> >> >> > > > > it will allow mgmt to query present and possible to > > >> >> >> > > > > hotplug CPUs > > >> >> >> > > > > it is required from a target platform that wish to support > > >> >> >> > > > > command to set board specific MachineClass.possible_cpus() > > >> >> >> > > > > hook, > > >> >> >> > > > > which will return a list of possible CPUs with options > > >> >> >> > > > > that would be needed for hotplugging possible CPUs. > > >> >> >> > > > > > > >> >> >> > > > > For RFC there are: > > >> >> >> > > > > 'arch_id': 'int' - mandatory unique CPU number, > > >> >> >> > > > > for x86 it's APIC ID for ARM it's > > >> >> >> > > > > MPIDR > > >> >> >> > > > > 'type': 'str' - CPU object type for usage with > > >> >> >> > > > > device_add > > >> >> >> > > > > > > >> >> >> > > > > and a set of optional fields that would allows mgmt tools > > >> >> >> > > > > to know at what granularity and where a new CPU could be > > >> >> >> > > > > hotplugged; > > >> >> >> > > > > [node],[socket],[core],[thread] > > >> >> >> > > > > Hopefully that should cover needs for CPU hotplug porposes > > >> >> >> > > > > for > > >> >> >> > > > > magor targets and we can extend structure in future adding > > >> >> >> > > > > more fields if it will be needed. > > >> >> >> > > > > > > >> >> >> > > > > also for present CPUs there is a 'cpu_link' field which > > >> >> >> > > > > would allow mgmt inspect whatever object/abstraction > > >> >> >> > > > > the target platform considers as CPU object. > > >> >> >> > > > > > > >> >> >> > > > > For RFC purposes implements only for x86 target so far. > > >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > Adding ad hoc queries as we go won't scale. Could this be > > >> >> >> > > > solved by a > > >> >> >> > > > generic introspection interface? > > >> >> >> > > Do you mean generic QOM introspection? > > >> >> >> > > > > >> >> >> > > Using QOM we could have '/cpus' container and create QOM links > > >> >> >> > > for exiting (populated links) and possible (empty links) CPUs. > > >> >> >> > > However in that case link's name will need have a special > > >> >> >> > > format > > >> >> >> > > that will convey an information necessary for mgmt to hotplug > > >> >> >> > > a CPU object, at least: > > >> >> >> > > - where: [node],[socket],[core],[thread] options > > >> >> >> > > - optionally what CPU object to use with device_add command > > >> >> >> > > > > >> >> >> > > > >> >> >> > Hmm.. is it not enough to follow the link and get the topology > > >> >> >> > information by examining the target? > > >> >> >> One can't follow a link if it's an empty one, hence > > >> >> >> CPU placement information should be provided somehow, > > >> >> >> either: > > >> >> > > > >> >> > Ah, right, so the issue is determining the socket/core/thread > > >> >> > addresses that cpus which aren't yet present will have. > > >> >> > > > >> >> >> * by precreating cpu-package objects with properties that > > >> >> >> would describe it /could be inspected via OQM/ > > >> >> > > > >> >> > So, we could do this, but I think the natural way would be to have > > >> >> > the > > >> >> > information for each potential thread in the package. Just putting > > >> >> > say "core number" in the package itself assumes more than I'd like > > >> >> > about how packages sit in the heirarchy. Plus, it means that > > >> >> > management has a bunch of cases to deal with: package has all the > > >> >> > information, package has just a core id, package has just a socket > > >> >> > id, > > >> >> > and so forth. > > >> >> > > > >> >> > It is a but clunky that when the package is plugged, this > > >> >> > information > > >> >> > will have to sit parallel to the array of actual thread links. > > >> >> > > > >> >> > Markus or Andreas is there a natural way to present a list of (node, > > >> >> > socket, core, thread) tuples in the package object? Preferably > > >> >> > without having to create a whole bunch of "potential thread" objects > > >> >> > just for the purpose. > > >> >> > > >> >> I'm just a dabbler when it comes to QOM, but I can try. > > >> >> > > >> >> I view a concrete cpu-package device (subtype of the abstract > > >> >> cpu-package device) as a composite device containing stuff like actual > > >> >> cores. > > >> > > > >> > So.. the idea is it's a bit more abstract than that. My intention is > > >> > that the package lists - in some manner - each of the threads > > >> > (i.e. vcpus) it contains / can contain. Depending on the platform it > > >> > *might* also have internal structure such as cores / sockets, but it > > >> > doesn't have to. Either way, the contained threads will be listed in > > >> > a common way, as a flat array. > > >> > > > >> >> To create a composite device, you start with the outer shell, then > > >> >> plug > > >> >> in components one by one. Components can be nested arbitrarily deep. > > >> >> > > >> >> Perhaps you can define the concrete cpu-package shell in a way that > > >> >> lets > > >> >> you query what you need to know from a mere shell (no components > > >> >> plugged). > > >> > > > >> > Right.. that's exactly what I'm suggesting, but I don't know enough > > >> > about the presentation of basic data in QOM to know quite how to > > >> > accomplish it. > > >> > > > >> >> >> or > > >> >> >> * via QMP/HMP command that would provide the same information > > >> >> >> only without need to precreate anything. The only difference > > >> >> >> is that it allows to use -device/device_add for new CPUs. > > >> >> > > > >> >> > I'd be ok with that option as well. I'd be thinking it would be > > >> >> > implemented via a class method on the package object which returns > > >> >> > the > > >> >> > addresses that its contained threads will have, whether or not > > >> >> > they're > > >> >> > present right now. Does that make sense? > > >> >> > > >> >> If you model CPU packages as composite cpu-package devices, then you > > >> >> should be able to plug and unplug these with device_add, unless > > >> >> plugging > > >> >> them requires complex wiring that can't be done in qdev / device_add, > > >> >> yet. > > >> > > > >> > There's a whole bunch of issues raised by allowing device_add of > > >> > cpus. Although they're certainly interesting and probably useful, I'd > > >> > really like to punt on them for the time being, so we can get some > > >> > sort of cpu hotplug working on Power (and s390 and others). > > >> > > >> If you make it a device, you can still set > > >> cannot_instantiate_with_device_add_yet to disable -device / device_add > > >> for now, and unset it later, when you're ready for it. > > > > > > Yes, that was the plan. > > > > > >> > The idea of the cpu packages is that - at least for now - the user > > >> > can't control their contents apart from the single "present" bit. > > >> > They already know what they can contain. > > >> > > >> Composite devices commonly do. They're not general containers. > > >> > > >> The "present" bit sounds like you propose to "pre-plug" all the possible > > >> CPU packages, and thus reduce CPU hot plug/unplug to enabling/disabling > > >> pre-plugged CPU packages. > > > > > > Yes. > > > > I'm concerned this might suffer combinatorial explosion. > > > > qemu-system-x86_64 --cpu help shows more than two dozen CPUs. They can > > be configured in numerous arrangements of sockets, cores, threads. Many > > of these wouldn't be physically possible with older CPUs. Guest > > software might work even with physically impossible configurations, but > > arranging virtual models of physical hardware in physically impossible > > configurations invites trouble, and should best be avoided. > > > > I'm afraid I'm still in the guess-what-you-mean stage because I lack > > concrete examples to go with the abstract description. Can you > > enumerate the pre-plugged CPU packages for a board of your choice to > > give us a better idea of how your proposal would look like in practice? > > Then describe briefly what a management application would need to know > > about them, and what it would do with the knowledge? > > > > Perhaps a PC board would be the most useful, because PCs are probably > > second to none in random complexity :) > > Well, it may be moot at this point, since Andreas has objected > strongly to Bharata's draft for reasons I have yet to really figure > out. > > But I think the answer below will clarify this. > > > >> What if a board can take different kinds of CPU packages? Do we > > >> pre-plug all combinations? Then some combinations are non-sensical. > > >> How would we reject them? > > > > > > I'm not trying to solve all cases with the present bit handling - just > > > the currently common case of a machine with fixed maximum number of > > > slots which are expected to contain identical processor units. > > > > > >> For instance, PC machines support a wide range of CPUs in various > > >> arrangements, but you generally need to use a single kind of CPU, and > > >> the kind of CPU restricts the possible arrangements. How would you > > >> model that? > > > > > > The idea is that the available slots are determined by the machine, > > > possibly using machine or global options. So for PC, -cpu and -smp > > > would determine the number of slots and what can go into them. > > > > Do these CPU packages come with "soldered-in" CPUs? Or do they provide > > slots where CPUs can be plugged in? From what I've read, I guess it's > > the latter, together with a "thou shalt not plug in different CPUs" > > commandment. Correct? > > No, they do in fact come with "soldered in" CPUS. Once the package is > constructed it is either absent, or supplies exactly one set of cpu > threads (and possibly other bits and pieces), there is no further > configuration. > > So: > qemu-system-x86_64 -machine pc -cpu Haswell -smp 2,maxcpus=8 > > Would give you 8 cpu packages. 2 would initially be present, the rest > would be absent. If you toggle an absent one to present, another > single-thread Haswell would appear in the guest. > > qemu-system-x86_64 -machine pc -cpu Haswell \ > -smp 2,threads=2,cores=2,sockets=2,maxcpus=8 > ok now lets imagine that mgmt set 'present'=on for pkg 7 and that needs to be migrated, how would target QEMU be able to recreate the state of source QEMU instance? > Would be basically the same (because thread granularity hotplug is > allowed on x86). 2 present (pkg0, pkg1) and 6 (pkg2..pkg7) absent cpu > packages. If you toggled on pkg2, socket 0, core 1, thread 0 would > appear. If you toggled on pkg 7, socket 1, core 1, thread 1 would > appear. > > In contrast, pseries only allows per-core hotplug, so: > > qemu-system-ppc64 -machine pseries -cpu POWER8 \ > -smp 16,threads=8,cores=2,sockets=1,maxcpus=16 > > Would give you 2 cpu packages, 1 present, 1 absent. Toggling on the > second package would make a second POWER8 with 8 threads appear. > > Clearer? > > > If yes, then the CPU the board comes with would determine what you can > > plug into the slots. > > > > Conversely, the CPU the board comes with helps determine the CPU > > packages. > > Either, potentially. The machine type code would determine what > packages are constructed, and may use machine specific or global > options to determine this. Or it can (as now) declare that that's not > a possible set of CPUs for this board. > > > >> > There are a bunch of potential use cases this doesn't address, but I > > >> > think it *does* address a useful subset of currently interesting > > >> > cases, without precluding more flexible extensions in future. > > >> > > > >> >> If that's the case, a general solution for "device needs complex > > >> >> wiring" > > >> >> would be more useful than a one-off for CPU packages. > > >> >> > > >> >> [...] > > >> >> > > >> > > >