On Tue, Feb 23, 2016 at 10:46:45AM +0100, Igor Mammedov wrote: > On Mon, 22 Feb 2016 13:54:32 +1100 > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > On Fri, Feb 19, 2016 at 04:49:11PM +0100, Igor Mammedov wrote: > > > On Fri, 19 Feb 2016 15:38:48 +1100 > > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > > CCing thread a couple of libvirt guys. > > > > > > > On Thu, Feb 18, 2016 at 11:37:39AM +0100, Igor Mammedov wrote: > > > > > On Thu, 18 Feb 2016 14:39:52 +1100 > > > > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > > > > > > > > > On Tue, Feb 16, 2016 at 11:36:55AM +0100, Igor Mammedov wrote: > > > > > > > On Mon, 15 Feb 2016 20:43:41 +0100 > > > > > > > Markus Armbruster <arm...@redhat.com> wrote: > > > > > > > > > > > > > > > Igor Mammedov <imamm...@redhat.com> writes: > > > > > > > > > > > > > > > > > it will allow mgmt to query present and possible to hotplug > > > > > > > > > CPUs > > > > > > > > > it is required from a target platform that wish to support > > > > > > > > > command to set board specific MachineClass.possible_cpus() > > > > > > > > > hook, > > > > > > > > > which will return a list of possible CPUs with options > > > > > > > > > that would be needed for hotplugging possible CPUs. > > > > > > > > > > > > > > > > > > For RFC there are: > > > > > > > > > 'arch_id': 'int' - mandatory unique CPU number, > > > > > > > > > for x86 it's APIC ID for ARM it's MPIDR > > > > > > > > > 'type': 'str' - CPU object type for usage with device_add > > > > > > > > > > > > > > > > > > and a set of optional fields that would allows mgmt tools > > > > > > > > > to know at what granularity and where a new CPU could be > > > > > > > > > hotplugged; > > > > > > > > > [node],[socket],[core],[thread] > > > > > > > > > Hopefully that should cover needs for CPU hotplug porposes for > > > > > > > > > magor targets and we can extend structure in future adding > > > > > > > > > more fields if it will be needed. > > > > > > > > > > > > > > > > > > also for present CPUs there is a 'cpu_link' field which > > > > > > > > > would allow mgmt inspect whatever object/abstraction > > > > > > > > > the target platform considers as CPU object. > > > > > > > > > > > > > > > > > > For RFC purposes implements only for x86 target so far. > > > > > > > > > > > > > > > > > > > > > > > > > Adding ad hoc queries as we go won't scale. Could this be > > > > > > > > solved by a > > > > > > > > generic introspection interface? > > > > > > > Do you mean generic QOM introspection? > > > > > > > > > > > > > > Using QOM we could have '/cpus' container and create QOM links > > > > > > > for exiting (populated links) and possible (empty links) CPUs. > > > > > > > However in that case link's name will need have a special format > > > > > > > that will convey an information necessary for mgmt to hotplug > > > > > > > a CPU object, at least: > > > > > > > - where: [node],[socket],[core],[thread] options > > > > > > > - optionally what CPU object to use with device_add command > > > > > > > > > > > > > > > > > > > Hmm.. is it not enough to follow the link and get the topology > > > > > > information by examining the target? > > > > > One can't follow a link if it's an empty one, hence > > > > > CPU placement information should be provided somehow, > > > > > either: > > > > > > > > Ah, right, so the issue is determining the socket/core/thread > > > > addresses that cpus which aren't yet present will have. > > > > > > > > > * by precreating cpu-package objects with properties that > > > > > would describe it /could be inspected via OQM/ > > > > > > > > So, we could do this, but I think the natural way would be to have the > > > > information for each potential thread in the package. Just putting > > > > say "core number" in the package itself assumes more than I'd like > > > > about how packages sit in the heirarchy. Plus, it means that > > > > management has a bunch of cases to deal with: package has all the > > > > information, package has just a core id, package has just a socket id, > > > > and so forth. > > > > > > > > It is a but clunky that when the package is plugged, this information > > > > will have to sit parallel to the array of actual thread links. > > > > > > > > Markus or Andreas is there a natural way to present a list of (node, > > > > socket, core, thread) tuples in the package object? Preferably > > > > without having to create a whole bunch of "potential thread" objects > > > > just for the purpose. > > > I'm sorry but I couldn't parse above 2 paragraphs. The way I see > > > whatever placement info QEMU will provide to mgmt, mgmt will have > > > to deal with it in one way or another. > > > Perhaps rephrasing and adding some examples might help to explain > > > suggestion a bit better? > > > > Ok, so what I'm saying is that I think describing a location for the > > package itself could be problematic. For some cases it will be ok, > > but depending on exactly what the package represents on a particular > > platform there could be a lot of options for how to represent it. > > > > What I'm suggesting instead is that instead of giving a location for > > itself, the package lists the locations of all the threads it will > > contain when it is enabled/present/whatever. Those locations can be > > given as node/socket/core/thread tuples - which are properties that > > cpu threads already need to have, so we're not making the possible > > inadequacy of that information any worse than it already was. > > > > Examples.. so I'm not really sure how to write QOM objects, but I hope > > this is clear enough: > > > > On x86 > > .../cpu-package[0] (type 'acpi-thread') > > present = true > > location[0] = (node 0, socket 0, core 0, thread 0) > > thread[0] = <link to cpu thread object> > > .../cpu-package[1] (type 'acpi-thread') > > present = false > > location[0] = (node 0, socket 0, core 0, thread 1) > > > > On Power > > .../cpu-package[0] (type 'spapr-core') > > present = true > > location[0] = (node 0, socket 0, core 0, thread 0) > > location[1] = (node 0, socket 0, core 0, thread 1) > > ... > > location[7] = (node 0, socket 0, core 0, thread 7) > > thread[0] = <link...> > > ... > > thread[7] = >link...> > > .../cpu-package[1] (type 'spapr-core') > > present = false > > location[0] = (node 0, socket 0, core 0, thread 0) > > location[1] = (node 0, socket 0, core 0, thread 1) > > ... > > location[7] = (node 0, socket 0, core 0, thread 7) > > > > Does that make sense? > > > > > > > or > > > > > * via QMP/HMP command that would provide the same information > > > > > only without need to precreate anything. The only difference > > > > > is that it allows to use -device/device_add for new CPUs. > > > > > > > > I'd be ok with that option as well. I'd be thinking it would be > > > > implemented via a class method on the package object which returns the > > > > addresses that its contained threads will have, whether or not they're > > > > present right now. Does that make sense? > > > In this RFC it's MachineClass.possible_cpus method which is a bit more > > > flexible as it allows a board to describe possible CPU devices (whatever > > > they might be: sockets|cores|threads|some_chip_module) and their > > > properties > > > without forcing board to precreate cpu_package objects which should convey > > > the same info one way or another. > > > > Hmm.. so my RFC so far (at least the revised version based on > > Eduardo's comments) is that the cpu_package objects are always > > precreated. In future we might allow dynamic construction, but that > > will require a bunch more thinking to designt the right interfaces. > > > > > > > Considering that we would need to create HMP command so user could > > > > > inspect possible CPUs from monitor, it would need to do the same as > > > > > QMP command regardless of whether it's cpu-package objects or > > > > > just board calculated info a runtime. > > > > > > > > > > > In the design Eduardo and I have been discussing we're actually not > > > > > > planning to allow device_add to construct CPU packages - at least, > > > > > > not > > > > > > for the time being. The idea is that the machine type will > > > > > > construct > > > > > > enough packages for maxcpus, and management just toggles them on and > > > > > > off. > > > > > Another question is how it would work wrt migration? > > > > > > > > I'm assuming the "present" bits would be added to the migration > > > > stream; seems straightforward enough to me. Is there some > > > > consideration I'm missing? > > > It's hard to estimate how cpu-package objects might complicate > > > migration. It should not break migration for old machine types > > > and if possible it should work for backwards migration to older > > > QEMU versions (to be downstream friendly). > > > > So, the simple way to achieve that is to only instantiate the > > cpu-package objects on newer machine types. Older machine types will > > instatiate the cpu threads directly from the machine type in the old > > way, and (except for x86) won't allow cpu hotplug. > > > > I think that's a reasonable first approach. Later we can look at > > migrating a non-package setup to a package setup, if it looks like > > that will be useful. > > > > > If we go typical '-device/device_add whatever_cpu_device,foo_options_list' > > > route then it would allow us to replicate older device models without > > > issues (I don't expect any in x86 case) as it's what CPUs are now under > > > the hood. > > > This RFC doesn't force us to re-factor device models in order to use > > > hotplug (where CPU objects are already self-sufficient devices/hotplug > > > capable). > > > > > > It rather tries completely split interface aspect from how we are > > > internally model CPU hotplug, and tries to solve issue with > > > > > > -device/device_add for which we need to provide > > > 'what type to plug' and 'where to plug, which options to set to what' > > > > > > It's 1st level per you proposal, later we can do 2nd level on top of it > > > using cpu-packages(flip present property) to simplify mgmt's job > > > if it still would really needed (i.e. mgmt won't be able to cope with > > > -device, which it already has support for). > > > > Yeah.. so the thing is, in the short term I'm really more interested > > in the 2nd layer interface. It's something we can actually use, > > whereas the 1st layer interfaace still has a lot of potential > > complications. > What complications do you see from POWER point if view?
I don't relaly see any complications specific to Power. But the biggest issue, as far as I can tell is how do we advertise to the user / management layer what sorts of CPUs can be hotplugged - how many, what types are possible and so forth. The constraints here could in theory be pretty complex. > > This is why Eduardo suggested - and I agreed - that it's probably > > better to implement the "1st layer" as an internal structure/interface > > only, and implement the 2nd layer on top of that. When/if we need to > > we can revisit a user-accessible interface to the 1st layer. > We are going around QOM based CPU introspecting interface for > years now and that's exactly what 2nd layer is, just another > implementation. I've just lost hope in this approach. > > What I'm suggesting in this RFC is to forget controversial > QOM approach for now and use -device/device_add + QMP introspection, > i.e. completely split interface from how boards internally implement > CPU hotplug. I can see the appeal of that approach at this juncture. Hmm.. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature