Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-21 Thread David Gibson
On Tue, Jul 19, 2016 at 09:58:59AM +0530, Bharata B Rao wrote:
> On Mon, Jul 18, 2016 at 06:20:35PM +0200, Igor Mammedov wrote:
> > On Mon, 18 Jul 2016 17:06:18 +0200
> > Peter Krempa  wrote:
> > 
> > > On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:
> > > > I'm not entirely sure if this is a good idea, and if it is whether
> > > > this is a good approach to it.  But I'd like to discuss it and see if
> > > > anyone has better ideas.
> > > > 
> > > > As you may know we've hit a bunch of complications with cpu_index
> > > > which will impose some limitations with what we can do with the new
> > > > query-hotpluggable-cpus interface, and we've run out of time to
> > > > address these in qemu-2.7.
> > > >
> > > > At the same time we're hitting complications with the fact that the
> > > > new qemu interface requires a new libvirt interface to use properly,
> > > > and that has follow on effects further up the stack.  
> > > 
> > > The libvirt interface is basically now depending on adding a working
> > > implementation for qemu or a different hypervisor. APIs without
> > > implementation are not accepted upstream.
> > > 
> > > It looks like there are the following problems which make the above
> > > hard:
> > > 
> > > First of the problem is the missing link between the NUMA topology
> > > (currently confirured via 'cpu id' which is not linked in any way to the
> > > query-hotpluggable-cpus entries). This basically means that I'll have to
> > > re-implement the qemu numbering scheme and hope that it doesn't change
> > > until a better approach is added.
> > with current 'in order' plug/unplug limitation behavior is the same as
> > for cpu-add (wrt x86) so device_add could be used as direct replacement
> > of cpu-add in NUMA case.
> > 
> > Numa node to CPU in query-hotpluggable-cpus a missing part
> > but once numa mapping for hotplugged CPUs (which is broken now) is fixed
> > (fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html)
> > I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping
> > that -numa cpus=1,2,3... happened to configure.
> > (note: that device_add cpu,node=X that doesn't match whatever has been
> > configured with -numa cpus=... will rise error, as numa configuration
> > is static and fixed at VM creation time, meaning that "node" option
> > in query-hotpluggable-cpus is optional and only to inform users to
> > which node cpu belongs)
> > 
> > > Secondly from my understanding of the current state it's impossible to
> > > select an arbitrary cpu to hotplug but they need to happen 'in order' of
> > > the cpu id pointed out above (which is not accessible). The grand plan
> > > is to allow adding the cpus in any order. This makes the feature look
> > > like a proof of concept rather than something useful.
> 
> > having out-of-order plug/unplug would be nice but that wasn't
> > the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and
> > on top of that provide support for 'device_del cpu' instead of adding 
> > cpu-del
> > command.
> > And as result of migration to device_add to avoid changing -smp to match
> > present cpus count on target and reuse the same interface as other devices.
> > 
> > We can still pick 'out of order' device_add cpu using migration_id patch
> > and revert in-order limit patch. It would work for x86,
> > but I think there were issues with SPAPR, that's why I'm in favor of
> > in-order limit approach.
> 
> Not that the migration_id patch doesn't work for sPAPR, but it was felt
> that having too many IDs (cpu_dt_id, arch_id, migration_id) is not
> good/idea/preferable and could cause confusion.

I was also concerned that adding another id would be yet another layer
of things we needed to maintain compatibility with in future.

> I am not clear as to why limiting the out-of-order hotplug is a show
> stopper for libvirt actually. Isn't that how it is for cpu-add currently ?
> 
> Regards,
> Bharata.
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-19 Thread Peter Krempa
On Mon, Jul 18, 2016 at 18:20:35 +0200, Igor Mammedov wrote:
> On Mon, 18 Jul 2016 17:06:18 +0200
> Peter Krempa  wrote:
> > On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:

[...]

> > First of the problem is the missing link between the NUMA topology
> > (currently confirured via 'cpu id' which is not linked in any way to the
> > query-hotpluggable-cpus entries). This basically means that I'll have to
> > re-implement the qemu numbering scheme and hope that it doesn't change
> > until a better approach is added.
> with current 'in order' plug/unplug limitation behavior is the same as
> for cpu-add (wrt x86) so device_add could be used as direct replacement
> of cpu-add in NUMA case.
> 
> Numa node to CPU in query-hotpluggable-cpus a missing part
> but once numa mapping for hotplugged CPUs (which is broken now) is fixed
> (fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html)
> I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping
> that -numa cpus=1,2,3... happened to configure.

So this is one instance where we need to know the relation between the
cpu 'number' and the entry in query-hotpluggable-cpus. (see below ...)

I'm aware though that for the NUMA mapping there is a plan to do it
differently in an upcomming release so that may eventually be solved

> (note: that device_add cpu,node=X that doesn't match whatever has been
> configured with -numa cpus=... will rise error, as numa configuration
> is static and fixed at VM creation time, meaning that "node" option
> in query-hotpluggable-cpus is optional and only to inform users to
> which node cpu belongs)

That is okay in this regard, I'm not planing on modifying the
configuration once we start it. We although need to allow keeping an
arbitrary configuration as it is possible now.

> > Secondly from my understanding of the current state it's impossible to
> > select an arbitrary cpu to hotplug but they need to happen 'in order' of
> > the cpu id pointed out above (which is not accessible). The grand plan
> > is to allow adding the cpus in any order. This makes the feature look
> > like a proof of concept rather than something useful.
> having out-of-order plug/unplug would be nice but that wasn't

Not only nice but it's really necessary for NUMA enabled guests so that
we can plug cpus into a given node. Otherwise NUMA guests can't take any
advantage of this since you can't control where to add vcpus.

> the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and
> on top of that provide support for 'device_del cpu' instead of adding cpu-del
> command.

Unfortunately combination of the following:

- necessity to plug the vcpus in a certain order

- query-hotpluggable-cpus not reporting any ordering information

- order of entries in query-hotpluggable-cpus is arbitrary
  The documentation doesn't codify any ordering of the entries. This
  series also contains a patch that changes the order, thus the order
  information is unreliable.

make the interface unusable as-is.

With the interface there isn't a certain way how we could select the
correct entry to plug. We can just guess (basically reimplement qemu's
algorithm for numbering the cpus).

By codifying the order of entries (in any order, but it shall not be
changed afterthat) or numbering the entries we can at least eliminate the
guessing it would be possible to actually use this. (This basically
means that either the order or a index will in the end encode the
information that I've requested earlier [1])

As for the NUMA node numbering we can still guess (reimplement qemu's
algorithm for the numbering) with the data scraped from the above
information (thus basically infer the 'index' of the cpus [1]). This can
be later changed to the new interface once it will be done.

The gist of the above is that by disallowing arbitrary order of hotplug
you basically need to tell libvirt the 'index' of the cpus either
directly or indirectly by inference from the order of entries in
query-hotpluggable-cpus and the 'vcpus-count' field.

> And as result of migration to device_add to avoid changing -smp to match
> present cpus count on target and reuse the same interface as other devices.
> 
> We can still pick 'out of order' device_add cpu using migration_id patch
> and revert in-order limit patch. It would work for x86,
> but I think there were issues with SPAPR, that's why I'm in favor of
> in-order limit approach.

[...]

> > > To make this work, I need to broaden the semantics of cpu-add: it will
> > > a single entry from query-hotpluggable-cpus, which means it may add
> > > multiple vcpus, which the x86 implementation did not previously do.  
> > 
> > See my response to 2/2. If this requires to add -device for the
> > hotplugged entries when migrating it basically doesn't help at all.
> > 
> > > I'm not sure if the intended semantics of cpu-add were ever defined
> > > well enough to say if this is "wrong" or not.  
> > 
> > 

Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-19 Thread Igor Mammedov
On Tue, 19 Jul 2016 09:58:59 +0530
Bharata B Rao  wrote:

> On Mon, Jul 18, 2016 at 06:20:35PM +0200, Igor Mammedov wrote:
> > On Mon, 18 Jul 2016 17:06:18 +0200
> > Peter Krempa  wrote:
> >   
> > > On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:  
> > > > I'm not entirely sure if this is a good idea, and if it is whether
> > > > this is a good approach to it.  But I'd like to discuss it and see if
> > > > anyone has better ideas.
> > > > 
> > > > As you may know we've hit a bunch of complications with cpu_index
> > > > which will impose some limitations with what we can do with the new
> > > > query-hotpluggable-cpus interface, and we've run out of time to
> > > > address these in qemu-2.7.
> > > >
> > > > At the same time we're hitting complications with the fact that the
> > > > new qemu interface requires a new libvirt interface to use properly,
> > > > and that has follow on effects further up the stack.
> > > 
> > > The libvirt interface is basically now depending on adding a working
> > > implementation for qemu or a different hypervisor. APIs without
> > > implementation are not accepted upstream.
> > > 
> > > It looks like there are the following problems which make the above
> > > hard:
> > > 
> > > First of the problem is the missing link between the NUMA topology
> > > (currently confirured via 'cpu id' which is not linked in any way to the
> > > query-hotpluggable-cpus entries). This basically means that I'll have to
> > > re-implement the qemu numbering scheme and hope that it doesn't change
> > > until a better approach is added.  
> > with current 'in order' plug/unplug limitation behavior is the same as
> > for cpu-add (wrt x86) so device_add could be used as direct replacement
> > of cpu-add in NUMA case.
> > 
> > Numa node to CPU in query-hotpluggable-cpus a missing part
> > but once numa mapping for hotplugged CPUs (which is broken now) is fixed
> > (fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html)
> > I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping
> > that -numa cpus=1,2,3... happened to configure.
> > (note: that device_add cpu,node=X that doesn't match whatever has been
> > configured with -numa cpus=... will rise error, as numa configuration
> > is static and fixed at VM creation time, meaning that "node" option
> > in query-hotpluggable-cpus is optional and only to inform users to
> > which node cpu belongs)
> >   
> > > Secondly from my understanding of the current state it's impossible to
> > > select an arbitrary cpu to hotplug but they need to happen 'in order' of
> > > the cpu id pointed out above (which is not accessible). The grand plan
> > > is to allow adding the cpus in any order. This makes the feature look
> > > like a proof of concept rather than something useful.  
> 
> > having out-of-order plug/unplug would be nice but that wasn't
> > the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and
> > on top of that provide support for 'device_del cpu' instead of adding 
> > cpu-del
> > command.
> > And as result of migration to device_add to avoid changing -smp to match
> > present cpus count on target and reuse the same interface as other devices.
> > 
> > We can still pick 'out of order' device_add cpu using migration_id patch
> > and revert in-order limit patch. It would work for x86,
> > but I think there were issues with SPAPR, that's why I'm in favor of
> > in-order limit approach.  
> 
> Not that the migration_id patch doesn't work for sPAPR, but it was felt
> that having too many IDs (cpu_dt_id, arch_id, migration_id) is not
> good/idea/preferable and could cause confusion.
migration_id is internal thing and doesn't concern libvirt at all,
so it will be only QEMU thing that we can deal with later either by
eliminating cpu_index and leaving migration_id only or merging them
into one id after cpu_index refactoring.

> I am not clear as to why limiting the out-of-order hotplug is a show
> stopper for libvirt actually. Isn't that how it is for cpu-add currently ?
It's not show stopper but as Eric pointed out there is a caveat.
If we ship limited device_add then we would need to extend external
interface to report that out-of-order creation is available.

Looking from that point of view it's better to go migration_id route
keeping external API simple if spapr is able to handle out-of-order
cpu creation and migration.


> 
> Regards,
> Bharata.
> 




Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-18 Thread Bharata B Rao
On Mon, Jul 18, 2016 at 06:20:35PM +0200, Igor Mammedov wrote:
> On Mon, 18 Jul 2016 17:06:18 +0200
> Peter Krempa  wrote:
> 
> > On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:
> > > I'm not entirely sure if this is a good idea, and if it is whether
> > > this is a good approach to it.  But I'd like to discuss it and see if
> > > anyone has better ideas.
> > > 
> > > As you may know we've hit a bunch of complications with cpu_index
> > > which will impose some limitations with what we can do with the new
> > > query-hotpluggable-cpus interface, and we've run out of time to
> > > address these in qemu-2.7.
> > >
> > > At the same time we're hitting complications with the fact that the
> > > new qemu interface requires a new libvirt interface to use properly,
> > > and that has follow on effects further up the stack.  
> > 
> > The libvirt interface is basically now depending on adding a working
> > implementation for qemu or a different hypervisor. APIs without
> > implementation are not accepted upstream.
> > 
> > It looks like there are the following problems which make the above
> > hard:
> > 
> > First of the problem is the missing link between the NUMA topology
> > (currently confirured via 'cpu id' which is not linked in any way to the
> > query-hotpluggable-cpus entries). This basically means that I'll have to
> > re-implement the qemu numbering scheme and hope that it doesn't change
> > until a better approach is added.
> with current 'in order' plug/unplug limitation behavior is the same as
> for cpu-add (wrt x86) so device_add could be used as direct replacement
> of cpu-add in NUMA case.
> 
> Numa node to CPU in query-hotpluggable-cpus a missing part
> but once numa mapping for hotplugged CPUs (which is broken now) is fixed
> (fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html)
> I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping
> that -numa cpus=1,2,3... happened to configure.
> (note: that device_add cpu,node=X that doesn't match whatever has been
> configured with -numa cpus=... will rise error, as numa configuration
> is static and fixed at VM creation time, meaning that "node" option
> in query-hotpluggable-cpus is optional and only to inform users to
> which node cpu belongs)
> 
> > Secondly from my understanding of the current state it's impossible to
> > select an arbitrary cpu to hotplug but they need to happen 'in order' of
> > the cpu id pointed out above (which is not accessible). The grand plan
> > is to allow adding the cpus in any order. This makes the feature look
> > like a proof of concept rather than something useful.

> having out-of-order plug/unplug would be nice but that wasn't
> the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and
> on top of that provide support for 'device_del cpu' instead of adding cpu-del
> command.
> And as result of migration to device_add to avoid changing -smp to match
> present cpus count on target and reuse the same interface as other devices.
> 
> We can still pick 'out of order' device_add cpu using migration_id patch
> and revert in-order limit patch. It would work for x86,
> but I think there were issues with SPAPR, that's why I'm in favor of
> in-order limit approach.

Not that the migration_id patch doesn't work for sPAPR, but it was felt
that having too many IDs (cpu_dt_id, arch_id, migration_id) is not
good/idea/preferable and could cause confusion.

I am not clear as to why limiting the out-of-order hotplug is a show
stopper for libvirt actually. Isn't that how it is for cpu-add currently ?

Regards,
Bharata.




Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-18 Thread David Gibson
On Mon, Jul 18, 2016 at 05:06:18PM +0200, Peter Krempa wrote:
> On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:
> > I'm not entirely sure if this is a good idea, and if it is whether
> > this is a good approach to it.  But I'd like to discuss it and see if
> > anyone has better ideas.
> > 
> > As you may know we've hit a bunch of complications with cpu_index
> > which will impose some limitations with what we can do with the new
> > query-hotpluggable-cpus interface, and we've run out of time to
> > address these in qemu-2.7.
> >
> > At the same time we're hitting complications with the fact that the
> > new qemu interface requires a new libvirt interface to use properly,
> > and that has follow on effects further up the stack.
> 
> The libvirt interface is basically now depending on adding a working
> implementation for qemu or a different hypervisor. APIs without
> implementation are not accepted upstream.
> 
> It looks like there are the following problems which make the above
> hard:
> 
> First of the problem is the missing link between the NUMA topology
> (currently confirured via 'cpu id' which is not linked in any way to the
> query-hotpluggable-cpus entries). This basically means that I'll have to
> re-implement the qemu numbering scheme and hope that it doesn't change
> until a better approach is added.

I have at least a start on how to fix this in mind, and it's the next
thing I'll work on.  However, it obviously won't be merged for qemu-2.7.

> Secondly from my understanding of the current state it's impossible to
> select an arbitrary cpu to hotplug but they need to happen 'in order' of
> the cpu id pointed out above (which is not accessible). The grand plan
> is to allow adding the cpus in any order. This makes the feature look
> like a proof of concept rather than something useful.

Alas, yes :(.  Again, I have a plan on this, but it's missed the 2.7
window.

> The two problems above make this feature hard to implement and hard to
> sell to libvirt's upstream.
> 
> > Together this means a bunch more delays to having usable CPU hotplug
> > on Power for downstream users, which is unfortunate.
> 
> I'm not in favor of adding upstream hacks for sake of downstream
> deadlines.

As a rule, I'm not either.  But if the hacks are small and isolated
enough, I think it can be reasonable.  Whether that's the case is what
I'm trying to assess here.

> > This is an attempt to get something limited working in a shorter time
> > frame, by implementing the old cpu-add interface in terms of the new
> > interface.  Obviously this can't fully exploit the new interface's
> > capabilities, but you can do basic in-order cpu hotplug without removal.
> 
> As a side note, cpu-add technically allows out of order usage. Libvirt
> didn't use it that way though.

Yes, I know.  I gather it will break migration though.  With this
patch out-of-order cpu-add will fail because of the test enforcing
in-order device_add.

> > To make this work, I need to broaden the semantics of cpu-add: it will
> > a single entry from query-hotpluggable-cpus, which means it may add
> > multiple vcpus, which the x86 implementation did not previously do.
> 
> See my response to 2/2. If this requires to add -device for the
> hotplugged entries when migrating it basically doesn't help at all.

It doesn't.  But it does require a more complex calculation of how to
increase -smp.

> > I'm not sure if the intended semantics of cpu-add were ever defined
> > well enough to say if this is "wrong" or not.
> 
> For x86 I'll also need to experiment with the combined use of cpu-add
> and device_add interfaces. I plan to add a implementation which
> basically uses the old API in libvirt but calls the new APIs in qemu if
> they were used previously. (We still need to fall back to the old API
> for migration compatibility)

> > Because of this, I suspect libvirt will still need some work, but I'm
> > hoping it might be less that the full new API implementation.
> 
> Mostly as adding a single entry via the interface will result in
> multiple entries in query-cpus. Also libvirt's interface takes the
> target number of vcpus as argument so any increment that is not
> divisible by the thread count needs to be rejected.

Yes.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-18 Thread Igor Mammedov
On Mon, 18 Jul 2016 17:06:18 +0200
Peter Krempa  wrote:

> On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:
> > I'm not entirely sure if this is a good idea, and if it is whether
> > this is a good approach to it.  But I'd like to discuss it and see if
> > anyone has better ideas.
> > 
> > As you may know we've hit a bunch of complications with cpu_index
> > which will impose some limitations with what we can do with the new
> > query-hotpluggable-cpus interface, and we've run out of time to
> > address these in qemu-2.7.
> >
> > At the same time we're hitting complications with the fact that the
> > new qemu interface requires a new libvirt interface to use properly,
> > and that has follow on effects further up the stack.  
> 
> The libvirt interface is basically now depending on adding a working
> implementation for qemu or a different hypervisor. APIs without
> implementation are not accepted upstream.
> 
> It looks like there are the following problems which make the above
> hard:
> 
> First of the problem is the missing link between the NUMA topology
> (currently confirured via 'cpu id' which is not linked in any way to the
> query-hotpluggable-cpus entries). This basically means that I'll have to
> re-implement the qemu numbering scheme and hope that it doesn't change
> until a better approach is added.
with current 'in order' plug/unplug limitation behavior is the same as
for cpu-add (wrt x86) so device_add could be used as direct replacement
of cpu-add in NUMA case.

Numa node to CPU in query-hotpluggable-cpus a missing part
but once numa mapping for hotplugged CPUs (which is broken now) is fixed
(fix https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg00595.html)
I'll be ready to extend x86.query-hotpluggable-cpus with numa mapping
that -numa cpus=1,2,3... happened to configure.
(note: that device_add cpu,node=X that doesn't match whatever has been
configured with -numa cpus=... will rise error, as numa configuration
is static and fixed at VM creation time, meaning that "node" option
in query-hotpluggable-cpus is optional and only to inform users to
which node cpu belongs)

> Secondly from my understanding of the current state it's impossible to
> select an arbitrary cpu to hotplug but they need to happen 'in order' of
> the cpu id pointed out above (which is not accessible). The grand plan
> is to allow adding the cpus in any order. This makes the feature look
> like a proof of concept rather than something useful.
having out-of-order plug/unplug would be nice but that wasn't
the grand plan. Main reason is to replace cpu-add with 'device_add cpu' and
on top of that provide support for 'device_del cpu' instead of adding cpu-del
command.
And as result of migration to device_add to avoid changing -smp to match
present cpus count on target and reuse the same interface as other devices.

We can still pick 'out of order' device_add cpu using migration_id patch
and revert in-order limit patch. It would work for x86,
but I think there were issues with SPAPR, that's why I'm in favor of
in-order limit approach.

> The two problems above make this feature hard to implement and hard to
> sell to libvirt's upstream.
> 
> > Together this means a bunch more delays to having usable CPU hotplug
> > on Power for downstream users, which is unfortunate.  
> 
> I'm not in favor of adding upstream hacks for sake of downstream
> deadlines.
> 
> > This is an attempt to get something limited working in a shorter time
> > frame, by implementing the old cpu-add interface in terms of the new
> > interface.  Obviously this can't fully exploit the new interface's
> > capabilities, but you can do basic in-order cpu hotplug without removal.  
> 
> As a side note, cpu-add technically allows out of order usage. Libvirt
> didn't use it that way though.
out-of-order cpu-add breaks migration that's why it's not been used.

> > To make this work, I need to broaden the semantics of cpu-add: it will
> > a single entry from query-hotpluggable-cpus, which means it may add
> > multiple vcpus, which the x86 implementation did not previously do.  
> 
> See my response to 2/2. If this requires to add -device for the
> hotplugged entries when migrating it basically doesn't help at all.
> 
> > I'm not sure if the intended semantics of cpu-add were ever defined
> > well enough to say if this is "wrong" or not.  
> 
> For x86 I'll also need to experiment with the combined use of cpu-add
> and device_add interfaces.
It should work, though I'd not recommend to use them together as cpu-add
will be obsoleted eventually.

>I plan to add a implementation which
> basically uses the old API in libvirt but calls the new APIs in qemu if
> they were used previously. 
(skip)

>(We still need to fall back to the old API for migration compatibility)
Why?

> 
> > Because of this, I suspect libvirt will still need some work, but I'm
> > hoping it might be less that the full new API implementation.  
> 
> Mostly as adding a 

Re: [Qemu-devel] [RFC 0/2] cpu-add compatibility for query-hotpluggable-cpus implementations

2016-07-18 Thread Peter Krempa
On Mon, Jul 18, 2016 at 19:19:18 +1000, David Gibson wrote:
> I'm not entirely sure if this is a good idea, and if it is whether
> this is a good approach to it.  But I'd like to discuss it and see if
> anyone has better ideas.
> 
> As you may know we've hit a bunch of complications with cpu_index
> which will impose some limitations with what we can do with the new
> query-hotpluggable-cpus interface, and we've run out of time to
> address these in qemu-2.7.
>
> At the same time we're hitting complications with the fact that the
> new qemu interface requires a new libvirt interface to use properly,
> and that has follow on effects further up the stack.

The libvirt interface is basically now depending on adding a working
implementation for qemu or a different hypervisor. APIs without
implementation are not accepted upstream.

It looks like there are the following problems which make the above
hard:

First of the problem is the missing link between the NUMA topology
(currently confirured via 'cpu id' which is not linked in any way to the
query-hotpluggable-cpus entries). This basically means that I'll have to
re-implement the qemu numbering scheme and hope that it doesn't change
until a better approach is added.

Secondly from my understanding of the current state it's impossible to
select an arbitrary cpu to hotplug but they need to happen 'in order' of
the cpu id pointed out above (which is not accessible). The grand plan
is to allow adding the cpus in any order. This makes the feature look
like a proof of concept rather than something useful.

The two problems above make this feature hard to implement and hard to
sell to libvirt's upstream.

> Together this means a bunch more delays to having usable CPU hotplug
> on Power for downstream users, which is unfortunate.

I'm not in favor of adding upstream hacks for sake of downstream
deadlines.

> This is an attempt to get something limited working in a shorter time
> frame, by implementing the old cpu-add interface in terms of the new
> interface.  Obviously this can't fully exploit the new interface's
> capabilities, but you can do basic in-order cpu hotplug without removal.

As a side note, cpu-add technically allows out of order usage. Libvirt
didn't use it that way though.

> To make this work, I need to broaden the semantics of cpu-add: it will
> a single entry from query-hotpluggable-cpus, which means it may add
> multiple vcpus, which the x86 implementation did not previously do.

See my response to 2/2. If this requires to add -device for the
hotplugged entries when migrating it basically doesn't help at all.

> I'm not sure if the intended semantics of cpu-add were ever defined
> well enough to say if this is "wrong" or not.

For x86 I'll also need to experiment with the combined use of cpu-add
and device_add interfaces. I plan to add a implementation which
basically uses the old API in libvirt but calls the new APIs in qemu if
they were used previously. (We still need to fall back to the old API
for migration compatibility)

> Because of this, I suspect libvirt will still need some work, but I'm
> hoping it might be less that the full new API implementation.

Mostly as adding a single entry via the interface will result in
multiple entries in query-cpus. Also libvirt's interface takes the
target number of vcpus as argument so any increment that is not
divisible by the thread count needs to be rejected.

Peter