Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-05-13 Thread Michael Galaxy via Devel
One thing to keep in mind here (despite me not having any hardware to 
test) was that one of the original goals here
in the RDMA implementation was not simply raw throughput nor raw 
latency, but a lack of CPU utilization in kernel
space due to the offload. While it is entirely possible that newer 
hardware w/ TCP might compete, the significant

reductions in CPU usage in the TCP/IP stack were a big win at the time.

Just something to consider while you're doing the testing

- Michael

On 5/9/24 03:58, Zheng Chuan wrote:

Hi, Peter,Lei,Jinpu.

On 2024/5/8 0:28, Peter Xu wrote:

On Tue, May 07, 2024 at 01:50:43AM +, Gonglei (Arei) wrote:

Hello,


-Original Message-
From: Peter Xu [mailto:pet...@redhat.com]
Sent: Monday, May 6, 2024 11:18 PM
To: Gonglei (Arei) 
Cc: Daniel P. Berrangé ; Markus Armbruster
; Michael Galaxy ; Yu Zhang
; Zhijian Li (Fujitsu) ; Jinpu Wang
; Elmar Gerdes ;
qemu-de...@nongnu.org; Yuval Shaia ; Kevin Wolf
; Prasanna Kumar Kalever
; Cornelia Huck ;
Michael Roth ; Prasanna Kumar Kalever
; integrat...@gluster.org; Paolo Bonzini
; qemu-bl...@nongnu.org; devel@lists.libvirt.org;
Hanna Reitz ; Michael S. Tsirkin ;
Thomas Huth ; Eric Blake ; Song
Gao ; Marc-André Lureau
; Alex Bennée ;
Wainer dos Santos Moschetta ; Beraldo Leal
; Pannengyuan ;
Xiexiangyou 
Subject: Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

On Mon, May 06, 2024 at 02:06:28AM +, Gonglei (Arei) wrote:

Hi, Peter

Hey, Lei,

Happy to see you around again after years.


Haha, me too.


RDMA features high bandwidth, low latency (in non-blocking lossless
network), and direct remote memory access by bypassing the CPU (As you
know, CPU resources are expensive for cloud vendors, which is one of
the reasons why we introduced offload cards.), which TCP does not have.

It's another cost to use offload cards, v.s. preparing more cpu resources?


Software and hardware offload converged architecture is the way to go for all 
cloud vendors
(Including comprehensive benefits in terms of performance, cost, security, and 
innovation speed),
it's not just a matter of adding the resource of a DPU card.


In some scenarios where fast live migration is needed (extremely short
interruption duration and migration duration) is very useful. To this
end, we have also developed RDMA support for multifd.

Will any of you upstream that work?  I'm curious how intrusive would it be
when adding it to multifd, if it can keep only 5 exported functions like what
rdma.h does right now it'll be pretty nice.  We also want to make sure it works
with arbitrary sized loads and buffers, e.g. vfio is considering to add IO 
loads to
multifd channels too.


In fact, we sent the patchset to the community in 2021. Pls see:
https://urldefense.com/v3/__https://lore.kernel.org/all/20210203185906.GT2950@work-vm/T/__;!!GjvTz_vk!VfP_SV-8uRya7rBdopv8OUJkmnSi44Ktpqq1E7sr_Xcwt6zvveW51qboWOBSTChdUG1hJwfAl7HZl4NUEGc$

Yes, I have sent the patchset of multifd support for rdma migration by taking 
over my colleague, and also
sorry for not keeping on this work at that time due to some reasons.
And also I am strongly agree with Lei that the RDMA protocol has some special 
advantages against with TCP
in some scenario, and we are indeed to use it in our product.


I wasn't aware of that for sure in the past..

Multifd has changed quite a bit in the last 9.0 release, that may not apply
anymore.  One thing to mention is please look at Dan's comment on possible
use of rsocket.h:

https://urldefense.com/v3/__https://lore.kernel.org/all/zjjm6rcqs5eho...@redhat.com/__;!!GjvTz_vk!VfP_SV-8uRya7rBdopv8OUJkmnSi44Ktpqq1E7sr_Xcwt6zvveW51qboWOBSTChdUG1hJwfAl7HZ0CFSE-o$

And Jinpu did help provide an initial test result over the library:

https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/camgffek8wiknqmouyxcathgtiem2dwocf_w7t0vmcd-i30t...@mail.gmail.com/__;!!GjvTz_vk!VfP_SV-8uRya7rBdopv8OUJkmnSi44Ktpqq1E7sr_Xcwt6zvveW51qboWOBSTChdUG1hJwfAl7HZxPNcdb4$

It looks like we have a chance to apply that in QEMU.




One thing to note that the question here is not about a pure performance
comparison between rdma and nics only.  It's about help us make a decision
on whether to drop rdma, iow, even if rdma performs well, the community still
has the right to drop it if nobody can actively work and maintain it.
It's just that if nics can perform as good it's more a reason to drop, unless
companies can help to provide good support and work together.


We are happy to provide the necessary review and maintenance work for RDMA
if the community needs it.

CC'ing Chuan Zheng.

I'm not sure whether you and Jinpu's team would like to work together and
provide a final solution for rdma over multifd.  It could be much simpler
than the original 2021 proposal if the rsocket API will work out.

Thanks,


That's a good news to see the socket abstraction for RDMA!
When I was developed the series above, the most pain is the RDMA migration has 
no QIOChannel abstractio

Re: [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any machine

2024-05-13 Thread Daniel P . Berrangé
On Mon, May 13, 2024 at 10:22:22PM +0800, Zhao Liu wrote:
> Cc Paolo for x86 topology part
> 
> Hi Daniel,
> 
> On Mon, May 13, 2024 at 01:33:57PM +0100, Daniel P. Berrangé wrote:
> > Date: Mon, 13 May 2024 13:33:57 +0100
> > From: "Daniel P. Berrangé" 
> > Subject: [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any
> >  machine
> > 
> > This effectively reverts
> > 
> >   commit 54c4ea8f3ae614054079395842128a856a73dbf9
> >   Author: Zhao Liu 
> >   Date:   Sat Mar 9 00:01:37 2024 +0800
> > 
> > hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP 
> > configurations
> > 
> > but is not done as a 'git revert' since the part of the changes to the
> > file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
> > Furthermore, we have to tweak the subsequently added unit test to
> > account for differing warning message.
> > 
> > The rationale for the original deprecation was:
> > 
> >   "Currently, it was allowed for users to specify the unsupported
> >topology parameter as "1". For example, x86 PC machine doesn't
> >support drawer/book/cluster topology levels, but user could specify
> >"-smp drawers=1,books=1,clusters=1".
> > 
> >This is meaningless and confusing, so that the support for this kind
> >of configurations is marked deprecated since 9.0."
> > 
> > There are varying POVs on the topic of 'unsupported' topology levels.
> > 
> > It is common to say that on a system without hyperthreading, that there
> > is always 1 thread. Likewise when new CPUs introduced a concept of
> > multiple "dies', it was reasonable to say that all historical CPUs
> > before that implicitly had 1 'die'. Likewise for the more recently
> > introduced 'modules' and 'clusters' parameter'. From this POV, it is
> > valid to set 'parameter=1' on the -smp command line for any machine,
> > only a value > 1 is strictly an error condition.
> 
> Currently QEMU has become more and more difficult to maintain a general
> topology hierarchy, there are two recent examples:
> 
> 1. as you mentioned "module" v.s. "cluster", one reason for introducing
> "module" is because it is difficult to define what "cluster" is for x86,
> the cluster in the device tree can be nested, then it can correspond to
> an x86 die, or it can correspond to an x86 module. Therefore, specifying
> "clusters=1" for x86 is ambiguous.
> 
> 2. s390 introduces book and drawer, which are above socket/package
> level, but for x86, the level above the package names "cluster" (yeah,
> "cluster" again :-(). So if user sets "books=1" or "drawers=1" for x86,
> then it's meaningless. Similarly, "clusters=1" is also very confusing for
> x86 machine.
> 
> I think that only thread/core/socket are architecturally general, the
> other topology levels are hard to define across architectures, then
> allowing unsupported topology=1 is always confusing...
> 
> Moreover, QEMU currently requires a clear topology containment
> relationship when defining a topology, after which it will become
> increasingly difficult to define a generic topology containment
> relationship when new topology levels are introduced in the future...

I'm failing to see what real world technical problems QEMU faces
with a parameter being set to '1' by a mgmt app, when QEMU itself
treats all omitted values as being '1' anyway.

If we're trying to faithfully model the real world, then restricting
the topology against machine types though still looks inherantly wrong.
The valid topology ought to be constrained based on the named CPU model.
eg it doesn't make sense to allow 'dies=4' with a Skylake CPU model,
only an EPYC CPU model, especially if we want to model cache info in
a way that matches the real world silicon better.

> > It doesn't cause any functional difficulty for QEMU, because internally
> > the QEMU code is itself assuming that all "unsupported" parameters
> > implicitly have a value of '1'.
> > 
> > At the libvirt level, we've allowed applications to set 'parameter=1'
> > when configuring a guest, and pass that through to QEMU.
> > 
> > Deprecating this creates extra difficulty for because there's no info
> > exposed from QEMU about which machine types "support" which parameters.
> > Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
> > a given machine type, or whether it will trigger deprecation messages.
> 
> I understand that libvirt is having trouble because there is no interface
> to expose which topology levels the current machine supports. As a
> workaround to eliminate the difficulties at the libvirt level, it's
> ok for me.
> 
> But I believe deprecating the unsupported topology is necessary, so do
> you think it's acceptable to include an interface to expose the supported
> topology if it's going to be deprecated again later?

As above, I think that restrictions based on machine type, while nice and
simple, are incorrect long term. If we did impose restrictions based on
CPU model, then we could trivially expose this inf

Re: [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any machine

2024-05-13 Thread Zhao Liu
Cc Paolo for x86 topology part

Hi Daniel,

On Mon, May 13, 2024 at 01:33:57PM +0100, Daniel P. Berrangé wrote:
> Date: Mon, 13 May 2024 13:33:57 +0100
> From: "Daniel P. Berrangé" 
> Subject: [PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any
>  machine
> 
> This effectively reverts
> 
>   commit 54c4ea8f3ae614054079395842128a856a73dbf9
>   Author: Zhao Liu 
>   Date:   Sat Mar 9 00:01:37 2024 +0800
> 
> hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP 
> configurations
> 
> but is not done as a 'git revert' since the part of the changes to the
> file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
> Furthermore, we have to tweak the subsequently added unit test to
> account for differing warning message.
> 
> The rationale for the original deprecation was:
> 
>   "Currently, it was allowed for users to specify the unsupported
>topology parameter as "1". For example, x86 PC machine doesn't
>support drawer/book/cluster topology levels, but user could specify
>"-smp drawers=1,books=1,clusters=1".
> 
>This is meaningless and confusing, so that the support for this kind
>of configurations is marked deprecated since 9.0."
> 
> There are varying POVs on the topic of 'unsupported' topology levels.
> 
> It is common to say that on a system without hyperthreading, that there
> is always 1 thread. Likewise when new CPUs introduced a concept of
> multiple "dies', it was reasonable to say that all historical CPUs
> before that implicitly had 1 'die'. Likewise for the more recently
> introduced 'modules' and 'clusters' parameter'. From this POV, it is
> valid to set 'parameter=1' on the -smp command line for any machine,
> only a value > 1 is strictly an error condition.

Currently QEMU has become more and more difficult to maintain a general
topology hierarchy, there are two recent examples:

1. as you mentioned "module" v.s. "cluster", one reason for introducing
"module" is because it is difficult to define what "cluster" is for x86,
the cluster in the device tree can be nested, then it can correspond to
an x86 die, or it can correspond to an x86 module. Therefore, specifying
"clusters=1" for x86 is ambiguous.

2. s390 introduces book and drawer, which are above socket/package
level, but for x86, the level above the package names "cluster" (yeah,
"cluster" again :-(). So if user sets "books=1" or "drawers=1" for x86,
then it's meaningless. Similarly, "clusters=1" is also very confusing for
x86 machine.

I think that only thread/core/socket are architecturally general, the
other topology levels are hard to define across architectures, then
allowing unsupported topology=1 is always confusing...

Moreover, QEMU currently requires a clear topology containment
relationship when defining a topology, after which it will become
increasingly difficult to define a generic topology containment
relationship when new topology levels are introduced in the future...

> It doesn't cause any functional difficulty for QEMU, because internally
> the QEMU code is itself assuming that all "unsupported" parameters
> implicitly have a value of '1'.
> 
> At the libvirt level, we've allowed applications to set 'parameter=1'
> when configuring a guest, and pass that through to QEMU.
> 
> Deprecating this creates extra difficulty for because there's no info
> exposed from QEMU about which machine types "support" which parameters.
> Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
> a given machine type, or whether it will trigger deprecation messages.

I understand that libvirt is having trouble because there is no interface
to expose which topology levels the current machine supports. As a
workaround to eliminate the difficulties at the libvirt level, it's
ok for me.

But I believe deprecating the unsupported topology is necessary, so do
you think it's acceptable to include an interface to expose the supported
topology if it's going to be deprecated again later?

Regards,
Zhao
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


[PATCH v2] qemu_hotplug: Properly assign USB address to hotplugged usb-net device

2024-05-13 Thread Rayhan Faizel
Previously, the network device hotplug logic would try to ensure only CCW or
PCI addresses. With recent support for the usb-net model, this patch will
ensure USB addresses for usb-net network devices.

[Changes in v2]
- Additional information in commit message
- Relevant gitlab issue link (mostly already resolved but hotplugging is
mentioned in one of the comments)

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/14
Signed-off-by: Rayhan Faizel 
---
 src/qemu/qemu_hotplug.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index 774962b0df..3b39941780 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -1159,8 +1159,11 @@ qemuDomainAttachNetDevice(virQEMUDriver *driver,
 goto cleanup;
 }
 
-if (qemuDomainIsS390CCW(vm->def) &&
-net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) {
+if (net->model == VIR_DOMAIN_NET_MODEL_USB_NET) {
+if (virDomainUSBAddressEnsure(priv->usbaddrs, &net->info) < 0)
+goto cleanup;
+} else if (qemuDomainIsS390CCW(vm->def) &&
+   net->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI) {
 net->info.type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_CCW;
 if (!(ccwaddrs = virDomainCCWAddressSetCreateFromDomain(vm->def)))
 goto cleanup;
-- 
2.34.1
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH] vmx: Do not require DVS Port ID

2024-05-13 Thread Ján Tomko

On a Monday in 2024, Martin Kletzander wrote:

It can be safely removed from the VMX, VMWare will still boot the
machine and once another ethernet is added it is updated in the VMX to
zero.  So do not require it and default to zero too since this part of
the XML is done as best effort and it is mentioned even in our
documentation.

Signed-off-by: Martin Kletzander 
---
src/vmx/vmx.c|  2 +-
tests/vmx2xmldata/ethernet-vds-no-portid.vmx | 10 
tests/vmx2xmldata/ethernet-vds-no-portid.xml | 24 
3 files changed, 35 insertions(+), 1 deletion(-)
create mode 100644 tests/vmx2xmldata/ethernet-vds-no-portid.vmx
create mode 100644 tests/vmx2xmldata/ethernet-vds-no-portid.xml



Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH 0/2] hw/core: revert deprecation of 'parameter=1' for SMP topology

2024-05-13 Thread Ján Tomko

On a Monday in 2024, Daniel P. Berrangé wrote:

Since QEMU 9.0, users are complaining that depecation messages are shown
for every VM libvirt starts. This is due to the newly introduced
deprecation of 'parameter=1' for -smp. This proposes reverting that, see
the 1st patch for further commentary.

Daniel P. Berrangé (2):
 hw/core: allow parameter=1 for CPU topology on any machine
 tests: add testing of parameter=1 for SMP topology

docs/about/deprecated.rst   | 14 ---
hw/core/machine-smp.c   | 82 -
tests/unit/test-smp-parse.c | 16 ++--
3 files changed, 38 insertions(+), 74 deletions(-)



Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


[PATCH] vmx: Do not require DVS Port ID

2024-05-13 Thread Martin Kletzander
It can be safely removed from the VMX, VMWare will still boot the
machine and once another ethernet is added it is updated in the VMX to
zero.  So do not require it and default to zero too since this part of
the XML is done as best effort and it is mentioned even in our
documentation.

Signed-off-by: Martin Kletzander 
---
 src/vmx/vmx.c|  2 +-
 tests/vmx2xmldata/ethernet-vds-no-portid.vmx | 10 
 tests/vmx2xmldata/ethernet-vds-no-portid.xml | 24 
 3 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 tests/vmx2xmldata/ethernet-vds-no-portid.vmx
 create mode 100644 tests/vmx2xmldata/ethernet-vds-no-portid.xml

diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c
index f4182bc5184c..d90b41d2ad14 100644
--- a/src/vmx/vmx.c
+++ b/src/vmx/vmx.c
@@ -2901,7 +2901,7 @@ virVMXParseEthernet(virConf *conf, int controller, 
virDomainNetDef **def)
 portId_name,
 &(*def)->data.vds.port_id,
 0,
-false) < 0 ||
+true) < 0 ||
 virVMXGetConfigLong(conf,
 connectionId_name,
 &(*def)->data.vds.connection_id,
diff --git a/tests/vmx2xmldata/ethernet-vds-no-portid.vmx 
b/tests/vmx2xmldata/ethernet-vds-no-portid.vmx
new file mode 100644
index ..7761accb3abc
--- /dev/null
+++ b/tests/vmx2xmldata/ethernet-vds-no-portid.vmx
@@ -0,0 +1,10 @@
+config.version = "8"
+virtualHW.version = "4"
+ethernet0.present = "true"
+ethernet0.virtualDev = "e1000e"
+ethernet0.addressType = "vpx"
+ethernet0.generatedAddress = "00:50:56:87:65:43"
+ethernet0.dvs.switchId = "50 34 26 b2 94 e9 3b 16-1d 68 87 bf ff 4a 54 40"
+ethernet0.dvs.portgroupId = "dvportgroup-1285"
+ethernet0.dvs.connectionId = "408217997"
+displayName = "test"
diff --git a/tests/vmx2xmldata/ethernet-vds-no-portid.xml 
b/tests/vmx2xmldata/ethernet-vds-no-portid.xml
new file mode 100644
index ..60fd9c99feb9
--- /dev/null
+++ b/tests/vmx2xmldata/ethernet-vds-no-portid.xml
@@ -0,0 +1,24 @@
+
+  test
+  ----
+  32768
+  32768
+  1
+  
+hvm
+  
+  
+  destroy
+  restart
+  destroy
+  
+
+  
+  
+  
+
+
+  
+
+  
+
-- 
2.45.0
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH] qemu_hotplug: Properly assign USB address to hotplugged usb-net device

2024-05-13 Thread Michal Prívozník
On 5/11/24 21:29, Rayhan Faizel wrote:
> Signed-off-by: Rayhan Faizel 
> ---
>  src/qemu/qemu_hotplug.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)

Can you expand the commit message a bit? And if there's a gitlab issue
associated, please put a link too.

Michal
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH] NEWS: Announce virtio sound model support

2024-05-13 Thread Michal Prívozník
On 5/10/24 13:16, Rayhan Faizel wrote:
> Signed-off-by: Rayhan Faizel 
> ---
>  NEWS.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 

Reviewed-by: Michal Privoznik 

Michal
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


[PATCH 2/2] tests: add testing of parameter=1 for SMP topology

2024-05-13 Thread Daniel P . Berrangé
Validate that it is possible to pass 'parameter=1' for any SMP topology
parameter, since unsupported parameters are implicitly considered to
always have a value of 1.

Signed-off-by: Daniel P. Berrangé 
---
 tests/unit/test-smp-parse.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c
index 56165e6644..56ce5128f1 100644
--- a/tests/unit/test-smp-parse.c
+++ b/tests/unit/test-smp-parse.c
@@ -330,6 +330,14 @@ static const struct SMPTestData data_generic_valid[] = {
 .config = SMP_CONFIG_GENERIC(T, 8, T, 2, T, 4, T, 2, T, 16),
 .expect_prefer_sockets = CPU_TOPOLOGY_GENERIC(8, 2, 4, 2, 16),
 .expect_prefer_cores   = CPU_TOPOLOGY_GENERIC(8, 2, 4, 2, 16),
+}, {
+/*
+ * Unsupported parameters are always allowed to be set to '1'
+ * config: -smp 
8,books=1,drawers=1,sockets=2,modules=1,dies=1,cores=4,threads=2,maxcpus=8
+ * expect: cpus=8,sockets=2,cores=2,threads=2,maxcpus=8 */
+.config = SMP_CONFIG_WITH_FULL_TOPO(8, 1, 1, 2, 1, 1, 2, 2, 8),
+.expect_prefer_sockets = CPU_TOPOLOGY_GENERIC(8, 2, 2, 2, 8),
+.expect_prefer_cores   = CPU_TOPOLOGY_GENERIC(8, 2, 2, 2, 8),
 },
 };
 
-- 
2.43.0
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


[PATCH 1/2] hw/core: allow parameter=1 for SMP topology on any machine

2024-05-13 Thread Daniel P . Berrangé
This effectively reverts

  commit 54c4ea8f3ae614054079395842128a856a73dbf9
  Author: Zhao Liu 
  Date:   Sat Mar 9 00:01:37 2024 +0800

hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations

but is not done as a 'git revert' since the part of the changes to the
file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
Furthermore, we have to tweak the subsequently added unit test to
account for differing warning message.

The rationale for the original deprecation was:

  "Currently, it was allowed for users to specify the unsupported
   topology parameter as "1". For example, x86 PC machine doesn't
   support drawer/book/cluster topology levels, but user could specify
   "-smp drawers=1,books=1,clusters=1".

   This is meaningless and confusing, so that the support for this kind
   of configurations is marked deprecated since 9.0."

There are varying POVs on the topic of 'unsupported' topology levels.

It is common to say that on a system without hyperthreading, that there
is always 1 thread. Likewise when new CPUs introduced a concept of
multiple "dies', it was reasonable to say that all historical CPUs
before that implicitly had 1 'die'. Likewise for the more recently
introduced 'modules' and 'clusters' parameter'. From this POV, it is
valid to set 'parameter=1' on the -smp command line for any machine,
only a value > 1 is strictly an error condition.

It doesn't cause any functional difficulty for QEMU, because internally
the QEMU code is itself assuming that all "unsupported" parameters
implicitly have a value of '1'.

At the libvirt level, we've allowed applications to set 'parameter=1'
when configuring a guest, and pass that through to QEMU.

Deprecating this creates extra difficulty for because there's no info
exposed from QEMU about which machine types "support" which parameters.
Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
a given machine type, or whether it will trigger deprecation messages.

Since there's no apparent functional benefit to deleting this deprecated
behaviour from QEMU, and it creates problems for consumers of QEMU,
remove this deprecation.

Signed-off-by: Daniel P. Berrangé 
---
 docs/about/deprecated.rst   | 14 ---
 hw/core/machine-smp.c   | 82 -
 tests/unit/test-smp-parse.c |  8 ++--
 3 files changed, 30 insertions(+), 74 deletions(-)

diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst
index e22acb17f2..5b551b12a6 100644
--- a/docs/about/deprecated.rst
+++ b/docs/about/deprecated.rst
@@ -47,20 +47,6 @@ as short-form boolean values, and passed to plugins as 
``arg_name=on``.
 However, short-form booleans are deprecated and full explicit ``arg_name=on``
 form is preferred.
 
-``-smp`` (Unsupported "parameter=1" SMP configurations) (since 9.0)
-'''
-
-Specified CPU topology parameters must be supported by the machine.
-
-In the SMP configuration, users should provide the CPU topology parameters that
-are supported by the target machine.
-
-However, historically it was allowed for users to specify the unsupported
-topology parameter as "1", which is meaningless. So support for this kind of
-configurations (e.g. -smp drawers=1,books=1,clusters=1 for x86 PC machine) is
-marked deprecated since 9.0, users have to ensure that all the topology members
-described with -smp are supported by the target machine.
-
 User-mode emulator command line arguments
 -
 
diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 2b93fa99c9..eb43caca9b 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -119,75 +119,45 @@ void machine_parse_smp_config(MachineState *ms,
 
 /*
  * If not supported by the machine, a topology parameter must be
- * omitted.
+ * not be set to a value greater than 1.
  */
-if (!mc->smp_props.modules_supported && config->has_modules) {
-if (config->modules > 1) {
-error_setg(errp, "modules not supported by this "
-   "machine's CPU topology");
-return;
-} else {
-/* Here modules only equals 1 since we've checked zero case. */
-warn_report("Deprecated CPU topology (considered invalid): "
-"Unsupported modules parameter mustn't be "
-"specified as 1");
-}
+if (!mc->smp_props.modules_supported &&
+config->has_modules && config->modules > 1) {
+error_setg(errp,
+   "modules > 1 not supported by this machine's CPU topology");
+return;
 }
 modules = modules > 0 ? modules : 1;
 
-if (!mc->smp_props.clusters_supported && config->has_clusters) {
-if (config->clusters > 1) {
-error_setg(errp, "clusters not supported by this "
-   "machine's CPU topology");
-return;
-  

[PATCH 0/2] hw/core: revert deprecation of 'parameter=1' for SMP topology

2024-05-13 Thread Daniel P . Berrangé
Since QEMU 9.0, users are complaining that depecation messages are shown
for every VM libvirt starts. This is due to the newly introduced
deprecation of 'parameter=1' for -smp. This proposes reverting that, see
the 1st patch for further commentary.

Daniel P. Berrangé (2):
  hw/core: allow parameter=1 for CPU topology on any machine
  tests: add testing of parameter=1 for SMP topology

 docs/about/deprecated.rst   | 14 ---
 hw/core/machine-smp.c   | 82 -
 tests/unit/test-smp-parse.c | 16 ++--
 3 files changed, 38 insertions(+), 74 deletions(-)

-- 
2.43.0
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH] docs: mention migrate-setmaxdowntime's impact on snapshot-create

2024-05-13 Thread Peter Krempa
On Mon, May 13, 2024 at 01:36:27 +0530, Abhiram Tilak wrote:
> The migrate-setmaxdowntime command sets the max allowed downtime during
> live-migration, but since `snapshot-create` performs qmp migration, it also
> affects the downtime during internal/external snapshot creation.
> 
> Resolves: https://gitlab.com/libvirt/libvirt/-/issues/593
> Signed-off-by: Abhiram Tilak 
> ---
> This issue was a minor change, yet it has remained without any updates for
> some time.
> 
>  docs/manpages/virsh.rst | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst
> index fa038e4547..18bdae53c9 100644
> --- a/docs/manpages/virsh.rst
> +++ b/docs/manpages/virsh.rst
> @@ -3717,7 +3717,8 @@ migrate-setmaxdowntime
> 
>  Set maximum tolerable downtime for a domain which is being live-migrated to
>  another host.  The *downtime* is a number of milliseconds the guest is 
> allowed
> -to be down at the end of live migration.
> +to be down at the end of live migration. This is also used to set the max
> +downtime for creating a snapshot using ``snapshot-create``.

While this is true, the usefulness of setting max downtime for snapshots
is questionable at best. Snapshots are mostly created in paused state
when this doesn't make sense at all since the VM is "down" the whole
time.

During "live" snapshots (which are not used very much as the memory
image will be unnecessarily large) this might apply, but generally local
storage is usually fast enough for this to not be a problem.

I'm not sure what this bit of documentation would provide to the user.
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH] "docs: formatsnapshot: add docs for snapshotDeleteInProgress"

2024-05-13 Thread Peter Krempa
On Sun, May 12, 2024 at 01:30:45 +0530, Abhiram Tilak wrote:
> Adds documentation for the  element to
> the libvirt snapshot format XML reference. The 
> element, introduced at commit 565bcb5d79, ensures the consistency of qcow2
> images during snapshot deletion operations by marking disks in snapshot
> metadata as invalid until deletion is successfully completed.
> 
> The commit was merged but the related documentation was missing.

Well to be fair this is kind of an internal flag, but yes the
documentation should be there as it's in the public XML.


> 
> Resolves: https://gitlab.com/libvirt/libvirt/-/issues/609
> Signed-off-by: Abhiram Tilak 
> ---
>  docs/formatsnapshot.rst | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/docs/formatsnapshot.rst b/docs/formatsnapshot.rst
> index 570a988442..e3d0d47e00 100644
> --- a/docs/formatsnapshot.rst
> +++ b/docs/formatsnapshot.rst
> @@ -170,6 +170,13 @@ The top-level ``domainsnapshot`` element may contain the 
> following elements:
>   sub-element can be used with same semantics as the identically named
>   subelement of the domain definition disk's driver.
> 
> +  ``snapshotDeleteInProgress``
> +
> + If a the snapshot is marked for deletion, the disk is invalidated. 
> The
> + sub-element ``snapshotDeleteInProgress`` marks the disk in the 
> snapshot
> + as invalid until the snapshot delete is completed successfully.
> + :since:`Since 9.0`

Preferrably also add a note that users should not add or remove this
element as it may break the snapshot or disk images.
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-05-13 Thread Jinpu Wang via Devel
Hi Peter, Hi Chuan,

On Thu, May 9, 2024 at 4:14 PM Peter Xu  wrote:
>
> On Thu, May 09, 2024 at 04:58:34PM +0800, Zheng Chuan via wrote:
> > That's a good news to see the socket abstraction for RDMA!
> > When I was developed the series above, the most pain is the RDMA migration 
> > has no QIOChannel abstraction and i need to take a 'fake channel'
> > for it which is awkward in code implementation.
> > So, as far as I know, we can do this by
> > i. the first thing is that we need to evaluate the rsocket is good enough 
> > to satisfy our QIOChannel fundamental abstraction
> > ii. if it works right, then we will continue to see if it can give us 
> > opportunity to hide the detail of rdma protocol
> > into rsocket by remove most of code in rdma.c and also some hack in 
> > migration main process.
> > iii. implement the advanced features like multi-fd and multi-uri for rdma 
> > migration.
> >
> > Since I am not familiar with rsocket, I need some times to look at it and 
> > do some quick verify with rdma migration based on rsocket.
> > But, yes, I am willing to involved in this refactor work and to see if we 
> > can make this migration feature more better:)
>
> Based on what we have now, it looks like we'd better halt the deprecation
> process a bit, so I think we shouldn't need to rush it at least in 9.1
> then, and we'll need to see how it goes on the refactoring.
>
> It'll be perfect if rsocket works, otherwise supporting multifd with little
> overhead / exported APIs would also be a good thing in general with
> whatever approach.  And obviously all based on the facts that we can get
> resources from companies to support this feature first.
>
> Note that so far nobody yet compared with rdma v.s. nic perf, so I hope if
> any of us can provide some test results please do so.  Many people are
> saying RDMA is better, but I yet didn't see any numbers comparing it with
> modern TCP networks.  I don't want to have old impressions floating around
> even if things might have changed..  When we have consolidated results, we
> should share them out and also reflect that in QEMU's migration docs when a
> rdma document page is ready.
I also did a tests with Mellanox ConnectX-6 100 G RoCE nic, the
results are mixed, for less than 3 streams native ethernet is faster,
and when more than 3 streams rsocket performs better.

root@x4-right:~# iperf -c 1.1.1.16 -P 1

Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  325 KByte (default)

[  3] local 1.1.1.15 port 44214 connected with 1.1.1.16 port 5001
[ ID] Interval   Transfer Bandwidth
[  3] 0.-10. sec  52.9 GBytes  45.4 Gbits/sec
root@x4-right:~# iperf -c 1.1.1.16 -P 2
[  3] local 1.1.1.15 port 33118 connected with 1.1.1.16 port 5001
[  4] local 1.1.1.15 port 33130 connected with 1.1.1.16 port 5001

Client connecting to 1.1.1.16, TCP port 5001
TCP window size: 4.00 MByte (default)

[ ID] Interval   Transfer Bandwidth
[  3] 0.-10.0001 sec  45.0 GBytes  38.7 Gbits/sec
[  4] 0.-10. sec  43.9 GBytes  37.7 Gbits/sec
[SUM] 0.-10. sec  88.9 GBytes  76.4 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
0.172/0.189/0.205/0.172 ms (tot/err) = 2/0
root@x4-right:~# iperf -c 1.1.1.16 -P 4

Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  325 KByte (default)

[  5] local 1.1.1.15 port 50748 connected with 1.1.1.16 port 5001
[  4] local 1.1.1.15 port 50734 connected with 1.1.1.16 port 5001
[  6] local 1.1.1.15 port 50764 connected with 1.1.1.16 port 5001
[  3] local 1.1.1.15 port 50730 connected with 1.1.1.16 port 5001
[ ID] Interval   Transfer Bandwidth
[  6] 0.-10. sec  24.7 GBytes  21.2 Gbits/sec
[  3] 0.-10.0004 sec  23.6 GBytes  20.3 Gbits/sec
[  4] 0.-10. sec  27.8 GBytes  23.9 Gbits/sec
[  5] 0.-10. sec  28.0 GBytes  24.0 Gbits/sec
[SUM] 0.-10. sec   104 GBytes  89.4 Gbits/sec
[ CT] final connect times (min/avg/max/stdev) =
0.104/0.156/0.204/0.124 ms (tot/err) = 4/0
root@x4-right:~# iperf -c 1.1.1.16 -P 8
[  4] local 1.1.1.15 port 55588 connected with 1.1.1.16 port 5001
[  5] local 1.1.1.15 port 55600 connected with 1.1.1.16 port 5001

Client connecting to 1.1.1.16, TCP port 5001
TCP window size:  325 KByte (default)

[ 10] local 1.1.1.15 port 55628 connected with 1.1.1.16 port 5001
[ 15] local 1.1.1.15 port 55648 connected with 1.1.1.16 port 5001
[  7] local 1.1.1.15 port 55620 connected with 1.1.1.16 port 5001
[  3] local 1.1.1.15 port 55584 connected with 1.1.1.16 port 5001
[ 14

[PATCH] docs: mention migrate-setmaxdowntime's impact on snapshot-create

2024-05-13 Thread Abhiram Tilak
The migrate-setmaxdowntime command sets the max allowed downtime during
live-migration, but since `snapshot-create` performs qmp migration, it also
affects the downtime during internal/external snapshot creation.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/593
Signed-off-by: Abhiram Tilak 
---
This issue was a minor change, yet it has remained without any updates for
some time.

 docs/manpages/virsh.rst | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst
index fa038e4547..18bdae53c9 100644
--- a/docs/manpages/virsh.rst
+++ b/docs/manpages/virsh.rst
@@ -3717,7 +3717,8 @@ migrate-setmaxdowntime

 Set maximum tolerable downtime for a domain which is being live-migrated to
 another host.  The *downtime* is a number of milliseconds the guest is allowed
-to be down at the end of live migration.
+to be down at the end of live migration. This is also used to set the max
+downtime for creating a snapshot using ``snapshot-create``.


 migrate-setspeed
--
2.44.0
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


[PATCH] "docs: formatsnapshot: add docs for snapshotDeleteInProgress"

2024-05-13 Thread Abhiram Tilak
Adds documentation for the  element to
the libvirt snapshot format XML reference. The 
element, introduced at commit 565bcb5d79, ensures the consistency of qcow2
images during snapshot deletion operations by marking disks in snapshot
metadata as invalid until deletion is successfully completed.

The commit was merged but the related documentation was missing.

Resolves: https://gitlab.com/libvirt/libvirt/-/issues/609
Signed-off-by: Abhiram Tilak 
---
 docs/formatsnapshot.rst | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/docs/formatsnapshot.rst b/docs/formatsnapshot.rst
index 570a988442..e3d0d47e00 100644
--- a/docs/formatsnapshot.rst
+++ b/docs/formatsnapshot.rst
@@ -170,6 +170,13 @@ The top-level ``domainsnapshot`` element may contain the 
following elements:
  sub-element can be used with same semantics as the identically named
  subelement of the domain definition disk's driver.

+  ``snapshotDeleteInProgress``
+
+ If a the snapshot is marked for deletion, the disk is invalidated. The
+ sub-element ``snapshotDeleteInProgress`` marks the disk in the 
snapshot
+ as invalid until the snapshot delete is completed successfully.
+ :since:`Since 9.0`
+
 ``creationTime``

A readonly representation of the time this snapshot was created. The time is
--
2.44.0
___
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-le...@lists.libvirt.org


Re: [PATCH] network: add modify-or-add feature to net-update

2024-05-13 Thread atp exp
ping, it's been a while since i have put this out.

On Sat, 16 Mar 2024 at 01:51, Abhiram Tilak  wrote:

> The current way of updating a network configuration uses `virsh
> net-update` to add, delete or modify entries. But with such a mechansim
> one should know if an entry with current info already exists. Adding
> modify-or-add option automatically performs either modify or add
> depending on the current state.
>
> Fixes: https://gitlab.com/libvirt/libvirt/-/issues/363
> Signed-off-by: Abhiram Tilak 
> ---
>  docs/manpages/virsh.rst   |   5 +-
>  include/libvirt/libvirt-network.h |   2 +
>  src/conf/network_conf.c   | 148 --
>  tools/virsh-network.c |   4 +-
>  4 files changed, 126 insertions(+), 33 deletions(-)
>
> diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst
> index 115b802c45..dc91ba895c 100644
> --- a/docs/manpages/virsh.rst
> +++ b/docs/manpages/virsh.rst
> @@ -5908,7 +5908,10 @@ changes optionally taking effect immediately,
> without needing to
>  destroy and re-start the network.
>
>  *command* is one of "add-first", "add-last", "add" (a synonym for
> -add-last), "delete", or "modify".
> +add-last), "delete", "modify", "modify-or-add" (modify + add-last),
> +"modify-or-add-first". The 'modify-or-add' commands perform modify or
> +add operation depending on the given state, and can be useful for
> +scripting.
>
>  *section* is one of "bridge", "domain", "ip", "ip-dhcp-host",
>  "ip-dhcp-range", "forward", "forward-interface", "forward-pf",
> diff --git a/include/libvirt/libvirt-network.h
> b/include/libvirt/libvirt-network.h
> index 58591be7ac..a6e132f407 100644
> --- a/include/libvirt/libvirt-network.h
> +++ b/include/libvirt/libvirt-network.h
> @@ -181,6 +181,8 @@ typedef enum {
>  VIR_NETWORK_UPDATE_COMMAND_DELETE= 2, /* delete an existing
> element (Since: 0.10.2) */
>  VIR_NETWORK_UPDATE_COMMAND_ADD_LAST  = 3, /* add an element at end of
> list (Since: 0.10.2) */
>  VIR_NETWORK_UPDATE_COMMAND_ADD_FIRST = 4, /* add an element at start
> of list (Since: 0.10.2) */
> +VIR_NETWORK_UPDATE_COMMAND_MODIFY_OR_ADD_LAST = 5, /* if exists
> modify or add an element at end of list (Since: 0.10.2) */
> +VIR_NETWORK_UPDATE_COMMAND_MODIFY_OR_ADD_FIRST = 6, /* if exists
> modify or add an element at start of list (Since: 0.10.2) */
>  # ifdef VIR_ENUM_SENTINELS
>  VIR_NETWORK_UPDATE_COMMAND_LAST /* (Since: 0.10.2) */
>  # endif
> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c
> index cc92ed0b03..2835395385 100644
> --- a/src/conf/network_conf.c
> +++ b/src/conf/network_conf.c
> @@ -2721,6 +2721,9 @@ virNetworkDefUpdateIPDHCPHost(virNetworkDef *def,
>  virNetworkDHCPHostDef host = { 0 };
>  bool partialOkay = (command == VIR_NETWORK_UPDATE_COMMAND_DELETE);
>
> +/* added for modify-or-add feature */
> +bool modified = false;
> +
>  if (virNetworkDefUpdateCheckElementName(def, ctxt->node, "host") < 0)
>  goto cleanup;
>
> @@ -2826,7 +2829,34 @@ virNetworkDefUpdateIPDHCPHost(virNetworkDef *def,
>  virNetworkDHCPHostDefClear(&ipdef->hosts[i]);
>  VIR_DELETE_ELEMENT(ipdef->hosts, i, ipdef->nhosts);
>
> -} else {
> +} else if ((command == VIR_NETWORK_UPDATE_COMMAND_MODIFY_OR_ADD_LAST)
> ||
> +   (command ==
> VIR_NETWORK_UPDATE_COMMAND_MODIFY_OR_ADD_FIRST)) {
> +
> +/* find entries with matching name/address/ip */
> +for (i = 0; i < ipdef->nhosts; i++) {
> +if ((host.mac && ipdef->hosts[i].mac &&
> + !virMacAddrCompare(host.mac, ipdef->hosts[i].mac)) ||
> +(host.name &&
> + STREQ_NULLABLE(host.name, ipdef->hosts[i].name)) ||
> +(VIR_SOCKET_ADDR_VALID(&host.ip) &&
> + virSocketAddrEqual(&host.ip, &ipdef->hosts[i].ip))) {
> +
> +modified = true;
> +break;
> +}
> +}
> +
> +/* if element is found then modify, or else add to beginning/end
> of list */
> +if (modified) {
> +virNetworkDHCPHostDefClear(&ipdef->hosts[i]);
> +ipdef->hosts[i] = host;
> +memset(&host, 0, sizeof(host));
> +} else if (VIR_INSERT_ELEMENT(ipdef->hosts,
> +command ==
> VIR_NETWORK_UPDATE_COMMAND_MODIFY_OR_ADD_FIRST
> +? 0 : ipdef->nhosts,
> +ipdef->nhosts, host) < 0)
> +goto cleanup;
> +} else  {
>  virNetworkDefUpdateUnknownCommand(command);
>  goto cleanup;
>  }
> @@ -2885,7 +2915,9 @@ virNetworkDefUpdateIPDHCPRange(virNetworkDef *def,
>  }
>
>  if ((command == VIR_NETWORK_UPDATE_COMMAND_ADD_FIRST) ||
> -(command == VIR_NETWORK_UPDATE_COMMAND_ADD_LAST)) {
> +(command == VIR_NETWORK_UPDATE_COMMAND_ADD_LAST) ||
> +(command == VIR_NETWORK_UPDATE_COMMAND_MODIFY_OR_ADD_LAST) ||
> +(command == VIR_NETWORK_UPDATE_C