Re: Reducing vdpa migration downtime because of memory pin / maps

2023-04-09 Thread longpeng2--- via




在 2023/4/10 10:14, Jason Wang 写道:

On Wed, Apr 5, 2023 at 7:38 PM Eugenio Perez Martin  wrote:


Hi!

As mentioned in the last upstream virtio-networking meeting, one of
the factors that adds more downtime to migration is the handling of
the guest memory (pin, map, etc). At this moment this handling is
bound to the virtio life cycle (DRIVER_OK, RESET). In that sense, the
destination device waits until all the guest memory / state is
migrated to start pinning all the memory.

The proposal is to bind it to the char device life cycle (open vs
close), so all the guest memory can be pinned for all the guest / qemu
lifecycle.

This has two main problems:
* At this moment the reset semantics forces the vdpa device to unmap
all the memory. So this change needs a vhost vdpa feature flag.


Is this true? I didn't find any codes to unmap the memory in
vhost_vdpa_set_status().



It could depend on the vendor driver, for example, the vdpasim would do 
something like that.


vhost_vdpa_set_status->vdpa_reset->vdpasim_reset->vdpasim_do_reset->vhost_iotlb_reset


Thanks


* This may increase the initialization time. Maybe we can delay it if
qemu is not the destination of a LM. Anyway I think this should be
done as an optimization on top.

Any ideas or comments in this regard?

Thanks!



.




Re: [PATCH v1 1/3] virtio-pci: submit msi route changes in batch

2023-02-28 Thread longpeng2--- via




在 2023/2/28 19:20, Longpeng (Mike, Cloud Infrastructure Service Product 
Dept.) 写道:



在 2023/2/28 18:17, Michael S. Tsirkin 写道:

On Tue, Feb 28, 2023 at 05:39:35PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

The kvm_irqchip_commit_routes() is a time-intensive operation, it needs
scan and update all irqfds that are already assigned during each 
invocation,

so more vectors means need more time to process them.


I think the real reason is it's the write side of RCU.



Yes, so we can reduce the invocation of it in this way.

I'll send other optimizations in the next step, including irqbypass, 
kvm_irqfd, etc.




Iterates the irqfds list is also time-consuming, it would iterate all 
existing irqfds when we commit, so the time complexity is O(n^2) without 
this patch.



For virtio-pci, we
can just submit once when enabling vectors of a virtio-pci device.

This can reduce the downtime when migrating a VM with vhost-vdpa 
devices.


Signed-off-by: Longpeng 
---
  hw/virtio/virtio-pci.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 247325c193..22e76e3902 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -49,6 +49,19 @@
   * configuration space */
  #define VIRTIO_PCI_CONFIG_SIZE(dev) 
VIRTIO_PCI_CONFIG_OFF(msix_enabled(dev))

+/* Protected by the BQL */
+static KVMRouteChange virtio_pci_route_change;
+
+static inline void virtio_pci_begin_route_changes(void)
+{
+    virtio_pci_route_change = 
kvm_irqchip_begin_route_changes(kvm_state);

+}
+
+static inline void virtio_pci_commit_route_changes(void)
+{
+    kvm_irqchip_commit_route_changes(_pci_route_change);
+}
+
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
  static void virtio_pci_reset(DeviceState *qdev);
@@ -790,12 +803,11 @@ static int 
kvm_virtio_pci_vq_vector_use(VirtIOPCIProxy *proxy,

  int ret;
  if (irqfd->users == 0) {
-    KVMRouteChange c = kvm_irqchip_begin_route_changes(kvm_state);
-    ret = kvm_irqchip_add_msi_route(, vector, >pci_dev);
+    ret = kvm_irqchip_add_msi_route(_pci_route_change, 
vector,

+    >pci_dev);
  if (ret < 0) {
  return ret;
  }
-    kvm_irqchip_commit_route_changes();
  irqfd->virq = ret;
  }
  irqfd->users++;
@@ -903,12 +915,18 @@ static int 
kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy *proxy, int nvqs)

  int ret = 0;
  VirtIODevice *vdev = virtio_bus_get_device(>bus);
+    virtio_pci_begin_route_changes();
+
  for (queue_no = 0; queue_no < nvqs; queue_no++) {
  if (!virtio_queue_get_num(vdev, queue_no)) {
+    virtio_pci_commit_route_changes();
  return -1;
  }
  ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
  }
+
+    virtio_pci_commit_route_changes();
+
  return ret;
  }
--
2.23.0


.




Re: [PATCH v1 1/3] virtio-pci: submit msi route changes in batch

2023-02-28 Thread longpeng2--- via




在 2023/2/28 18:18, Michael S. Tsirkin 写道:

On Tue, Feb 28, 2023 at 05:39:35PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

The kvm_irqchip_commit_routes() is a time-intensive operation, it needs
scan and update all irqfds that are already assigned during each invocation,
so more vectors means need more time to process them. For virtio-pci, we
can just submit once when enabling vectors of a virtio-pci device.

This can reduce the downtime when migrating a VM with vhost-vdpa devices.


can in what sense?  does it or does it not? by how much?



I've replied in patch 3.


Signed-off-by: Longpeng 
---
  hw/virtio/virtio-pci.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 247325c193..22e76e3902 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -49,6 +49,19 @@
   * configuration space */
  #define VIRTIO_PCI_CONFIG_SIZE(dev) 
VIRTIO_PCI_CONFIG_OFF(msix_enabled(dev))
  
+/* Protected by the BQL */

+static KVMRouteChange virtio_pci_route_change;
+
+static inline void virtio_pci_begin_route_changes(void)
+{
+virtio_pci_route_change = kvm_irqchip_begin_route_changes(kvm_state);
+}
+
+static inline void virtio_pci_commit_route_changes(void)
+{
+kvm_irqchip_commit_route_changes(_pci_route_change);
+}
+
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
  static void virtio_pci_reset(DeviceState *qdev);
@@ -790,12 +803,11 @@ static int kvm_virtio_pci_vq_vector_use(VirtIOPCIProxy 
*proxy,
  int ret;
  
  if (irqfd->users == 0) {

-KVMRouteChange c = kvm_irqchip_begin_route_changes(kvm_state);
-ret = kvm_irqchip_add_msi_route(, vector, >pci_dev);
+ret = kvm_irqchip_add_msi_route(_pci_route_change, vector,
+>pci_dev);
  if (ret < 0) {
  return ret;
  }
-kvm_irqchip_commit_route_changes();
  irqfd->virq = ret;
  }
  irqfd->users++;
@@ -903,12 +915,18 @@ static int kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy 
*proxy, int nvqs)
  int ret = 0;
  VirtIODevice *vdev = virtio_bus_get_device(>bus);
  
+virtio_pci_begin_route_changes();

+
  for (queue_no = 0; queue_no < nvqs; queue_no++) {
  if (!virtio_queue_get_num(vdev, queue_no)) {
+virtio_pci_commit_route_changes();
  return -1;
  }
  ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
  }
+
+virtio_pci_commit_route_changes();
+
  return ret;
  }
  
--

2.23.0


.




Re: [PATCH v1 1/3] virtio-pci: submit msi route changes in batch

2023-02-28 Thread longpeng2--- via




在 2023/2/28 18:17, Michael S. Tsirkin 写道:

On Tue, Feb 28, 2023 at 05:39:35PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

The kvm_irqchip_commit_routes() is a time-intensive operation, it needs
scan and update all irqfds that are already assigned during each invocation,
so more vectors means need more time to process them.


I think the real reason is it's the write side of RCU.



Yes, so we can reduce the invocation of it in this way.

I'll send other optimizations in the next step, including irqbypass, 
kvm_irqfd, etc.



For virtio-pci, we
can just submit once when enabling vectors of a virtio-pci device.

This can reduce the downtime when migrating a VM with vhost-vdpa devices.

Signed-off-by: Longpeng 
---
  hw/virtio/virtio-pci.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 247325c193..22e76e3902 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -49,6 +49,19 @@
   * configuration space */
  #define VIRTIO_PCI_CONFIG_SIZE(dev) 
VIRTIO_PCI_CONFIG_OFF(msix_enabled(dev))
  
+/* Protected by the BQL */

+static KVMRouteChange virtio_pci_route_change;
+
+static inline void virtio_pci_begin_route_changes(void)
+{
+virtio_pci_route_change = kvm_irqchip_begin_route_changes(kvm_state);
+}
+
+static inline void virtio_pci_commit_route_changes(void)
+{
+kvm_irqchip_commit_route_changes(_pci_route_change);
+}
+
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,
 VirtIOPCIProxy *dev);
  static void virtio_pci_reset(DeviceState *qdev);
@@ -790,12 +803,11 @@ static int kvm_virtio_pci_vq_vector_use(VirtIOPCIProxy 
*proxy,
  int ret;
  
  if (irqfd->users == 0) {

-KVMRouteChange c = kvm_irqchip_begin_route_changes(kvm_state);
-ret = kvm_irqchip_add_msi_route(, vector, >pci_dev);
+ret = kvm_irqchip_add_msi_route(_pci_route_change, vector,
+>pci_dev);
  if (ret < 0) {
  return ret;
  }
-kvm_irqchip_commit_route_changes();
  irqfd->virq = ret;
  }
  irqfd->users++;
@@ -903,12 +915,18 @@ static int kvm_virtio_pci_vector_vq_use(VirtIOPCIProxy 
*proxy, int nvqs)
  int ret = 0;
  VirtIODevice *vdev = virtio_bus_get_device(>bus);
  
+virtio_pci_begin_route_changes();

+
  for (queue_no = 0; queue_no < nvqs; queue_no++) {
  if (!virtio_queue_get_num(vdev, queue_no)) {
+virtio_pci_commit_route_changes();
  return -1;
  }
  ret = kvm_virtio_pci_vector_use_one(proxy, queue_no);
  }
+
+virtio_pci_commit_route_changes();
+
  return ret;
  }
  
--

2.23.0


.




Re: [PATCH v1 3/3] virtio-pci: defer to commit kvm irq routing when enable msi/msix

2023-02-28 Thread longpeng2--- via




在 2023/2/28 18:40, Michael S. Tsirkin 写道:

On Tue, Feb 28, 2023 at 05:39:37PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

All unmasked vectors will be setup in msix_set_vector_notifiers(), which
is a time-consuming operation because each vector need to be submit to
KVM once. It's even worse if the VM has several devices and each devices
has dozens of vectors.

We can defer and commit the vectors in batch, just like the commit dc580d51f7
("vfio: defer to commit kvm irq routing when enable msi/msix"),

The can reduce 80% of the time spending on virtio_pci_set_guest_notifiers().


cover letter also refers to 80%. what about patch 1 then? does it
contribute some of this gain?



Sorry, it's a clerical error, patch 3 reduces 50% in actually.

In my test, patch 1 can reduce 37%(160ms->101ms) and patch 3 can reduce 
50%(101ms->21ms, 80/160).



Signed-off-by: Longpeng 


In the age of language models there's no longer any excuse to post
agrammatical commit messages. Please just give your text to one of these
to correct.



Oh, I really envy you because I can not use it in my workspace. Thank 
you for your correction.



I prompted: "please correct grammar in the following text"
and got back:

All unmasked vectors will be set up in
msix_set_vector_notifiers(), which is a time-consuming operation because
each vector needs to be submitted to KVM once. It's even worse if the VM
has several devices and each device has dozens of vectors.

We can defer and commit the vectors in batches, just like the
commit dc580d51f7 ("vfio: defer to commit kvm irq routing when enabling
msi/msix").

This can reduce the time spent on virtio_pci_set_guest_notifiers() by 
80%.





---
  hw/virtio/virtio-pci.c | 113 -
  include/hw/virtio/virtio.h |   1 +
  2 files changed, 99 insertions(+), 15 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 5fd02b7cb8..13f9c31009 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -51,15 +51,22 @@
  
  /* Protected by the BQL */

  static KVMRouteChange virtio_pci_route_change;
+static unsigned virtio_pci_route_change_depth;
  
  static inline void virtio_pci_begin_route_changes(void)

  {
-virtio_pci_route_change = kvm_irqchip_begin_route_changes(kvm_state);
+if (!virtio_pci_route_change_depth) {
+virtio_pci_route_change = kvm_irqchip_begin_route_changes(kvm_state);
+}
+virtio_pci_route_change_depth++;
  }
  
  static inline void virtio_pci_commit_route_changes(void)

  {
-kvm_irqchip_commit_route_changes(_pci_route_change);
+virtio_pci_route_change_depth--;
+if (!virtio_pci_route_change_depth) {
+kvm_irqchip_commit_route_changes(_pci_route_change);
+}
  }
  
  static void virtio_pci_bus_new(VirtioBusState *bus, size_t bus_size,

@@ -976,6 +983,88 @@ static void 
kvm_virtio_pci_vector_config_release(VirtIOPCIProxy *proxy)
  kvm_virtio_pci_vector_release_one(proxy, VIRTIO_CONFIG_IRQ_IDX);
  }
  
+static int virtio_pci_vector_do_unmask(VirtIOPCIProxy *proxy,

+   unsigned int queue_no,
+   unsigned int vector,
+   EventNotifier *n)
+{
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
+int ret = 0;
+
+/*
+ * If guest supports masking, irqfd is already setup, unmask it.
+ * Otherwise, set it up now.
+ */
+if (vdev->use_guest_notifier_mask && k->guest_notifier_mask) {
+k->guest_notifier_mask(vdev, queue_no, false);
+/* Test after unmasking to avoid losing events. */
+if (k->guest_notifier_pending &&
+k->guest_notifier_pending(vdev, queue_no)) {
+event_notifier_set(n);
+}
+} else {
+ret = kvm_virtio_pci_irqfd_use(proxy, n, vector);
+}
+
+return ret;
+}
+
+static void virtio_pci_prepare_kvm_msi_virq_batch(VirtIOPCIProxy *proxy)
+{
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+
+assert(!vdev->defer_kvm_irq_routing);
+vdev->defer_kvm_irq_routing = true;
+virtio_pci_begin_route_changes();


move this out of here please - otherwise it's not clear each begin
is matched by commit.  in fact just open code this function.


+}
+
+static void virtio_pci_commit_kvm_msi_virq_batch(VirtIOPCIProxy *proxy)
+{
+VirtIODevice *vdev = virtio_bus_get_device(>bus);
+PCIDevice *dev = >pci_dev;
+VirtQueue *vq;
+EventNotifier *n;
+int vector, index;
+int ret;
+
+assert(vdev->defer_kvm_irq_routing);
+virtio_pci_commit_route_changes();
+vdev->defer_kvm_irq_routing = false;
+
+if (!msix_enabled(dev)) {
+return;
+}
+
+/* Unmask all unmasked vectors */
+for (vector = 0; vector < dev->msix_entries_nr; vector++) {
+if (msix_is_masked(dev, vector)) {
+continue;
+ 

Re: [PATCH v3 2/3] vhost: configure all host notifiers in a single MR transaction

2022-12-28 Thread longpeng2--- via




在 2022/12/28 21:12, Philippe Mathieu-Daudé 写道:

On 27/12/22 18:54, Michael S. Tsirkin wrote:

On Tue, Dec 27, 2022 at 05:43:57PM +0100, Philippe Mathieu-Daudé wrote:

On 27/12/22 08:20, Longpeng(Mike) wrote:

From: Longpeng 

This allows the vhost device to batch the setup of all its host 
notifiers.
This significantly reduces the device starting time, e.g. the time 
spend

on enabling notifiers reduce from 376ms to 9.1ms for a VM with 64 vCPUs
and 3 vhost-vDPA generic devices (vdpa_sim_blk, 64vq per device)

Signed-off-by: Longpeng 
---
   hw/virtio/vhost.c | 24 
   1 file changed, 24 insertions(+)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 5994559da8..064d4abe5c 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1562,16 +1562,25 @@ int vhost_dev_enable_notifiers(struct 
vhost_dev *hdev, VirtIODevice *vdev)

   return r;
   }
+    /*
+ * Batch all the host notifiers in a single transaction to avoid
+ * quadratic time complexity in address_space_update_ioeventfds().
+ */
+    memory_region_transaction_begin();
+
   for (i = 0; i < hdev->nvqs; ++i) {
   r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i,

    true);
   if (r < 0) {
   error_report("vhost VQ %d notifier binding failed: 
%d", i, -r);

+    memory_region_transaction_commit();
   vhost_dev_disable_notifiers(hdev, vdev);


Could we 'break' here, ...


   return r;
   }
   }
+    memory_region_transaction_commit();
+
   return 0;


... and return 'r' here?



won't this commit twice? seems ugly ...


Twice? I meant keep the begin/commit() around the for() to have
only *one* commit() call instead of 2:



There's a transaction in vhost_dev_disable_notifiers() too, We must 
commit the outter transaction before invoking it, you can see the 
comment in it.



-- >8 --
+    memory_region_transaction_begin();
+
  for (i = 0; i < hdev->nvqs; ++i) {
  r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i,

   true);
  if (r < 0) {
  error_report("vhost VQ %d notifier binding failed: %d", i, 
-r);

  vhost_dev_disable_notifiers(hdev, vdev);
-    return r;
+    break;
  }
  }

+    memory_region_transaction_commit();
+
-    return 0;
+    return r;
  }
---

Anyhow,
Reviewed-by: Philippe Mathieu-Daudé 
.




Re: [PATCH v11 5/5] docs: Add generic vhost-vdpa device documentation

2022-12-19 Thread longpeng2--- via




在 2022/12/20 14:15, Michael S. Tsirkin 写道:

On Tue, Dec 20, 2022 at 08:02:51AM +0800, Longpeng (Mike, Cloud Infrastructure 
Service Product Dept.) wrote:



在 2022/12/20 5:37, Michael S. Tsirkin 写道:

On Fri, Dec 16, 2022 at 11:33:49AM +0800, Jason Wang wrote:

On Thu, Dec 15, 2022 at 9:50 PM Longpeng(Mike)  wrote:


From: Longpeng 

Signed-off-by: Longpeng 
---
   .../devices/vhost-vdpa-generic-device.rst | 68 +++
   1 file changed, 68 insertions(+)
   create mode 100644 docs/system/devices/vhost-vdpa-generic-device.rst

diff --git a/docs/system/devices/vhost-vdpa-generic-device.rst 
b/docs/system/devices/vhost-vdpa-generic-device.rst
new file mode 100644
index 00..24c825ef1a
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-generic-device.rst
@@ -0,0 +1,68 @@
+
+=
+vhost-vDPA generic device
+=
+
+This document explains the usage of the vhost-vDPA generic device.
+
+
+Description
+---
+
+vDPA(virtio data path acceleration) device is a device that uses a datapath
+which complies with the virtio specifications with vendor specific control
+path.
+
+QEMU provides two types of vhost-vDPA devices to enable the vDPA device, one
+is type sensitive which means QEMU needs to know the actual device type
+(e.g. net, blk, scsi) and another is called "vhost-vDPA generic device" which
+is type insensitive.
+
+The vhost-vDPA generic device builds on the vhost-vdpa subsystem and virtio
+subsystem. It is quite small, but it can support any type of virtio device.
+
+
+Requirements
+
+Linux 5.18+
+iproute2/vdpa 5.12.0+
+
+
+Examples
+
+
+1. Prepare the vhost-vDPA backends, here is an example using vdpa_sim_blk
+   device:
+
+::
+  host# modprobe vhost_vdpa
+  host# modprobe vdpa_sim_blk


Nit: it's probably better to add driver binding steps here.


+  host# vdpa dev add mgmtdev vdpasim_blk name blk0
+  (...you can see the vhost-vDPA device under /dev directory now...)


And then the vhost char dev name could be fetch via

ls /sys/bus/vdpa/device/blk0/vhost-vdpa*

With the above changes.

Acked-by: Jason Wang 

Thanks



Sounds minor enough, I'll queue, pls fix with a patch on top.


OK, thanks. I'll send when I'm free.


Is this going to be in the next couple of weeks? then ok.


I think I can send out during this weekend.


We do want this addressed by let's say rc3.


+  host# ls -l /dev/vhost-vdpa-*
+  crw--- 1 root root 236, 0 Nov  2 00:49 /dev/vhost-vdpa-0
+
+Note:
+It needs some vendor-specific steps to provision the vDPA device if you're
+using real HW devices, such as loading the vendor-specific vDPA driver and
+binding the device to the driver.
+
+
+2. Start the virtual machine:
+
+Start QEMU with virtio-mmio bus:
+
+::
+  host# qemu-system  \
+  -M microvm -m 512 -smp 2 -kernel ... -initrd ...   \
+  -device vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-0   \
+  ...
+
+
+Start QEMU with virtio-pci bus:
+
+::
+  host# qemu-system  \
+  -M pc -m 512 -smp 2\
+  -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-0   \
+  ...
--
2.23.0



.


.




Re: [PATCH v11 5/5] docs: Add generic vhost-vdpa device documentation

2022-12-19 Thread longpeng2--- via




在 2022/12/20 5:37, Michael S. Tsirkin 写道:

On Fri, Dec 16, 2022 at 11:33:49AM +0800, Jason Wang wrote:

On Thu, Dec 15, 2022 at 9:50 PM Longpeng(Mike)  wrote:


From: Longpeng 

Signed-off-by: Longpeng 
---
  .../devices/vhost-vdpa-generic-device.rst | 68 +++
  1 file changed, 68 insertions(+)
  create mode 100644 docs/system/devices/vhost-vdpa-generic-device.rst

diff --git a/docs/system/devices/vhost-vdpa-generic-device.rst 
b/docs/system/devices/vhost-vdpa-generic-device.rst
new file mode 100644
index 00..24c825ef1a
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-generic-device.rst
@@ -0,0 +1,68 @@
+
+=
+vhost-vDPA generic device
+=
+
+This document explains the usage of the vhost-vDPA generic device.
+
+
+Description
+---
+
+vDPA(virtio data path acceleration) device is a device that uses a datapath
+which complies with the virtio specifications with vendor specific control
+path.
+
+QEMU provides two types of vhost-vDPA devices to enable the vDPA device, one
+is type sensitive which means QEMU needs to know the actual device type
+(e.g. net, blk, scsi) and another is called "vhost-vDPA generic device" which
+is type insensitive.
+
+The vhost-vDPA generic device builds on the vhost-vdpa subsystem and virtio
+subsystem. It is quite small, but it can support any type of virtio device.
+
+
+Requirements
+
+Linux 5.18+
+iproute2/vdpa 5.12.0+
+
+
+Examples
+
+
+1. Prepare the vhost-vDPA backends, here is an example using vdpa_sim_blk
+   device:
+
+::
+  host# modprobe vhost_vdpa
+  host# modprobe vdpa_sim_blk


Nit: it's probably better to add driver binding steps here.


+  host# vdpa dev add mgmtdev vdpasim_blk name blk0
+  (...you can see the vhost-vDPA device under /dev directory now...)


And then the vhost char dev name could be fetch via

ls /sys/bus/vdpa/device/blk0/vhost-vdpa*

With the above changes.

Acked-by: Jason Wang 

Thanks



Sounds minor enough, I'll queue, pls fix with a patch on top.


OK, thanks. I'll send when I'm free.


+  host# ls -l /dev/vhost-vdpa-*
+  crw--- 1 root root 236, 0 Nov  2 00:49 /dev/vhost-vdpa-0
+
+Note:
+It needs some vendor-specific steps to provision the vDPA device if you're
+using real HW devices, such as loading the vendor-specific vDPA driver and
+binding the device to the driver.
+
+
+2. Start the virtual machine:
+
+Start QEMU with virtio-mmio bus:
+
+::
+  host# qemu-system  \
+  -M microvm -m 512 -smp 2 -kernel ... -initrd ...   \
+  -device vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-0   \
+  ...
+
+
+Start QEMU with virtio-pci bus:
+
+::
+  host# qemu-system  \
+  -M pc -m 512 -smp 2\
+  -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-0   \
+  ...
--
2.23.0



.




Re: [PATCH v10 5/5] docs: Add generic vhost-vdpa device documentation

2022-12-13 Thread longpeng2--- via




在 2022/12/13 22:35, Stefano Garzarella 写道:

On Mon, Dec 05, 2022 at 04:49:43PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Signed-off-by: Longpeng 
---
.../devices/vhost-vdpa-generic-device.rst | 66 +++
1 file changed, 66 insertions(+)
create mode 100644 docs/system/devices/vhost-vdpa-generic-device.rst

diff --git a/docs/system/devices/vhost-vdpa-generic-device.rst 
b/docs/system/devices/vhost-vdpa-generic-device.rst

new file mode 100644
index 00..7d13359ea1
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-generic-device.rst
@@ -0,0 +1,66 @@
+
+=
+vhost-vDPA generic device
+=
+
+This document explains the usage of the vhost-vDPA generic device.
+
+Description
+---
+
+vDPA(virtio data path acceleration) device is a device that uses a 
datapath
+which complies with the virtio specifications with vendor specific 
control

+path.
+
+QEMU provides two types of vhost-vDPA devices to enable the vDPA 
device, one

+is type sensitive which means QEMU needs to know the actual device type
+(e.g. net, blk, scsi) and another is called "vhost-vDPA generic 
device" which

+is type insensitive.
+
+The vhost-vDPA generic device builds on the vhost-vdpa subsystem and 
virtio
+subsystem. It is quite small, but it can support any type of virtio 
device.

+


Maybe we can also add a minimum requirement section (e.g. we needs at 
least Linux v5.18 for VHOST_VDPA_GET_VQS_COUNT)



Ok.


+Examples
+
+
+1. Please make sure the modules listed bellow are installed:
+    vhost.ko
+    vhost_iotlb.ko
+    vdpa.ko
+    vhost_vdpa.ko
+
+
+2. Prepare the vhost-vDPA backends, here is an example using 
vdpa_sim_blk

+   device:
+
+::


Should we add also a `modprobe vhost-vdpa` step?


This is already described in Step 1.


+  host# modprobe vdpa_sim_blk
+  host# vdpa dev add mgmtdev vdpasim_blk name blk0
+  (...you can see the vhost-vDPA device under /dev directory now...)
+  host# ls -l /dev/vhost-vdpa-*
+  crw--- 1 root root 236, 0 Nov  2 00:49 /dev/vhost-vdpa-0
+
+Note:
+It needs some vendor-specific steps to provision the vDPA device if 
you're
+using real HW devices, such as installing the vendor-specific vDPA 
driver

+and binding the device to the driver.
+
+
+3. Start the virtual machine:
+
+Start QEMU with virtio-mmio bus:
+
+::
+  host# qemu-system  \
+  -M microvm -m 512 -smp 2 -kernel ... -initrd ...   \
+  -device vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-0   \
+  ...
+
+
+Start QEMU with virtio-pci bus:
+
+::
+  host# qemu-system  \
+  -M pc -m 512 -smp 2    \
+  -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-0   \
+  ...
--
2.23.0



.




Re: [PATCH v2 1/2] vhost: configure all host notifiers in a single MR transaction

2022-12-06 Thread longpeng2--- via




在 2022/12/6 18:45, Philippe Mathieu-Daudé 写道:
On 6/12/22 11:28, Longpeng (Mike, Cloud Infrastructure Service Product 
Dept.) wrote:



在 2022/12/6 17:07, Philippe Mathieu-Daudé 写道:

On 6/12/22 09:18, Longpeng(Mike) via wrote:

From: Longpeng 

This allows the vhost device to batch the setup of all its host 
notifiers.
This significantly reduces the device starting time, e.g. the time 
spend

on enabling notifiers reduce from 376ms to 9.1ms for a VM with 64 vCPUs
and 3 vhost-vDPA generic devices[1] (64vq per device)

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg921541.html

Signed-off-by: Longpeng 
---
  hw/virtio/vhost.c | 40 ++--
  1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7fb008bc9e..16f8391d86 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1507,7 +1507,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
  int vhost_dev_enable_notifiers(struct vhost_dev *hdev, 
VirtIODevice *vdev)

  {
  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
-    int i, r, e;
+    int i, n, r, e;
  /* We will pass the notifiers to the kernel, make sure that QEMU
   * doesn't interfere.
@@ -1518,6 +1518,12 @@ int vhost_dev_enable_notifiers(struct 
vhost_dev *hdev, VirtIODevice *vdev)

  goto fail;
  }
+    /*
+ * Batch all the host notifiers in a single transaction to avoid
+ * quadratic time complexity in address_space_update_ioeventfds().
+ */
+    memory_region_transaction_begin();
+
  for (i = 0; i < hdev->nvqs; ++i) {
  r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i,

   true);
@@ -1527,8 +1533,12 @@ int vhost_dev_enable_notifiers(struct 
vhost_dev *hdev, VirtIODevice *vdev)

  }
  }
+    memory_region_transaction_commit();
+
  return 0;
  fail_vq:
+    /* save i for a second iteration after transaction is 
committed. */

+    n = i;
  while (--i >= 0) {
  e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i,

   false);
@@ -1536,8 +1546,18 @@ fail_vq:
  error_report("vhost VQ %d notifier cleanup error: %d", 
i, -r);

  }
  assert (e >= 0);
-    virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i);

  }
+
+    /*
+ * The transaction expects the ioeventfds to be open when it
+ * commits. Do it now, before the cleanup loop.
+ */
+    memory_region_transaction_commit();
+
+    while (--n >= 0) {
+    virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + n);

+    }
+
  virtio_device_release_ioeventfd(vdev);
  fail:
  return r;


Similarly to patch #2, removing both goto statement in this function 
(as a preliminary patch) will 1/ simplify the code 2/ simplify 
reviewing your changes, resulting in something like:


int vhost_dev_enable_notifiers(struct vhost_dev *hdev,
    VirtIODevice *vdev)
{
 BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
 int i, r, e;

 /* We will pass the notifiers to the kernel, make sure that QEMU
  * doesn't interfere.
  */
 r = virtio_device_grab_ioeventfd(vdev);
 if (r < 0) {
 error_report("binding does not support host notifiers");
 return r;
 }

 memory_region_transaction_begin();

 for (i = 0; i < hdev->nvqs; ++i) {
 r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus),
  hdev->vq_index + i,
  true);
 if (r < 0) {
 error_report("vhost VQ %d notifier binding failed: %d",
  i, -r);
 while (--i >= 0) {
 e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus),
  hdev->vq_index + i,
  false);
 if (e < 0) {
 error_report(
    "vhost VQ %d notifier cleanup error: 
%d",

  i, -r);
 }
 assert (e >= 0);
 virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus),
  hdev->vq_index + i);
 }
 virtio_device_release_ioeventfd(vdev);
 break;
 }
 }

 memory_region_transaction_commit();

 return r;
}

What do you think?

Maybe we can use vhost_dev_disable_notifiers to further simplify the 
error path ?


Good idea, but having the BusState resolved on each call seems a waste.
Eventually factor it out and pass as argument ...


And we must commit before invoking virtio_bus_cleanup_host_notifier.


... but with that info on top, finally your original patch is simpler.


Yes, I'll try in next version, thanks.


.




Re: [PATCH v2 1/2] vhost: configure all host notifiers in a single MR transaction

2022-12-06 Thread longpeng2--- via




在 2022/12/6 17:07, Philippe Mathieu-Daudé 写道:

On 6/12/22 09:18, Longpeng(Mike) via wrote:

From: Longpeng 

This allows the vhost device to batch the setup of all its host 
notifiers.

This significantly reduces the device starting time, e.g. the time spend
on enabling notifiers reduce from 376ms to 9.1ms for a VM with 64 vCPUs
and 3 vhost-vDPA generic devices[1] (64vq per device)

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg921541.html

Signed-off-by: Longpeng 
---
  hw/virtio/vhost.c | 40 ++--
  1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7fb008bc9e..16f8391d86 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1507,7 +1507,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
  int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice 
*vdev)

  {
  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
-    int i, r, e;
+    int i, n, r, e;
  /* We will pass the notifiers to the kernel, make sure that QEMU
   * doesn't interfere.
@@ -1518,6 +1518,12 @@ int vhost_dev_enable_notifiers(struct vhost_dev 
*hdev, VirtIODevice *vdev)

  goto fail;
  }
+    /*
+ * Batch all the host notifiers in a single transaction to avoid
+ * quadratic time complexity in address_space_update_ioeventfds().
+ */
+    memory_region_transaction_begin();
+
  for (i = 0; i < hdev->nvqs; ++i) {
  r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i,

   true);
@@ -1527,8 +1533,12 @@ int vhost_dev_enable_notifiers(struct vhost_dev 
*hdev, VirtIODevice *vdev)

  }
  }
+    memory_region_transaction_commit();
+
  return 0;
  fail_vq:
+    /* save i for a second iteration after transaction is committed. */
+    n = i;
  while (--i >= 0) {
  e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i,

   false);
@@ -1536,8 +1546,18 @@ fail_vq:
  error_report("vhost VQ %d notifier cleanup error: %d", 
i, -r);

  }
  assert (e >= 0);
-    virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + i);

  }
+
+    /*
+ * The transaction expects the ioeventfds to be open when it
+ * commits. Do it now, before the cleanup loop.
+ */
+    memory_region_transaction_commit();
+
+    while (--n >= 0) {
+    virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), 
hdev->vq_index + n);

+    }
+
  virtio_device_release_ioeventfd(vdev);
  fail:
  return r;


Similarly to patch #2, removing both goto statement in this function (as 
a preliminary patch) will 1/ simplify the code 2/ simplify reviewing 
your changes, resulting in something like:


int vhost_dev_enable_notifiers(struct vhost_dev *hdev,
    VirtIODevice *vdev)
{
     BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
     int i, r, e;

     /* We will pass the notifiers to the kernel, make sure that QEMU
  * doesn't interfere.
  */
     r = virtio_device_grab_ioeventfd(vdev);
     if (r < 0) {
     error_report("binding does not support host notifiers");
     return r;
     }

     memory_region_transaction_begin();

     for (i = 0; i < hdev->nvqs; ++i) {
     r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus),
  hdev->vq_index + i,
  true);
     if (r < 0) {
     error_report("vhost VQ %d notifier binding failed: %d",
  i, -r);
     while (--i >= 0) {
     e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus),
  hdev->vq_index + i,
  false);
     if (e < 0) {
     error_report(
    "vhost VQ %d notifier cleanup error: %d",
  i, -r);
     }
     assert (e >= 0);
     virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus),
  hdev->vq_index + i);
     }
     virtio_device_release_ioeventfd(vdev);
     break;
     }
     }

     memory_region_transaction_commit();

     return r;
}

What do you think?

Maybe we can use vhost_dev_disable_notifiers to further simplify the 
error path ?


And we must commit before invoking virtio_bus_cleanup_host_notifier.


Regards,

Phil.
.




Re: [PATCH v2 2/2] vdpa: commit all host notifier MRs in a single MR transaction

2022-12-06 Thread longpeng2--- via




在 2022/12/6 16:30, Philippe Mathieu-Daudé 写道:

On 6/12/22 09:18, Longpeng(Mike) via wrote:

From: Longpeng 

This allows the vhost-vdpa device to batch the setup of all its MRs of
host notifiers.

This significantly reduces the device starting time, e.g. the time spend
on setup the host notifier MRs reduce from 423ms to 32ms for a VM with
64 vCPUs and 3 vhost-vDPA generic devices[1] (64vq per device).

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg921541.html

Signed-off-by: Longpeng 
---
  hw/virtio/vhost-vdpa.c | 18 ++
  1 file changed, 18 insertions(+)



@@ -562,16 +571,25 @@ static void 
vhost_vdpa_host_notifiers_init(struct vhost_dev *dev)

  return;
  }
+    /*
+ * Pack all the changes to the memory regions in a single
+ * transaction to avoid a few updating of the address space
+ * topology.
+ */
+    memory_region_transaction_begin();
+
  for (i = dev->vq_index; i < dev->vq_index + dev->nvqs; i++) {
  if (vhost_vdpa_host_notifier_init(dev, i)) {
  goto err;


Could we simplify by replacing this goto statement with:

    vhost_vdpa_host_notifiers_uninit(dev, i - dev->vq_index);
    break;

?


Good suggestion! I'll do in v3, thanks.


  }
  }
+    memory_region_transaction_commit();
  return;
  err:
  vhost_vdpa_host_notifiers_uninit(dev, i - dev->vq_index);
+    memory_region_transaction_commit();
  return;
  }


.




Re: [PATCH] vhost: configure all host notifiers in a single MR transaction

2022-11-28 Thread longpeng2--- via




在 2022/11/21 12:01, Jason Wang 写道:

On Fri, Nov 18, 2022 at 10:49 PM Longpeng(Mike)  wrote:


From: Longpeng 

This allows the vhost device to batch the setup of all its host notifiers.
This significantly reduces the device starting time, e.g. the vhost-vDPA
generic device [1] start time reduce from 376ms to 9.1ms for a VM with
64 vCPUs and 3 vDPA device(64vq per device).


Great, I think we need to do this for host_notifiers_mr as well. This
helps for the case when the notification area could be mapped directly
to guests.

Batch and commit once for host_notifiers_mrs can reduce the cost from 
423ms to 32ms, testing on vdpasim_blk (3 devices and 64 vqs per device) 
with doorbell passthrough support.

I'll append a patch in the next version.



[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg921541.html

Signed-off-by: Longpeng 
---
  hw/virtio/vhost.c | 39 ++-
  1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index d1c4c20b8c..bf82d9b176 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1507,6 +1507,7 @@ void vhost_dev_cleanup(struct vhost_dev *hdev)
  int vhost_dev_enable_notifiers(struct vhost_dev *hdev, VirtIODevice *vdev)
  {
  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
+int vq_init_count = 0;
  int i, r, e;

  /* We will pass the notifiers to the kernel, make sure that QEMU
@@ -1518,6 +1519,12 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, 
VirtIODevice *vdev)
  goto fail;
  }

+/*
+ * Batch all the host notifiers in a single transaction to avoid
+ * quadratic time complexity in address_space_update_ioeventfds().
+ */
+memory_region_transaction_begin();
+
  for (i = 0; i < hdev->nvqs; ++i) {
  r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
   true);
@@ -1525,19 +1532,33 @@ int vhost_dev_enable_notifiers(struct vhost_dev *hdev, 
VirtIODevice *vdev)
  error_report("vhost VQ %d notifier binding failed: %d", i, -r);
  goto fail_vq;
  }
+
+vq_init_count++;
  }


Nit, the name needs some tweak, it's actually the number of the host
notifiers that is initialized. And we can count it out of the loop.


Ok, I will refer to virtio_device_start_ioeventfd_impl().



+memory_region_transaction_commit();
+
  return 0;
  fail_vq:
-while (--i >= 0) {
+for (i = 0; i < vq_init_count; i++) {


It looks to me there's no need for this change.

Others look good.

Thanks


  e = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
   false);
  if (e < 0) {
  error_report("vhost VQ %d notifier cleanup error: %d", i, -r);
  }
  assert (e >= 0);
+}
+
+/*
+ * The transaction expects the ioeventfds to be open when it
+ * commits. Do it now, before the cleanup loop.
+ */
+memory_region_transaction_commit();
+
+for (i = 0; i < vq_init_count; i++) {
  virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + 
i);
  }
+
  virtio_device_release_ioeventfd(vdev);
  fail:
  return r;
@@ -1553,6 +1574,12 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, 
VirtIODevice *vdev)
  BusState *qbus = BUS(qdev_get_parent_bus(DEVICE(vdev)));
  int i, r;

+/*
+ * Batch all the host notifiers in a single transaction to avoid
+ * quadratic time complexity in address_space_update_ioeventfds().
+ */
+memory_region_transaction_begin();
+
  for (i = 0; i < hdev->nvqs; ++i) {
  r = virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + i,
   false);
@@ -1560,8 +1587,18 @@ void vhost_dev_disable_notifiers(struct vhost_dev *hdev, 
VirtIODevice *vdev)
  error_report("vhost VQ %d notifier cleanup failed: %d", i, -r);
  }
  assert (r >= 0);
+}
+
+/*
+ * The transaction expects the ioeventfds to be open when it
+ * commits. Do it now, before the cleanup loop.
+ */
+memory_region_transaction_commit();
+
+for (i = 0; i < hdev->nvqs; ++i) {
  virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), hdev->vq_index + 
i);
  }
+
  virtio_device_release_ioeventfd(vdev);
  }

--
2.23.0



.




Re: [PATCH v8 5/5] docs: Add generic vhost-vdpa device documentation

2022-11-11 Thread longpeng2--- via




在 2022/11/8 16:42, Stefano Garzarella 写道:
On Tue, Nov 08, 2022 at 11:30:53AM +0800, Longpeng (Mike, Cloud 
Infrastructure Service Product Dept.) wrote:



在 2022/11/8 10:42, Jason Wang 写道:
On Tue, Nov 8, 2022 at 8:42 AM Longpeng(Mike)  
wrote:


From: Longpeng 

Signed-off-by: Longpeng 
---
 docs/system/devices/vhost-vdpa-device.rst | 43 +++
 1 file changed, 43 insertions(+)
 create mode 100644 docs/system/devices/vhost-vdpa-device.rst

diff --git a/docs/system/devices/vhost-vdpa-device.rst 
b/docs/system/devices/vhost-vdpa-device.rst

new file mode 100644
index 00..b758c4fce6
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-device.rst


If the doc is for a general vhost-vDPA device, we'd better have a 
better name?




How about general-vhost-vdpa-device.rst?



I would leave vhost-vdpa as the prefix, how about 
vhost-vdpa-generic-device.rst?



Okay, will do in next version, thanks.


Thanks,
Stefano

.




Re: [PATCH v8 4/5] vdpa-dev: mark the device as unmigratable

2022-11-11 Thread longpeng2--- via




在 2022/11/8 16:46, Stefano Garzarella 写道:

On Tue, Nov 08, 2022 at 08:41:56AM +0800, Longpeng(Mike) wrote:

From: Longpeng 

The generic vDPA device doesn't support migration currently, so
mark it as unmigratable temporarily.

Signed-off-by: Longpeng 
---
hw/virtio/vdpa-dev.c | 1 +
1 file changed, 1 insertion(+)


Is there a particular reason why we don't squash this change in the 
second patch of this series where we add the hw/virtio/vdpa-dev.c file?



No, just want to make it noticeable.


Anyway, this change LGTM:

Reviewed-by: Stefano Garzarella 


Thanks.



diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
index 2885d06cbe..62d83d3423 100644
--- a/hw/virtio/vdpa-dev.c
+++ b/hw/virtio/vdpa-dev.c
@@ -327,6 +327,7 @@ static Property vhost_vdpa_device_properties[] = {

static const VMStateDescription vmstate_vhost_vdpa_device = {
    .name = "vhost-vdpa-device",
+    .unmigratable = 1,
    .minimum_version_id = 1,
    .version_id = 1,
    .fields = (VMStateField[]) {
--
2.23.0



.




Re: [PATCH v8 5/5] docs: Add generic vhost-vdpa device documentation

2022-11-07 Thread longpeng2--- via




在 2022/11/8 10:42, Jason Wang 写道:

On Tue, Nov 8, 2022 at 8:42 AM Longpeng(Mike)  wrote:


From: Longpeng 

Signed-off-by: Longpeng 
---
  docs/system/devices/vhost-vdpa-device.rst | 43 +++
  1 file changed, 43 insertions(+)
  create mode 100644 docs/system/devices/vhost-vdpa-device.rst

diff --git a/docs/system/devices/vhost-vdpa-device.rst 
b/docs/system/devices/vhost-vdpa-device.rst
new file mode 100644
index 00..b758c4fce6
--- /dev/null
+++ b/docs/system/devices/vhost-vdpa-device.rst


If the doc is for a general vhost-vDPA device, we'd better have a better name?



How about general-vhost-vdpa-device.rst?

On the other hand, this series focuses on the general vhost-vDPA device, 
so the doc is for it. It's ok if you want one doc includes both, then I 
think it should move out of this series.




@@ -0,0 +1,43 @@
+
+=
+generic vhost-vdpa device
+=
+
+This document explains the usage of the generic vhost vdpa device.
+
+Description
+---
+
+vDPA(virtio data path acceleration) device is a device that uses a datapath
+which complies with the virtio specifications with vendor specific control
+path.
+
+QEMU provides two types of vhost-vdpa devices to enable the vDPA device, one
+is type sensitive which means QEMU needs to know the actual device type
+(e.g. net, blk, scsi) and another is called "generic vdpa device" which is
+type insensitive (likes vfio-pci).


Same as above, if this document is focused on the general vhost-vDPA
device, we'd better emphasize it. And I don't think mention vfio-pci
is good idea here since those two are different from a lot of places,
(e.g the general vhost-vdpa is not transport specific, as demonstrated
below).



Ok, got it.


Thanks


+
+Examples
+
+
+Prepare the vhost-vdpa backends first:
+
+::
+  host# ls -l /dev/vhost-vdpa-*
+  crw--- 1 root root 236, 0 Nov  2 00:49 /dev/vhost-vdpa-0
+
+Start QEMU with virtio-mmio bus:
+
+::
+  host# qemu-system  \
+  -M microvm -m 512 -smp 2 -kernel ... -initrd ...   \
+  -device vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-0   \
+  ...
+
+Start QEMU with virtio-pci bus:
+
+::
+  host# qemu-system  \
+  -M pc -m 512 -smp 2\
+  -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-0   \
+  ...
--
2.23.0



.




Re: [PATCH v7 resend 0/4] add generic vDPA device support

2022-11-06 Thread longpeng2--- via




在 2022/11/6 21:47, Michael S. Tsirkin 写道:

On Sun, Nov 06, 2022 at 09:11:39PM +0800, Longpeng (Mike, Cloud Infrastructure 
Service Product Dept.) wrote:



在 2022/11/6 13:22, Michael S. Tsirkin 写道:

On Sun, Nov 06, 2022 at 08:17:07AM +0800, Longpeng (Mike, Cloud Infrastructure 
Service Product Dept.) wrote:



在 2022/11/6 0:43, Michael S. Tsirkin 写道:

On Sat, Nov 05, 2022 at 04:36:25PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Hi guys,

With the generic vDPA device, QEMU won't need to touch the device
types any more, such like vfio.


With this kind of passthrough migration is completely MIA right?
Better add a blocker...


Oh, I missed the "vdpa-dev: mark the device as unmigratable" since v4 and
I'll add it in the next version.

We'll support passthrough migration in the next step. We have already
written a demo that can migrate between some offloading cards.


Hmm ok. Backend disconnect can't work though, can it? State
is by necessity lost when backend crashes.
Yes, it can't.



And given this is there an advantage over VFIO?


I think the answer is the same as "why we need vDPA" if we compare it with
VFIO.


The answer is mostly because you can migrate and support backend
disconnect, no?


Migrating between different hardware is the first consideration in our
requirement, supporting backend disconnect is a low priority.


I dislike non-orthogonal features though ...
And the advantage of keeping it out of process with qemu is
I presume security?



Yes, this is one of the reasons. The TCB of the generic vdpa device is 
smaller than the existing vdpa device (needs to use the 
virtio-net/blk/scsi emulation codes).


Besides, the generic vdpa device can support any virtio device, but the 
existing vdpa device only supports virtio-net yet.


Though the existing vdpa device is more powerful and the generic vdpa 
device would miss some features, it can be an alternative for some users.






We can use the generic vDPA device as follow:
 -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-X
 Or
 -M microvm -m 512m -smp 2 -kernel ... -initrd ... -device \
 vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-x



Changes v6 -> v7:
   (v6: https://mail.gnu.org/archive/html/qemu-devel/2022-05/msg02821.html)
   - rebase. [Jason]
   - add documentation . [Stefan]

Changes v5 -> v6:
 Patch 2:
   - Turn to the original approach in the RFC to initialize the
 virtio_pci_id_info array. [Michael]
  https://lore.kernel.org/all/20220105005900.860-2-longpe...@huawei.com/
 Patch 3:
   - Fix logical error of exception handler around the post_init.
 [Stefano]
   - Fix some coding style warnings. [Stefano]
 Patch 4:
   - Fix some coding style warnings. [Stefano]

Changes v4 -> v5:
 Patch 3:
   - remove vhostfd [Jason]
   - support virtio-mmio [Jason]

Changes v3 -> v4:
 v3: https://www.mail-archive.com/qemu-devel@nongnu.org/msg877015.html
 - reorganize the series [Stefano]
 - fix some typos [Stefano]
 - fix logical error in vhost_vdpa_device_realize [Stefano]

Changes v2 -> v3
 Patch 4 & 5:
   - only call vdpa ioctls in vdpa-dev.c [Stefano, Longpeng]
   - s/VQS_NUM/VQS_COUNT  [Stefano]
   - check both vdpa_dev_fd and vdpa_dev [Stefano]
 Patch 6:
   - move all steps into vhost_vdpa_device_unrealize. [Stefano]

Changes RFC -> v2
 Patch 1:
   - rename 'pdev_id' to 'trans_devid'  [Michael]
   - only use transitional device id for the devices
 listed in the spec  [Michael]
   - use macros to make the id_info table clearer  [Longpeng]
   - add some modern devices in the id_info table  [Longpeng]
 Patch 2:
   - remove the GET_VECTORS_NUM command  [Jason]
 Patch 4:
   - expose vdpa_dev_fd as a QOM preperty  [Stefan]
   - introduce vhost_vdpa_device_get_u32 as a common
 function to make the code clearer  [Stefan]
   - fix the misleading description of 'dc->desc'  [Stefano]
 Patch 5:
   - check returned number of virtqueues  [Stefan]
 Patch 6:
   - init s->num_queues  [Stefano]
   - free s->dev.vqs  [Stefano]


Longpeng (Mike) (4):
 virtio: get class_id and pci device id by the virtio id
 vdpa: add vdpa-dev support
 vdpa: add vdpa-dev-pci support
 docs: Add generic vhost-vdpa device documentation

docs/system/devices/vhost-vdpa-device.rst |  43 +++
hw/virtio/Kconfig |   5 +
hw/virtio/meson.build |   2 +
hw/virtio/vdpa-dev-pci.c  | 102 ++
hw/virtio/vdpa-dev.c  | 377 ++
hw/virtio/virtio-pci.c|  88 +
include/hw/virtio/vdpa-dev.h  |  43 +++
include/hw/virtio/virtio-pci.h|   5 +
8 files changed, 665 insertions(+)
create mode 100644 docs/system/devices/vhost-vdpa-device.rst
create mode 100644 hw/virtio/vdpa-dev-pci.c

Re: [PATCH v7 resend 0/4] add generic vDPA device support

2022-11-06 Thread longpeng2--- via




在 2022/11/6 13:22, Michael S. Tsirkin 写道:

On Sun, Nov 06, 2022 at 08:17:07AM +0800, Longpeng (Mike, Cloud Infrastructure 
Service Product Dept.) wrote:



在 2022/11/6 0:43, Michael S. Tsirkin 写道:

On Sat, Nov 05, 2022 at 04:36:25PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Hi guys,

With the generic vDPA device, QEMU won't need to touch the device
types any more, such like vfio.


With this kind of passthrough migration is completely MIA right?
Better add a blocker...


Oh, I missed the "vdpa-dev: mark the device as unmigratable" since v4 and
I'll add it in the next version.

We'll support passthrough migration in the next step. We have already
written a demo that can migrate between some offloading cards.


Hmm ok. Backend disconnect can't work though, can it? State
is by necessity lost when backend crashes.
Yes, it can't.



And given this is there an advantage over VFIO?


I think the answer is the same as "why we need vDPA" if we compare it with
VFIO.


The answer is mostly because you can migrate and support backend
disconnect, no?

Migrating between different hardware is the first consideration in our 
requirement, supporting backend disconnect is a low priority.





We can use the generic vDPA device as follow:
-device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-X
Or
-M microvm -m 512m -smp 2 -kernel ... -initrd ... -device \
vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-x



Changes v6 -> v7:
  (v6: https://mail.gnu.org/archive/html/qemu-devel/2022-05/msg02821.html)
  - rebase. [Jason]
  - add documentation . [Stefan]

Changes v5 -> v6:
Patch 2:
  - Turn to the original approach in the RFC to initialize the
virtio_pci_id_info array. [Michael]
  https://lore.kernel.org/all/20220105005900.860-2-longpe...@huawei.com/
Patch 3:
  - Fix logical error of exception handler around the post_init.
[Stefano]
  - Fix some coding style warnings. [Stefano]
Patch 4:
  - Fix some coding style warnings. [Stefano]

Changes v4 -> v5:
Patch 3:
  - remove vhostfd [Jason]
  - support virtio-mmio [Jason]

Changes v3 -> v4:
v3: https://www.mail-archive.com/qemu-devel@nongnu.org/msg877015.html
- reorganize the series [Stefano]
- fix some typos [Stefano]
- fix logical error in vhost_vdpa_device_realize [Stefano]

Changes v2 -> v3
Patch 4 & 5:
  - only call vdpa ioctls in vdpa-dev.c [Stefano, Longpeng]
  - s/VQS_NUM/VQS_COUNT  [Stefano]
  - check both vdpa_dev_fd and vdpa_dev [Stefano]
Patch 6:
  - move all steps into vhost_vdpa_device_unrealize. [Stefano]

Changes RFC -> v2
Patch 1:
  - rename 'pdev_id' to 'trans_devid'  [Michael]
  - only use transitional device id for the devices
listed in the spec  [Michael]
  - use macros to make the id_info table clearer  [Longpeng]
  - add some modern devices in the id_info table  [Longpeng]
Patch 2:
  - remove the GET_VECTORS_NUM command  [Jason]
Patch 4:
  - expose vdpa_dev_fd as a QOM preperty  [Stefan]
  - introduce vhost_vdpa_device_get_u32 as a common
function to make the code clearer  [Stefan]
  - fix the misleading description of 'dc->desc'  [Stefano]
Patch 5:
  - check returned number of virtqueues  [Stefan]
Patch 6:
  - init s->num_queues  [Stefano]
  - free s->dev.vqs  [Stefano]


Longpeng (Mike) (4):
virtio: get class_id and pci device id by the virtio id
vdpa: add vdpa-dev support
vdpa: add vdpa-dev-pci support
docs: Add generic vhost-vdpa device documentation

   docs/system/devices/vhost-vdpa-device.rst |  43 +++
   hw/virtio/Kconfig |   5 +
   hw/virtio/meson.build |   2 +
   hw/virtio/vdpa-dev-pci.c  | 102 ++
   hw/virtio/vdpa-dev.c  | 377 ++
   hw/virtio/virtio-pci.c|  88 +
   include/hw/virtio/vdpa-dev.h  |  43 +++
   include/hw/virtio/virtio-pci.h|   5 +
   8 files changed, 665 insertions(+)
   create mode 100644 docs/system/devices/vhost-vdpa-device.rst
   create mode 100644 hw/virtio/vdpa-dev-pci.c
   create mode 100644 hw/virtio/vdpa-dev.c
   create mode 100644 include/hw/virtio/vdpa-dev.h

--
2.23.0


.



.




Re: [PATCH v7 resend 0/4] add generic vDPA device support

2022-11-05 Thread longpeng2--- via




在 2022/11/6 0:43, Michael S. Tsirkin 写道:

On Sat, Nov 05, 2022 at 04:36:25PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Hi guys,

With the generic vDPA device, QEMU won't need to touch the device
types any more, such like vfio.


With this kind of passthrough migration is completely MIA right?
Better add a blocker...


Oh, I missed the "vdpa-dev: mark the device as unmigratable" since v4 
and I'll add it in the next version.


We'll support passthrough migration in the next step. We have already 
written a demo that can migrate between some offloading cards.



And given this is there an advantage over VFIO?


I think the answer is the same as "why we need vDPA" if we compare it 
with VFIO.





We can use the generic vDPA device as follow:
   -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-X
   Or
   -M microvm -m 512m -smp 2 -kernel ... -initrd ... -device \
   vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-x



Changes v6 -> v7:
 (v6: https://mail.gnu.org/archive/html/qemu-devel/2022-05/msg02821.html)
 - rebase. [Jason]
 - add documentation . [Stefan]

Changes v5 -> v6:
   Patch 2:
 - Turn to the original approach in the RFC to initialize the
   virtio_pci_id_info array. [Michael]
  https://lore.kernel.org/all/20220105005900.860-2-longpe...@huawei.com/
   Patch 3:
 - Fix logical error of exception handler around the post_init.
   [Stefano]
 - Fix some coding style warnings. [Stefano]
   Patch 4:
 - Fix some coding style warnings. [Stefano]

Changes v4 -> v5:
   Patch 3:
 - remove vhostfd [Jason]
 - support virtio-mmio [Jason]

Changes v3 -> v4:
   v3: https://www.mail-archive.com/qemu-devel@nongnu.org/msg877015.html
   - reorganize the series [Stefano]
   - fix some typos [Stefano]
   - fix logical error in vhost_vdpa_device_realize [Stefano]

Changes v2 -> v3
   Patch 4 & 5:
 - only call vdpa ioctls in vdpa-dev.c [Stefano, Longpeng]
 - s/VQS_NUM/VQS_COUNT  [Stefano]
 - check both vdpa_dev_fd and vdpa_dev [Stefano]
   Patch 6:
 - move all steps into vhost_vdpa_device_unrealize. [Stefano]

Changes RFC -> v2
   Patch 1:
 - rename 'pdev_id' to 'trans_devid'  [Michael]
 - only use transitional device id for the devices
   listed in the spec  [Michael]
 - use macros to make the id_info table clearer  [Longpeng]
 - add some modern devices in the id_info table  [Longpeng]
   Patch 2:
 - remove the GET_VECTORS_NUM command  [Jason]
   Patch 4:
 - expose vdpa_dev_fd as a QOM preperty  [Stefan]
 - introduce vhost_vdpa_device_get_u32 as a common
   function to make the code clearer  [Stefan]
 - fix the misleading description of 'dc->desc'  [Stefano]
   Patch 5:
 - check returned number of virtqueues  [Stefan]
   Patch 6:
 - init s->num_queues  [Stefano]
 - free s->dev.vqs  [Stefano]


Longpeng (Mike) (4):
   virtio: get class_id and pci device id by the virtio id
   vdpa: add vdpa-dev support
   vdpa: add vdpa-dev-pci support
   docs: Add generic vhost-vdpa device documentation

  docs/system/devices/vhost-vdpa-device.rst |  43 +++
  hw/virtio/Kconfig |   5 +
  hw/virtio/meson.build |   2 +
  hw/virtio/vdpa-dev-pci.c  | 102 ++
  hw/virtio/vdpa-dev.c  | 377 ++
  hw/virtio/virtio-pci.c|  88 +
  include/hw/virtio/vdpa-dev.h  |  43 +++
  include/hw/virtio/virtio-pci.h|   5 +
  8 files changed, 665 insertions(+)
  create mode 100644 docs/system/devices/vhost-vdpa-device.rst
  create mode 100644 hw/virtio/vdpa-dev-pci.c
  create mode 100644 hw/virtio/vdpa-dev.c
  create mode 100644 include/hw/virtio/vdpa-dev.h

--
2.23.0


.




Re: [PATCH v6 resend 4/4] vdpa: add vdpa-dev-pci support

2022-06-08 Thread longpeng2--- via



在 2022/6/9 7:10, Michael S. Tsirkin 写道:

On Sat, May 14, 2022 at 12:11:07PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Supports vdpa-dev-pci, we can use the device as follow:

-device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-X

Reviewed-by: Stefano Garzarella 
Signed-off-by: Longpeng 



Build fails:

FAILED: libqemu-aarch64-softmmu.fa.p/hw_virtio_vdpa-dev-pci.c.o
cc -m64 -mcx16 -Ilibqemu-aarch64-softmmu.fa.p -I. -I.. -Itarget/arm -I../target/arm -Iqapi -Itrace 
-Iui -Iui/shader -I/usr/include/pixman-1 -I/usr/include/capstone -I/usr/include/spice-server 
-I/usr/include/spice-1 -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include 
-I/usr/include/sysprof-4 -I/usr/include/nss3 -I/usr/include/nspr4 -I/usr/include/PCSC 
-fdiagnostics-color=auto -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem 
/scm/qemu/linux-headers -isystem linux-headers -iquote . -iquote /scm/qemu -iquote 
/scm/qemu/include -iquote /scm/qemu/disas/libvixl -iquote /scm/qemu/tcg/i386 -pthread 
-U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-Wstrict-prototypes -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes 
-fno-strict-aliasing -fno-common -fwrapv -Wold-style-declaration -Wold-style-definition 
-Wtype-limits -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body 
-Wnested-externs -Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 
-Wno-missing-include-dirs -Wno-shift-negative-value -Wno-psabi -fstack-protector-strong -fPIE 
-isystem../linux-headers -isystemlinux-headers -DNEED_CPU_H 
'-DCONFIG_TARGET="aarch64-softmmu-config-target.h"' 
'-DCONFIG_DEVICES="aarch64-softmmu-config-devices.h"' -MD -MQ 
libqemu-aarch64-softmmu.fa.p/hw_virtio_vdpa-dev-pci.c.o -MF 
libqemu-aarch64-softmmu.fa.p/hw_virtio_vdpa-dev-pci.c.o.d -o 
libqemu-aarch64-softmmu.fa.p/hw_virtio_vdpa-dev-pci.c.o -c ../hw/virtio/vdpa-dev-pci.c
../hw/virtio/vdpa-dev-pci.c:26:10: fatal error: virtio-pci.h: No such file or 
directory
26 | #include "virtio-pci.h"
   |  ^~
compilation terminated.



The following patch moved the virtio-pci.h to the include/ directory:

e1b1f53 2022-05-16 04:38:40 -0400 hw/virtio: move virtio-pci.h into 
shared include space


I'll rebase this series recently.

Thanks.




---
  hw/virtio/meson.build|   1 +
  hw/virtio/vdpa-dev-pci.c | 102 +++
  2 files changed, 103 insertions(+)
  create mode 100644 hw/virtio/vdpa-dev-pci.c

diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 8f6f86db71..c2da69616f 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -50,6 +50,7 @@ virtio_pci_ss.add(when: 'CONFIG_VIRTIO_SERIAL', if_true: 
files('virtio-serial-pc
  virtio_pci_ss.add(when: 'CONFIG_VIRTIO_PMEM', if_true: 
files('virtio-pmem-pci.c'))
  virtio_pci_ss.add(when: 'CONFIG_VIRTIO_IOMMU', if_true: 
files('virtio-iommu-pci.c'))
  virtio_pci_ss.add(when: 'CONFIG_VIRTIO_MEM', if_true: 
files('virtio-mem-pci.c'))
+virtio_pci_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: 
files('vdpa-dev-pci.c'))
  
  virtio_ss.add_all(when: 'CONFIG_VIRTIO_PCI', if_true: virtio_pci_ss)
  
diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c

new file mode 100644
index 00..fde35dfc92
--- /dev/null
+++ b/hw/virtio/vdpa-dev-pci.c
@@ -0,0 +1,102 @@
+/*
+ * Vhost Vdpa Device PCI Bindings
+ *
+ * Copyright (c) Huawei Technologies Co., Ltd. 2022. All Rights Reserved.
+ *
+ * Authors:
+ *   Longpeng 
+ *
+ * Largely based on the "vhost-user-blk-pci.c" and "vhost-user-blk.c"
+ * implemented by:
+ *   Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "hw/pci/pci.h"
+#include "hw/qdev-properties.h"
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/module.h"
+#include "virtio-pci.h"
+#include "qom/object.h"
+
+
+typedef struct VhostVdpaDevicePCI VhostVdpaDevicePCI;
+
+#define TYPE_VHOST_VDPA_DEVICE_PCI "vhost-vdpa-device-pci-base"
+DECLARE_INSTANCE_CHECKER(VhostVdpaDevicePCI, VHOST_VDPA_DEVICE_PCI,
+ TYPE_VHOST_VDPA_DEVICE_PCI)
+
+struct VhostVdpaDevicePCI {
+VirtIOPCIProxy parent_obj;
+VhostVdpaDevice vdev;
+};
+
+static void vhost_vdpa_device_pci_instance_init(Object *obj)
+{
+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
+
+virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
+TYPE_VHOST_VDPA_DEVICE);
+object_property_add_alias(obj, "bootindex", OBJECT(>vdev),
+  "bootindex");
+}
+
+static Property vhost_vdpa_device_pci_properties[] = {
+DEFINE_PROP_END_OF_LIST(),
+};
+
+static int vhost_vdpa_device_pci_post_init(VhostVdpaDevice *v, Error **errp)
+{
+VhostVdpaDevicePCI 

Re: [PATCH v6 resend 0/4] add generic vDPA device support

2022-05-31 Thread longpeng2--- via




在 2022/5/31 16:50, Jason Wang 写道:

On Mon, May 30, 2022 at 12:16 PM Longpeng (Mike, Cloud Infrastructure
Service Product Dept.)  wrote:


Hi guys,

Ping...Any other comments?


Looks good to me.

Acked-by: Jason Wang 

(This probably requires a rebase since it doesn't apply cleanly on master).



Okay, I'll rebase in next version.
Thanks.


Thanks



在 2022/5/14 12:11, Longpeng(Mike) 写道:

From: Longpeng 

Hi guys,

With the generic vDPA device, QEMU won't need to touch the device
types any more, such like vfio.

We can use the generic vDPA device as follow:
-device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-X
Or
-M microvm -m 512m -smp 2 -kernel ... -initrd ... -device \
vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-x

I've done some simple tests on Huawei's offloading card (net, 0.95).

Changes v5 -> v6:
Patch 2:
  - Turn to the original approach in the RFC to initialize the
virtio_pci_id_info array. [Michael]
 https://lore.kernel.org/all/20220105005900.860-2-longpe...@huawei.com/
Patch 3:
  - Fix logical error of exception handler around the post_init.
[Stefano]
  - Fix some coding style warnings. [Stefano]
Patch 4:
  - Fix some coding style warnings. [Stefano]

Changes v4 -> v5:
Patch 3:
  - remove vhostfd [Jason]
  - support virtio-mmio [Jason]

Changes v3 -> v4:
v3: https://www.mail-archive.com/qemu-devel@nongnu.org/msg877015.html
- reorganize the series [Stefano]
- fix some typos [Stefano]
- fix logical error in vhost_vdpa_device_realize [Stefano]

Changes v2 -> v3
Patch 4 & 5:
  - only call vdpa ioctls in vdpa-dev.c [Stefano, Longpeng]
  - s/VQS_NUM/VQS_COUNT  [Stefano]
  - check both vdpa_dev_fd and vdpa_dev [Stefano]
Patch 6:
  - move all steps into vhost_vdpa_device_unrealize. [Stefano]

Changes RFC -> v2
Patch 1:
  - rename 'pdev_id' to 'trans_devid'  [Michael]
  - only use transitional device id for the devices
listed in the spec  [Michael]
  - use macros to make the id_info table clearer  [Longpeng]
  - add some modern devices in the id_info table  [Longpeng]
Patch 2:
  - remove the GET_VECTORS_NUM command  [Jason]
Patch 4:
  - expose vdpa_dev_fd as a QOM preperty  [Stefan]
  - introduce vhost_vdpa_device_get_u32 as a common
function to make the code clearer  [Stefan]
  - fix the misleading description of 'dc->desc'  [Stefano]
Patch 5:
  - check returned number of virtqueues  [Stefan]
Patch 6:
  - init s->num_queues  [Stefano]
  - free s->dev.vqs  [Stefano]


Longpeng (Mike) (4):
linux-headers: Update headers to Linux 5.18-rc6
virtio: get class_id and pci device id by the virtio id
vdpa: add vdpa-dev support
vdpa: add vdpa-dev-pci support

   hw/virtio/Kconfig|   5 +
   hw/virtio/meson.build|   2 +
   hw/virtio/vdpa-dev-pci.c | 102 ++
   hw/virtio/vdpa-dev.c | 377 +++
   hw/virtio/virtio-pci.c   |  88 
   hw/virtio/virtio-pci.h   |   5 +
   include/hw/virtio/vdpa-dev.h |  43 
   linux-headers/linux/vhost.h  |   7 +
   8 files changed, 629 insertions(+)
   create mode 100644 hw/virtio/vdpa-dev-pci.c
   create mode 100644 hw/virtio/vdpa-dev.c
   create mode 100644 include/hw/virtio/vdpa-dev.h





.




Re: [PATCH v6 resend 0/4] add generic vDPA device support

2022-05-29 Thread longpeng2--- via

Hi guys,

Ping...Any other comments?

在 2022/5/14 12:11, Longpeng(Mike) 写道:

From: Longpeng 

Hi guys,

With the generic vDPA device, QEMU won't need to touch the device
types any more, such like vfio.

We can use the generic vDPA device as follow:
   -device vhost-vdpa-device-pci,vhostdev=/dev/vhost-vdpa-X
   Or
   -M microvm -m 512m -smp 2 -kernel ... -initrd ... -device \
   vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-x

I've done some simple tests on Huawei's offloading card (net, 0.95).

Changes v5 -> v6:
   Patch 2:
 - Turn to the original approach in the RFC to initialize the
   virtio_pci_id_info array. [Michael]
  https://lore.kernel.org/all/20220105005900.860-2-longpe...@huawei.com/
   Patch 3:
 - Fix logical error of exception handler around the post_init.
   [Stefano]
 - Fix some coding style warnings. [Stefano]
   Patch 4:
 - Fix some coding style warnings. [Stefano]

Changes v4 -> v5:
   Patch 3:
 - remove vhostfd [Jason]
 - support virtio-mmio [Jason]

Changes v3 -> v4:
   v3: https://www.mail-archive.com/qemu-devel@nongnu.org/msg877015.html
   - reorganize the series [Stefano]
   - fix some typos [Stefano]
   - fix logical error in vhost_vdpa_device_realize [Stefano]

Changes v2 -> v3
   Patch 4 & 5:
 - only call vdpa ioctls in vdpa-dev.c [Stefano, Longpeng]
 - s/VQS_NUM/VQS_COUNT  [Stefano]
 - check both vdpa_dev_fd and vdpa_dev [Stefano]
   Patch 6:
 - move all steps into vhost_vdpa_device_unrealize. [Stefano]

Changes RFC -> v2
   Patch 1:
 - rename 'pdev_id' to 'trans_devid'  [Michael]
 - only use transitional device id for the devices
   listed in the spec  [Michael]
 - use macros to make the id_info table clearer  [Longpeng]
 - add some modern devices in the id_info table  [Longpeng]
   Patch 2:
 - remove the GET_VECTORS_NUM command  [Jason]
   Patch 4:
 - expose vdpa_dev_fd as a QOM preperty  [Stefan]
 - introduce vhost_vdpa_device_get_u32 as a common
   function to make the code clearer  [Stefan]
 - fix the misleading description of 'dc->desc'  [Stefano]
   Patch 5:
 - check returned number of virtqueues  [Stefan]
   Patch 6:
 - init s->num_queues  [Stefano]
 - free s->dev.vqs  [Stefano]


Longpeng (Mike) (4):
   linux-headers: Update headers to Linux 5.18-rc6
   virtio: get class_id and pci device id by the virtio id
   vdpa: add vdpa-dev support
   vdpa: add vdpa-dev-pci support

  hw/virtio/Kconfig|   5 +
  hw/virtio/meson.build|   2 +
  hw/virtio/vdpa-dev-pci.c | 102 ++
  hw/virtio/vdpa-dev.c | 377 +++
  hw/virtio/virtio-pci.c   |  88 
  hw/virtio/virtio-pci.h   |   5 +
  include/hw/virtio/vdpa-dev.h |  43 
  linux-headers/linux/vhost.h  |   7 +
  8 files changed, 629 insertions(+)
  create mode 100644 hw/virtio/vdpa-dev-pci.c
  create mode 100644 hw/virtio/vdpa-dev.c
  create mode 100644 include/hw/virtio/vdpa-dev.h





Re: [PATCH v5 2/4] virtio: get class_id and pci device id by the virtio id

2022-05-12 Thread longpeng2--- via




在 2022/5/12 14:56, Michael S. Tsirkin 写道:

On Thu, May 12, 2022 at 02:21:01PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Add helpers to get the "Transitional PCI Device ID" and "class_id"
of the device specified by the "Virtio Device ID".

These helpers will be used to build the generic vDPA device later.

Signed-off-by: Longpeng 



Except, the IDs in the current header a broken. I just fixed them
and they will be hopefully OK in the next version.



Do you mean this patch ?

```
virtio: fix virtio transitional ids
This commit fixes the transitional PCI device ID.

Fixes: d61914ea6ada ("virtio: update virtio id table, add transitional ids")
Signed-off-by: Shunsuke Mie 
Link: https://lore.kernel.org/r/20220510102723.87666-1-...@igel.co.jp
Signed-off-by: Michael S. Tsirkin 
```

Luckily, we use PCI_DEVICE_ID_VIRTIO_xxx instead of VIRTIO_TRANS_ID_xxx 
here.




---
  hw/virtio/virtio-pci.c | 77 ++
  hw/virtio/virtio-pci.h |  5 +++
  2 files changed, 82 insertions(+)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 7cf1231c1c..fdfa205cee 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -19,6 +19,7 @@
  
  #include "exec/memop.h"

  #include "standard-headers/linux/virtio_pci.h"
+#include "standard-headers/linux/virtio_ids.h"
  #include "hw/boards.h"
  #include "hw/virtio/virtio.h"
  #include "migration/qemu-file-types.h"
@@ -212,6 +213,79 @@ static int virtio_pci_load_queue(DeviceState *d, int n, 
QEMUFile *f)
  return 0;
  }
  
+typedef struct VirtIOPCIIDInfo {

+/* virtio id */
+uint16_t vdev_id;
+/* pci device id for the transitional device */
+uint16_t trans_devid;
+uint16_t class_id;
+} VirtIOPCIIDInfo;
+
+#define VIRTIO_TRANS_DEV_ID_INFO(name, class)   \
+{   \
+.vdev_id = VIRTIO_ID_##name,\
+.trans_devid = PCI_DEVICE_ID_VIRTIO_##name, \
+.class_id = class,  \
+}
+
+#define VIRTIO_MODERN_DEV_ID_NFO(name, class)   \
+{   \
+.vdev_id = VIRTIO_ID_##name,\
+.class_id = class,  \
+}
+


No, I think I liked the original approach in the RFC better, even though
it duplicates a tiny bit of code.
This trick does not save a lot of typing and obscures the ID
use in case it's wrong (as was the case just recently).


OK, will do in v6.

Thanks.




+static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
+/* Non-transitional devices */
+VIRTIO_MODERN_DEV_ID_NFO(CRYPTO,PCI_CLASS_OTHERS),
+VIRTIO_MODERN_DEV_ID_NFO(FS,PCI_CLASS_STORAGE_OTHER),
+/* Transitional devices */
+VIRTIO_TRANS_DEV_ID_INFO(NET,   PCI_CLASS_NETWORK_ETHERNET),
+VIRTIO_TRANS_DEV_ID_INFO(BLOCK, PCI_CLASS_STORAGE_SCSI),
+VIRTIO_TRANS_DEV_ID_INFO(CONSOLE,   PCI_CLASS_COMMUNICATION_OTHER),
+VIRTIO_TRANS_DEV_ID_INFO(SCSI,  PCI_CLASS_STORAGE_SCSI),
+VIRTIO_TRANS_DEV_ID_INFO(9P,PCI_BASE_CLASS_NETWORK),
+VIRTIO_TRANS_DEV_ID_INFO(BALLOON,   PCI_CLASS_OTHERS),
+VIRTIO_TRANS_DEV_ID_INFO(RNG,   PCI_CLASS_OTHERS),
+};
+
+static const VirtIOPCIIDInfo *virtio_pci_get_id_info(uint16_t vdev_id)
+{
+const VirtIOPCIIDInfo *info = NULL;
+int i;
+
+for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
+if (virtio_pci_id_info[i].vdev_id == vdev_id) {
+info = _pci_id_info[i];
+break;
+}
+}
+
+if (!info) {
+/* The device id is invalid or not added to the id_info yet. */
+error_report("Invalid virtio device(id %u)", vdev_id);
+abort();
+}
+
+return info;
+}
+
+/*
+ * Get the Transitional Device ID for the specific device, return
+ * zero if the device is non-transitional.
+ */
+uint16_t virtio_pci_get_trans_devid(uint16_t device_id)
+{
+return virtio_pci_get_id_info(device_id)->trans_devid;
+}
+
+/*
+ * Get the Class ID for the specific device.
+ */
+uint16_t virtio_pci_get_class_id(uint16_t device_id)
+{
+return virtio_pci_get_id_info(device_id)->class_id;
+}
+
  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
  {
  VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
@@ -1675,6 +1749,9 @@ static void virtio_pci_device_plugged(DeviceState *d, 
Error **errp)
   * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
   */
  pci_set_word(config + PCI_SUBSYSTEM_ID, virtio_bus_get_vdev_id(bus));
+if (proxy->trans_devid) {
+pci_config_set_device_id(config, proxy->trans_devid);
+}
  } else {
  /* pure virtio-1.0 */
  pci_set_word(config + PCI_VENDOR_ID,
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 2446dcd9ae..f08665cd1b 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -146,6 +146,8 @@ struct VirtIOPCIProxy {
  bool disable_modern;
  bool 

Re: [PATCH v5 3/4] vdpa: add vdpa-dev support

2022-05-12 Thread longpeng2--- via




在 2022/5/12 22:36, Stefano Garzarella 写道:

On Thu, May 12, 2022 at 02:21:02PM +0800, Longpeng(Mike) wrote:

From: Longpeng 

Supports vdpa-dev, we can use the deivce directly:

-M microvm -m 512m -smp 2 -kernel ... -initrd ... -device \
vhost-vdpa-device,vhostdev=/dev/vhost-vdpa-x

Signed-off-by: Longpeng 
---
hw/virtio/Kconfig    |   5 +
hw/virtio/meson.build    |   1 +
hw/virtio/vdpa-dev.c | 377 +++
include/hw/virtio/vdpa-dev.h |  43 
4 files changed, 426 insertions(+)
create mode 100644 hw/virtio/vdpa-dev.c
create mode 100644 include/hw/virtio/vdpa-dev.h

diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index c144d42f9b..724eb58a32 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -68,3 +68,8 @@ config VHOST_USER_RNG
    bool
    default y
    depends on VIRTIO && VHOST_USER
+
+config VHOST_VDPA_DEV
+    bool
+    default y
+    depends on VIRTIO && VHOST_VDPA && LINUX
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 67dc77e00f..8f6f86db71 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -29,6 +29,7 @@ virtio_ss.add(when: 'CONFIG_VHOST_USER_I2C', 
if_true: files('vhost-user-i2c.c'))
virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_I2C'], 
if_true: files('vhost-user-i2c-pci.c'))
virtio_ss.add(when: 'CONFIG_VHOST_USER_RNG', if_true: 
files('vhost-user-rng.c'))
virtio_ss.add(when: ['CONFIG_VHOST_USER_RNG', 'CONFIG_VIRTIO_PCI'], 
if_true: files('vhost-user-rng-pci.c'))
+virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: 
files('vdpa-dev.c'))


virtio_pci_ss = ss.source_set()
virtio_pci_ss.add(when: 'CONFIG_VHOST_VSOCK', if_true: 
files('vhost-vsock-pci.c'))

diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
new file mode 100644
index 00..56597c881a
--- /dev/null
+++ b/hw/virtio/vdpa-dev.c
@@ -0,0 +1,377 @@
+/*
+ * Vhost Vdpa Device
+ *
+ * Copyright (c) Huawei Technologies Co., Ltd. 2022. All Rights 
Reserved.

+ *
+ * Authors:
+ *   Longpeng 
+ *
+ * Largely based on the "vhost-user-blk-pci.c" and "vhost-user-blk.c" 
implemented by:

+ *   Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 
or later.

+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/cutils.h"
+#include "hw/qdev-core.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
+
+static void
+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+    /* Nothing to do */
+}
+
+static uint32_t
+vhost_vdpa_device_get_u32(int fd, unsigned long int cmd, Error **errp)
+{
+    uint32_t val = (uint32_t)-1;
+
+    if (ioctl(fd, cmd, ) < 0) {
+    error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
+   cmd, strerror(errno));
+    }
+
+    return val;
+}
+
+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VhostVdpaDevice *v = VHOST_VDPA_DEVICE(vdev);
+    uint16_t max_queue_size;
+    struct vhost_virtqueue *vqs;
+    int i, ret;
+
+    if (!v->vhostdev) {
+    error_setg(errp, "vhost-vdpa-device: vhostdev are missing");
+    return;
+    }
+
+    v->vhostfd = qemu_open(v->vhostdev, O_RDWR, errp);
+    if (*errp) {
+    return;
+    }
+    v->vdpa.device_fd = v->vhostfd;
+
+    v->vdev_id = vhost_vdpa_device_get_u32(v->vhostfd,
+   VHOST_VDPA_GET_DEVICE_ID, 
errp);

+    if (*errp) {
+    goto out;
+    }
+
+    max_queue_size = vhost_vdpa_device_get_u32(v->vhostfd,
+   
VHOST_VDPA_GET_VRING_NUM, errp);

+    if (*errp) {
+    goto out;
+    }
+
+    if (v->queue_size > max_queue_size) {
+    error_setg(errp, "vhost-vdpa-device: invalid queue_size: %u 
(max:%u)",

+   v->queue_size, max_queue_size);
+    goto out;
+    } else if (!v->queue_size) {
+    v->queue_size = max_queue_size;
+    }
+
+    v->num_queues = vhost_vdpa_device_get_u32(v->vhostfd,
+  
VHOST_VDPA_GET_VQS_COUNT, errp);

+    if (*errp) {
+    goto out;
+    }
+
+    if (!v->num_queues || v->num_queues > VIRTIO_QUEUE_MAX) {
+    error_setg(errp, "invalid number of virtqueues: %u (max:%u)",
+   v->num_queues, VIRTIO_QUEUE_MAX);
+    goto out;
+    }
+
+    v->dev.nvqs = v->num_queues;
+    vqs = g_new0(struct vhost_virtqueue, v->dev.nvqs);
+    v->dev.vqs = vqs;
+    v->dev.vq_index = 0;
+    v->dev.vq_index_end = v->dev.nvqs;
+    v->dev.backend_features = 0;
+    v->started = false;
+
+    ret = vhost_dev_init(>dev, >vdpa, 

Re: [PATCH v4 3/4] vdpa: add vdpa-dev support

2022-05-11 Thread longpeng2--- via




在 2022/5/11 16:55, Jason Wang 写道:

On Wed, May 11, 2022 at 2:10 PM Longpeng (Mike, Cloud Infrastructure
Service Product Dept.)  wrote:




在 2022/5/11 10:56, Jason Wang 写道:

On Tue, May 10, 2022 at 8:59 PM Longpeng(Mike)  wrote:


From: Longpeng 

Supports vdpa-dev.

Signed-off-by: Longpeng 
---
   hw/virtio/Kconfig|   5 +
   hw/virtio/meson.build|   1 +
   hw/virtio/vdpa-dev.c | 385 +++
   include/hw/virtio/vdpa-dev.h |  43 
   4 files changed, 434 insertions(+)
   create mode 100644 hw/virtio/vdpa-dev.c
   create mode 100644 include/hw/virtio/vdpa-dev.h

diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index c144d42f9b..2723283382 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -68,3 +68,8 @@ config VHOST_USER_RNG
   bool
   default y
   depends on VIRTIO && VHOST_USER
+
+config VHOST_VDPA_DEV
+bool
+default y if VIRTIO_PCI


Do we have the plan to add VIRTIO_MMIO support?


There is currently no plan. Maybe we can try to support it in the next step.


That would be better, allowing MMIO on top of vhost-vDPA is considered
to be one of the important advantages.


OK, I've done some simple tests, and it works fine.
I'll send v5 later.




+depends on VIRTIO && VHOST_VDPA && LINUX
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 67dc77e00f..8f6f86db71 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -29,6 +29,7 @@ virtio_ss.add(when: 'CONFIG_VHOST_USER_I2C', if_true: 
files('vhost-user-i2c.c'))
   virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_I2C'], if_true: 
files('vhost-user-i2c-pci.c'))
   virtio_ss.add(when: 'CONFIG_VHOST_USER_RNG', if_true: 
files('vhost-user-rng.c'))
   virtio_ss.add(when: ['CONFIG_VHOST_USER_RNG', 'CONFIG_VIRTIO_PCI'], if_true: 
files('vhost-user-rng-pci.c'))
+virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c'))

   virtio_pci_ss = ss.source_set()
   virtio_pci_ss.add(when: 'CONFIG_VHOST_VSOCK', if_true: 
files('vhost-vsock-pci.c'))
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
new file mode 100644
index 00..543b5b4b81
--- /dev/null
+++ b/hw/virtio/vdpa-dev.c
@@ -0,0 +1,385 @@
+/*
+ * Vhost Vdpa Device
+ *
+ * Copyright (c) Huawei Technologies Co., Ltd. 2022. All Rights Reserved.
+ *
+ * Authors:
+ *   Longpeng 
+ *
+ * Largely based on the "vhost-user-blk-pci.c" and "vhost-user-blk.c" 
implemented by:
+ *   Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/cutils.h"
+#include "hw/qdev-core.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
+
+static void
+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+/* Nothing to do */
+}
+
+static uint32_t
+vhost_vdpa_device_get_u32(int fd, unsigned long int cmd, Error **errp)
+{
+uint32_t val = (uint32_t)-1;
+
+if (ioctl(fd, cmd, ) < 0) {
+error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
+   cmd, strerror(errno));
+}
+
+return val;
+}
+
+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VhostVdpaDevice *v = VHOST_VDPA_DEVICE(vdev);
+uint16_t max_queue_size;
+struct vhost_virtqueue *vqs;
+int i, ret;
+
+if (!v->vhostdev && v->vhostfd == -1) {
+error_setg(errp, "both vhostdev and vhostfd are missing");
+return;
+}
+
+if (v->vhostdev && v->vhostfd != -1) {
+error_setg(errp, "both vhostdev and vhostfd are set");
+return;
+}
+
+if (v->vhostfd == -1) {
+v->vhostfd = qemu_open(v->vhostdev, O_RDWR, errp);
+if (*errp) {


Is this better to set error messages for all the possible failures
during realization?


I think the error messages already set in *errp, right?


Oh right.




+return;
+}
+}
+v->vdpa.device_fd = v->vhostfd;
+
+v->vdev_id = vhost_vdpa_device_get_u32(v->vhostfd,
+   VHOST_VDPA_GET_DEVICE_ID, errp);
+if (*errp) {
+goto out;
+}
+
+max_queue_size = vhost_vdpa_device_get_u32(v->vhostfd,
+   VHOST_VDPA_GET_VRING_NUM, errp);
+if (*errp) {
+goto out;
+}
+
+if (v->queue_size > max_queue_size) {
+error_setg(errp, "vhost-vdpa-device: invalid queue_size: %u (max:%u)",
+   v->queue_size, max_queue_size);
+goto out;
+} else if (!v->queue_size) {
+

Re: [PATCH v4 3/4] vdpa: add vdpa-dev support

2022-05-11 Thread longpeng2--- via




在 2022/5/11 10:56, Jason Wang 写道:

On Tue, May 10, 2022 at 8:59 PM Longpeng(Mike)  wrote:


From: Longpeng 

Supports vdpa-dev.

Signed-off-by: Longpeng 
---
  hw/virtio/Kconfig|   5 +
  hw/virtio/meson.build|   1 +
  hw/virtio/vdpa-dev.c | 385 +++
  include/hw/virtio/vdpa-dev.h |  43 
  4 files changed, 434 insertions(+)
  create mode 100644 hw/virtio/vdpa-dev.c
  create mode 100644 include/hw/virtio/vdpa-dev.h

diff --git a/hw/virtio/Kconfig b/hw/virtio/Kconfig
index c144d42f9b..2723283382 100644
--- a/hw/virtio/Kconfig
+++ b/hw/virtio/Kconfig
@@ -68,3 +68,8 @@ config VHOST_USER_RNG
  bool
  default y
  depends on VIRTIO && VHOST_USER
+
+config VHOST_VDPA_DEV
+bool
+default y if VIRTIO_PCI


Do we have the plan to add VIRTIO_MMIO support?


There is currently no plan. Maybe we can try to support it in the next step.


+depends on VIRTIO && VHOST_VDPA && LINUX
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index 67dc77e00f..8f6f86db71 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -29,6 +29,7 @@ virtio_ss.add(when: 'CONFIG_VHOST_USER_I2C', if_true: 
files('vhost-user-i2c.c'))
  virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_I2C'], if_true: 
files('vhost-user-i2c-pci.c'))
  virtio_ss.add(when: 'CONFIG_VHOST_USER_RNG', if_true: 
files('vhost-user-rng.c'))
  virtio_ss.add(when: ['CONFIG_VHOST_USER_RNG', 'CONFIG_VIRTIO_PCI'], if_true: 
files('vhost-user-rng-pci.c'))
+virtio_ss.add(when: 'CONFIG_VHOST_VDPA_DEV', if_true: files('vdpa-dev.c'))

  virtio_pci_ss = ss.source_set()
  virtio_pci_ss.add(when: 'CONFIG_VHOST_VSOCK', if_true: 
files('vhost-vsock-pci.c'))
diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
new file mode 100644
index 00..543b5b4b81
--- /dev/null
+++ b/hw/virtio/vdpa-dev.c
@@ -0,0 +1,385 @@
+/*
+ * Vhost Vdpa Device
+ *
+ * Copyright (c) Huawei Technologies Co., Ltd. 2022. All Rights Reserved.
+ *
+ * Authors:
+ *   Longpeng 
+ *
+ * Largely based on the "vhost-user-blk-pci.c" and "vhost-user-blk.c" 
implemented by:
+ *   Changpeng Liu 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "qapi/error.h"
+#include "qemu/error-report.h"
+#include "qemu/cutils.h"
+#include "hw/qdev-core.h"
+#include "hw/qdev-properties.h"
+#include "hw/qdev-properties-system.h"
+#include "hw/virtio/vhost.h"
+#include "hw/virtio/virtio.h"
+#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-access.h"
+#include "hw/virtio/vdpa-dev.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
+
+static void
+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
+{
+/* Nothing to do */
+}
+
+static uint32_t
+vhost_vdpa_device_get_u32(int fd, unsigned long int cmd, Error **errp)
+{
+uint32_t val = (uint32_t)-1;
+
+if (ioctl(fd, cmd, ) < 0) {
+error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
+   cmd, strerror(errno));
+}
+
+return val;
+}
+
+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+VhostVdpaDevice *v = VHOST_VDPA_DEVICE(vdev);
+uint16_t max_queue_size;
+struct vhost_virtqueue *vqs;
+int i, ret;
+
+if (!v->vhostdev && v->vhostfd == -1) {
+error_setg(errp, "both vhostdev and vhostfd are missing");
+return;
+}
+
+if (v->vhostdev && v->vhostfd != -1) {
+error_setg(errp, "both vhostdev and vhostfd are set");
+return;
+}
+
+if (v->vhostfd == -1) {
+v->vhostfd = qemu_open(v->vhostdev, O_RDWR, errp);
+if (*errp) {


Is this better to set error messages for all the possible failures
during realization?


I think the error messages already set in *errp, right?


+return;
+}
+}
+v->vdpa.device_fd = v->vhostfd;
+
+v->vdev_id = vhost_vdpa_device_get_u32(v->vhostfd,
+   VHOST_VDPA_GET_DEVICE_ID, errp);
+if (*errp) {
+goto out;
+}
+
+max_queue_size = vhost_vdpa_device_get_u32(v->vhostfd,
+   VHOST_VDPA_GET_VRING_NUM, errp);
+if (*errp) {
+goto out;
+}
+
+if (v->queue_size > max_queue_size) {
+error_setg(errp, "vhost-vdpa-device: invalid queue_size: %u (max:%u)",
+   v->queue_size, max_queue_size);
+goto out;
+} else if (!v->queue_size) {
+v->queue_size = max_queue_size;
+}
+
+v->num_queues = vhost_vdpa_device_get_u32(v->vhostfd,
+  VHOST_VDPA_GET_VQS_COUNT, errp);
+if (*errp) {
+goto out;
+}
+
+if (!v->num_queues || v->num_queues > VIRTIO_QUEUE_MAX) {
+error_setg(errp, "invalid number of virtqueues: %u (max:%u)",
+   

Re: Fio regression caused by f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94

2022-05-05 Thread longpeng2--- via

Hi Stefan,

在 2022/5/5 18:09, Stefan Hajnoczi 写道:

On Tue, May 03, 2022 at 09:43:15AM +0200, Lukáš Doktor wrote:

Hello Mike, Paolo, others,

in my perf pipeline I noticed a regression bisected to the 
f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94 - "thread-posix: remove the posix semaphore 
support" commit and I'd like to ask you to verify it might have caused that and 
eventually consider fixing it. The regression is visible, reproducible and clearly 
bisectable to this commit with the following 2 scenarios:

I can't parse the commit message for
f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94, so it's not 100% clear to me
why it was necessary to remove sem_*() calls.


We can find the previous discussion here:

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg870174.html

[2] https://www.mail-archive.com/qemu-devel@nongnu.org/msg870409.html


Because sem_timedwait() only supports absolute time and it would be 
affected


if the system time is changing. Another reason to remove sem_*() is to make

the code much neater.



util/thread-pool.c uses qemu_sem_*() to notify worker threads when work
becomes available. It makes sense that this operation is
performance-critical and that's why the benchmark regressed.

Maybe thread-pool.c can use qemu_cond_*() instead of qemu_sem_*(). That
avoids the extra mutex (we already have pool->lock) and counter (we
already have pool->request_list)?


1. fio write 4KiB using the nbd ioengine on localhost
2. fio read 4KiB using #cpu jobs and iodepth=8 on a rotational disk using qcow2 
image and default virt-install

 
   
   
   
 

but smaller regressions can be seen under other scenarios as well since this 
commit. You can find the report from bisections here:

https://ldoktor.github.io/tmp/RedHat-virtlab506/v7.0.0/RedHat-virtlab506-f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94-RHEL-8.4.0-20210503.1-1.html
https://ldoktor.github.io/tmp/RedHat-virtlab506/v7.0.0/RedHat-virtlab506-f9fc8932b11f3bcf2a2626f567cb6fdd36a33a94-RHEL-8.4.0-20210503.1-2.html

Regards,
Lukáš









RE: [PATCH v3 00/10] add generic vDPA device support

2022-03-28 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 28, 2022 11:12 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v3 00/10] add generic vDPA device support
> 
> On Sat, Mar 19, 2022 at 03:20:02PM +0800, Longpeng(Mike) wrote:
> >From: Longpeng 
> >
> >Hi guys,
> >
> >With the generic vDPA device, QEMU won't need to touch the device
> >types any more, such like vfio.
> >
> >We can use the generic vDPA device as follow:
> >  -device vhost-vdpa-device-pci,vdpa-dev=/dev/vhost-vdpa-X
> >
> >I've done some simple tests on Huawei's offloading card (net, 0.95)
> >and vdpa_sim_blk (1.0);
> 
> The whole seems almost okay, I left you some comments, but for the next
> version maybe it's better to reorganize the series differently.
> 
> Instead of adding vdpa-dev and vdpa-dev-pci incrementally, with stub
> functions and untestable parts, I would make a patch that adds vdpa-dev
> in full and then one that adds vdpa-dev-pci.

Ok, will do in next version.

> The first 2 patches on the other hand are okay, maybe patch 2 won't be
> needed since I think we will sync the headers after the 7.0 release, but
> is better to keep it for now to simplify the review.
> 

Yes, that's also my original intent. Thanks.

> Thanks,
> Stefano




RE: [PATCH v3 05/10] vdpa-dev: implement the realize interface

2022-03-28 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 28, 2022 11:00 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: Stefan Hajnoczi ; Michael Tsirkin ;
> Cornelia Huck ; Paolo Bonzini ;
> Gonglei (Arei) ; Yechuan ;
> Huangzhichao ; qemu devel list 
> 
> Subject: Re: [PATCH v3 05/10] vdpa-dev: implement the realize interface
> 
> On Sat, Mar 19, 2022 at 03:20:07PM +0800, Longpeng(Mike) wrote:
> >From: Longpeng 
> >

[snap]

> > static void vhost_vdpa_device_unrealize(DeviceState *dev)
> >@@ -66,6 +197,7 @@ static void vhost_vdpa_device_set_status(VirtIODevice 
> >*vdev,
> uint8_t status)
> > static Property vhost_vdpa_device_properties[] = {
> > DEFINE_PROP_STRING("vdpa-dev", VhostVdpaDevice, vdpa_dev),
> > DEFINE_PROP_INT32("vdpa-dev-fd", VhostVdpaDevice, vdpa_dev_fd, -1),
> 
> Other vhost devices use the `vhostfd` property, maybe we should use the
> same name.
> 
> If we go for this change, then maybe we also need to change `vdpa-dev`
> to `vhostpath` or something like that
> 

Make sense.

I prefer to use 'vhostdev' for the file path, just like the usage in Netdev:
  -netdev type=vhost-vdpa,vhostdev=/dev/vhost-vdpa-0,id=vhostvdpa1

> Thanks,
> Stefano
> 
> >+DEFINE_PROP_UINT16("queue-size", VhostVdpaDevice, queue_size, 0),
> > DEFINE_PROP_END_OF_LIST(),
> > };
> >
> >diff --git a/include/hw/virtio/vdpa-dev.h b/include/hw/virtio/vdpa-dev.h
> >index 476bda0873..cf11abd0f7 100644
> >--- a/include/hw/virtio/vdpa-dev.h
> >+++ b/include/hw/virtio/vdpa-dev.h
> >@@ -28,6 +28,16 @@ struct VhostVdpaDevice {
> > char *vdpa_dev;
> > int vdpa_dev_fd;
> > int32_t bootindex;
> >+uint32_t vdev_id;
> >+uint32_t num_queues;
> >+struct vhost_dev dev;
> >+struct vhost_vdpa vdpa;
> >+VirtQueue **virtqs;
> >+uint8_t *config;
> >+int config_size;
> >+uint16_t queue_size;
> >+bool started;
> >+int (*post_init)(VhostVdpaDevice *v, Error **errp);
> > };
> >
> > #endif
> >--
> >2.23.0
> >




RE: [PATCH v3 05/10] vdpa-dev: implement the realize interface

2022-03-28 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 28, 2022 11:00 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: Stefan Hajnoczi ; Michael Tsirkin ;
> Cornelia Huck ; Paolo Bonzini ;
> Gonglei (Arei) ; Yechuan ;
> Huangzhichao ; qemu devel list 
> 
> Subject: Re: [PATCH v3 05/10] vdpa-dev: implement the realize interface
> 
> On Sat, Mar 19, 2022 at 03:20:07PM +0800, Longpeng(Mike) wrote:
> >From: Longpeng 
> >
> >Implements the .realize interface.
> >
> >Signed-off-by: Longpeng 
> >---
> > hw/virtio/vdpa-dev-pci.c |  18 -
> > hw/virtio/vdpa-dev.c | 132 +++
> > include/hw/virtio/vdpa-dev.h |  10 +++
> > 3 files changed, 159 insertions(+), 1 deletion(-)
> >
> >diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> >index 9eb590ce8c..31bd17353a 100644
> >--- a/hw/virtio/vdpa-dev-pci.c
> >+++ b/hw/virtio/vdpa-dev-pci.c
> >@@ -51,10 +51,26 @@ static Property vhost_vdpa_device_pci_properties[] = {
> > DEFINE_PROP_END_OF_LIST(),
> > };
> >
> >+static int vhost_vdpa_device_pci_post_init(VhostVdpaDevice *v, Error **errp)
> >+{
> >+VhostVdpaDevicePCI *dev = container_of(v, VhostVdpaDevicePCI, vdev);
> >+VirtIOPCIProxy *vpci_dev = >parent_obj;
> >+
> >+vpci_dev->class_code = virtio_pci_get_class_id(v->vdev_id);
> >+vpci_dev->trans_devid = virtio_pci_get_trans_devid(v->vdev_id);
> >+/* one for config vector */
> >+vpci_dev->nvectors = v->num_queues + 1;
> >+
> >+return 0;
> >+}
> >+
> > static void
> > vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> > {
> >-return;
> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
> >+
> >+dev->vdev.post_init = vhost_vdpa_device_pci_post_init;
> >+qdev_realize(DEVICE(>vdev), BUS(_dev->bus), errp);
> > }
> >
> > static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void *data)
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index 993cbc7d11..4defe6c33d 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -29,9 +29,140 @@
> > #include "sysemu/sysemu.h"
> > #include "sysemu/runstate.h"
> >
> >+static void
> >+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >+{
> >+/* Nothing to do */
> >+}
> >+
> >+static uint32_t
> >+vhost_vdpa_device_get_u32(int fd, unsigned long int cmd, Error **errp)
> >+{
> >+uint32_t val = (uint32_t)-1;
> >+
> >+if (ioctl(fd, cmd, ) < 0) {
> >+error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> >+   cmd, strerror(errno));
> >+}
> >+
> >+return val;
> >+}
> >+
> > static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > {
> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >+VhostVdpaDevice *v = VHOST_VDPA_DEVICE(vdev);
> >+uint32_t max_queue_size;
> >+struct vhost_virtqueue *vqs;
> >+int i, ret;
> >+
> >+if (!v->vdpa_dev || v->vdpa_dev_fd == -1) {
> ^
> Should we use && operator here?
> 
> I can't start QEMU, how did you test this series?
> 

Oh! I changed to && operator in my local environment when I test this series 
but forgot to include it.

I'll fix it in the next version.

> $ ./qemu-system-x86_64 -m 512M -smp 2 -M q35,accel=kvm \
>...
>-device vhost-vdpa-device-pci,vdpa-dev=/dev/vhost-vdpa-0
> qemu-system-x86_64: -device
> vhost-vdpa-device-pci,vdpa-dev=/dev/vhost-vdpa-0: both vpda-dev and
> vpda-dev-fd are missing
> 
> >+error_setg(errp, "both vpda-dev and vpda-dev-fd are missing");
> 
> Typo s/vpda/vdpa
> 

OK.

> >+return;
> >+}
> >+
> >+if (v->vdpa_dev && v->vdpa_dev_fd != -1) {
> >+error_setg(errp, "both vpda-dev and vpda-dev-fd are set");
> 
> Ditto
> 

OK.

> >+return;
> >+}
> >+
> >+if (v->vdpa_dev_fd == -1) {
> >+v->vdpa_dev_fd = qemu_open(v->vdpa_dev, O_RDWR, errp);
> >+if (*errp) {
> >+return;
> >+}
> >+}
> >+v->vdpa.device_fd = v->vdpa_dev_fd;
> >+
> >+v->vdev_id = vhost_vdpa_device_get_u32(v->vdpa_dev_fd,
> >+   VHOST_VDPA_GET_DEVICE_ID, errp);
> >+if (*errp) {
> >+goto out;
> >+}
> >+
> >+max_queue_size = vhost_vdpa_device_get_u32(v->vdpa_dev_fd,
> >+   VHOST_VDPA_GET_VRING_NUM, 
> >errp);
> >+if (*errp) {
> >+goto out;
> >+}
> >+
> >+if (v->queue_size > max_queue_size) {
> >+error_setg(errp, "vhost-vdpa-device: invalid queue_size: %u
> (max:%u)",
> >+   v->queue_size, max_queue_size);
> >+goto out;
> >+} else if (!v->queue_size) {
> >+v->queue_size = max_queue_size;
> >+}
> >+
> >+v->num_queues = vhost_vdpa_device_get_u32(v->vdpa_dev_fd,
> >+  VHOST_VDPA_GET_VQS_COUNT, 
> >errp);
> >+if (*errp) {
> >+  

RE: [PATCH v2 05/10] vdpa-dev: implement the realize interface

2022-03-08 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Tuesday, March 8, 2022 4:41 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> 
> On Tue, Mar 08, 2022 at 03:19:55AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> >> -Original Message-
> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> Sent: Monday, March 7, 2022 8:14 PM
> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> 
> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> ; Huangzhichao ;
> >> qemu-devel@nongnu.org
> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >>
> >> On Mon, Mar 07, 2022 at 11:13:02AM +, Longpeng (Mike, Cloud 
> >> Infrastructure
> >> Service Product Dept.) wrote:
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> >> Sent: Monday, March 7, 2022 4:24 PM
> >> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> >> 
> >> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> >> ; Huangzhichao ;
> >> >> qemu-devel@nongnu.org
> >> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >> >>
> >> >> On Sat, Mar 05, 2022 at 07:07:54AM +, Longpeng (Mike, Cloud
> Infrastructure
> >> >> Service Product Dept.) wrote:
> >> >> >
> >> >> >
> >> >> >> -Original Message-
> >> >> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> >> >> Sent: Wednesday, January 19, 2022 7:31 PM
> >> >> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> >> >> 
> >> >> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> >> >> pbonz...@redhat.com; Gonglei (Arei) ; 
> >> >> >> Yechuan
> >> >> >> ; Huangzhichao ;
> >> >> >> qemu-devel@nongnu.org
> >> >> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize 
> >> >> >> interface
> >> >> >>
> >> >> >> On Mon, Jan 17, 2022 at 08:43:26PM +0800, Longpeng(Mike) via wrote:
> >> >> >> >From: Longpeng 
> >> >> >> >
> >> >> >> >Implements the .realize interface.
> >> >> >> >
> >> >> >> >Signed-off-by: Longpeng 
> >> >> >> >---
> >> >> >> > hw/virtio/vdpa-dev.c | 101
> +++
> >> >> >> > include/hw/virtio/vdpa-dev.h |   8 +++
> >> >> >> > 2 files changed, 109 insertions(+)
> >> >> >> >
> >> >> >> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >> >> >> >index b103768f33..bd28cf7a15 100644
> >> >> >> >--- a/hw/virtio/vdpa-dev.c
> >> >> >> >+++ b/hw/virtio/vdpa-dev.c
> >> >> >> >@@ -27,9 +27,109 @@ uint32_t vhost_vdpa_device_get_u32(int fd, 
> >> >> >> >unsigned
> >> long
> >> >> >> int cmd, Error **errp)
> >> >> >> > return val;
> >> >> >> > }
> >> >> >> >
> >> >> >> >+static void
> >> >> >> >+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue
> >> *vq)
> >> >> >> >+{
> >> >> >> >+/* Nothing to do */
> >> >> >> >+}
> >> >> >> >+
> >> >> >> > static void vhost_vdpa_device_realize(DeviceState *dev, Error 
> >> >> >> > **errp)
> >> >> >> > {
> >> >> >> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >> >> >> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >> >> >> >+uint32_t vdev_id, max_queue_size;
> >> >> >> >+struct vhost_virtqueue *vqs;
> >> >> >> >+int i, ret;
> >> >> >> >+
> >> >> >> >+if (s->vdpa_dev_fd == -1) {
> >> >> >> >+s->vdpa_dev_fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> >> >> >>
> >> >> >> So, here we are re-opening the `vdpa_dev` again (without checking if
> it
> >> >> >> is NULL).
> >> >> >>
> >> >> >> And we re-do the same ioctls already done in
> >> >> >> vhost_vdpa_device_pci_realize(), so I think we should do them in a
> >> >> >> single place, and that place should be here.
> >> >> >>
> >> >> >> So, what about doing all the ioctls here, setting appropriate fields
> in
> >> >> >> VhostVdpaDevice, then using that fields in
> >> >> >> vhost_vdpa_device_pci_realize() after qdev_realize() to set
> >> >> >> `class_code`, `trans_devid`, and `nvectors`?
> >> >> >>
> >> >> >
> >> >> >vhost_vdpa_device_pci_realize()
> >> >> >  qdev_realize()
> >> >> >virtio_device_realize()
> >> >> >  vhost_vdpa_device_realize()
> >> >> >  virtio_bus_device_plugged()
> >> >> >virtio_pci_device_plugged()
> >> >> >
> >> >> >These three fields would be used in virtio_pci_device_plugged(), so 
> >> >> >it's
> >> too
> >> >> >late to set them after qdev_realize().  And they belong to 
> >> >> >VirtIOPCIProxy,
> >> so
> >> >> >we cannot set them in vhost_vdpa_device_realize() which is
> >> >> >transport 

RE: [PATCH v2 05/10] vdpa-dev: implement the realize interface

2022-03-07 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 7, 2022 8:14 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> 
> On Mon, Mar 07, 2022 at 11:13:02AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> >> -Original Message-
> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> Sent: Monday, March 7, 2022 4:24 PM
> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> 
> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> ; Huangzhichao ;
> >> qemu-devel@nongnu.org
> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >>
> >> On Sat, Mar 05, 2022 at 07:07:54AM +, Longpeng (Mike, Cloud 
> >> Infrastructure
> >> Service Product Dept.) wrote:
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> >> Sent: Wednesday, January 19, 2022 7:31 PM
> >> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> >> 
> >> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> >> ; Huangzhichao ;
> >> >> qemu-devel@nongnu.org
> >> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >> >>
> >> >> On Mon, Jan 17, 2022 at 08:43:26PM +0800, Longpeng(Mike) via wrote:
> >> >> >From: Longpeng 
> >> >> >
> >> >> >Implements the .realize interface.
> >> >> >
> >> >> >Signed-off-by: Longpeng 
> >> >> >---
> >> >> > hw/virtio/vdpa-dev.c | 101 +++
> >> >> > include/hw/virtio/vdpa-dev.h |   8 +++
> >> >> > 2 files changed, 109 insertions(+)
> >> >> >
> >> >> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >> >> >index b103768f33..bd28cf7a15 100644
> >> >> >--- a/hw/virtio/vdpa-dev.c
> >> >> >+++ b/hw/virtio/vdpa-dev.c
> >> >> >@@ -27,9 +27,109 @@ uint32_t vhost_vdpa_device_get_u32(int fd, unsigned
> long
> >> >> int cmd, Error **errp)
> >> >> > return val;
> >> >> > }
> >> >> >
> >> >> >+static void
> >> >> >+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue
> *vq)
> >> >> >+{
> >> >> >+/* Nothing to do */
> >> >> >+}
> >> >> >+
> >> >> > static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> >> >> > {
> >> >> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >> >> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >> >> >+uint32_t vdev_id, max_queue_size;
> >> >> >+struct vhost_virtqueue *vqs;
> >> >> >+int i, ret;
> >> >> >+
> >> >> >+if (s->vdpa_dev_fd == -1) {
> >> >> >+s->vdpa_dev_fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> >> >>
> >> >> So, here we are re-opening the `vdpa_dev` again (without checking if it
> >> >> is NULL).
> >> >>
> >> >> And we re-do the same ioctls already done in
> >> >> vhost_vdpa_device_pci_realize(), so I think we should do them in a
> >> >> single place, and that place should be here.
> >> >>
> >> >> So, what about doing all the ioctls here, setting appropriate fields in
> >> >> VhostVdpaDevice, then using that fields in
> >> >> vhost_vdpa_device_pci_realize() after qdev_realize() to set
> >> >> `class_code`, `trans_devid`, and `nvectors`?
> >> >>
> >> >
> >> >vhost_vdpa_device_pci_realize()
> >> >  qdev_realize()
> >> >virtio_device_realize()
> >> >  vhost_vdpa_device_realize()
> >> >  virtio_bus_device_plugged()
> >> >virtio_pci_device_plugged()
> >> >
> >> >These three fields would be used in virtio_pci_device_plugged(), so it's
> too
> >> >late to set them after qdev_realize().  And they belong to VirtIOPCIProxy,
> so
> >> >we cannot set them in vhost_vdpa_device_realize() which is transport layer
> >> >independent.
> >>
> >> Maybe I expressed myself wrong, I was saying to open the file and make
> >> ioctls in vhost_vdpa_device_realize(). Save the values we use on both
> >> sides in VhostVdpaDevice (e.g. num_queues, queue_size) and use these
> >> saved values in virtio_pci_device_plugged() without re-opening the file
> >> again.
> >>
> >
> >This means we need to access VhostVdpaDevice in virtio_pci_device_plugged()?
> 
> Yep, or implement some functions to get those values.
> 

I prefer not to modify the VIRTIO or the VIRTIO_PCI core too much.
How about the following proposal?

struct VhostVdpaDevice {
...
void (*post_init)(VhostVdpaDevice *vdpa_dev);
...
}

vhost_vdpa_device_pci_post_init(VhostVdpaDevice *vdpa_dev)
{
...
vpci_dev->class_code = virtio_pci_get_class_id(vdpa_dev->vdev_id);
vpci_dev->trans_devid = virtio_pci_get_trans_devid(vdpa_dev->vdev_id);
vpci_dev->nvectors = 

RE: [PATCH v2 05/10] vdpa-dev: implement the realize interface

2022-03-07 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 7, 2022 4:24 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> 
> On Sat, Mar 05, 2022 at 07:07:54AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> >> -Original Message-
> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> Sent: Wednesday, January 19, 2022 7:31 PM
> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> 
> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> ; Huangzhichao ;
> >> qemu-devel@nongnu.org
> >> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> >>
> >> On Mon, Jan 17, 2022 at 08:43:26PM +0800, Longpeng(Mike) via wrote:
> >> >From: Longpeng 
> >> >
> >> >Implements the .realize interface.
> >> >
> >> >Signed-off-by: Longpeng 
> >> >---
> >> > hw/virtio/vdpa-dev.c | 101 +++
> >> > include/hw/virtio/vdpa-dev.h |   8 +++
> >> > 2 files changed, 109 insertions(+)
> >> >
> >> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >> >index b103768f33..bd28cf7a15 100644
> >> >--- a/hw/virtio/vdpa-dev.c
> >> >+++ b/hw/virtio/vdpa-dev.c
> >> >@@ -27,9 +27,109 @@ uint32_t vhost_vdpa_device_get_u32(int fd, unsigned 
> >> >long
> >> int cmd, Error **errp)
> >> > return val;
> >> > }
> >> >
> >> >+static void
> >> >+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >> >+{
> >> >+/* Nothing to do */
> >> >+}
> >> >+
> >> > static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> >> > {
> >> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >> >+uint32_t vdev_id, max_queue_size;
> >> >+struct vhost_virtqueue *vqs;
> >> >+int i, ret;
> >> >+
> >> >+if (s->vdpa_dev_fd == -1) {
> >> >+s->vdpa_dev_fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> >>
> >> So, here we are re-opening the `vdpa_dev` again (without checking if it
> >> is NULL).
> >>
> >> And we re-do the same ioctls already done in
> >> vhost_vdpa_device_pci_realize(), so I think we should do them in a
> >> single place, and that place should be here.
> >>
> >> So, what about doing all the ioctls here, setting appropriate fields in
> >> VhostVdpaDevice, then using that fields in
> >> vhost_vdpa_device_pci_realize() after qdev_realize() to set
> >> `class_code`, `trans_devid`, and `nvectors`?
> >>
> >
> >vhost_vdpa_device_pci_realize()
> >  qdev_realize()
> >virtio_device_realize()
> >  vhost_vdpa_device_realize()
> >  virtio_bus_device_plugged()
> >virtio_pci_device_plugged()
> >
> >These three fields would be used in virtio_pci_device_plugged(), so it's too
> >late to set them after qdev_realize().  And they belong to VirtIOPCIProxy, so
> >we cannot set them in vhost_vdpa_device_realize() which is transport layer
> >independent.
> 
> Maybe I expressed myself wrong, I was saying to open the file and make
> ioctls in vhost_vdpa_device_realize(). Save the values we use on both
> sides in VhostVdpaDevice (e.g. num_queues, queue_size) and use these
> saved values in virtio_pci_device_plugged() without re-opening the file
> again.
> 

This means we need to access VhostVdpaDevice in virtio_pci_device_plugged()?

> Can't we set `class_code`, `trans_devid`, and `nvectors` after calling
> qdev_realize()?
> 
> Thanks,
> Stefano




RE: [PATCH v2 04/10] vdpa-dev: implement the instance_init/class_init interface

2022-03-07 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 7, 2022 5:13 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> On Mon, Mar 07, 2022 at 08:55:35AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> >> -Original Message-
> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> Sent: Monday, March 7, 2022 4:10 PM
> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> 
> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> ; Huangzhichao ;
> >> qemu-devel@nongnu.org
> >> Subject: Re: [PATCH v2 04/10] vdpa-dev: implement the
> instance_init/class_init
> >> interface
> >>
> >> Hi Longpeng,
> >>
> >> On Sat, Mar 05, 2022 at 06:06:42AM +, Longpeng (Mike, Cloud 
> >> Infrastructure
> >> Service Product Dept.) wrote:
> >> >Hi Stefano,
> >> >
> >> >> -Original Message-
> >> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> >> Sent: Wednesday, January 19, 2022 7:24 PM
> >> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> >> 
> >> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> >> ; Huangzhichao ;
> >> >> qemu-devel@nongnu.org
> >> >> Subject: Re: [PATCH v2 04/10] vdpa-dev: implement the
> >> instance_init/class_init
> >> >> interface
> >> >>
> >> >> On Mon, Jan 17, 2022 at 08:43:25PM +0800, Longpeng(Mike) via wrote:
> >> >> >From: Longpeng 
> >> >> >
> >> >> >Implements the .instance_init and the .class_init interface.
> >> >> >
> >> >> >Signed-off-by: Longpeng 
> >> >> >---
> >> >> > hw/virtio/vdpa-dev-pci.c | 52 ++-
> >> >> > hw/virtio/vdpa-dev.c | 81 +++-
> >> >> > include/hw/virtio/vdpa-dev.h |  5 +++
> >> >> > 3 files changed, 134 insertions(+), 4 deletions(-)
> >> >> >
> >> >> >diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> >> >> >index a5a7b528a9..257538dbdd 100644
> >> >> >--- a/hw/virtio/vdpa-dev-pci.c
> >> >> >+++ b/hw/virtio/vdpa-dev-pci.c
> >> >> >@@ -25,12 +25,60 @@ struct VhostVdpaDevicePCI {
> >> >> >
> >> >> > static void vhost_vdpa_device_pci_instance_init(Object *obj)
> >> >> > {
> >> >> >-return;
> >> >> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
> >> >> >+
> >> >> >+virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
> >> >> >+TYPE_VHOST_VDPA_DEVICE);
> >> >> >+object_property_add_alias(obj, "bootindex", OBJECT(>vdev),
> >> >> >+  "bootindex");
> >> >> >+}
> >> >> >+
> >> >> >+static Property vhost_vdpa_device_pci_properties[] = {
> >> >> >+DEFINE_PROP_END_OF_LIST(),
> >> >> >+};
> >> >> >+
> >> >> >+static void
> >> >> >+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> >> >> >+{
> >> >> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
> >> >> >+DeviceState *vdev = DEVICE(>vdev);
> >> >> >+uint32_t vdev_id;
> >> >> >+uint32_t num_queues;
> >> >> >+int fd;
> >> >> >+
> >> >> >+fd = qemu_open(dev->vdev.vdpa_dev, O_RDWR, errp);
> >> >>
> >> >> We should use `vdpa_dev_fd` if the user set it, and I think we should
> >> >> also check that `vdpa_dev` is not null.
> >> >>
> >> >
> >> >The user can set both 'vdpa_dev_fd' and 'vdpa_dev' now, but how
> >> >to make sure the 'vdpa_dev_fd' is really a FD of the 'vdpa_dev' ?
> >> >Maybe we should remove 'vdpa_dev_fd' from
> >> >'vhost_vdpa_device_properties',
> >> >so the user can only set 'vdpa_dev'.
> >>
> >> This is the same problem that would happen if the user passed a path any
> >> file or device (e.g. /dev/null). I believe that on the first operation
> >> on it (e.g. an ioctl) we would get an error and exit.
> >>
> >
> >Yes, but how about the 'vdpa_dev_fd' refers to /dev/vhost-vdpa-0 but
> >the 'vdpa_dev' refers to /dev/vhost-vdpa-1 ? Should we need to consider
> >this case ?
> 
> I think we can do as we already do in vhost-scsi and vhost-vsock. If fd
> is set we use that otherwise we use path.
> 
> If we want we can also print an error and exit if both are set, since
> IMHO should be set only one of the two.
> 

Make sense, I'll try. Thanks.

> Thanks,
> Stefano




RE: [PATCH v2 04/10] vdpa-dev: implement the instance_init/class_init interface

2022-03-07 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Monday, March 7, 2022 4:10 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> Hi Longpeng,
> 
> On Sat, Mar 05, 2022 at 06:06:42AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >Hi Stefano,
> >
> >> -Original Message-
> >> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> >> Sent: Wednesday, January 19, 2022 7:24 PM
> >> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> >> 
> >> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> >> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> >> ; Huangzhichao ;
> >> qemu-devel@nongnu.org
> >> Subject: Re: [PATCH v2 04/10] vdpa-dev: implement the
> instance_init/class_init
> >> interface
> >>
> >> On Mon, Jan 17, 2022 at 08:43:25PM +0800, Longpeng(Mike) via wrote:
> >> >From: Longpeng 
> >> >
> >> >Implements the .instance_init and the .class_init interface.
> >> >
> >> >Signed-off-by: Longpeng 
> >> >---
> >> > hw/virtio/vdpa-dev-pci.c | 52 ++-
> >> > hw/virtio/vdpa-dev.c | 81 +++-
> >> > include/hw/virtio/vdpa-dev.h |  5 +++
> >> > 3 files changed, 134 insertions(+), 4 deletions(-)
> >> >
> >> >diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> >> >index a5a7b528a9..257538dbdd 100644
> >> >--- a/hw/virtio/vdpa-dev-pci.c
> >> >+++ b/hw/virtio/vdpa-dev-pci.c
> >> >@@ -25,12 +25,60 @@ struct VhostVdpaDevicePCI {
> >> >
> >> > static void vhost_vdpa_device_pci_instance_init(Object *obj)
> >> > {
> >> >-return;
> >> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
> >> >+
> >> >+virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
> >> >+TYPE_VHOST_VDPA_DEVICE);
> >> >+object_property_add_alias(obj, "bootindex", OBJECT(>vdev),
> >> >+  "bootindex");
> >> >+}
> >> >+
> >> >+static Property vhost_vdpa_device_pci_properties[] = {
> >> >+DEFINE_PROP_END_OF_LIST(),
> >> >+};
> >> >+
> >> >+static void
> >> >+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> >> >+{
> >> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
> >> >+DeviceState *vdev = DEVICE(>vdev);
> >> >+uint32_t vdev_id;
> >> >+uint32_t num_queues;
> >> >+int fd;
> >> >+
> >> >+fd = qemu_open(dev->vdev.vdpa_dev, O_RDWR, errp);
> >>
> >> We should use `vdpa_dev_fd` if the user set it, and I think we should
> >> also check that `vdpa_dev` is not null.
> >>
> >
> >The user can set both 'vdpa_dev_fd' and 'vdpa_dev' now, but how
> >to make sure the 'vdpa_dev_fd' is really a FD of the 'vdpa_dev' ?
> >Maybe we should remove 'vdpa_dev_fd' from
> >'vhost_vdpa_device_properties',
> >so the user can only set 'vdpa_dev'.
> 
> This is the same problem that would happen if the user passed a path any
> file or device (e.g. /dev/null). I believe that on the first operation
> on it (e.g. an ioctl) we would get an error and exit.
> 

Yes, but how about the 'vdpa_dev_fd' refers to /dev/vhost-vdpa-0 but
the 'vdpa_dev' refers to /dev/vhost-vdpa-1 ? Should we need to consider
this case ?

> I think we should allow to specify an fd (as we already do for other
> vhost devices), because it's a common use case when qemu is launched
> from higher layers (e.g. libvirt).
> 
> >
> >> >+if (*errp) {
> >> >+return;
> >> >+}
> >> >+
> >> >+vdev_id = vhost_vdpa_device_get_u32(fd, VHOST_VDPA_GET_DEVICE_ID,
> errp);
> >> >+if (*errp) {
> >> >+qemu_close(fd);
> >> >+return;
> >> >+}
> >> >+
> >> >+num_queues = vhost_vdpa_device_get_u32(fd, VHOST_VDPA_GET_VQS_NUM,
> >> errp);
> >>   ^
> >> The build fails here, I think this should be VHOST_VDPA_GET_VQS_COUNT
> >>
> >
> >Yes, I sent a wrong version, I'll send v3 later.
> 
> Thanks,
> Stefano




RE: [PATCH v2 06/10] vdpa-dev: implement the unrealize interface

2022-03-04 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Wednesday, January 19, 2022 7:36 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 06/10] vdpa-dev: implement the unrealize interface
> 
> On Mon, Jan 17, 2022 at 08:43:27PM +0800, Longpeng(Mike) via wrote:
> >From: Longpeng 
> >
> >Implements the .unrealize interface.
> >
> >Signed-off-by: Longpeng 
> >---
> > hw/virtio/vdpa-dev.c | 24 +++-
> > 1 file changed, 23 insertions(+), 1 deletion(-)
> >
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index bd28cf7a15..e5691d02bb 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -132,9 +132,31 @@ out:
> > s->vdpa_dev_fd = -1;
> > }
> >
> >+static void vhost_vdpa_vdev_unrealize(VhostVdpaDevice *s)
> >+{
> >+VirtIODevice *vdev = VIRTIO_DEVICE(s);
> >+int i;
> >+
> >+for (i = 0; i < s->num_queues; i++) {
> >+virtio_delete_queue(s->virtqs[i]);
> >+}
> >+g_free(s->virtqs);
> >+virtio_cleanup(vdev);
> >+
> >+g_free(s->config);
> 
> Is there a particular reason for these steps in a separate function?
> 
> >+}
> >+
> > static void vhost_vdpa_device_unrealize(DeviceState *dev)
> > {
> >-return;
> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >+
> >+virtio_set_status(vdev, 0);
> >+vhost_vdpa_vdev_unrealize(s);
> >+g_free(s->dev.vqs);
> >+vhost_dev_cleanup(>dev);
> >+qemu_close(s->vdpa_dev_fd);
> >+s->vdpa_dev_fd = -1;
> > }
> 
> Maybe we can have all steps (in the reverse order of
> vhost_vdpa_device_realize) in vhost_vdpa_device_unrealize().
> 

Make sense. Will do.

> Stefano




RE: [PATCH v2 05/10] vdpa-dev: implement the realize interface

2022-03-04 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Wednesday, January 19, 2022 7:31 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 05/10] vdpa-dev: implement the realize interface
> 
> On Mon, Jan 17, 2022 at 08:43:26PM +0800, Longpeng(Mike) via wrote:
> >From: Longpeng 
> >
> >Implements the .realize interface.
> >
> >Signed-off-by: Longpeng 
> >---
> > hw/virtio/vdpa-dev.c | 101 +++
> > include/hw/virtio/vdpa-dev.h |   8 +++
> > 2 files changed, 109 insertions(+)
> >
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index b103768f33..bd28cf7a15 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -27,9 +27,109 @@ uint32_t vhost_vdpa_device_get_u32(int fd, unsigned long
> int cmd, Error **errp)
> > return val;
> > }
> >
> >+static void
> >+vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> >+{
> >+/* Nothing to do */
> >+}
> >+
> > static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > {
> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >+uint32_t vdev_id, max_queue_size;
> >+struct vhost_virtqueue *vqs;
> >+int i, ret;
> >+
> >+if (s->vdpa_dev_fd == -1) {
> >+s->vdpa_dev_fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> 
> So, here we are re-opening the `vdpa_dev` again (without checking if it
> is NULL).
> 
> And we re-do the same ioctls already done in
> vhost_vdpa_device_pci_realize(), so I think we should do them in a
> single place, and that place should be here.
> 
> So, what about doing all the ioctls here, setting appropriate fields in
> VhostVdpaDevice, then using that fields in
> vhost_vdpa_device_pci_realize() after qdev_realize() to set
> `class_code`, `trans_devid`, and `nvectors`?
> 

vhost_vdpa_device_pci_realize()
  qdev_realize()
virtio_device_realize()
  vhost_vdpa_device_realize()
  virtio_bus_device_plugged()
virtio_pci_device_plugged()

These three fields would be used in virtio_pci_device_plugged(), so it's too 
late to set them after qdev_realize().  And they belong to VirtIOPCIProxy, so 
we cannot set them in vhost_vdpa_device_realize() which is transport layer 
independent.

> >+if (*errp) {
> >+return;
> >+}
> >+}
> >+s->vdpa.device_fd = s->vdpa_dev_fd;
> >+
> >+max_queue_size = vhost_vdpa_device_get_u32(s->vdpa_dev_fd,
> >+   VHOST_VDPA_GET_VRING_NUM, 
> >errp);
> >+if (*errp) {
> >+goto out;
> >+}
> >+
> >+if (s->queue_size > max_queue_size) {
> >+error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> (max:%d)",
> >+   s->queue_size, max_queue_size);
> >+goto out;
> >+} else if (!s->queue_size) {
> >+s->queue_size = max_queue_size;
> >+}
> >+
> >+s->num_queues = vhost_vdpa_device_get_u32(s->vdpa_dev_fd,
> >+  VHOST_VDPA_GET_VQS_NUM, errp);
>  ^
> VHOST_VDPA_GET_VQS_COUNT
> 

OK, thanks :)

> >+if (*errp) {
> >+goto out;
> >+}
> >+
> >+if (!s->num_queues || s->num_queues > VIRTIO_QUEUE_MAX) {
> >+error_setg(errp, "invalid number of virtqueues: %u (max:%u)",
> >+   s->num_queues, VIRTIO_QUEUE_MAX);
> >+goto out;
> >+}
> >+
> >+s->dev.nvqs = s->num_queues;
> >+vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> >+s->dev.vqs = vqs;
> >+s->dev.vq_index = 0;
> >+s->dev.vq_index_end = s->dev.nvqs;
> >+s->dev.backend_features = 0;
> >+s->started = false;
> >+
> >+ret = vhost_dev_init(>dev, >vdpa, VHOST_BACKEND_TYPE_VDPA, 0,
> NULL);
> >+if (ret < 0) {
> >+error_setg(errp, "vhost-vdpa-device: vhost initialization
> failed: %s",
> >+   strerror(-ret));
> >+goto free_vqs;
> >+}
> >+
> >+vdev_id = vhost_vdpa_device_get_u32(s->vdpa_dev_fd,
> >+VHOST_VDPA_GET_DEVICE_ID, errp);
> >+if (ret < 0) {
> >+error_setg(errp, "vhost-vdpa-device: vhost get device id failed: 
> >%s",
> >+   strerror(-ret));
> >+goto vhost_cleanup;
> >+}
> >+
> >+s->config_size = vhost_vdpa_device_get_u32(s->vdpa_dev_fd,
> >+   VHOST_VDPA_GET_CONFIG_SIZE,
> errp);
> >+if (*errp) {
> >+goto vhost_cleanup;
> >+}
> >+s->config = g_malloc0(s->config_size);
> >+
> >+ret = vhost_dev_get_config(>dev, s->config, s->config_size, NULL);
> >+if (ret < 0) {
> >+error_setg(errp, "vhost-vdpa-device: get config failed");
> >+

RE: [PATCH v2 04/10] vdpa-dev: implement the instance_init/class_init interface

2022-03-04 Thread longpeng2--- via
Hi Stefano,

> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Wednesday, January 19, 2022 7:24 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; coh...@redhat.com;
> pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org
> Subject: Re: [PATCH v2 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> On Mon, Jan 17, 2022 at 08:43:25PM +0800, Longpeng(Mike) via wrote:
> >From: Longpeng 
> >
> >Implements the .instance_init and the .class_init interface.
> >
> >Signed-off-by: Longpeng 
> >---
> > hw/virtio/vdpa-dev-pci.c | 52 ++-
> > hw/virtio/vdpa-dev.c | 81 +++-
> > include/hw/virtio/vdpa-dev.h |  5 +++
> > 3 files changed, 134 insertions(+), 4 deletions(-)
> >
> >diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> >index a5a7b528a9..257538dbdd 100644
> >--- a/hw/virtio/vdpa-dev-pci.c
> >+++ b/hw/virtio/vdpa-dev-pci.c
> >@@ -25,12 +25,60 @@ struct VhostVdpaDevicePCI {
> >
> > static void vhost_vdpa_device_pci_instance_init(Object *obj)
> > {
> >-return;
> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
> >+
> >+virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
> >+TYPE_VHOST_VDPA_DEVICE);
> >+object_property_add_alias(obj, "bootindex", OBJECT(>vdev),
> >+  "bootindex");
> >+}
> >+
> >+static Property vhost_vdpa_device_pci_properties[] = {
> >+DEFINE_PROP_END_OF_LIST(),
> >+};
> >+
> >+static void
> >+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> >+{
> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
> >+DeviceState *vdev = DEVICE(>vdev);
> >+uint32_t vdev_id;
> >+uint32_t num_queues;
> >+int fd;
> >+
> >+fd = qemu_open(dev->vdev.vdpa_dev, O_RDWR, errp);
> 
> We should use `vdpa_dev_fd` if the user set it, and I think we should
> also check that `vdpa_dev` is not null.
> 

The user can set both 'vdpa_dev_fd' and 'vdpa_dev' now, but how
to make sure the 'vdpa_dev_fd' is really a FD of the 'vdpa_dev' ?
Maybe we should remove 'vdpa_dev_fd' from 'vhost_vdpa_device_properties',
so the user can only set 'vdpa_dev'.

> >+if (*errp) {
> >+return;
> >+}
> >+
> >+vdev_id = vhost_vdpa_device_get_u32(fd, VHOST_VDPA_GET_DEVICE_ID, errp);
> >+if (*errp) {
> >+qemu_close(fd);
> >+return;
> >+}
> >+
> >+num_queues = vhost_vdpa_device_get_u32(fd, VHOST_VDPA_GET_VQS_NUM,
> errp);
>   ^
> The build fails here, I think this should be VHOST_VDPA_GET_VQS_COUNT
> 

Yes, I sent a wrong version, I'll send v3 later.

> >+if (*errp) {
> >+qemu_close(fd);
> >+return;
> >+}
> >+
> >+dev->vdev.vdpa_dev_fd = fd;
> >+vpci_dev->class_code = virtio_pci_get_class_id(vdev_id);
> >+vpci_dev->trans_devid = virtio_pci_get_trans_devid(vdev_id);
> >+/* one for config interrupt, one per vq */
> >+vpci_dev->nvectors = num_queues + 1;
> >+qdev_realize(vdev, BUS(_dev->bus), errp);
> > }
> >
> > static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void
> > *data)
> > {
> >-return;
> >+DeviceClass *dc = DEVICE_CLASS(klass);
> >+VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> >+
> >+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >+device_class_set_props(dc, vhost_vdpa_device_pci_properties);
> >+k->realize = vhost_vdpa_device_pci_realize;
> > }
> >
> > static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index f4f92b90b0..b103768f33 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -15,16 +15,93 @@
> > #include "sysemu/sysemu.h"
> > #include "sysemu/runstate.h"
> >
> >-static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
> >+uint32_t vhost_vdpa_device_get_u32(int fd, unsigned long int cmd, Error
> **errp)
> >+{
> >+uint32_t val = (uint32_t)-1;
> >+
> >+if (ioctl(fd, cmd, ) < 0) {
> >+error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> >+   cmd, strerror(errno));
> >+}
> >+
> >+return val;
> >+}
> >+
> >+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > {
> > return;
> > }
> >
> >-static void vhost_vdpa_device_instance_init(Object *obj)
> >+static void vhost_vdpa_device_unrealize(DeviceState *dev)
> >+{
> >+return;
> >+}
> >+
> >+static void
> >+vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
> >+{
> >+return;
> >+}
> >+
> >+static void
> >+vhost_vdpa_device_set_config(VirtIODevice *vdev, const uint8_t *config)
> >+{
> >+return;
> >+}
> >+
> >+static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
> >+   

RE: [PATCH] thread-posix: optimize qemu_sem_timedwait with zero timeout

2022-02-23 Thread longpeng2--- via



> -Original Message-
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
> Bonzini
> Sent: Wednesday, February 23, 2022 4:36 PM
> To: qemu-devel@nongnu.org
> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Subject: [PATCH] thread-posix: optimize qemu_sem_timedwait with zero timeout
> 
> In this case there is no need to call pthread_cond_timedwait; the
> function is just a trywait and waiting on the condition variable would
> always time out.
> 
> Based-on: <20220222090507.2028-1-longpe...@huawei.com>
> Cc: "Longpeng(Mike)" 
> Signed-off-by: Paolo Bonzini 
> ---
>  util/qemu-thread-posix.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index f2ce47d36b..89c23f1d64 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -282,8 +282,12 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms)
>  compute_abs_deadline(, ms);
>  qemu_mutex_lock(>mutex);
>  while (sem->count == 0) {
> -rc = qemu_cond_timedwait_ts(>cond, >mutex, ,
> -__FILE__, __LINE__);
> +if (ms == 0) {
> +rc = false;
> +} else {
> +rc = qemu_cond_timedwait_ts(>cond, >mutex, ,
> +__FILE__, __LINE__);
> +}
>  if (!rc) { /* timeout */
>  break;
>  }
> --
> 2.34.1

Reviewed-by: Longpeng 

Thanks.



RE: [RFC 0/2] qemu-sem-posix: use monotonic clock instead

2022-02-21 Thread longpeng2--- via


> -Original Message-
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
> Bonzini
> Sent: Monday, February 21, 2022 7:31 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> ; berra...@redhat.com; m...@redhat.com
> Cc: qemu-devel@nongnu.org; Gonglei (Arei) 
> Subject: Re: [RFC 0/2] qemu-sem-posix: use monotonic clock instead
> 
> On 2/21/22 10:56, Longpeng(Mike) via wrote:
> > The qemu_sem_timedwait() uses system time as default, it would be affected
> by
> > changes to the system time. In the real scenario, the time that goes faster
> or
> > slower is a common case and the NTP service could help us to sync time
> > periodically.
> >
> > This patchset uses monotonic clock instead of the realtime clock, this could
> > make sure we would not be affected by the system time anymore.
> 
> This looks good, I don't think there are cases where a more optimized
> semaphore is necessary (if there were, we could introduce a futex
> fallback on Linux).
> 
> However, pthread_condattr_t need not be in the struct.  The attributes
> can be allocated on the stack, because they do not have to remain alive
> after pthread_cond_init.
> 

OK, will do in the next version, thanks!

> Thanks,
> 
> Paolo
> 
> > Longpeng (Mike) (2):
> >sem-posix: remove the posix semaphore support
> >sem-posix: use monotonic clock instead
> >
> >   include/qemu/thread-posix.h |  5 +--
> >   meson.build | 12 ++-
> >   util/qemu-thread-posix.c| 82
> +++--
> >   3 files changed, 39 insertions(+), 60 deletions(-)
> >



RE: [RFC 1/2] sem-posix: remove the posix semaphore support

2022-02-21 Thread longpeng2--- via
Hi Daniel,

> -Original Message-
> From: Daniel P. Berrangé [mailto:berra...@redhat.com]
> Sent: Monday, February 21, 2022 7:12 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: pbonz...@redhat.com; m...@redhat.com; qemu-devel@nongnu.org; Gonglei 
> (Arei)
> 
> Subject: Re: [RFC 1/2] sem-posix: remove the posix semaphore support
> 
> On Mon, Feb 21, 2022 at 05:56:16PM +0800, Longpeng(Mike) via wrote:
> > POSIX specifies an absolute time for sem_timedwait(), it would be
> > affected if the system time is changing, but there is not a relative
> > time or monotonic clock version of sem_timedwait, so we cannot gain
> > from POSIX semaphore anymore.
> 
> It doesn't appear in any man pages on my systems, but there is a
> new-ish API  sem_clockwait() that accepts a choice of clock as a
> parameter.
> 
> This is apparently a proposed POSIX standard API introduced in
> glibc 2.30, along with several others:
> 

But the API is only supported in glibc.

https://www.gnu.org/software/gnulib/manual/html_node/sem_005fclockwait.html

> https://sourceware.org/legacy-ml/libc-announce/2019/msg1.html
> 
> [quote]
> * Add new POSIX-proposed pthread_cond_clockwait, pthread_mutex_clocklock,
>   pthread_rwlock_clockrdlock, pthread_rwlock_clockwrlock and sem_clockwait
>   functions.  These behave similarly to their "timed" equivalents, but also
>   accept a clockid_t parameter to determine which clock their timeout should
>   be measured against.  All functions allow waiting against CLOCK_MONOTONIC
>   and CLOCK_REALTIME.  The decision of which clock to be used is made at the
>   time of the wait (unlike with pthread_condattr_setclock, which requires
>   the clock choice at initialization time).
> [/quote]
> 

It seems pthread_condattr_setclock() can meet our requirement here, it's OK
for us to choose the clock at initialization time.

> > diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
> > index b792e6e..5466608 100644
> > --- a/include/qemu/thread-posix.h
> > +++ b/include/qemu/thread-posix.h
> > @@ -27,13 +27,9 @@ struct QemuCond {
> >  };
> >
> >  struct QemuSemaphore {
> > -#ifndef CONFIG_SEM_TIMEDWAIT
> >  pthread_mutex_t lock;
> >  pthread_cond_t cond;
> >  unsigned int count;
> > -#else
> > -sem_t sem;
> > -#endif
> >  bool initialized;
> >  };
> 
> As a point of history, the original  code only used sem_t. The pthreads
> based fallback was introduced by Paolo in
> 
>   commit c166cb72f1676855816340666c3b618beef4b976
>   Author: Paolo Bonzini 
>   Date:   Fri Nov 2 15:43:21 2012 +0100
> 
> semaphore: implement fallback counting semaphores with mutex+condvar
> 
> OpenBSD and Darwin do not have sem_timedwait.  Implement a fallback
> for them.
> 
> Signed-off-by: Paolo Bonzini 
> Signed-off-by: Anthony Liguori 
> 
> I'm going to assume this fallback impl is less efficient than the
> native sem_t impl as the reason for leaving the original impl, or
> maybe Paolo just want to risk accidental bugs by removing the
> existing usage ?
> 

Paolo has replied, seems this change is acceptable, so I'll continue to
work on this solution. Thanks :)

> 
> Regards,
> Daniel
> --
> |: https://berrange.com  -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-
> https://www.instagram.com/dberrange :|



RE: [RFC 2/2] sem-posix: use monotonic clock instead

2022-02-21 Thread longpeng2--- via


> -Original Message-
> From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo 
> Bonzini
> Sent: Monday, February 21, 2022 7:42 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> ; berra...@redhat.com; m...@redhat.com
> Cc: qemu-devel@nongnu.org; Gonglei (Arei) 
> Subject: Re: [RFC 2/2] sem-posix: use monotonic clock instead
> 
> On 2/21/22 10:56, Longpeng(Mike) via wrote:
> > +long now_nsec;
> > +#ifdef CONFIG_PTHREAD_CONDATTR_SETCLOCK
> > +struct timespec now;
> > +clock_gettime(CLOCK_MONOTONIC, );
> > +now_sec = now.tv_sec;
> > +now_nsec = now.tv_nsec;
> > +#else
> >   struct timeval tv;
> >   gettimeofday(, NULL);
> > -ts->tv_nsec = tv.tv_usec * 1000 + (ms % 1000) * 100;
> > -ts->tv_sec = tv.tv_sec + ms / 1000;
> > +now_sec = tv.tv_sec;
> > +now_nsec = tv.tv_usec * 1000;
> > +#endif
> > +
> 
> Perhaps this might minimize the amount of conditional code, too:
> 
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index 903fa33965..4743d7b714 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -40,10 +40,14 @@ static void error_exit(int err, const char *msg)
> 
>   static void compute_abs_deadline(struct timespec *ts, int ms)
>   {
> -struct timeval tv;
> -gettimeofday(, NULL);
> -ts->tv_nsec = tv.tv_usec * 1000 + (ms % 1000) * 100;
> -ts->tv_sec = tv.tv_sec + ms / 1000;
> +#ifdef CONFIG_PTHREAD_CONDATTR_SETCLOCK
> +clock_gettime(CLOCK_MONOTONIC, ts);
> +#else
> +clock_gettime(CLOCK_REALTIME, ts);
> +#endif
> +
> +ts->tv_nsec += (ms % 1000) * 100;
> +ts->tv_sec += ms / 1000;
>   if (ts->tv_nsec >= 10) {
>   ts->tv_sec++;
>   ts->tv_nsec -= 10;
> 
> 
> Finally, the conditional variables initialization qemu_cond_init must
> also use pthread_condattr_setclock.
> 

Make sense! Will optimize the code in the next version. Thanks.

> Paolo


RE: [RFC 05/10] vdpa-dev: implement the realize interface

2022-01-17 Thread longpeng2--- via



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Thursday, January 6, 2022 7:34 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: m...@redhat.com; jasow...@redhat.com; sgarz...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> 
> On Thu, Jan 06, 2022 at 03:02:37AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> > > -Original Message-
> > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > Sent: Wednesday, January 5, 2022 6:18 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > 
> > > Cc: m...@redhat.com; jasow...@redhat.com; sgarz...@redhat.com;
> > > coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> > > ; Yechuan ; Huangzhichao
> > > ; qemu-devel@nongnu.org
> > > Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> > >
> > > On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> > > > From: Longpeng 
> > > >
> > > > Implements the .realize interface.
> > > >
> > > > Signed-off-by: Longpeng 
> > > > ---
> > > >  hw/virtio/vdpa-dev.c | 114 +++
> > > >  include/hw/virtio/vdpa-dev.h |   8 +++
> > > >  2 files changed, 122 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> > > > index 790117fb3b..2d534d837a 100644
> > > > --- a/hw/virtio/vdpa-dev.c
> > > > +++ b/hw/virtio/vdpa-dev.c
> > > > @@ -15,9 +15,122 @@
> > > >  #include "sysemu/sysemu.h"
> > > >  #include "sysemu/runstate.h"
> > > >
> > > > +static void
> > > > +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue
> *vq)
> > > > +{
> > > > +/* Nothing to do */
> > > > +}
> > > > +
> > > > +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
> > >
> > > This looks similar to the helper function in a previous patch but this
> > > time the return value type is int instead of uint32_t. Please make the
> > > types consistent.
> > >
> >
> > OK.
> >
> > > > +{
> > > > +int val;
> > > > +
> > > > +if (ioctl(fd, cmd, ) < 0) {
> > > > +error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> > > > +   cmd, strerror(errno));
> > > > +return -1;
> > > > +}
> > > > +
> > > > +return val;
> > > > +}
> > > > +
> > > > +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> > > > +{
> > > > +return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> > > > +}
> > > > +
> > > > +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> > > > +{
> > > > +return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> > > > +}
> > > > +
> > > > +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> > > > +{
> > > > +return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE,
> errp);
> > > > +}
> > > > +
> > > >  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > > >  {
> > > > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > > > +VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> > > > +uint32_t device_id;
> > > > +int max_queue_size;
> > > > +int fd;
> > > > +int i, ret;
> > > > +
> > > > +fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> > > > +if (fd == -1) {
> > > > +return;
> > > > +}
> > > > +s->vdpa.device_fd = fd;
> > >
> > > This is the field I suggest exposing as a QOM property so it can be set
> > > from the proxy object (e.g. when the PCI proxy opens the vdpa device
> > > before our .realize() function is called).
> > >
> >
> > OK.
> >
> > > > +
> > > > +max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> > > > +if (*errp) {
> > > > +goto out;
> > > > +}
> > > > +
> > > > +if (s->queue_size > max_queue_size) {
> > > > +error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> > > (max:%d)",
> > > > +   s->queue_size, max_queue_size);
> > > > +goto out;
> > > > +} else if (!s->queue_size) {
> > > > +s->queue_size = max_queue_size;
> > > > +}
> > > > +
> > > > +ret = vdpa_dev_get_vqs_num(fd, errp);
> > > > +if (*errp) {
> > > > +goto out;
> > > > +}
> > > > +
> > > > +s->dev.nvqs = ret;
> > >
> > > There is no input validation because we trust the kernel vDPA return
> > > values. That seems okay for now but if there is a vhost-user version of
> > > this in the future then input validation will be necessary to achieve
> > > isolation between QEMU and the vhost-user processes. I suggest including
> > > input validation code right away because it's harder to audit the code
> > > and fix missing input validation later on.
> > >
> >
> > Make sense!
> >
> > Should we only need to validate the upper boundary (e.g.  
> Careful, ret is currently an int so negative 

RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id

2022-01-09 Thread longpeng2--- via



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Monday, January 10, 2022 1:43 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; jasow...@redhat.com; sgarz...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 05, 2022 at 08:58:51AM +0800, Longpeng(Mike) wrote:
> > From: Longpeng 
> >
> > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > deivce which is specificed by the "Virtio Device ID".
> 
> ton of typos here.
> 

Will fix all in the V2.

> > These helpers will be used to build the generic vDPA device later.
> >
> > Signed-off-by: Longpeng 
> > ---
> >  hw/virtio/virtio-pci.c | 93 ++
> >  hw/virtio/virtio-pci.h |  4 ++
> >  2 files changed, 97 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 750aa47ec1..843085c4ea 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "exec/memop.h"
> >  #include "standard-headers/linux/virtio_pci.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> >  #include "hw/boards.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "migration/qemu-file-types.h"
> > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> QEMUFile *f)
> >  return 0;
> >  }
> >
> > +typedef struct VirtIOPCIIDInfo {
> > +uint16_t vdev_id; /* virtio id */
> > +uint16_t pdev_id; /* pci device id */
> > +uint16_t class_id;
> > +} VirtIOPCIIDInfo;
> 
> 
> if this is transitional as comment says make it explicit
> in the names and comments.
> 

OK.

> > +
> > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > +{
> > +.vdev_id = VIRTIO_ID_NET,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > +.class_id = PCI_CLASS_NETWORK_ETHERNET,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_BLOCK,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_CONSOLE,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_SCSI,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_9P,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > +.class_id = PCI_BASE_CLASS_NETWORK,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_VSOCK,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_IOMMU,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_MEM,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_PMEM,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_RNG,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_BALLOON,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +};
> > +
> 
> 
> this is the list from the spec:
> 
> 
> So this is the list from the spec:
> 
> 0x1000 network card
> 0x1001 block device
> 0x1002 memory ballooning (traditional)
> 0x1003 console
> 0x1004 SCSI host
> 0x1005 entropy source
> 0x1009 9P transport
> 

Why the following device IDs are introduced? They are non
transitional devices.

#define PCI_DEVICE_ID_VIRTIO_VSOCK   0x1012
#define PCI_DEVICE_ID_VIRTIO_PMEM0x1013
#define PCI_DEVICE_ID_VIRTIO_IOMMU   0x1014
#define PCI_DEVICE_ID_VIRTIO_MEM 0x1015

> 
> I'd drop all the rest, use the algorithm for non transitional.
> And when class is other I'd just not include it in the array,
> make this the default.
> 
> 
> 
> > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > +{
> > +VirtIOPCIIDInfo info = {};
> > +int i;
> > +
> > +for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > +if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > +info = virtio_pci_id_info[i];
> > +break;
> > +}
> > +}
> > +
> > +return info;
> > +}
> > +
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > +{
> > +return virtio_pci_get_id_info(device_id).pdev_id;
> > +}
> > +
> > +uint16_t 

RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id

2022-01-09 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Wednesday, January 5, 2022 2:15 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: Stefan Hajnoczi ; mst ; Stefano
> Garzarella ; Cornelia Huck ; pbonzini
> ; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ; qemu-devel
> 
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 5, 2022 at 1:48 PM Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.)  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Jason Wang [mailto:jasow...@redhat.com]
> > > Sent: Wednesday, January 5, 2022 12:38 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > 
> > > Cc: Stefan Hajnoczi ; mst ; Stefano
> > > Garzarella ; Cornelia Huck ;
> pbonzini
> > > ; Gonglei (Arei) ; Yechuan
> > > ; Huangzhichao ; qemu-devel
> > > 
> > > Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the 
> > > virtio
> > > id
> > >
> > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike)  
> > > wrote:
> > > >
> > > > From: Longpeng 
> > > >
> > > > Add helpers to get the "Transitional PCI Device ID" and "class_id" of 
> > > > the
> > > > deivce which is specificed by the "Virtio Device ID".
> > > >
> > > > These helpers will be used to build the generic vDPA device later.
> > > >
> > > > Signed-off-by: Longpeng 
> > > > ---
> > > >  hw/virtio/virtio-pci.c | 93 ++
> > > >  hw/virtio/virtio-pci.h |  4 ++
> > > >  2 files changed, 97 insertions(+)
> > > >
> > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > > index 750aa47ec1..843085c4ea 100644
> > > > --- a/hw/virtio/virtio-pci.c
> > > > +++ b/hw/virtio/virtio-pci.c
> > > > @@ -19,6 +19,7 @@
> > > >
> > > >  #include "exec/memop.h"
> > > >  #include "standard-headers/linux/virtio_pci.h"
> > > > +#include "standard-headers/linux/virtio_ids.h"
> > > >  #include "hw/boards.h"
> > > >  #include "hw/virtio/virtio.h"
> > > >  #include "migration/qemu-file-types.h"
> > > > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, 
> > > > int
> n,
> > > QEMUFile *f)
> > > >  return 0;
> > > >  }
> > > >
> > > > +typedef struct VirtIOPCIIDInfo {
> > > > +uint16_t vdev_id; /* virtio id */
> > > > +uint16_t pdev_id; /* pci device id */
> > > > +uint16_t class_id;
> > > > +} VirtIOPCIIDInfo;
> > > > +
> > > > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > > > +{
> > >
> > > Any way to get rid of this array? E.g using the algorithm that is used
> > > by the kernel virtio driver.
> > >
> >
> > For device id, we can use the algorithm if we no need to support
> > Transitional id. But how to get the class id ?
> 
> Right, I miss this. So the current code should be fine.
> 

Maybe the following way would be better? It can save about 40 lines.

#define VIRTIO_PCI_ID_INFO(name, class)   \
{VIRTIO_ID_##name, PCI_DEVICE_ID_VIRTIO_##name, class}

static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
VIRTIO_PCI_ID_INFO(NET, PCI_CLASS_NETWORK_ETHERNET),
VIRTIO_PCI_ID_INFO(BLOCK,   PCI_CLASS_STORAGE_SCSI),
VIRTIO_PCI_ID_INFO(SCSI,PCI_CLASS_STORAGE_SCSI),
VIRTIO_PCI_ID_INFO(CONSOLE, PCI_CLASS_COMMUNICATION_OTHER),
VIRTIO_PCI_ID_INFO(VSOCK,   PCI_CLASS_COMMUNICATION_OTHER),
VIRTIO_PCI_ID_INFO(IOMMU,   PCI_CLASS_OTHERS),
VIRTIO_PCI_ID_INFO(MEM, PCI_CLASS_OTHERS),
VIRTIO_PCI_ID_INFO(PMEM,PCI_CLASS_OTHERS),
VIRTIO_PCI_ID_INFO(RNG, PCI_CLASS_OTHERS),
VIRTIO_PCI_ID_INFO(BALLOON, PCI_CLASS_OTHERS),
VIRTIO_PCI_ID_INFO(9P,  PCI_BASE_CLASS_NETWORK),
};


> Thanks
> 
> >
> > > Thanks
> > >
> > > > +.vdev_id = VIRTIO_ID_NET,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > > > +.class_id = PCI_CLASS_NETWORK_ETHERNET,
> > > > +},
> > > > +{
> > > > +.vdev_id = VIRTIO_ID_BLOCK,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > > > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > > > +},
> > > > +{
> > > > +.vdev_id = VIRTIO_ID_CONSOLE,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > > > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > > +},
> > > > +{
> > > > +.vdev_id = VIRTIO_ID_SCSI,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > > > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > > > +},
> > > > +{
> > > > +.vdev_id = VIRTIO_ID_9P,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > > > +.class_id = PCI_BASE_CLASS_NETWORK,
> > > > +},
> > > > +{
> > > > +.vdev_id = VIRTIO_ID_VSOCK,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > > > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > > > +},
> > > > +{
> > > > +.vdev_id = VIRTIO_ID_IOMMU,
> > > > +.pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > > > +.class_id = PCI_CLASS_OTHERS,
> > > > +},
> > > > + 

RE: [RFC 02/10] vhost: add 3 commands for vhost-vdpa

2022-01-06 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Thursday, January 6, 2022 10:34 AM
> To: Michael S. Tsirkin 
> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> ; Stefan Hajnoczi ; Stefano
> Garzarella ; Cornelia Huck ; pbonzini
> ; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ; qemu-devel
> 
> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> 
> On Wed, Jan 5, 2022 at 8:26 PM Michael S. Tsirkin  wrote:
> >
> > On Wed, Jan 05, 2022 at 05:09:07PM +0800, Jason Wang wrote:
> > > On Wed, Jan 5, 2022 at 4:37 PM Longpeng (Mike, Cloud Infrastructure
> > > Service Product Dept.)  wrote:
> > > >
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Jason Wang [mailto:jasow...@redhat.com]
> > > > > Sent: Wednesday, January 5, 2022 3:54 PM
> > > > > To: Michael S. Tsirkin 
> > > > > Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > > > ; Stefan Hajnoczi ; Stefano
> > > > > Garzarella ; Cornelia Huck ;
> pbonzini
> > > > > ; Gonglei (Arei) ; 
> > > > > Yechuan
> > > > > ; Huangzhichao ;
> qemu-devel
> > > > > 
> > > > > Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> > > > >
> > > > > On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > > > > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike) 
> > > > > > > 
> wrote:
> > > > > > > >
> > > > > > > > From: Longpeng 
> > > > > > > >
> > > > > > > > To support generic vdpa deivce, we need add the following 
> > > > > > > > ioctls:
> > > > > > > > - GET_VECTORS_NUM: the count of vectors that supported
> > > > > > >
> > > > > > > Does this mean MSI vectors? If yes, it looks like a layer 
> > > > > > > violation:
> > > > > > > vhost is transport independent.
> > > > > >
> > > > > > Well *guest* needs to know how many vectors device supports.
> > > > > > I don't think there's a way around that. Do you?
> > > > >
> > > > > We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> > > > > simply assume #vqs + 1?
> > > > >
> > > > > > Otherwise guests will at best be suboptimal.
> > > > > >
> > > > > > >  And it reveals device implementation
> > > > > > > details which block (cross vendor) migration.
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > > Not necessarily, userspace can hide this from guest if it
> > > > > > wants to, just validate.
> > > > >
> > > > > If we can hide it at vhost/uAPI level, it would be even better?
> > > > >
> > > >
> > > > Not only MSI vectors, but also queue-size, #vqs, etc.
> > >
> > > MSI is PCI specific, we have non PCI vDPA parent e.g VDUSE/simulator/mlx5
> > >
> > > And it's something that is not guaranteed to be not changed. E.g some
> > > drivers may choose to allocate MSI during set_status() which can fail
> > > for various reasons.
> > >
> > > >
> > > > Maybe the vhost level could expose the hardware's real capabilities
> > > > and let the userspace (QEMU) do the hiding? The userspace know how
> > > > to process them.
> > >
> > > #MSI vectors is much more easier to be mediated than queue-size and #vqs.
> > >
> > > For interrupts, we've already had VHOST_SET_X_KICK, we can keep
> > > allocating eventfd based on #MSI vectors to make it work with any
> > > number of MSI vectors that the virtual device had.
> >
> > Right but if hardware does not support so many then what?
> > Just fail?
> 
> Or just trigger the callback of vqs that shares the vector.
> 

Then we should disable PI if we need to share a vector in this case?

> > Having a query API would make things somewhat cleaner imho.
> 
> I may miss something,  even if we know #vectors, we still don't know
> the associated virtqueues for a dedicated vector?
> 
> >
> > > For queue-size, it's Ok to have a new uAPI but it's not a must, Qemu
> > > can simply fail if SET_VRING_NUM fail.
> > >
> > > For #vqs, it's OK to have a new uAPI since the emulated virtio-pci
> > > device requires knowledge the #vqs in the config space. (still not a
> > > must, we can enumerate #vqs per device type)
> > >
> > > For the config size, it's OK but not a must, technically we can simply
> > > relay what guest write to vhost-vdpa. It's just because current Qemu
> > > require to have it during virtio device initialization.
> > >
> > > Thanks
> >
> >
> > I agree but these ok things make for a cleaner API I think.
> 
> Right.
> 
> Thanks
> 
> >
> > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > >
> > > > > > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > > > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > > > > > >
> > > > > > > > Signed-off-by: Longpeng 
> > > > > > > > ---
> > > > > > > >  linux-headers/linux/vhost.h | 10 ++
> > > > > > > >  1 file changed, 10 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/linux-headers/linux/vhost.h
> b/linux-headers/linux/vhost.h
> > > > > > > > index c998860d7b..c5edd75d15 100644
> > 

RE: [RFC 06/10] vdpa-dev: implement the unrealize interface

2022-01-05 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Wednesday, January 5, 2022 7:16 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; jasow...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 06/10] vdpa-dev: implement the unrealize interface
> 
> On Wed, Jan 05, 2022 at 08:58:56AM +0800, Longpeng(Mike) wrote:
> >From: Longpeng 
> >
> >Implements the .unrealize interface.
> >
> >Signed-off-by: Longpeng 
> >---
> > hw/virtio/vdpa-dev.c | 22 +-
> > 1 file changed, 21 insertions(+), 1 deletion(-)
> >
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index 2d534d837a..4e4dd3d201 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -133,9 +133,29 @@ out:
> > close(fd);
> > }
> >
> >+static void vhost_vdpa_vdev_unrealize(VhostVdpaDevice *s)
> >+{
> >+VirtIODevice *vdev = VIRTIO_DEVICE(s);
> >+int i;
> >+
> >+for (i = 0; i < s->num_queues; i++) {
>^
> `s->num_queues` seems uninitialized to me, maybe we could just remove
> the num_queues field from VhostVdpaDevice, and use `s->dev.nvqs` as in
> vhost_vdpa_device_realize().
> 

Good catch, I'll fix the bug.

But I think we should keep the num_queues field, we need it if we support
migration in the next step, right?

> >+virtio_delete_queue(s->virtqs[i]);
> >+}
> >+g_free(s->virtqs);
> >+virtio_cleanup(vdev);
> >+
> >+g_free(s->config);
> >+}
> >+
> > static void vhost_vdpa_device_unrealize(DeviceState *dev)
> > {
> >-return;
> >+VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> >+VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> >+
> >+virtio_set_status(vdev, 0);
> >+vhost_dev_cleanup(>dev);
> 
> If we will use `s->dev.nvqs` in vhost_vdpa_vdev_unrealize(), we should
> call vhost_dev_cleanup() after it, just before close() as we already do
> in the error path of vhost_vdpa_device_realize().
> 

I'll try to fix the above bug first if you agree that we should keep the
num_queues field.

I just realize that I forgot to free s->dev.vqs here...

> >+vhost_vdpa_vdev_unrealize(s);
> >+close(s->vdpa.device_fd);
> > }
> >
> > static void
> >--
> >2.23.0
> >




RE: [RFC 05/10] vdpa-dev: implement the realize interface

2022-01-05 Thread longpeng2--- via



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Wednesday, January 5, 2022 6:18 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: m...@redhat.com; jasow...@redhat.com; sgarz...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 05/10] vdpa-dev: implement the realize interface
> 
> On Wed, Jan 05, 2022 at 08:58:55AM +0800, Longpeng(Mike) wrote:
> > From: Longpeng 
> >
> > Implements the .realize interface.
> >
> > Signed-off-by: Longpeng 
> > ---
> >  hw/virtio/vdpa-dev.c | 114 +++
> >  include/hw/virtio/vdpa-dev.h |   8 +++
> >  2 files changed, 122 insertions(+)
> >
> > diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> > index 790117fb3b..2d534d837a 100644
> > --- a/hw/virtio/vdpa-dev.c
> > +++ b/hw/virtio/vdpa-dev.c
> > @@ -15,9 +15,122 @@
> >  #include "sysemu/sysemu.h"
> >  #include "sysemu/runstate.h"
> >
> > +static void
> > +vhost_vdpa_device_dummy_handle_output(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +/* Nothing to do */
> > +}
> > +
> > +static int vdpa_dev_get_info_by_fd(int fd, uint64_t cmd, Error **errp)
> 
> This looks similar to the helper function in a previous patch but this
> time the return value type is int instead of uint32_t. Please make the
> types consistent.
> 

OK.

> > +{
> > +int val;
> > +
> > +if (ioctl(fd, cmd, ) < 0) {
> > +error_setg(errp, "vhost-vdpa-device: cmd 0x%lx failed: %s",
> > +   cmd, strerror(errno));
> > +return -1;
> > +}
> > +
> > +return val;
> > +}
> > +
> > +static inline int vdpa_dev_get_queue_size(int fd, Error **errp)
> > +{
> > +return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VRING_NUM, errp);
> > +}
> > +
> > +static inline int vdpa_dev_get_vqs_num(int fd, Error **errp)
> > +{
> > +return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_VQS_NUM, errp);
> > +}
> > +
> > +static inline int vdpa_dev_get_config_size(int fd, Error **errp)
> > +{
> > +return vdpa_dev_get_info_by_fd(fd, VHOST_VDPA_GET_CONFIG_SIZE, errp);
> > +}
> > +
> >  static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> >  {
> > +VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> > +VhostVdpaDevice *s = VHOST_VDPA_DEVICE(vdev);
> > +uint32_t device_id;
> > +int max_queue_size;
> > +int fd;
> > +int i, ret;
> > +
> > +fd = qemu_open(s->vdpa_dev, O_RDWR, errp);
> > +if (fd == -1) {
> > +return;
> > +}
> > +s->vdpa.device_fd = fd;
> 
> This is the field I suggest exposing as a QOM property so it can be set
> from the proxy object (e.g. when the PCI proxy opens the vdpa device
> before our .realize() function is called).
> 

OK.

> > +
> > +max_queue_size = vdpa_dev_get_queue_size(fd, errp);
> > +if (*errp) {
> > +goto out;
> > +}
> > +
> > +if (s->queue_size > max_queue_size) {
> > +error_setg(errp, "vhost-vdpa-device: invalid queue_size: %d
> (max:%d)",
> > +   s->queue_size, max_queue_size);
> > +goto out;
> > +} else if (!s->queue_size) {
> > +s->queue_size = max_queue_size;
> > +}
> > +
> > +ret = vdpa_dev_get_vqs_num(fd, errp);
> > +if (*errp) {
> > +goto out;
> > +}
> > +
> > +s->dev.nvqs = ret;
> 
> There is no input validation because we trust the kernel vDPA return
> values. That seems okay for now but if there is a vhost-user version of
> this in the future then input validation will be necessary to achieve
> isolation between QEMU and the vhost-user processes. I suggest including
> input validation code right away because it's harder to audit the code
> and fix missing input validation later on.
> 

Make sense!

Should we only need to validate the upper boundary (e.g.  > +s->dev.vqs = g_new0(struct vhost_virtqueue, s->dev.nvqs);
> > +s->dev.vq_index = 0;
> > +s->dev.vq_index_end = s->dev.nvqs;
> > +s->dev.backend_features = 0;
> > +s->started = false;
> > +
> > +ret = vhost_dev_init(>dev, >vdpa, VHOST_BACKEND_TYPE_VDPA, 0,
> NULL);
> > +if (ret < 0) {
> > +error_setg(errp, "vhost-vdpa-device: vhost initialization
> failed: %s",
> > +   strerror(-ret));
> > +goto out;
> > +}
> > +
> > +ret = s->dev.vhost_ops->vhost_get_device_id(>dev, _id);
> 
> The vhost_*() API abstracts the ioctl calls but this source file and the
> PCI proxy have ioctl calls. I wonder if it's possible to move the ioctls
> calls into the vhost_*() API? That would be cleaner and also make it
> easier to add vhost-user vDPA support in the future.

We need these ioctls calls because we need invoke them before the vhost-dev
object is initialized.



RE: [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface

2022-01-05 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Wednesday, January 5, 2022 7:29 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: stefa...@redhat.com; m...@redhat.com; jasow...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> On Wed, Jan 05, 2022 at 08:58:54AM +0800, Longpeng(Mike) wrote:
> >From: Longpeng 
> >
> >Implements the .instance_init and the .class_init interface.
> >
> >Signed-off-by: Longpeng 
> >---
> > hw/virtio/vdpa-dev-pci.c | 80 +++-
> > hw/virtio/vdpa-dev.c | 68 +-
> > include/hw/virtio/vdpa-dev.h |  2 +
> > 3 files changed, 146 insertions(+), 4 deletions(-)
> >
> >diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> >index a5a7b528a9..0af54a26d4 100644
> >--- a/hw/virtio/vdpa-dev-pci.c
> >+++ b/hw/virtio/vdpa-dev-pci.c
> >@@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
> > VhostVdpaDevice vdev;
> > };
> >
> >+static uint32_t
> >+vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)
> >+{
> >+int device_fd;
> >+uint32_t val;
> >+int ret;
> >+
> >+device_fd = qemu_open(name, O_RDWR, errp);
> >+if (device_fd == -1) {
> >+return (uint32_t)-1;
> >+}
> >+
> >+ret = ioctl(device_fd, cmd, );
> >+if (ret < 0) {
> >+error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
> >+   cmd, strerror(errno));
> >+goto out;
> >+}
> >+
> >+out:
> >+close(device_fd);
> >+return val;
> >+}
> >+
> >+static inline uint32_t
> >+vdpa_dev_pci_get_devid(VhostVdpaDevicePCI *dev, Error **errp)
> >+{
> >+return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
> >+ VHOST_VDPA_GET_DEVICE_ID, errp);
> >+}
> >+
> >+static inline uint32_t
> >+vdpa_dev_pci_get_vectors_num(VhostVdpaDevicePCI *dev, Error **errp)
> >+{
> >+return vdpa_dev_pci_get_info(dev->vdev.vdpa_dev,
> >+ VHOST_VDPA_GET_VECTORS_NUM, errp);
> >+}
> >+
> > static void vhost_vdpa_device_pci_instance_init(Object *obj)
> > {
> >-return;
> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(obj);
> >+
> >+virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
> >+TYPE_VHOST_VDPA_DEVICE);
> >+object_property_add_alias(obj, "bootindex", OBJECT(>vdev),
> >+  "bootindex");
> >+}
> >+
> >+static Property vhost_vdpa_device_pci_properties[] = {
> >+DEFINE_PROP_END_OF_LIST(),
> >+};
> >+
> >+static void
> >+vhost_vdpa_device_pci_realize(VirtIOPCIProxy *vpci_dev, Error **errp)
> >+{
> >+VhostVdpaDevicePCI *dev = VHOST_VDPA_DEVICE_PCI(vpci_dev);
> >+DeviceState *vdev = DEVICE(>vdev);
> >+uint32_t devid;
> >+uint32_t vectors;
> >+
> >+devid = vdpa_dev_pci_get_devid(dev, errp);
> >+if (*errp) {
> >+return;
> >+}
> >+
> >+vectors = vdpa_dev_pci_get_vectors_num(dev, errp);
> >+if (*errp) {
> >+return;
> >+}
> >+
> >+vpci_dev->class_code = virtio_pci_get_class_id(devid);
> >+vpci_dev->pdev_id = virtio_pci_get_pci_devid(devid);
> >+vpci_dev->nvectors = vectors;
> >+qdev_realize(vdev, BUS(_dev->bus), errp);
> > }
> >
> > static void vhost_vdpa_device_pci_class_init(ObjectClass *klass, void *data)
> > {
> >-return;
> >+DeviceClass *dc = DEVICE_CLASS(klass);
> >+VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> >+
> >+set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >+device_class_set_props(dc, vhost_vdpa_device_pci_properties);
> >+k->realize = vhost_vdpa_device_pci_realize;
> > }
> >
> > static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> >diff --git a/hw/virtio/vdpa-dev.c b/hw/virtio/vdpa-dev.c
> >index f4f92b90b0..790117fb3b 100644
> >--- a/hw/virtio/vdpa-dev.c
> >+++ b/hw/virtio/vdpa-dev.c
> >@@ -15,16 +15,80 @@
> > #include "sysemu/sysemu.h"
> > #include "sysemu/runstate.h"
> >
> >-static void vhost_vdpa_device_class_init(ObjectClass *klass, void *data)
> >+static void vhost_vdpa_device_realize(DeviceState *dev, Error **errp)
> > {
> > return;
> > }
> >
> >-static void vhost_vdpa_device_instance_init(Object *obj)
> >+static void vhost_vdpa_device_unrealize(DeviceState *dev)
> >+{
> >+return;
> >+}
> >+
> >+static void
> >+vhost_vdpa_device_get_config(VirtIODevice *vdev, uint8_t *config)
> >+{
> >+return;
> >+}
> >+
> >+static void
> >+vhost_vdpa_device_set_config(VirtIODevice *vdev, const uint8_t *config)
> > {
> > return;
> > }
> >
> >+static uint64_t vhost_vdpa_device_get_features(VirtIODevice *vdev,
> >+   uint64_t features,
> >+   Error **errp)
> >+{
> >+return (uint64_t)-1;
> >+}
> 

RE: [RFC 04/10] vdpa-dev: implement the instance_init/class_init interface

2022-01-05 Thread longpeng2--- via



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Wednesday, January 5, 2022 6:01 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: m...@redhat.com; jasow...@redhat.com; sgarz...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 04/10] vdpa-dev: implement the instance_init/class_init
> interface
> 
> On Wed, Jan 05, 2022 at 08:58:54AM +0800, Longpeng(Mike) wrote:
> > From: Longpeng 
> >
> > Implements the .instance_init and the .class_init interface.
> >
> > Signed-off-by: Longpeng 
> > ---
> >  hw/virtio/vdpa-dev-pci.c | 80 +++-
> >  hw/virtio/vdpa-dev.c | 68 +-
> >  include/hw/virtio/vdpa-dev.h |  2 +
> >  3 files changed, 146 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/virtio/vdpa-dev-pci.c b/hw/virtio/vdpa-dev-pci.c
> > index a5a7b528a9..0af54a26d4 100644
> > --- a/hw/virtio/vdpa-dev-pci.c
> > +++ b/hw/virtio/vdpa-dev-pci.c
> > @@ -23,14 +23,90 @@ struct VhostVdpaDevicePCI {
> >  VhostVdpaDevice vdev;
> >  };
> >
> > +static uint32_t
> > +vdpa_dev_pci_get_info(const char *name, uint64_t cmd, Error **errp)
> 
> vdpa_dev_pci_get_u32() might be a clearer name.
> 

OK.

> > +{
> > +int device_fd;
> > +uint32_t val;
> > +int ret;
> > +
> > +device_fd = qemu_open(name, O_RDWR, errp);
> > +if (device_fd == -1) {
> > +return (uint32_t)-1;
> > +}
> > +
> > +ret = ioctl(device_fd, cmd, );
> > +if (ret < 0) {
> > +error_setg(errp, "vhost-vdpa-device-pci: cmd 0x%lx failed: %s",
> > +   cmd, strerror(errno));
> > +goto out;
> > +}
> > +
> > +out:
> > +close(device_fd);
> 
> Race conditions are possible if the device node is replaced between
> calls. Why not open the file once and reuse the fd across ioctl calls?
> 
> Both VhostVdpaDevicePCI and VhostVdpaDevice need the fd but
> VhostVdpaDevicePCI needs it first. I suggest passing ownership of the fd
> to the VhostVdpaDevice. One way of doing this is using QOM properties so
> that VhostVdpaDevice can use the given fd instead of reopening the file.
> (If fd is -1 then VhostVdpaDevice can try to open the file. That is
> necessary when VhostVdpaDevice is used directly with virtio-mmio because
> there is no proxy object.)

Adding the fd field into the VhostVdpaDevice looks fine! I'll do it in V2.
Thanks.





RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id

2022-01-05 Thread longpeng2--- via



> -Original Message-
> From: Cornelia Huck [mailto:coh...@redhat.com]
> Sent: Wednesday, January 5, 2022 6:46 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> ; stefa...@redhat.com; m...@redhat.com;
> jasow...@redhat.com; sgarz...@redhat.com
> Cc: pbonz...@redhat.com; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ;
> qemu-devel@nongnu.org; Longpeng (Mike, Cloud Infrastructure Service Product
> Dept.) 
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 05 2022, "Longpeng(Mike)"  wrote:
> 
> > From: Longpeng 
> >
> > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > deivce which is specificed by the "Virtio Device ID".
> >
> > These helpers will be used to build the generic vDPA device later.
> >
> > Signed-off-by: Longpeng 
> > ---
> >  hw/virtio/virtio-pci.c | 93 ++
> >  hw/virtio/virtio-pci.h |  4 ++
> >  2 files changed, 97 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 750aa47ec1..843085c4ea 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "exec/memop.h"
> >  #include "standard-headers/linux/virtio_pci.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> >  #include "hw/boards.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "migration/qemu-file-types.h"
> > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> QEMUFile *f)
> >  return 0;
> >  }
> >
> > +typedef struct VirtIOPCIIDInfo {
> > +uint16_t vdev_id; /* virtio id */
> > +uint16_t pdev_id; /* pci device id */
> > +uint16_t class_id;
> > +} VirtIOPCIIDInfo;
> > +
> > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > +{
> > +.vdev_id = VIRTIO_ID_NET,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > +.class_id = PCI_CLASS_NETWORK_ETHERNET,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_BLOCK,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_CONSOLE,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_SCSI,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_9P,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > +.class_id = PCI_BASE_CLASS_NETWORK,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_VSOCK,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_IOMMU,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_MEM,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_PMEM,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_RNG,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_BALLOON,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +};
> > +
> > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > +{
> > +VirtIOPCIIDInfo info = {};
> > +int i;
> > +
> > +for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > +if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > +info = virtio_pci_id_info[i];
> > +break;
> > +}
> > +}
> > +
> > +return info;
> > +}
> > +
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > +{
> > +return virtio_pci_get_id_info(device_id).pdev_id;
> > +}
> > +
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > +{
> > +return virtio_pci_get_id_info(device_id).class_id;
> > +}
> 
> What happens if these functions are called for a device_id that is not
> in the array, e.g. if we forgot to add a new id to the array?
> 

It would return pdev_id=0 or class_id=0 as a result, the virtio device
with pdev_id=0 is undefined and class_id=0 is also treated as undefined
(PCI_CLASS_NOT_DEFINED), so the caller should check the returned value.

> Can the array be generated in some way?

For PCI Device ID:
  - If we need to support Transitional VIRTIO devices, there's no algorithm
can map a VIRTIO ID to a PCI Device ID.
  - If we only need to support 1.0+ VIRTIO devices, then we can calculate the
PCI Device ID based on the VIRTIO ID.

For Class ID, seems no way :(




RE: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev

2022-01-05 Thread longpeng2--- via



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Wednesday, January 5, 2022 5:49 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: m...@redhat.com; jasow...@redhat.com; sgarz...@redhat.com;
> coh...@redhat.com; pbonz...@redhat.com; Gonglei (Arei)
> ; Yechuan ; Huangzhichao
> ; qemu-devel@nongnu.org
> Subject: Re: [RFC 03/10] vdpa: add the infrastructure of vdpa-dev
> 
> On Wed, Jan 05, 2022 at 08:58:53AM +0800, Longpeng(Mike) wrote:
> > +static const VirtioPCIDeviceTypeInfo vhost_vdpa_device_pci_info = {
> > +.base_name   = TYPE_VHOST_VDPA_DEVICE_PCI,
> > +.generic_name= "vhost-vdpa-device-pci",
> > +.transitional_name   = "vhost-vdpa-device-pci-transitional",
> > +.non_transitional_name   = "vhost-vdpa-device-pci-non-transitional",
> 
> Does vDPA support Transitional VIRTIO devices?
> 
> I expected this device to support Modern devices only.
> 

There's already a 0.95 vdpa driver (Alibaba ENI) in the kernel source and
supporting 0.95 devices is necessary for some older GuestOS.

I'm OK if other guys also approve of supporting 1.0+ devices only :)

> Stefan



RE: [RFC 02/10] vhost: add 3 commands for vhost-vdpa

2022-01-05 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Wednesday, January 5, 2022 3:54 PM
> To: Michael S. Tsirkin 
> Cc: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> ; Stefan Hajnoczi ; Stefano
> Garzarella ; Cornelia Huck ; pbonzini
> ; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ; qemu-devel
> 
> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> 
> On Wed, Jan 5, 2022 at 3:02 PM Michael S. Tsirkin  wrote:
> >
> > On Wed, Jan 05, 2022 at 12:35:53PM +0800, Jason Wang wrote:
> > > On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike)  
> > > wrote:
> > > >
> > > > From: Longpeng 
> > > >
> > > > To support generic vdpa deivce, we need add the following ioctls:
> > > > - GET_VECTORS_NUM: the count of vectors that supported
> > >
> > > Does this mean MSI vectors? If yes, it looks like a layer violation:
> > > vhost is transport independent.
> >
> > Well *guest* needs to know how many vectors device supports.
> > I don't think there's a way around that. Do you?
> 
> We have VHOST_SET_VRING/CONFIG_CALL which is per vq. I think we can
> simply assume #vqs + 1?
> 
> > Otherwise guests will at best be suboptimal.
> >
> > >  And it reveals device implementation
> > > details which block (cross vendor) migration.
> > >
> > > Thanks
> >
> > Not necessarily, userspace can hide this from guest if it
> > wants to, just validate.
> 
> If we can hide it at vhost/uAPI level, it would be even better?
> 

Not only MSI vectors, but also queue-size, #vqs, etc.

Maybe the vhost level could expose the hardware's real capabilities
and let the userspace (QEMU) do the hiding? The userspace know how
to process them.

> Thanks
> 
> >
> >
> > > > - GET_CONFIG_SIZE: the size of the virtio config space
> > > > - GET_VQS_NUM: the count of virtqueues that exported
> > > >
> > > > Signed-off-by: Longpeng 
> > > > ---
> > > >  linux-headers/linux/vhost.h | 10 ++
> > > >  1 file changed, 10 insertions(+)
> > > >
> > > > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > > > index c998860d7b..c5edd75d15 100644
> > > > --- a/linux-headers/linux/vhost.h
> > > > +++ b/linux-headers/linux/vhost.h
> > > > @@ -150,4 +150,14 @@
> > > >  /* Get the valid iova range */
> > > >  #define VHOST_VDPA_GET_IOVA_RANGE  _IOR(VHOST_VIRTIO, 0x78, \
> > > >  struct 
> > > > vhost_vdpa_iova_range)
> > > > +
> > > > +/* Get the number of vectors */
> > > > +#define VHOST_VDPA_GET_VECTORS_NUM _IOR(VHOST_VIRTIO, 0x79, int)
> > > > +
> > > > +/* Get the virtio config size */
> > > > +#define VHOST_VDPA_GET_CONFIG_SIZE _IOR(VHOST_VIRTIO, 0x80, int)
> > > > +
> > > > +/* Get the number of virtqueues */
> > > > +#define VHOST_VDPA_GET_VQS_NUM _IOR(VHOST_VIRTIO, 0x81, int)
> > > > +
> > > >  #endif
> > > > --
> > > > 2.23.0
> > > >
> >



RE: [RFC 02/10] vhost: add 3 commands for vhost-vdpa

2022-01-04 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Wednesday, January 5, 2022 12:36 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: Stefan Hajnoczi ; mst ; Stefano
> Garzarella ; Cornelia Huck ; pbonzini
> ; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ; qemu-devel
> 
> Subject: Re: [RFC 02/10] vhost: add 3 commands for vhost-vdpa
> 
> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike)  wrote:
> >
> > From: Longpeng 
> >
> > To support generic vdpa deivce, we need add the following ioctls:
> > - GET_VECTORS_NUM: the count of vectors that supported
> 
> Does this mean MSI vectors? If yes, it looks like a layer violation:
> vhost is transport independent.  And it reveals device implementation
> details which block (cross vendor) migration.
> 

Can we set the VirtIOPCIProxy.nvectors to "the count of virtqueues + 1 
(config)" ?

> Thanks
> 
> > - GET_CONFIG_SIZE: the size of the virtio config space
> > - GET_VQS_NUM: the count of virtqueues that exported
> >
> > Signed-off-by: Longpeng 
> > ---
> >  linux-headers/linux/vhost.h | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
> > index c998860d7b..c5edd75d15 100644
> > --- a/linux-headers/linux/vhost.h
> > +++ b/linux-headers/linux/vhost.h
> > @@ -150,4 +150,14 @@
> >  /* Get the valid iova range */
> >  #define VHOST_VDPA_GET_IOVA_RANGE  _IOR(VHOST_VIRTIO, 0x78, \
> >  struct vhost_vdpa_iova_range)
> > +
> > +/* Get the number of vectors */
> > +#define VHOST_VDPA_GET_VECTORS_NUM _IOR(VHOST_VIRTIO, 0x79, int)
> > +
> > +/* Get the virtio config size */
> > +#define VHOST_VDPA_GET_CONFIG_SIZE _IOR(VHOST_VIRTIO, 0x80, int)
> > +
> > +/* Get the number of virtqueues */
> > +#define VHOST_VDPA_GET_VQS_NUM _IOR(VHOST_VIRTIO, 0x81, int)
> > +
> >  #endif
> > --
> > 2.23.0
> >



RE: [RFC 01/10] virtio: get class_id and pci device id by the virtio id

2022-01-04 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Wednesday, January 5, 2022 12:38 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: Stefan Hajnoczi ; mst ; Stefano
> Garzarella ; Cornelia Huck ; pbonzini
> ; Gonglei (Arei) ; Yechuan
> ; Huangzhichao ; qemu-devel
> 
> Subject: Re: [RFC 01/10] virtio: get class_id and pci device id by the virtio
> id
> 
> On Wed, Jan 5, 2022 at 8:59 AM Longpeng(Mike)  wrote:
> >
> > From: Longpeng 
> >
> > Add helpers to get the "Transitional PCI Device ID" and "class_id" of the
> > deivce which is specificed by the "Virtio Device ID".
> >
> > These helpers will be used to build the generic vDPA device later.
> >
> > Signed-off-by: Longpeng 
> > ---
> >  hw/virtio/virtio-pci.c | 93 ++
> >  hw/virtio/virtio-pci.h |  4 ++
> >  2 files changed, 97 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 750aa47ec1..843085c4ea 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -19,6 +19,7 @@
> >
> >  #include "exec/memop.h"
> >  #include "standard-headers/linux/virtio_pci.h"
> > +#include "standard-headers/linux/virtio_ids.h"
> >  #include "hw/boards.h"
> >  #include "hw/virtio/virtio.h"
> >  #include "migration/qemu-file-types.h"
> > @@ -213,6 +214,95 @@ static int virtio_pci_load_queue(DeviceState *d, int n,
> QEMUFile *f)
> >  return 0;
> >  }
> >
> > +typedef struct VirtIOPCIIDInfo {
> > +uint16_t vdev_id; /* virtio id */
> > +uint16_t pdev_id; /* pci device id */
> > +uint16_t class_id;
> > +} VirtIOPCIIDInfo;
> > +
> > +static const VirtIOPCIIDInfo virtio_pci_id_info[] = {
> > +{
> 
> Any way to get rid of this array? E.g using the algorithm that is used
> by the kernel virtio driver.
> 

For device id, we can use the algorithm if we no need to support
Transitional id. But how to get the class id ?

> Thanks
> 
> > +.vdev_id = VIRTIO_ID_NET,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_NET,
> > +.class_id = PCI_CLASS_NETWORK_ETHERNET,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_BLOCK,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BLOCK,
> > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_CONSOLE,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_CONSOLE,
> > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_SCSI,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_SCSI,
> > +.class_id = PCI_CLASS_STORAGE_SCSI,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_9P,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_9P,
> > +.class_id = PCI_BASE_CLASS_NETWORK,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_VSOCK,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_VSOCK,
> > +.class_id = PCI_CLASS_COMMUNICATION_OTHER,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_IOMMU,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_IOMMU,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_MEM,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_MEM,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_PMEM,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_PMEM,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_RNG,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_RNG,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +{
> > +.vdev_id = VIRTIO_ID_BALLOON,
> > +.pdev_id = PCI_DEVICE_ID_VIRTIO_BALLOON,
> > +.class_id = PCI_CLASS_OTHERS,
> > +},
> > +};
> > +
> > +static VirtIOPCIIDInfo virtio_pci_get_id_info(uint16_t vdev_id)
> > +{
> > +VirtIOPCIIDInfo info = {};
> > +int i;
> > +
> > +for (i = 0; i < ARRAY_SIZE(virtio_pci_id_info); i++) {
> > +if (virtio_pci_id_info[i].vdev_id == vdev_id) {
> > +info = virtio_pci_id_info[i];
> > +break;
> > +}
> > +}
> > +
> > +return info;
> > +}
> > +
> > +uint16_t virtio_pci_get_pci_devid(uint16_t device_id)
> > +{
> > +return virtio_pci_get_id_info(device_id).pdev_id;
> > +}
> > +
> > +uint16_t virtio_pci_get_class_id(uint16_t device_id)
> > +{
> > +return virtio_pci_get_id_info(device_id).class_id;
> > +}
> > +
> >  static bool virtio_pci_ioeventfd_enabled(DeviceState *d)
> >  {
> >  VirtIOPCIProxy *proxy = to_virtio_pci_proxy(d);
> > @@ -1674,6 +1764,9 @@ static void virtio_pci_device_plugged(DeviceState *d,
> Error **errp)
> >   * is set to PCI_SUBVENDOR_ID_REDHAT_QUMRANET by default.
> >   */
> >  pci_set_word(config + PCI_SUBSYSTEM_ID,
> virtio_bus_get_vdev_id(bus));
> > +if (proxy->pdev_id) {
> > +pci_config_set_device_id(config, proxy->pdev_id);
> > +}
> >  } else {
> >  /* pure virtio-1.0 */
> 

RE: [PATCH 0/2] kvm/msi: do explicit commit when adding msi routes

2022-01-03 Thread longpeng2--- via
Hi guys,

Ping...

> -Original Message-
> From: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> Sent: Saturday, December 11, 2021 10:27 PM
> To: pbonz...@redhat.com; alex.william...@redhat.com; m...@redhat.com;
> mtosa...@redhat.com
> Cc: k...@vger.kernel.org; qemu-devel@nongnu.org; Gonglei (Arei)
> ; Longpeng (Mike, Cloud Infrastructure Service
> Product Dept.) 
> Subject: [PATCH 0/2] kvm/msi: do explicit commit when adding msi routes
> 
> From: Longpeng 
> 
> This patchset moves the call to kvm_irqchip_commit_routes() out of
> kvm_irqchip_add_msi_route(). An optimization of vfio migration [1]
> depends on this changes.
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00968.html
> 
> Longpeng (Mike) (2):
>   kvm-irqchip: introduce new API to support route change
>   kvm/msi: do explicit commit when adding msi routes
> 
>  accel/kvm/kvm-all.c|  7 ---
>  accel/stubs/kvm-stub.c |  2 +-
>  hw/misc/ivshmem.c  |  5 -
>  hw/vfio/pci.c  |  5 -
>  hw/virtio/virtio-pci.c |  4 +++-
>  include/sysemu/kvm.h   | 23 +--
>  target/i386/kvm/kvm.c  |  4 +++-
>  7 files changed, 40 insertions(+), 10 deletions(-)
> 
> --
> 2.23.0




RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-20 Thread longpeng2--- via



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Monday, December 20, 2021 4:11 PM
> To: Jason Wang 
> Cc: Michael S. Tsirkin ; Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) ; pa...@nvidia.com;
> xieyon...@bytedance.com; sgarz...@redhat.com; Yechuan ;
> Gonglei (Arei) ; qemu-devel@nongnu.org; Dr. David
> Alan Gilbert 
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Mon, Dec 20, 2021 at 10:48:09AM +0800, Jason Wang wrote:
> > On Fri, Dec 17, 2021 at 4:35 PM Stefan Hajnoczi  wrote:
> > >
> > > On Fri, Dec 17, 2021 at 12:26:53PM +0800, Jason Wang wrote:
> > >
> > > Dave: You created the VIRTIO vmstate infrastructure in QEMU. Please see
> > > the bottom of this email about moving to a standard VIRTIO device
> > > save/load format defined by the VIRTIO spec in the future.
> > >
> > > > On Thu, Dec 16, 2021 at 5:10 PM Stefan Hajnoczi  
> > > > wrote:
> > > > >
> > > > > On Thu, Dec 16, 2021 at 11:01:40AM +0800, Jason Wang wrote:
> > > > > > On Wed, Dec 15, 2021 at 6:07 PM Stefan Hajnoczi 
> > > > > > 
> wrote:
> > > > > > >
> > > > > > > On Wed, Dec 15, 2021 at 11:18:05AM +0800, Jason Wang wrote:
> > > > > > > > On Tue, Dec 14, 2021 at 9:11 PM Stefan Hajnoczi 
> > > > > > > > 
> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Dec 14, 2021 at 10:22:53AM +0800, Jason Wang wrote:
> > > > > > > > > > On Mon, Dec 13, 2021 at 11:14 PM Stefan Hajnoczi
>  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Dec 13, 2021 at 10:47:00AM +0800, Jason Wang 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > On Sun, Dec 12, 2021 at 5:30 PM Michael S. Tsirkin 
> > > > > > > > > > > > 
> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Dec 11, 2021 at 03:00:27AM +, Longpeng 
> > > > > > > > > > > > > (Mike,
> Cloud Infrastructure Service Product Dept.) wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > -Original Message-
> > > > > > > > > > > > > > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> > > > > > > > > > > > > > > Sent: Thursday, December 9, 2021 5:17 PM
> > > > > > > > > > > > > > > To: Longpeng (Mike, Cloud Infrastructure Service
> Product Dept.)
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Cc: jasow...@redhat.com; m...@redhat.com;
> pa...@nvidia.com;
> > > > > > > > > > > > > > > xieyon...@bytedance.com; sgarz...@redhat.com; 
> > > > > > > > > > > > > > > Yechuan
> ;
> > > > > > > > > > > > > > > Gonglei (Arei) ;
> qemu-devel@nongnu.org
> > > > > > > > > > > > > > > Subject: Re: [RFC] vhost-vdpa-net: add 
> > > > > > > > > > > > > > > vhost-vdpa-net
> host device support
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Dec 08, 2021 at 01:20:10PM +0800, 
> > > > > > > > > > > > > > > Longpeng(Mike)
> wrote:
> > > > > > > > > > > > > > > > From: Longpeng 
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi guys,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > This patch introduces vhost-vdpa-net device, 
> > > > > > > > > > > > > > > > which
> is inspired
> > > > > > > > > > > > > > > > by vhost-user-blk and the proposal of 
> > > > > > > > > > > > > > > > vhost-vdpa-blk
> device [1].
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I've tested this patch on Huawei's offload card:
> > > > > > > > > > > > > > > > ./x86_64-softmmu/qemu-system-x86_64 \
> > > > > > > > > > > > > > > > -device
> vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > For virtio hardware offloading, the most 
> > > > > > > > > > > > > > > > important
> requirement for us
> > > > > > > > > > > > > > > > is to support live migration between offloading
> cards from different
> > > > > > > > > > > > > > > > vendors, the combination of netdev and 
> > > > > > > > > > > > > > > > virtio-net
> seems too heavy, we
> > > > > > > > > > > > > > > > prefer a lightweight way.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Maybe we could support both in the future ? Such
> as:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > * Lightweight
> > > > > > > > > > > > > > > >  Net: vhost-vdpa-net
> > > > > > > > > > > > > > > >  Storage: vhost-vdpa-blk
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > * Heavy but more powerful
> > > > > > > > > > > > > > > >  Net: netdev + virtio-net + vhost-vdpa
> > > > > > > > > > > > > > > >  Storage: bdrv + virtio-blk + vhost-vdpa
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [1]
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Stefano presented a plan for vdpa-blk at KVM Forum
> 2021:
> > > > > > > > > > > > > > >
> https://kvmforum2021.sched.com/event/ke3a/vdpa-blk-unified-hardware-and-sof
> > > > > > > > > > > > > > > 

RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-13 Thread longpeng2--- via



> -Original Message-
> From: Qemu-devel [mailto:qemu-devel-bounces+longpeng2=huawei@nongnu.org]
> On Behalf Of Stefan Hajnoczi
> Sent: Monday, December 13, 2021 11:16 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: m...@redhat.com; jasow...@redhat.com; qemu-devel@nongnu.org; Yechuan
> ; xieyon...@bytedance.com; Gonglei (Arei)
> ; pa...@nvidia.com; Stefano Garzarella
> 
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Sat, Dec 11, 2021 at 04:11:04AM +, Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.) wrote:
> >
> >
> > > -Original Message-
> > > From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> > > Sent: Thursday, December 9, 2021 11:55 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > 
> > > Cc: Stefan Hajnoczi ; jasow...@redhat.com;
> m...@redhat.com;
> > > pa...@nvidia.com; xieyon...@bytedance.com; Yechuan ;
> > > Gonglei (Arei) ; qemu-devel@nongnu.org
> > > Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> > >
> > > On Thu, Dec 09, 2021 at 09:16:58AM +, Stefan Hajnoczi wrote:
> > > >On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> > > >> From: Longpeng 
> > > >>
> > > >> Hi guys,
> > > >>
> > > >> This patch introduces vhost-vdpa-net device, which is inspired
> > > >> by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> > > >>
> > > >> I've tested this patch on Huawei's offload card:
> > > >> ./x86_64-softmmu/qemu-system-x86_64 \
> > > >> -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> > > >>
> > > >> For virtio hardware offloading, the most important requirement for us
> > > >> is to support live migration between offloading cards from different
> > > >> vendors, the combination of netdev and virtio-net seems too heavy, we
> > > >> prefer a lightweight way.
> > > >>
> > > >> Maybe we could support both in the future ? Such as:
> > > >>
> > > >> * Lightweight
> > > >>  Net: vhost-vdpa-net
> > > >>  Storage: vhost-vdpa-blk
> > > >>
> > > >> * Heavy but more powerful
> > > >>  Net: netdev + virtio-net + vhost-vdpa
> > > >>  Storage: bdrv + virtio-blk + vhost-vdpa
> > > >>
> > > >> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> > > >
> > > >Stefano presented a plan for vdpa-blk at KVM Forum 2021:
> > > >https://kvmforum2021.sched.com/event/ke3a/vdpa-blk-unified-hardware-an
> d-so
> > > ftware-offload-for-virtio-blk-stefano-garzarella-red-hat
> > > >
> > > >It's closer to today's virtio-net + vhost-net approach than the
> > > >vhost-vdpa-blk device you have mentioned. The idea is to treat vDPA as
> > > >an offload feature rather than a completely separate code path that
> > > >needs to be maintained and tested. That way QEMU's block layer features
> > > >and live migration work with vDPA devices and re-use the virtio-blk
> > > >code. The key functionality that has not been implemented yet is a "fast
> > > >path" mechanism that allows the QEMU virtio-blk device's virtqueue to be
> > > >offloaded to vDPA.
> > > >
> > > >The unified vdpa-blk architecture should deliver the same performance
> > > >as the vhost-vdpa-blk device you mentioned but with more features, so I
> > > >wonder what aspects of the vhost-vdpa-blk idea are important to you?
> > > >
> > > >QEMU already has vhost-user-blk, which takes a similar approach as the
> > > >vhost-vdpa-blk device you are proposing. I'm not against the
> > > >vhost-vdpa-blk approach in priciple, but would like to understand your
> > > >requirements and see if there is a way to collaborate on one vdpa-blk
> > > >implementation instead of dividing our efforts between two.
> > >
> > > Waiting for the aspects that Stefan asked, I add some details about the
> > > plan for vdpa-blk.
> > >
> > > Currently I'm working on the in-kernel software device. In the next
> > > months I hope to start working on the QEMU part. Anyway that part could
> > > go in parallel with the in-kernel device, so if you are interested we
> > > can collaborate.
> > >
> >
> > The work on QEMU part means supporting the vdpa in BlockDriver and 
> > virtio-blk?
> >
> > In fact, I wanted to support the vdpa in QEMU block layer before I sent this
> > RFC, because the net part uses netdev + virtio-net while the storage part 
> > uses
> > vhost-vdpa-blk (from Yongji) looks like a strange combination.
> >
> > But I found enable vdpa in QEMU block layer would take more time and some
> > features (e.g. snapshot, IO throttling) from the QEMU block layer are not 
> > needed
> > in our hardware offloading case, so I turned to develop the 
> > "vhost-vdpa-net",
> > maybe the combination of vhost-vdpa-net and vhost-vdpa-blk is congruous.
> >
> > > Having only the unified vdpa-blk architecture would allow us to simplify
> > > the management layers and avoid duplicate code, but it takes more time
> > > to develop compared to vhost-vdpa-blk. So if vdpa-blk support in QEMU is
> > > 

RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-13 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Monday, December 13, 2021 11:23 AM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: mst ; Parav Pandit ; Yongji Xie
> ; Stefan Hajnoczi ; Stefano
> Garzarella ; Yechuan ; Gonglei (Arei)
> ; qemu-devel 
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Sat, Dec 11, 2021 at 1:23 PM Longpeng (Mike, Cloud Infrastructure
> Service Product Dept.)  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Jason Wang [mailto:jasow...@redhat.com]
> > > Sent: Wednesday, December 8, 2021 2:27 PM
> > > To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> > > 
> > > Cc: mst ; Parav Pandit ; Yongji Xie
> > > ; Stefan Hajnoczi ; Stefano
> > > Garzarella ; Yechuan ; Gonglei
> (Arei)
> > > ; qemu-devel 
> > > Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> > >
> > > On Wed, Dec 8, 2021 at 1:20 PM Longpeng(Mike)  
> > > wrote:
> > > >
> > > > From: Longpeng 
> > > >
> > > > Hi guys,
> > > >
> > > > This patch introduces vhost-vdpa-net device, which is inspired
> > > > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> > > >
> > > > I've tested this patch on Huawei's offload card:
> > > > ./x86_64-softmmu/qemu-system-x86_64 \
> > > > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> > > >
> > > > For virtio hardware offloading, the most important requirement for us
> > > > is to support live migration between offloading cards from different
> > > > vendors, the combination of netdev and virtio-net seems too heavy, we
> > > > prefer a lightweight way.
> > >
> > > Could you elaborate more on this? It's mainly the control path when
> > > using with netdev, and it provides a lot of other benefits:
> > >
> > > - decouple the transport specific stuff out of the vhost abstraction,
> > > mmio device is supported with 0 line of code
> > > - migration compatibility, reuse the migration stream that is already
> > > supported by Qemu virtio-net, this will allow migration among
> > > different vhost backends.
> > > - software mediation facility, not all the virtqueues are assigned to
> > > guests directly. One example is the virtio-net cvq, qemu may want to
> > > intercept and record the device state for migration. Reusing the
> > > current virtio-net codes simplifies a lot of codes.
> > > - transparent failover (in the future), the nic model can choose to
> > > switch between vhost backends etc.
> > >
> >
> > We want to use the vdpa framework instead of the vfio-pci framework in
> > the virtio hardware offloading case, so maybe some of the benefits above
> > are not needed in our case. But we need to migrate between different
> > hardware, so I am not sure whether this approach would be harmful to the
> > requirement.
> 
> It should not, but it needs to build the migration facility for the
> net from the ground. And if we want to have a general migration
> solution instead of a vendor specific one, it may duplicate some logic
> of existing virtio-net implementation. The CVQ migration is an
> example, we don't provide a dedicated migration facility in the spec.
> So a more general way for live migration currently is using the shadow
> virtqueue which is what Eugenio is doing. So thanks to the design
> where we tried to do all the work in the vhost layer, this might not
> be a problem for this approach. But talking about the CVQ migration,
> things will be interesting. Qemu needs to decode the cvq commands in
> the middle thus it can record the device state. For having a general
> migration solution, vhost-vdpa-pci needs to do this as well.
> Virtio-net has the full CVQ logic so it's much easier, for
> vhost-vdpa-pci, it needs to duplicate them all in its own logic.
> 

OK, thanks for your patient explanation. We will follow up the progress
of live migration.

> >
> > > >
> > > > Maybe we could support both in the future ?
> > >
> > > For the net, we need to figure out the advantages of this approach
> > > first. Note that we didn't have vhost-user-net-pci or vhost-pci in the
> > > past.
> > >
> >
> > Why didn't support vhost-user-net-pci in history ? Because its control
> > path is much more complex than the block ?
> 
> I don't know, it may be simply because no one tries to do that.
> 
> >
> > > For the block, I will leave Stefan and Stefano to comment.
> > >
> > > > Such as:
> > > >
> > > > * Lightweight
> > > >  Net: vhost-vdpa-net
> > > >  Storage: vhost-vdpa-blk
> > > >
> > > > * Heavy but more powerful
> > > >  Net: netdev + virtio-net + vhost-vdpa
> > > >  Storage: bdrv + virtio-blk + vhost-vdpa
> > > >
> > > > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> > > >
> > > > Signed-off-by: Longpeng(Mike) 
> > > > ---
> > > >  hw/net/meson.build |   1 +
> > > >  hw/net/vhost-vdpa-net.c| 338
> > > +
> > > >  hw/virtio/Kconfig

RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-10 Thread longpeng2--- via


> -Original Message-
> From: Jason Wang [mailto:jasow...@redhat.com]
> Sent: Wednesday, December 8, 2021 2:27 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: mst ; Parav Pandit ; Yongji Xie
> ; Stefan Hajnoczi ; Stefano
> Garzarella ; Yechuan ; Gonglei (Arei)
> ; qemu-devel 
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Wed, Dec 8, 2021 at 1:20 PM Longpeng(Mike)  wrote:
> >
> > From: Longpeng 
> >
> > Hi guys,
> >
> > This patch introduces vhost-vdpa-net device, which is inspired
> > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> >
> > I've tested this patch on Huawei's offload card:
> > ./x86_64-softmmu/qemu-system-x86_64 \
> > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> >
> > For virtio hardware offloading, the most important requirement for us
> > is to support live migration between offloading cards from different
> > vendors, the combination of netdev and virtio-net seems too heavy, we
> > prefer a lightweight way.
> 
> Could you elaborate more on this? It's mainly the control path when
> using with netdev, and it provides a lot of other benefits:
> 
> - decouple the transport specific stuff out of the vhost abstraction,
> mmio device is supported with 0 line of code
> - migration compatibility, reuse the migration stream that is already
> supported by Qemu virtio-net, this will allow migration among
> different vhost backends.
> - software mediation facility, not all the virtqueues are assigned to
> guests directly. One example is the virtio-net cvq, qemu may want to
> intercept and record the device state for migration. Reusing the
> current virtio-net codes simplifies a lot of codes.
> - transparent failover (in the future), the nic model can choose to
> switch between vhost backends etc.
> 

We want to use the vdpa framework instead of the vfio-pci framework in 
the virtio hardware offloading case, so maybe some of the benefits above
are not needed in our case. But we need to migrate between different 
hardware, so I am not sure whether this approach would be harmful to the 
requirement.

> >
> > Maybe we could support both in the future ?
> 
> For the net, we need to figure out the advantages of this approach
> first. Note that we didn't have vhost-user-net-pci or vhost-pci in the
> past.
> 

Why didn't support vhost-user-net-pci in history ? Because its control
path is much more complex than the block ?

> For the block, I will leave Stefan and Stefano to comment.
> 
> > Such as:
> >
> > * Lightweight
> >  Net: vhost-vdpa-net
> >  Storage: vhost-vdpa-blk
> >
> > * Heavy but more powerful
> >  Net: netdev + virtio-net + vhost-vdpa
> >  Storage: bdrv + virtio-blk + vhost-vdpa
> >
> > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> >
> > Signed-off-by: Longpeng(Mike) 
> > ---
> >  hw/net/meson.build |   1 +
> >  hw/net/vhost-vdpa-net.c| 338
> +
> >  hw/virtio/Kconfig  |   5 +
> >  hw/virtio/meson.build  |   1 +
> >  hw/virtio/vhost-vdpa-net-pci.c | 118 +
> 
> I'd expect there's no device type specific code in this approach and
> any kind of vDPA devices could be used with a general pci device.
> 
> Any reason for having net specific types here?
> 

No, just because there already has the proposal of vhost-vdpa-blk, so I 
developed the vhost-vdpa-net correspondingly.

I pretty agree with your suggestion. If feasible, likes vfio-pci, we don't 
need to maintain the device type specific code in QEMU, what's more, it's 
possible to support the live migration of different virtio hardware.

> >  include/hw/virtio/vhost-vdpa-net.h |  31 
> >  include/net/vhost-vdpa.h   |   2 +
> >  net/vhost-vdpa.c   |   2 +-
> >  8 files changed, 497 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/net/vhost-vdpa-net.c
> >  create mode 100644 hw/virtio/vhost-vdpa-net-pci.c
> >  create mode 100644 include/hw/virtio/vhost-vdpa-net.h
> >
> > diff --git a/hw/net/meson.build b/hw/net/meson.build
> > index bdf71f1..139ebc4 100644
> > --- a/hw/net/meson.build
> > +++ b/hw/net/meson.build
> > @@ -44,6 +44,7 @@ specific_ss.add(when: 'CONFIG_XILINX_ETHLITE', if_true:
> files('xilinx_ethlite.c'
> >
> >  softmmu_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('net_rx_pkt.c'))
> >  specific_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('virtio-net.c'))
> > +specific_ss.add(when: 'CONFIG_VHOST_VDPA_NET', if_true:
> files('vhost-vdpa-net.c'))
> >
> >  softmmu_ss.add(when: ['CONFIG_VIRTIO_NET', 'CONFIG_VHOST_NET'], if_true:
> files('vhost_net.c'), if_false: files('vhost_net-stub.c'))
> >  softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost_net-stub.c'))
> > diff --git a/hw/net/vhost-vdpa-net.c b/hw/net/vhost-vdpa-net.c
> > new file mode 100644
> > index 000..48b99f9
> > --- /dev/null
> > +++ b/hw/net/vhost-vdpa-net.c
> > @@ -0,0 +1,338 @@
> > 

RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-10 Thread longpeng2--- via



> -Original Message-
> From: Stefano Garzarella [mailto:sgarz...@redhat.com]
> Sent: Thursday, December 9, 2021 11:55 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: Stefan Hajnoczi ; jasow...@redhat.com; 
> m...@redhat.com;
> pa...@nvidia.com; xieyon...@bytedance.com; Yechuan ;
> Gonglei (Arei) ; qemu-devel@nongnu.org
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Thu, Dec 09, 2021 at 09:16:58AM +, Stefan Hajnoczi wrote:
> >On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> >> From: Longpeng 
> >>
> >> Hi guys,
> >>
> >> This patch introduces vhost-vdpa-net device, which is inspired
> >> by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> >>
> >> I've tested this patch on Huawei's offload card:
> >> ./x86_64-softmmu/qemu-system-x86_64 \
> >> -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> >>
> >> For virtio hardware offloading, the most important requirement for us
> >> is to support live migration between offloading cards from different
> >> vendors, the combination of netdev and virtio-net seems too heavy, we
> >> prefer a lightweight way.
> >>
> >> Maybe we could support both in the future ? Such as:
> >>
> >> * Lightweight
> >>  Net: vhost-vdpa-net
> >>  Storage: vhost-vdpa-blk
> >>
> >> * Heavy but more powerful
> >>  Net: netdev + virtio-net + vhost-vdpa
> >>  Storage: bdrv + virtio-blk + vhost-vdpa
> >>
> >> [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> >
> >Stefano presented a plan for vdpa-blk at KVM Forum 2021:
> >https://kvmforum2021.sched.com/event/ke3a/vdpa-blk-unified-hardware-and-so
> ftware-offload-for-virtio-blk-stefano-garzarella-red-hat
> >
> >It's closer to today's virtio-net + vhost-net approach than the
> >vhost-vdpa-blk device you have mentioned. The idea is to treat vDPA as
> >an offload feature rather than a completely separate code path that
> >needs to be maintained and tested. That way QEMU's block layer features
> >and live migration work with vDPA devices and re-use the virtio-blk
> >code. The key functionality that has not been implemented yet is a "fast
> >path" mechanism that allows the QEMU virtio-blk device's virtqueue to be
> >offloaded to vDPA.
> >
> >The unified vdpa-blk architecture should deliver the same performance
> >as the vhost-vdpa-blk device you mentioned but with more features, so I
> >wonder what aspects of the vhost-vdpa-blk idea are important to you?
> >
> >QEMU already has vhost-user-blk, which takes a similar approach as the
> >vhost-vdpa-blk device you are proposing. I'm not against the
> >vhost-vdpa-blk approach in priciple, but would like to understand your
> >requirements and see if there is a way to collaborate on one vdpa-blk
> >implementation instead of dividing our efforts between two.
> 
> Waiting for the aspects that Stefan asked, I add some details about the
> plan for vdpa-blk.
> 
> Currently I'm working on the in-kernel software device. In the next
> months I hope to start working on the QEMU part. Anyway that part could
> go in parallel with the in-kernel device, so if you are interested we
> can collaborate.
> 

The work on QEMU part means supporting the vdpa in BlockDriver and virtio-blk?

In fact, I wanted to support the vdpa in QEMU block layer before I sent this
RFC, because the net part uses netdev + virtio-net while the storage part uses
vhost-vdpa-blk (from Yongji) looks like a strange combination.

But I found enable vdpa in QEMU block layer would take more time and some
features (e.g. snapshot, IO throttling) from the QEMU block layer are not needed
in our hardware offloading case, so I turned to develop the "vhost-vdpa-net",
maybe the combination of vhost-vdpa-net and vhost-vdpa-blk is congruous.

> Having only the unified vdpa-blk architecture would allow us to simplify
> the management layers and avoid duplicate code, but it takes more time
> to develop compared to vhost-vdpa-blk. So if vdpa-blk support in QEMU is
> urgent, I could understand the need to add vhost-vdpa-blk now.
> 

I prefer a way that can support vdpa devices (not only net and storage, but also
other device types) quickly in hardware offloading case, maybe it would 
decreases
the universalism, but it could be an alternative to some users.

> Let me know if you want more details about the unified vdpa-blk
> architecture.
> 
> Thanks,
> Stefano




RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-10 Thread longpeng2--- via



> -Original Message-
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com]
> Sent: Thursday, December 9, 2021 5:17 PM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: jasow...@redhat.com; m...@redhat.com; pa...@nvidia.com;
> xieyon...@bytedance.com; sgarz...@redhat.com; Yechuan ;
> Gonglei (Arei) ; qemu-devel@nongnu.org
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> > From: Longpeng 
> >
> > Hi guys,
> >
> > This patch introduces vhost-vdpa-net device, which is inspired
> > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> >
> > I've tested this patch on Huawei's offload card:
> > ./x86_64-softmmu/qemu-system-x86_64 \
> > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> >
> > For virtio hardware offloading, the most important requirement for us
> > is to support live migration between offloading cards from different
> > vendors, the combination of netdev and virtio-net seems too heavy, we
> > prefer a lightweight way.
> >
> > Maybe we could support both in the future ? Such as:
> >
> > * Lightweight
> >  Net: vhost-vdpa-net
> >  Storage: vhost-vdpa-blk
> >
> > * Heavy but more powerful
> >  Net: netdev + virtio-net + vhost-vdpa
> >  Storage: bdrv + virtio-blk + vhost-vdpa
> >
> > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> 
> Stefano presented a plan for vdpa-blk at KVM Forum 2021:
> https://kvmforum2021.sched.com/event/ke3a/vdpa-blk-unified-hardware-and-sof
> tware-offload-for-virtio-blk-stefano-garzarella-red-hat
> 
> It's closer to today's virtio-net + vhost-net approach than the
> vhost-vdpa-blk device you have mentioned. The idea is to treat vDPA as
> an offload feature rather than a completely separate code path that
> needs to be maintained and tested. That way QEMU's block layer features
> and live migration work with vDPA devices and re-use the virtio-blk
> code. The key functionality that has not been implemented yet is a "fast
> path" mechanism that allows the QEMU virtio-blk device's virtqueue to be
> offloaded to vDPA.
> 
> The unified vdpa-blk architecture should deliver the same performance
> as the vhost-vdpa-blk device you mentioned but with more features, so I
> wonder what aspects of the vhost-vdpa-blk idea are important to you?
> 
> QEMU already has vhost-user-blk, which takes a similar approach as the
> vhost-vdpa-blk device you are proposing. I'm not against the
> vhost-vdpa-blk approach in priciple, but would like to understand your
> requirements and see if there is a way to collaborate on one vdpa-blk
> implementation instead of dividing our efforts between two.
> 

We prefer a simple way in the virtio hardware offloading case, it could reduce
our maintenance workload, we no need to maintain the virtio-net, netdev,
virtio-blk, bdrv and ... any more. If we need to support other vdpa devices
(such as virtio-crypto, virtio-fs) in the future, then we also need to maintain
the corresponding device emulation code?

For the virtio hardware offloading case, we usually use the vfio-pci framework,
it saves a lot of our maintenance work in QEMU, we don't need to touch the 
device
types. Inspired by Jason, what we really prefer is "vhost-vdpa-pci/mmio", use 
it to
instead of the vfio-pci, it could provide the same performance as vfio-pci, but 
it's
*possible* to support live migrate between offloading cards from different 
vendors.

> Stefan



RE: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support

2021-12-10 Thread longpeng2--- via



> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Thursday, December 9, 2021 3:05 AM
> To: Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
> 
> Cc: jasow...@redhat.com; pa...@nvidia.com; xieyon...@bytedance.com;
> stefa...@redhat.com; sgarz...@redhat.com; Yechuan ;
> Gonglei (Arei) ; qemu-devel@nongnu.org
> Subject: Re: [RFC] vhost-vdpa-net: add vhost-vdpa-net host device support
> 
> On Wed, Dec 08, 2021 at 01:20:10PM +0800, Longpeng(Mike) wrote:
> > From: Longpeng 
> >
> > Hi guys,
> >
> > This patch introduces vhost-vdpa-net device, which is inspired
> > by vhost-user-blk and the proposal of vhost-vdpa-blk device [1].
> >
> > I've tested this patch on Huawei's offload card:
> > ./x86_64-softmmu/qemu-system-x86_64 \
> > -device vhost-vdpa-net-pci,vdpa-dev=/dev/vhost-vdpa-0
> >
> > For virtio hardware offloading, the most important requirement for us
> > is to support live migration between offloading cards from different
> > vendors, the combination of netdev and virtio-net seems too heavy, we
> > prefer a lightweight way.
> 
> Did not look at the patch in depth yet.
> Is this already supported with this patch? Or is that just the plan?
> 

With this patch, the data plane can work, I've done the test based on the
Huawei's offloading card. But the live migration is not support yet.

> > Maybe we could support both in the future ? Such as:
> >
> > * Lightweight
> >  Net: vhost-vdpa-net
> >  Storage: vhost-vdpa-blk
> >
> > * Heavy but more powerful
> >  Net: netdev + virtio-net + vhost-vdpa
> >  Storage: bdrv + virtio-blk + vhost-vdpa
> 
> I'd like to better understand what is in and out of scope for
> this device. Which features would be "more powerful" and belong
> in virtio-net, and which in vhost-vdpa-net?
> 

It's no doubt that the combination of netdev + vrtio-net + vhost-vdpa could
provides lots of benefits (such as Jason listed in his comments) , it's more
generic.

However, vhost-vdpa-net is only aiming at the virtio hardware offloading case, 
besides the data plane passthrough, migrate between offloading cards from
different vendors is our goal. Some features (e.g. transparent failover,
migrate between different types of vhost backends) maybe won't be used in such
specific case.

> > [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg797569.html
> >
> > Signed-off-by: Longpeng(Mike) 
> > ---
> >  hw/net/meson.build |   1 +
> >  hw/net/vhost-vdpa-net.c| 338
> +
> >  hw/virtio/Kconfig  |   5 +
> >  hw/virtio/meson.build  |   1 +
> >  hw/virtio/vhost-vdpa-net-pci.c | 118 +
> >  include/hw/virtio/vhost-vdpa-net.h |  31 
> >  include/net/vhost-vdpa.h   |   2 +
> >  net/vhost-vdpa.c   |   2 +-
> >  8 files changed, 497 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/net/vhost-vdpa-net.c
> >  create mode 100644 hw/virtio/vhost-vdpa-net-pci.c
> >  create mode 100644 include/hw/virtio/vhost-vdpa-net.h
> >
> > diff --git a/hw/net/meson.build b/hw/net/meson.build
> > index bdf71f1..139ebc4 100644
> > --- a/hw/net/meson.build
> > +++ b/hw/net/meson.build
> > @@ -44,6 +44,7 @@ specific_ss.add(when: 'CONFIG_XILINX_ETHLITE', if_true:
> files('xilinx_ethlite.c'
> >
> >  softmmu_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('net_rx_pkt.c'))
> >  specific_ss.add(when: 'CONFIG_VIRTIO_NET', if_true: files('virtio-net.c'))
> > +specific_ss.add(when: 'CONFIG_VHOST_VDPA_NET', if_true:
> files('vhost-vdpa-net.c'))
> >
> >  softmmu_ss.add(when: ['CONFIG_VIRTIO_NET', 'CONFIG_VHOST_NET'], if_true:
> files('vhost_net.c'), if_false: files('vhost_net-stub.c'))
> >  softmmu_ss.add(when: 'CONFIG_ALL', if_true: files('vhost_net-stub.c'))
> > diff --git a/hw/net/vhost-vdpa-net.c b/hw/net/vhost-vdpa-net.c
> > new file mode 100644
> > index 000..48b99f9
> > --- /dev/null
> > +++ b/hw/net/vhost-vdpa-net.c
> > @@ -0,0 +1,338 @@
> > +#include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "qemu/error-report.h"
> > +#include "qemu/cutils.h"
> > +#include "hw/qdev-core.h"
> > +#include "hw/qdev-properties.h"
> > +#include "hw/qdev-properties-system.h"
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/vhost-vdpa-net.h"
> > +#include "hw/virtio/virtio.h"
> > +#include "hw/virtio/virtio-bus.h"
> > +#include "hw/virtio/virtio-access.h"
> > +#include "sysemu/sysemu.h"
> > +#include "sysemu/runstate.h"
> > +#include "net/vhost-vdpa.h"
> > +
> > +static void vhost_vdpa_net_get_config(VirtIODevice *vdev, uint8_t *config)
> > +{
> > +VHostVdpaNet *s = VHOST_VDPA_NET(vdev);
> > +
> > +memcpy(config, >netcfg, sizeof(struct virtio_net_config));
> > +}
> > +
> > +static void vhost_vdpa_net_set_config(VirtIODevice *vdev, const uint8_t
> *config)
> > +{
> > +VHostVdpaNet *s = VHOST_VDPA_NET(vdev);
> > +struct virtio_net_config *netcfg = (struct virtio_net_config *)config;
> > +int ret;
> > +
> > +