Re: [PATCH 22/23] i2c: virtio: Remove #ifdef guards for PM related functions
On 05-07-23, 22:45, Paul Cercueil wrote: > Use the new PM macros for the suspend and resume functions to be > automatically dropped by the compiler when CONFIG_PM or > CONFIG_PM_SLEEP are disabled, without having to use #ifdef guards. > > This has the advantage of always compiling these functions in, > independently of any Kconfig option. Thanks to that, bugs and other > regressions are subsequently easier to catch. > > Signed-off-by: Paul Cercueil > > --- > Cc: Conghui Chen > Cc: Viresh Kumar > Cc: virtualization@lists.linux-foundation.org > --- > drivers/i2c/busses/i2c-virtio.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c > index 4b9536f50800..c60ae531ba57 100644 > --- a/drivers/i2c/busses/i2c-virtio.c > +++ b/drivers/i2c/busses/i2c-virtio.c > @@ -243,7 +243,6 @@ static struct virtio_device_id id_table[] = { > }; > MODULE_DEVICE_TABLE(virtio, id_table); > > -#ifdef CONFIG_PM_SLEEP > static int virtio_i2c_freeze(struct virtio_device *vdev) > { > virtio_i2c_del_vqs(vdev); > @@ -254,7 +253,6 @@ static int virtio_i2c_restore(struct virtio_device *vdev) > { > return virtio_i2c_setup_vqs(vdev->priv); > } > -#endif > > static const unsigned int features[] = { > VIRTIO_I2C_F_ZERO_LENGTH_REQUEST, > @@ -269,10 +267,8 @@ static struct virtio_driver virtio_i2c_driver = { > .driver = { > .name = "i2c_virtio", > }, > -#ifdef CONFIG_PM_SLEEP > - .freeze = virtio_i2c_freeze, > - .restore = virtio_i2c_restore, > -#endif > + .freeze = pm_sleep_ptr(virtio_i2c_freeze), > + .restore= pm_sleep_ptr(virtio_i2c_restore), > }; > module_virtio_driver(virtio_i2c_driver); > Acked-by: Viresh Kumar -- viresh ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 3/3] vduse: Temporarily disable control queue features
On Wed, Jul 5, 2023 at 6:04 PM Maxime Coquelin wrote: > > Virtio-net driver control queue implementation is not safe > when used with VDUSE. If the VDUSE application does not > reply to control queue messages, it currently ends up > hanging the kernel thread sending this command. > > Some work is on-going to make the control queue > implementation robust with VDUSE. Until it is completed, > let's filter out control virtqueue and features that depend > on it by keeping only features known to be supported. > > Signed-off-by: Maxime Coquelin Acked-by: Jason Wang Thanks > --- > drivers/vdpa/vdpa_user/vduse_dev.c | 36 ++ > 1 file changed, 36 insertions(+) > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > b/drivers/vdpa/vdpa_user/vduse_dev.c > index 1271c9796517..7345071db0a8 100644 > --- a/drivers/vdpa/vdpa_user/vduse_dev.c > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > @@ -46,6 +46,30 @@ > > #define IRQ_UNBOUND -1 > > +#define VDUSE_NET_VALID_FEATURES_MASK \ > + (BIT_ULL(VIRTIO_NET_F_CSUM) | \ > +BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \ > +BIT_ULL(VIRTIO_NET_F_MTU) |\ > +BIT_ULL(VIRTIO_NET_F_MAC) |\ > +BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \ > +BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \ > +BIT_ULL(VIRTIO_NET_F_GUEST_ECN) | \ > +BIT_ULL(VIRTIO_NET_F_GUEST_UFO) | \ > +BIT_ULL(VIRTIO_NET_F_HOST_TSO4) | \ > +BIT_ULL(VIRTIO_NET_F_HOST_TSO6) | \ > +BIT_ULL(VIRTIO_NET_F_HOST_ECN) | \ > +BIT_ULL(VIRTIO_NET_F_HOST_UFO) | \ > +BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) | \ > +BIT_ULL(VIRTIO_NET_F_STATUS) | \ > +BIT_ULL(VIRTIO_NET_F_HOST_USO) | \ > +BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \ > +BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \ > +BIT_ULL(VIRTIO_F_EVENT_IDX) | \ > +BIT_ULL(VIRTIO_F_VERSION_1) | \ > +BIT_ULL(VIRTIO_F_IOMMU_PLATFORM) | \ > +BIT_ULL(VIRTIO_F_RING_PACKED) |\ > +BIT_ULL(VIRTIO_F_IN_ORDER)) > + > struct vduse_virtqueue { > u16 index; > u16 num_max; > @@ -1778,6 +1802,16 @@ static struct attribute *vduse_dev_attrs[] = { > > ATTRIBUTE_GROUPS(vduse_dev); > > +static void vduse_dev_features_filter(struct vduse_dev_config *config) > +{ > + /* > +* Temporarily filter out virtio-net's control virtqueue and features > +* that depend on it while CVQ is being made more robust for VDUSE. > +*/ > + if (config->device_id == VIRTIO_ID_NET) > + config->features &= VDUSE_NET_VALID_FEATURES_MASK; > +} > + > static int vduse_create_dev(struct vduse_dev_config *config, > void *config_buf, u64 api_version) > { > @@ -1793,6 +1827,8 @@ static int vduse_create_dev(struct vduse_dev_config > *config, > if (!dev) > goto err; > > + vduse_dev_features_filter(config); > + > dev->api_version = api_version; > dev->device_features = config->features; > dev->device_id = config->device_id; > -- > 2.41.0 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On Wed, Jul 5, 2023 at 4:42 PM Michael S. Tsirkin wrote: > > On Wed, Jul 05, 2023 at 03:55:23PM +0800, Jason Wang wrote: > > On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin wrote: > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: > > > > > > With the current code it is accepted as long as userland send it. > > > > > > > > > > > > Although userland should not set a feature flag that has not been > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code > > > > > > will not > > > > > > complain for it. > > > > > > > > > > > > Since there is no specific reason for any parent to reject that > > > > > > backend > > > > > > feature bit when it has been proposed, let's control it at vdpa > > > > > > frontend > > > > > > level. Future patches may move this control to the parent driver. > > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") > > > > > > Signed-off-by: Eugenio Pérez > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" > > > > > hack > > > > > upstream at all, I merged it in next just to give it some testing. > > > > > We want RING_ACCESS_AFTER_KICK or some such. > > > > > > > > > > > > > Current devices do not support that semantic. > > > > > > Which devices specifically access the ring after DRIVER_OK but before > > > a kick? > > > > Vhost-net is one example at last. It polls a socket as well, so it > > starts to access the ring immediately after DRIVER_OK. > > > > Thanks > > > For sure but that is not vdpa. Somehow via vp_vdpa that I'm usually testing vDPA patches. The problem is that it's very hard to audit all vDPA devices now if it is not defined in the spec before they are invented. Thanks > > > > > > > > My plan was to convert > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I > > > > was not explicit enough. > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa > > > > (parent?) driver, as talked in virtio-comment. But that complicates > > > > the architecture: > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK > > > > * Store vq enable state separately, at > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw > > > > * Store the doorbell state separately, but do not configure it to the > > > > device directly. > > > > > > > > But how to recover if the device cannot configure them at kick time, > > > > for example? > > > > > > > > Maybe we can just fail if the parent driver does not support enabling > > > > the vq after DRIVER_OK? That way no new feature flag is needed. > > > > > > > > Thanks! > > > > > > > > > > > > > > > --- > > > > > > Sent with Fixes: tag pointing to > > > > > > git.kernel.org/pub/scm/linux/kernel/git/mst > > > > > > commit. Please let me know if I should send a v3 of [1] instead. > > > > > > > > > > > > [1] > > > > > > https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ > > > > > > --- > > > > > > drivers/vhost/vdpa.c | 7 +-- > > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > > > index e1abf29fed5b..a7e554352351 100644 > > > > > > --- a/drivers/vhost/vdpa.c > > > > > > +++ b/drivers/vhost/vdpa.c > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct > > > > > > file *filep, > > > > > > { > > > > > > struct vhost_vdpa *v = filep->private_data; > > > > > > struct vhost_dev *d = >vdev; > > > > > > + const struct vdpa_config_ops *ops = v->vdpa->config; > > > > > > void __user *argp = (void __user *)arg; > > > > > > u64 __user *featurep = argp; > > > > > > - u64 features; > > > > > > + u64 features, parent_features = 0; > > > > > > long r = 0; > > > > > > > > > > > > if (cmd == VHOST_SET_BACKEND_FEATURES) { > > > > > > if (copy_from_user(, featurep, > > > > > > sizeof(features))) > > > > > > return -EFAULT; > > > > > > + if (ops->get_backend_features) > > > > > > + parent_features = > > > > > > ops->get_backend_features(v->vdpa); > > > > > > if (features & ~(VHOST_VDPA_BACKEND_FEATURES | > > > > > >BIT_ULL(VHOST_BACKEND_F_SUSPEND) | > > > > > >BIT_ULL(VHOST_BACKEND_F_RESUME) | > > > > > > - > > > > > > BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK))) > > > > > > + parent_features)) > > > > > > return -EOPNOTSUPP; > > > > > > if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && > > > > > >
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On Wed, Jul 5, 2023 at 4:41 PM Michael S. Tsirkin wrote: > > On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote: > > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin wrote: > > > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote: > > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: > > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin > > > > > > wrote: > > > > > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: > > > > > > > > With the current code it is accepted as long as userland send > > > > > > > > it. > > > > > > > > > > > > > > > > Although userland should not set a feature flag that has not > > > > > > > > been > > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code > > > > > > > > will not > > > > > > > > complain for it. > > > > > > > > > > > > > > > > Since there is no specific reason for any parent to reject that > > > > > > > > backend > > > > > > > > feature bit when it has been proposed, let's control it at vdpa > > > > > > > > frontend > > > > > > > > level. Future patches may move this control to the parent > > > > > > > > driver. > > > > > > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") > > > > > > > > Signed-off-by: Eugenio Pérez > > > > > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver > > > > > > > ok" hack > > > > > > > upstream at all, I merged it in next just to give it some testing. > > > > > > > We want RING_ACCESS_AFTER_KICK or some such. > > > > > > > > > > > > > > > > > > > Current devices do not support that semantic. > > > > > > > > > > Which devices specifically access the ring after DRIVER_OK but before > > > > > a kick? > > > > > > > > > > > > > Previous versions of the QEMU LM series did a spurious kick to start > > > > traffic at the LM destination [1]. When it was proposed, that spurious > > > > kick was removed from the series because to check for descriptors > > > > after driver_ok, even without a kick, was considered work of the > > > > parent driver. > > > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw > > > > will read the ring before the kick actually. I can ask. > > > > > > > > Thanks! > > > > > > > > [1] > > > > https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html > > > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no? > > > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems > > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK. > > > > But this reminds me one thing, as the thread is going too long, I > > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is > > supported? > > > > Thanks > > I don't see what does one have to do with another ... > > I think with RING_RESET we had another solution, enable rings > mapping them to a zero page, then reset and re-enable later. As discussed before, this seems to have some problems: 1) The behaviour is not clarified in the document 2) zero is a valid IOVA Thanks > > > > > > > > > > > > > > > > My plan was to convert > > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I > > > > > > was not explicit enough. > > > > > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa > > > > > > (parent?) driver, as talked in virtio-comment. But that complicates > > > > > > the architecture: > > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK > > > > > > * Store vq enable state separately, at > > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw > > > > > > * Store the doorbell state separately, but do not configure it to > > > > > > the > > > > > > device directly. > > > > > > > > > > > > But how to recover if the device cannot configure them at kick time, > > > > > > for example? > > > > > > > > > > > > Maybe we can just fail if the parent driver does not support > > > > > > enabling > > > > > > the vq after DRIVER_OK? That way no new feature flag is needed. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > Sent with Fixes: tag pointing to > > > > > > > > git.kernel.org/pub/scm/linux/kernel/git/mst > > > > > > > > commit. Please let me know if I should send a v3 of [1] instead. > > > > > > > > > > > > > > > > [1] > > > > > > > > https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ > > > > > > > > --- > > > > > > > > drivers/vhost/vdpa.c | 7 +-- > > > > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > > > > > index e1abf29fed5b..a7e554352351 100644 > > > > > > > > --- a/drivers/vhost/vdpa.c > > > > > > > > +++
Re: [PATCH v2 00/24] use vmalloc_array and vcalloc
Julia, > The functions vmalloc_array and vcalloc were introduced in > > commit a8749a35c399 ("mm: vmalloc: introduce array allocation functions") > > but are not used much yet. This series introduces uses of > these functions, to protect against multiplication overflows. Applied #7 and #24 to 6.5/scsi-staging, thanks! -- Martin K. Petersen Oracle Linux Engineering ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net-next 2/5] virtio_net: Add page_pool support to improve performance
On Wed, Jul 5, 2023 at 2:05 PM Jason Wang wrote: > > On Wed, Jul 5, 2023 at 1:41 PM Liang Chen wrote: > > > > On Fri, Jun 9, 2023 at 10:57 AM Liang Chen > > wrote: > > > > > > On Thu, Jun 8, 2023 at 8:38 AM Jason Wang wrote: > > > > > > > > On Thu, Jun 8, 2023 at 4:17 AM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Wed, Jun 07, 2023 at 05:08:59PM +0800, Liang Chen wrote: > > > > > > On Tue, May 30, 2023 at 9:19 AM Liang Chen > > > > > > wrote: > > > > > > > > > > > > > > On Mon, May 29, 2023 at 5:55 PM Michael S. Tsirkin > > > > > > > wrote: > > > > > > > > > > > > > > > > On Mon, May 29, 2023 at 03:27:56PM +0800, Liang Chen wrote: > > > > > > > > > On Sun, May 28, 2023 at 2:20 PM Michael S. Tsirkin > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > On Fri, May 26, 2023 at 01:46:18PM +0800, Liang Chen wrote: > > > > > > > > > > > The implementation at the moment uses one page per packet > > > > > > > > > > > in both the > > > > > > > > > > > normal and XDP path. In addition, introducing a module > > > > > > > > > > > parameter to enable > > > > > > > > > > > or disable the usage of page pool (disabled by default). > > > > > > > > > > > > > > > > > > > > > > In single-core vm testing environments, it gives a modest > > > > > > > > > > > performance gain > > > > > > > > > > > in the normal path. > > > > > > > > > > > Upstream codebase: 47.5 Gbits/sec > > > > > > > > > > > Upstream codebase + page_pool support: 50.2 Gbits/sec > > > > > > > > > > > > > > > > > > > > > > In multi-core vm testing environments, The most > > > > > > > > > > > significant performance > > > > > > > > > > > gain is observed in XDP cpumap: > > > > > > > > > > > Upstream codebase: 1.38 Gbits/sec > > > > > > > > > > > Upstream codebase + page_pool support: 9.74 Gbits/sec > > > > > > > > > > > > > > > > > > > > > > With this foundation, we can further integrate page pool > > > > > > > > > > > fragmentation and > > > > > > > > > > > DMA map/unmap support. > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Liang Chen > > > > > > > > > > > > > > > > > > > > Why off by default? > > > > > > > > > > I am guessing it sometimes has performance costs too? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What happens if we use page pool for big mode too? > > > > > > > > > > The less modes we have the better... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sure, now I believe it makes sense to enable it by default. > > > > > > > > > When the > > > > > > > > > packet size is very small, it reduces the likelihood of skb > > > > > > > > > coalescing. But such cases are rare. > > > > > > > > > > > > > > > > small packets are rare? These workloads are easy to create > > > > > > > > actually. > > > > > > > > Pls try and include benchmark with small packet size. > > > > > > > > > > > > > > > > > > > > > > Sure, Thanks! > > > > > > > > > > > > Before going ahead and posting v2 patch, I would like to hear more > > > > > > advice for the cases of small packets. I have done more performance > > > > > > benchmark with small packets since then. Here is a list of iperf > > > > > > output, > > > > > > > > > > > > With PP and PP fragmenting: > > > > > > 256K: [ 5] 505.00-510.00 sec 1.34 GBytes 2.31 Gbits/sec0 > > > > > > 144 KBytes > > > > > > 1K: [ 5] 30.00-35.00 sec 4.63 GBytes 7.95 Gbits/sec0 > > > > > > 223 KBytes > > > > > > 2K: [ 5] 65.00-70.00 sec 8.33 GBytes 14.3 Gbits/sec0 > > > > > > 324 KBytes > > > > > > 4K: [ 5] 30.00-35.00 sec 13.3 GBytes 22.8 Gbits/sec0 > > > > > > 1.08 MBytes > > > > > > 8K: [ 5] 50.00-55.00 sec 18.9 GBytes 32.4 Gbits/sec0 > > > > > > 744 KBytes > > > > > > 16K: [ 5] 25.00-30.00 sec 24.6 GBytes 42.3 Gbits/sec0 > > > > > > 963 KBytes > > > > > > 32K: [ 5] 45.00-50.00 sec 29.8 GBytes 51.2 Gbits/sec0 > > > > > > 1.25 MBytes > > > > > > 64K: [ 5] 35.00-40.00 sec 34.0 GBytes 58.4 Gbits/sec0 > > > > > > 1.70 MBytes > > > > > > 128K: [ 5] 45.00-50.00 sec 36.7 GBytes 63.1 Gbits/sec0 > > > > > > 4.26 MBytes > > > > > > 256K: [ 5] 30.00-35.00 sec 40.0 GBytes 68.8 Gbits/sec0 > > > > > > 3.20 MBytes > > > > > > > > Note that virtio-net driver is lacking things like BQL and others, so > > > > it might suffer from buffer bloat for TCP performance. Would you mind > > > > to measure with e.g using testpmd on the vhost to see the rx PPS? > > > > > > > > > > No problem. Before we proceed to measure with testpmd, could you > > > please take a look at the PPS measurements we obtained previously and > > > see if they are sufficient? Though we will only utilize page pool for > > > xdp on v2. > > > > > > netperf -H 192.168.124.197 -p -t UDP_STREAM -l 0 -- -m $((1)) > > > > > > with page pool: > > > 1. > > > Average:IFACE rxpck/s txpck/srxkB/stxkB/s > > > rxcmp/s txcmp/s
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On 7/5/23 11:27 AM, Eugenio Perez Martin wrote: > On Wed, Jul 5, 2023 at 9:50 AM Jason Wang wrote: On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin wrote: On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote: On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin wrote: On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin wrote: On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: With the current code it is accepted as long as userland send it. Although userland should not set a feature flag that has not been offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not complain for it. Since there is no specific reason for any parent to reject that backend feature bit when it has been proposed, let's control it at vdpa frontend level. Future patches may move this control to the parent driver. Fixes: 967800d2d52e ("vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") Signed-off-by: Eugenio Pérez Please do send v3. And again, I don't want to send "after driver ok" hack upstream at all, I merged it in next just to give it some testing. We want RING_ACCESS_AFTER_KICK or some such. Current devices do not support that semantic. Which devices specifically access the ring after DRIVER_OK but before a kick? The PDS vdpa device can deal with a call to .set_vq_ready after DRIVER_OK is set. And I'm told that our VQ activity should start without a kick. Our vdpa device FW doesn't currently have support for VIRTIO_F_RING_RESET, but I believe it could be added without too much trouble. sln Previous versions of the QEMU LM series did a spurious kick to start traffic at the LM destination [1]. When it was proposed, that spurious kick was removed from the series because to check for descriptors after driver_ok, even without a kick, was considered work of the parent driver. I'm ok to go back to this spurious kick, but I'm not sure if the hw will read the ring before the kick actually. I can ask. Thanks! [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no? My understanding is [1] assuming ACCESS_AFTER_KICK. This seems sub-optimal than assuming ENABLE_AFTER_DRIVER_OK. But this reminds me one thing, as the thread is going too long, I wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is supported? The problem with that is that the device needs to support all RING_RESET, like to be able to change vq address etc after DRIVER_OK. Not all HW support it. We just need the subset of having the dataplane freezed until all CVQ commands have been consumed. I'm sure current vDPA code already supports it in some devices, like MLX and PSD. Thanks! Thanks My plan was to convert it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I was not explicit enough. The only solution I can see to that is to trap & emulate in the vdpa (parent?) driver, as talked in virtio-comment. But that complicates the architecture: * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK * Store vq enable state separately, at vdpa->config->set_vq_ready(true), but not transmit that enable to hw * Store the doorbell state separately, but do not configure it to the device directly. But how to recover if the device cannot configure them at kick time, for example? Maybe we can just fail if the parent driver does not support enabling the vq after DRIVER_OK? That way no new feature flag is needed. Thanks! --- Sent with Fixes: tag pointing to git.kernel.org/pub/scm/linux/kernel/git/mst commit. Please let me know if I should send a v3 of [1] instead. [1] https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ --- drivers/vhost/vdpa.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index e1abf29fed5b..a7e554352351 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep, { struct vhost_vdpa *v = filep->private_data; struct vhost_dev *d = >vdev; + const struct vdpa_config_ops *ops = v->vdpa->config; void __user *argp = (void __user *)arg; u64 __user *featurep = argp; - u64 features; + u64 features, parent_features = 0; long r = 0; if (cmd == VHOST_SET_BACKEND_FEATURES) { if (copy_from_user(, featurep, sizeof(features))) return -EFAULT; + if (ops->get_backend_features) + parent_features = ops->get_backend_features(v->vdpa); if (features & ~(VHOST_VDPA_BACKEND_FEATURES | BIT_ULL(VHOST_BACKEND_F_SUSPEND) | BIT_ULL(VHOST_BACKEND_F_RESUME) | -
[PATCH] vduse: Use proper spinlock for IRQ injection
The IRQ injection work used spin_lock_irq() to protect the scheduling of the softirq, but spin_lock_bh() should be used. With spin_lock_irq(), we noticed delay of more than 6 seconds between the time a NAPI polling work is scheduled and the time it is executed. Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Cc: xieyon...@bytedance.com Suggested-by: Jason Wang Signed-off-by: Maxime Coquelin --- drivers/vdpa/vdpa_user/vduse_dev.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index dc38ed21319d..df7869537ef1 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -935,10 +935,10 @@ static void vduse_dev_irq_inject(struct work_struct *work) { struct vduse_dev *dev = container_of(work, struct vduse_dev, inject); - spin_lock_irq(>irq_lock); + spin_lock_bh(>irq_lock); if (dev->config_cb.callback) dev->config_cb.callback(dev->config_cb.private); - spin_unlock_irq(>irq_lock); + spin_unlock_bh(>irq_lock); } static void vduse_vq_irq_inject(struct work_struct *work) @@ -946,10 +946,10 @@ static void vduse_vq_irq_inject(struct work_struct *work) struct vduse_virtqueue *vq = container_of(work, struct vduse_virtqueue, inject); - spin_lock_irq(>irq_lock); + spin_lock_bh(>irq_lock); if (vq->ready && vq->cb.callback) vq->cb.callback(vq->cb.private); - spin_unlock_irq(>irq_lock); + spin_unlock_bh(>irq_lock); } static bool vduse_vq_signal_irqfd(struct vduse_virtqueue *vq) -- 2.41.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 3/3] vduse: Temporarily disable control queue features
Virtio-net driver control queue implementation is not safe when used with VDUSE. If the VDUSE application does not reply to control queue messages, it currently ends up hanging the kernel thread sending this command. Some work is on-going to make the control queue implementation robust with VDUSE. Until it is completed, let's filter out control virtqueue and features that depend on it by keeping only features known to be supported. Signed-off-by: Maxime Coquelin --- drivers/vdpa/vdpa_user/vduse_dev.c | 36 ++ 1 file changed, 36 insertions(+) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index 1271c9796517..7345071db0a8 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -46,6 +46,30 @@ #define IRQ_UNBOUND -1 +#define VDUSE_NET_VALID_FEATURES_MASK \ + (BIT_ULL(VIRTIO_NET_F_CSUM) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \ +BIT_ULL(VIRTIO_NET_F_MTU) |\ +BIT_ULL(VIRTIO_NET_F_MAC) |\ +BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_ECN) | \ +BIT_ULL(VIRTIO_NET_F_GUEST_UFO) | \ +BIT_ULL(VIRTIO_NET_F_HOST_TSO4) | \ +BIT_ULL(VIRTIO_NET_F_HOST_TSO6) | \ +BIT_ULL(VIRTIO_NET_F_HOST_ECN) | \ +BIT_ULL(VIRTIO_NET_F_HOST_UFO) | \ +BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) | \ +BIT_ULL(VIRTIO_NET_F_STATUS) | \ +BIT_ULL(VIRTIO_NET_F_HOST_USO) | \ +BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \ +BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \ +BIT_ULL(VIRTIO_F_EVENT_IDX) | \ +BIT_ULL(VIRTIO_F_VERSION_1) | \ +BIT_ULL(VIRTIO_F_IOMMU_PLATFORM) | \ +BIT_ULL(VIRTIO_F_RING_PACKED) |\ +BIT_ULL(VIRTIO_F_IN_ORDER)) + struct vduse_virtqueue { u16 index; u16 num_max; @@ -1778,6 +1802,16 @@ static struct attribute *vduse_dev_attrs[] = { ATTRIBUTE_GROUPS(vduse_dev); +static void vduse_dev_features_filter(struct vduse_dev_config *config) +{ + /* +* Temporarily filter out virtio-net's control virtqueue and features +* that depend on it while CVQ is being made more robust for VDUSE. +*/ + if (config->device_id == VIRTIO_ID_NET) + config->features &= VDUSE_NET_VALID_FEATURES_MASK; +} + static int vduse_create_dev(struct vduse_dev_config *config, void *config_buf, u64 api_version) { @@ -1793,6 +1827,8 @@ static int vduse_create_dev(struct vduse_dev_config *config, if (!dev) goto err; + vduse_dev_features_filter(config); + dev->api_version = api_version; dev->device_features = config->features; dev->device_id = config->device_id; -- 2.41.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 1/3] vduse: validate block features only with block devices
This patch is preliminary work to enable network device type support to VDUSE. As VIRTIO_BLK_F_CONFIG_WCE shares the same value as VIRTIO_NET_F_HOST_TSO4, we need to restrict its check to Virtio-blk device type. Acked-by: Jason Wang Reviewed-by: Xie Yongji Signed-off-by: Maxime Coquelin --- drivers/vdpa/vdpa_user/vduse_dev.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index dc38ed21319d..ff9fdd6783fe 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id) return false; } -static bool features_is_valid(u64 features) +static bool features_is_valid(struct vduse_dev_config *config) { - if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) + if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) return false; /* Now we only support read-only configuration space */ - if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)) + if ((config->device_id == VIRTIO_ID_BLOCK) && + (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))) return false; return true; @@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config *config) if (!device_is_allowed(config->device_id)) return false; - if (!features_is_valid(config->features)) + if (!features_is_valid(config)) return false; return true; -- 2.41.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 0/3] vduse: add support for networking devices
This small series enables virtio-net device type in VDUSE. With it, basic operation have been tested, both with virtio-vdpa and vhost-vdpa using DPDK Vhost library series adding VDUSE support using split rings layout (merged in DPDK v23.07-rc1). Control queue support (and so multiqueue) has also been tested, but requires a Kernel series from Jason Wang relaxing control queue polling [1] to function reliably, so while Jason rework is done, a patch is added to disable CVQ and features that depend on it (tested also with DPDK v23.07-rc1). [1]: https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/ v2 -> v3 changes: = - Use allow list instead of deny list (Michael) v1 -> v2 changes: = - Add a patch to disable CVQ (Michael) RFC -> v1 changes: == - Fail device init if it does not support VERSION_1 (Jason) Maxime Coquelin (3): vduse: validate block features only with block devices vduse: enable Virtio-net device type vduse: Temporarily disable control queue features drivers/vdpa/vdpa_user/vduse_dev.c | 51 +++--- 1 file changed, 47 insertions(+), 4 deletions(-) -- 2.41.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 2/3] vduse: enable Virtio-net device type
This patch adds Virtio-net device type to the supported devices types. Initialization fails if the device does not support VIRTIO_F_VERSION_1 feature, in order to guarantee the configuration space is read-only. Acked-by: Jason Wang Reviewed-by: Xie Yongji Signed-off-by: Maxime Coquelin --- drivers/vdpa/vdpa_user/vduse_dev.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index ff9fdd6783fe..1271c9796517 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -142,6 +142,7 @@ static struct workqueue_struct *vduse_irq_bound_wq; static u32 allowed_device_id[] = { VIRTIO_ID_BLOCK, + VIRTIO_ID_NET, }; static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa) @@ -1672,6 +1673,10 @@ static bool features_is_valid(struct vduse_dev_config *config) (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))) return false; + if ((config->device_id == VIRTIO_ID_NET) && + !(config->features & (1ULL << VIRTIO_F_VERSION_1))) + return false; + return true; } @@ -2027,6 +2032,7 @@ static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops = { static struct virtio_device_id id_table[] = { { VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID }, + { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID }, { 0 }, }; -- 2.41.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On Wed, Jul 05, 2023 at 03:55:23PM +0800, Jason Wang wrote: > On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin wrote: > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: > > > > > With the current code it is accepted as long as userland send it. > > > > > > > > > > Although userland should not set a feature flag that has not been > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will > > > > > not > > > > > complain for it. > > > > > > > > > > Since there is no specific reason for any parent to reject that > > > > > backend > > > > > feature bit when it has been proposed, let's control it at vdpa > > > > > frontend > > > > > level. Future patches may move this control to the parent driver. > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") > > > > > Signed-off-by: Eugenio Pérez > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" > > > > hack > > > > upstream at all, I merged it in next just to give it some testing. > > > > We want RING_ACCESS_AFTER_KICK or some such. > > > > > > > > > > Current devices do not support that semantic. > > > > Which devices specifically access the ring after DRIVER_OK but before > > a kick? > > Vhost-net is one example at last. It polls a socket as well, so it > starts to access the ring immediately after DRIVER_OK. > > Thanks For sure but that is not vdpa. > > > > > My plan was to convert > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I > > > was not explicit enough. > > > > > > The only solution I can see to that is to trap & emulate in the vdpa > > > (parent?) driver, as talked in virtio-comment. But that complicates > > > the architecture: > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK > > > * Store vq enable state separately, at > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw > > > * Store the doorbell state separately, but do not configure it to the > > > device directly. > > > > > > But how to recover if the device cannot configure them at kick time, > > > for example? > > > > > > Maybe we can just fail if the parent driver does not support enabling > > > the vq after DRIVER_OK? That way no new feature flag is needed. > > > > > > Thanks! > > > > > > > > > > > > --- > > > > > Sent with Fixes: tag pointing to > > > > > git.kernel.org/pub/scm/linux/kernel/git/mst > > > > > commit. Please let me know if I should send a v3 of [1] instead. > > > > > > > > > > [1] > > > > > https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ > > > > > --- > > > > > drivers/vhost/vdpa.c | 7 +-- > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > > index e1abf29fed5b..a7e554352351 100644 > > > > > --- a/drivers/vhost/vdpa.c > > > > > +++ b/drivers/vhost/vdpa.c > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct > > > > > file *filep, > > > > > { > > > > > struct vhost_vdpa *v = filep->private_data; > > > > > struct vhost_dev *d = >vdev; > > > > > + const struct vdpa_config_ops *ops = v->vdpa->config; > > > > > void __user *argp = (void __user *)arg; > > > > > u64 __user *featurep = argp; > > > > > - u64 features; > > > > > + u64 features, parent_features = 0; > > > > > long r = 0; > > > > > > > > > > if (cmd == VHOST_SET_BACKEND_FEATURES) { > > > > > if (copy_from_user(, featurep, > > > > > sizeof(features))) > > > > > return -EFAULT; > > > > > + if (ops->get_backend_features) > > > > > + parent_features = > > > > > ops->get_backend_features(v->vdpa); > > > > > if (features & ~(VHOST_VDPA_BACKEND_FEATURES | > > > > >BIT_ULL(VHOST_BACKEND_F_SUSPEND) | > > > > >BIT_ULL(VHOST_BACKEND_F_RESUME) | > > > > > - > > > > > BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK))) > > > > > + parent_features)) > > > > > return -EOPNOTSUPP; > > > > > if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && > > > > >!vhost_vdpa_can_suspend(v)) > > > > > -- > > > > > 2.39.3 > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On Wed, Jul 05, 2023 at 03:49:58PM +0800, Jason Wang wrote: > On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin wrote: > > > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote: > > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin > > > wrote: > > > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: > > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin > > > > > wrote: > > > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: > > > > > > > With the current code it is accepted as long as userland send it. > > > > > > > > > > > > > > Although userland should not set a feature flag that has not been > > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code > > > > > > > will not > > > > > > > complain for it. > > > > > > > > > > > > > > Since there is no specific reason for any parent to reject that > > > > > > > backend > > > > > > > feature bit when it has been proposed, let's control it at vdpa > > > > > > > frontend > > > > > > > level. Future patches may move this control to the parent driver. > > > > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") > > > > > > > Signed-off-by: Eugenio Pérez > > > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver > > > > > > ok" hack > > > > > > upstream at all, I merged it in next just to give it some testing. > > > > > > We want RING_ACCESS_AFTER_KICK or some such. > > > > > > > > > > > > > > > > Current devices do not support that semantic. > > > > > > > > Which devices specifically access the ring after DRIVER_OK but before > > > > a kick? > > > > > > > > > > Previous versions of the QEMU LM series did a spurious kick to start > > > traffic at the LM destination [1]. When it was proposed, that spurious > > > kick was removed from the series because to check for descriptors > > > after driver_ok, even without a kick, was considered work of the > > > parent driver. > > > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw > > > will read the ring before the kick actually. I can ask. > > > > > > Thanks! > > > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html > > > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no? > > My understanding is [1] assuming ACCESS_AFTER_KICK. This seems > sub-optimal than assuming ENABLE_AFTER_DRIVER_OK. > > But this reminds me one thing, as the thread is going too long, I > wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is > supported? > > Thanks I don't see what does one have to do with another ... I think with RING_RESET we had another solution, enable rings mapping them to a zero page, then reset and re-enable later. > > > > > > > > > > > My plan was to convert > > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I > > > > > was not explicit enough. > > > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa > > > > > (parent?) driver, as talked in virtio-comment. But that complicates > > > > > the architecture: > > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK > > > > > * Store vq enable state separately, at > > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw > > > > > * Store the doorbell state separately, but do not configure it to the > > > > > device directly. > > > > > > > > > > But how to recover if the device cannot configure them at kick time, > > > > > for example? > > > > > > > > > > Maybe we can just fail if the parent driver does not support enabling > > > > > the vq after DRIVER_OK? That way no new feature flag is needed. > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > --- > > > > > > > Sent with Fixes: tag pointing to > > > > > > > git.kernel.org/pub/scm/linux/kernel/git/mst > > > > > > > commit. Please let me know if I should send a v3 of [1] instead. > > > > > > > > > > > > > > [1] > > > > > > > https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ > > > > > > > --- > > > > > > > drivers/vhost/vdpa.c | 7 +-- > > > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > > > > index e1abf29fed5b..a7e554352351 100644 > > > > > > > --- a/drivers/vhost/vdpa.c > > > > > > > +++ b/drivers/vhost/vdpa.c > > > > > > > @@ -681,18 +681,21 @@ static long > > > > > > > vhost_vdpa_unlocked_ioctl(struct file *filep, > > > > > > > { > > > > > > > struct vhost_vdpa *v = filep->private_data; > > > > > > > struct vhost_dev *d = >vdev; > > > > > > > + const struct vdpa_config_ops *ops = v->vdpa->config; > > > > > > > void __user *argp = (void __user *)arg; > > > > > > > u64 __user *featurep = argp; > > > > > > > - u64 features; > > > > > > > + u64 features,
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On Tue, Jul 4, 2023 at 6:38 PM Michael S. Tsirkin wrote: > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin wrote: > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: > > > > With the current code it is accepted as long as userland send it. > > > > > > > > Although userland should not set a feature flag that has not been > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code will not > > > > complain for it. > > > > > > > > Since there is no specific reason for any parent to reject that backend > > > > feature bit when it has been proposed, let's control it at vdpa frontend > > > > level. Future patches may move this control to the parent driver. > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") > > > > Signed-off-by: Eugenio Pérez > > > > > > Please do send v3. And again, I don't want to send "after driver ok" hack > > > upstream at all, I merged it in next just to give it some testing. > > > We want RING_ACCESS_AFTER_KICK or some such. > > > > > > > Current devices do not support that semantic. > > Which devices specifically access the ring after DRIVER_OK but before > a kick? Vhost-net is one example at last. It polls a socket as well, so it starts to access the ring immediately after DRIVER_OK. Thanks > > > My plan was to convert > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I > > was not explicit enough. > > > > The only solution I can see to that is to trap & emulate in the vdpa > > (parent?) driver, as talked in virtio-comment. But that complicates > > the architecture: > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK > > * Store vq enable state separately, at > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw > > * Store the doorbell state separately, but do not configure it to the > > device directly. > > > > But how to recover if the device cannot configure them at kick time, > > for example? > > > > Maybe we can just fail if the parent driver does not support enabling > > the vq after DRIVER_OK? That way no new feature flag is needed. > > > > Thanks! > > > > > > > > > --- > > > > Sent with Fixes: tag pointing to > > > > git.kernel.org/pub/scm/linux/kernel/git/mst > > > > commit. Please let me know if I should send a v3 of [1] instead. > > > > > > > > [1] > > > > https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ > > > > --- > > > > drivers/vhost/vdpa.c | 7 +-- > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > index e1abf29fed5b..a7e554352351 100644 > > > > --- a/drivers/vhost/vdpa.c > > > > +++ b/drivers/vhost/vdpa.c > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct file > > > > *filep, > > > > { > > > > struct vhost_vdpa *v = filep->private_data; > > > > struct vhost_dev *d = >vdev; > > > > + const struct vdpa_config_ops *ops = v->vdpa->config; > > > > void __user *argp = (void __user *)arg; > > > > u64 __user *featurep = argp; > > > > - u64 features; > > > > + u64 features, parent_features = 0; > > > > long r = 0; > > > > > > > > if (cmd == VHOST_SET_BACKEND_FEATURES) { > > > > if (copy_from_user(, featurep, sizeof(features))) > > > > return -EFAULT; > > > > + if (ops->get_backend_features) > > > > + parent_features = > > > > ops->get_backend_features(v->vdpa); > > > > if (features & ~(VHOST_VDPA_BACKEND_FEATURES | > > > >BIT_ULL(VHOST_BACKEND_F_SUSPEND) | > > > >BIT_ULL(VHOST_BACKEND_F_RESUME) | > > > > - > > > > BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK))) > > > > + parent_features)) > > > > return -EOPNOTSUPP; > > > > if ((features & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && > > > >!vhost_vdpa_can_suspend(v)) > > > > -- > > > > 2.39.3 > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vdpa: reject F_ENABLE_AFTER_DRIVER_OK if backend does not support it
On Tue, Jul 4, 2023 at 11:45 PM Michael S. Tsirkin wrote: > > On Tue, Jul 04, 2023 at 01:36:11PM +0200, Eugenio Perez Martin wrote: > > On Tue, Jul 4, 2023 at 12:38 PM Michael S. Tsirkin wrote: > > > > > > On Tue, Jul 04, 2023 at 12:25:32PM +0200, Eugenio Perez Martin wrote: > > > > On Mon, Jul 3, 2023 at 4:52 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Mon, Jul 03, 2023 at 04:22:18PM +0200, Eugenio Pérez wrote: > > > > > > With the current code it is accepted as long as userland send it. > > > > > > > > > > > > Although userland should not set a feature flag that has not been > > > > > > offered to it with VHOST_GET_BACKEND_FEATURES, the current code > > > > > > will not > > > > > > complain for it. > > > > > > > > > > > > Since there is no specific reason for any parent to reject that > > > > > > backend > > > > > > feature bit when it has been proposed, let's control it at vdpa > > > > > > frontend > > > > > > level. Future patches may move this control to the parent driver. > > > > > > > > > > > > Fixes: 967800d2d52e ("vdpa: accept > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature") > > > > > > Signed-off-by: Eugenio Pérez > > > > > > > > > > Please do send v3. And again, I don't want to send "after driver ok" > > > > > hack > > > > > upstream at all, I merged it in next just to give it some testing. > > > > > We want RING_ACCESS_AFTER_KICK or some such. > > > > > > > > > > > > > Current devices do not support that semantic. > > > > > > Which devices specifically access the ring after DRIVER_OK but before > > > a kick? > > > > > > > Previous versions of the QEMU LM series did a spurious kick to start > > traffic at the LM destination [1]. When it was proposed, that spurious > > kick was removed from the series because to check for descriptors > > after driver_ok, even without a kick, was considered work of the > > parent driver. > > > > I'm ok to go back to this spurious kick, but I'm not sure if the hw > > will read the ring before the kick actually. I can ask. > > > > Thanks! > > > > [1] https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg02775.html > > Let's find out. We need to check for ENABLE_AFTER_DRIVER_OK too, no? My understanding is [1] assuming ACCESS_AFTER_KICK. This seems sub-optimal than assuming ENABLE_AFTER_DRIVER_OK. But this reminds me one thing, as the thread is going too long, I wonder if we simply assume ENABLE_AFTER_DRIVER_OK if RING_RESET is supported? Thanks > > > > > > > My plan was to convert > > > > it in vp_vdpa if needed, and reuse the current vdpa ops. Sorry if I > > > > was not explicit enough. > > > > > > > > The only solution I can see to that is to trap & emulate in the vdpa > > > > (parent?) driver, as talked in virtio-comment. But that complicates > > > > the architecture: > > > > * Offer VHOST_BACKEND_F_RING_ACCESS_AFTER_KICK > > > > * Store vq enable state separately, at > > > > vdpa->config->set_vq_ready(true), but not transmit that enable to hw > > > > * Store the doorbell state separately, but do not configure it to the > > > > device directly. > > > > > > > > But how to recover if the device cannot configure them at kick time, > > > > for example? > > > > > > > > Maybe we can just fail if the parent driver does not support enabling > > > > the vq after DRIVER_OK? That way no new feature flag is needed. > > > > > > > > Thanks! > > > > > > > > > > > > > > > --- > > > > > > Sent with Fixes: tag pointing to > > > > > > git.kernel.org/pub/scm/linux/kernel/git/mst > > > > > > commit. Please let me know if I should send a v3 of [1] instead. > > > > > > > > > > > > [1] > > > > > > https://lore.kernel.org/lkml/20230609121244-mutt-send-email-...@kernel.org/T/ > > > > > > --- > > > > > > drivers/vhost/vdpa.c | 7 +-- > > > > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > > > > > > index e1abf29fed5b..a7e554352351 100644 > > > > > > --- a/drivers/vhost/vdpa.c > > > > > > +++ b/drivers/vhost/vdpa.c > > > > > > @@ -681,18 +681,21 @@ static long vhost_vdpa_unlocked_ioctl(struct > > > > > > file *filep, > > > > > > { > > > > > > struct vhost_vdpa *v = filep->private_data; > > > > > > struct vhost_dev *d = >vdev; > > > > > > + const struct vdpa_config_ops *ops = v->vdpa->config; > > > > > > void __user *argp = (void __user *)arg; > > > > > > u64 __user *featurep = argp; > > > > > > - u64 features; > > > > > > + u64 features, parent_features = 0; > > > > > > long r = 0; > > > > > > > > > > > > if (cmd == VHOST_SET_BACKEND_FEATURES) { > > > > > > if (copy_from_user(, featurep, > > > > > > sizeof(features))) > > > > > > return -EFAULT; > > > > > > + if (ops->get_backend_features) > > > > > > + parent_features = > > > > > > ops->get_backend_features(v->vdpa); > > > > > > if (features &
Re: [PATCH] mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
On Wed, Jul 5, 2023 at 2:16 PM Michael S. Tsirkin wrote: > > On Wed, Jul 05, 2023 at 01:47:44PM +0800, Jason Wang wrote: > > On Wed, Jul 5, 2023 at 1:31 PM Michael S. Tsirkin wrote: > > > > > > On Wed, Jul 05, 2023 at 01:11:37PM +0800, Jason Wang wrote: > > > > On Tue, Jul 4, 2023 at 6:16 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Mon, Jul 03, 2023 at 05:26:02PM -0700, Si-Wei Liu wrote: > > > > > > > > > > > > > > > > > > On 7/3/2023 8:46 AM, Michael S. Tsirkin wrote: > > > > > > > On Mon, Jul 03, 2023 at 04:25:14PM +0200, Eugenio Pérez wrote: > > > > > > > > Offer this backend feature as mlx5 is compatible with it. It > > > > > > > > allows it > > > > > > > > to do live migration with CVQ, dynamically switching between > > > > > > > > passthrough > > > > > > > > and shadow virtqueue. > > > > > > > > > > > > > > > > Signed-off-by: Eugenio Pérez > > > > > > > Same comment. > > > > > > to which? > > > > > > > > > > > > -Siwei > > > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is too narrow a use-case to > > > > > commit to it > > > > > as a kernel/userspace ABI: what if one wants to start rings in some > > > > > other specific order? > > > > > > > > Just enable a queue by writing e.g 1 to queue_enable in a specific > > > > order? > > > > > > > > > But then at driver ok time we don't know how many queues are there. > > > > There should be a device specific interface for this, for example, > > num_queue_paris. So the device knows at most how many queues there > > are. Or anything I miss? > > That's a device limitations. Does not tell the device how much is used. I think I miss something, how kick differs from queue_enable in this way? > > > > > > > > > As was discussed on list, a better promise is not to access ring > > > > > until the 1st kick. vdpa can then do a kick when it wants > > > > > the device to start accessing rings. > > > > > > > > Rethink about the ACCESS_AFTER_KICK, it sounds functional equivalent > > > > to allow queue_enable after DRIVER_OK, but it seems to have > > > > distanvages: > > > > > > > > A busy polling software device may disable notifications and never > > > > respond to register to any kick notifiers. ACCESS_AFTER_KICK will > > > > introduce overhead to those implementations. > > > > > > > > Thanks > > > > > > It's just the 1st kick, then you can disable. No? > > > > Yes, but: > > > > 1) adding hooks for queue_enable > > 2) adding new codes to register event notifier and toggle the notifier > > > > 1) seems much easier? Or for most devices, it already behaves like this. > > > > Thanks > > Well libvhostuser checks enabled queues at DRIVER_OK, does it not? Probably, but I meant: 1) This behaviour has been supported by some device (e.g MLX) 2) This is the current behaviour of Qemu for vhost-net devices: static void virtio_net_queue_enable(VirtIODevice *vdev, uint32_t queue_index) { VirtIONet *n = VIRTIO_NET(vdev); NetClientState *nc; int r; if (get_vhost_net(nc->peer) && nc->peer->info->type == NET_CLIENT_DRIVER_TAP) { r = vhost_net_virtqueue_restart(vdev, nc, queue_index); if (r < 0) { error_report("unable to restart vhost net virtqueue: %d, " "when resetting the queue", queue_index); } } } Thanks > > > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 7 +++ > > > > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > > index 9138ef2fb2c8..5f309a16b9dc 100644 > > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > > @@ -7,6 +7,7 @@ > > > > > > > > #include > > > > > > > > #include > > > > > > > > #include > > > > > > > > +#include > > > > > > > > #include > > > > > > > > #include > > > > > > > > #include > > > > > > > > @@ -2499,6 +2500,11 @@ static void > > > > > > > > unregister_link_notifier(struct mlx5_vdpa_net *ndev) > > > > > > > > flush_workqueue(ndev->mvdev.wq); > > > > > > > > } > > > > > > > > +static u64 mlx5_vdpa_get_backend_features(const struct > > > > > > > > vdpa_device *vdpa) > > > > > > > > +{ > > > > > > > > + return BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK); > > > > > > > > +} > > > > > > > > + > > > > > > > > static int mlx5_vdpa_set_driver_features(struct vdpa_device > > > > > > > > *vdev, u64 features) > > > > > > > > { > > > > > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > > > > > > > > @@ -3140,6 +3146,7 @@ static const struct vdpa_config_ops > > > > > > > > mlx5_vdpa_ops = { > > > > > > > > .get_vq_align = mlx5_vdpa_get_vq_align, > > > > > > > > .get_vq_group = mlx5_vdpa_get_vq_group, > > > > > > > > .get_device_features = mlx5_vdpa_get_device_features, > > > >
Re: [PATCH] mlx5_vdpa: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
On Wed, Jul 05, 2023 at 01:47:44PM +0800, Jason Wang wrote: > On Wed, Jul 5, 2023 at 1:31 PM Michael S. Tsirkin wrote: > > > > On Wed, Jul 05, 2023 at 01:11:37PM +0800, Jason Wang wrote: > > > On Tue, Jul 4, 2023 at 6:16 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Jul 03, 2023 at 05:26:02PM -0700, Si-Wei Liu wrote: > > > > > > > > > > > > > > > On 7/3/2023 8:46 AM, Michael S. Tsirkin wrote: > > > > > > On Mon, Jul 03, 2023 at 04:25:14PM +0200, Eugenio Pérez wrote: > > > > > > > Offer this backend feature as mlx5 is compatible with it. It > > > > > > > allows it > > > > > > > to do live migration with CVQ, dynamically switching between > > > > > > > passthrough > > > > > > > and shadow virtqueue. > > > > > > > > > > > > > > Signed-off-by: Eugenio Pérez > > > > > > Same comment. > > > > > to which? > > > > > > > > > > -Siwei > > > > > > > > VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is too narrow a use-case to > > > > commit to it > > > > as a kernel/userspace ABI: what if one wants to start rings in some > > > > other specific order? > > > > > > Just enable a queue by writing e.g 1 to queue_enable in a specific order? > > > > > > But then at driver ok time we don't know how many queues are there. > > There should be a device specific interface for this, for example, > num_queue_paris. So the device knows at most how many queues there > are. Or anything I miss? That's a device limitations. Does not tell the device how much is used. > > > > > > As was discussed on list, a better promise is not to access ring > > > > until the 1st kick. vdpa can then do a kick when it wants > > > > the device to start accessing rings. > > > > > > Rethink about the ACCESS_AFTER_KICK, it sounds functional equivalent > > > to allow queue_enable after DRIVER_OK, but it seems to have > > > distanvages: > > > > > > A busy polling software device may disable notifications and never > > > respond to register to any kick notifiers. ACCESS_AFTER_KICK will > > > introduce overhead to those implementations. > > > > > > Thanks > > > > It's just the 1st kick, then you can disable. No? > > Yes, but: > > 1) adding hooks for queue_enable > 2) adding new codes to register event notifier and toggle the notifier > > 1) seems much easier? Or for most devices, it already behaves like this. > > Thanks Well libvhostuser checks enabled queues at DRIVER_OK, does it not? > > > > > > > > > > > > > > > > > > > --- > > > > > > > drivers/vdpa/mlx5/net/mlx5_vnet.c | 7 +++ > > > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > index 9138ef2fb2c8..5f309a16b9dc 100644 > > > > > > > --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c > > > > > > > @@ -7,6 +7,7 @@ > > > > > > > #include > > > > > > > #include > > > > > > > #include > > > > > > > +#include > > > > > > > #include > > > > > > > #include > > > > > > > #include > > > > > > > @@ -2499,6 +2500,11 @@ static void > > > > > > > unregister_link_notifier(struct mlx5_vdpa_net *ndev) > > > > > > > flush_workqueue(ndev->mvdev.wq); > > > > > > > } > > > > > > > +static u64 mlx5_vdpa_get_backend_features(const struct > > > > > > > vdpa_device *vdpa) > > > > > > > +{ > > > > > > > + return BIT_ULL(VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK); > > > > > > > +} > > > > > > > + > > > > > > > static int mlx5_vdpa_set_driver_features(struct vdpa_device > > > > > > > *vdev, u64 features) > > > > > > > { > > > > > > > struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev); > > > > > > > @@ -3140,6 +3146,7 @@ static const struct vdpa_config_ops > > > > > > > mlx5_vdpa_ops = { > > > > > > > .get_vq_align = mlx5_vdpa_get_vq_align, > > > > > > > .get_vq_group = mlx5_vdpa_get_vq_group, > > > > > > > .get_device_features = mlx5_vdpa_get_device_features, > > > > > > > + .get_backend_features = mlx5_vdpa_get_backend_features, > > > > > > > .set_driver_features = mlx5_vdpa_set_driver_features, > > > > > > > .get_driver_features = mlx5_vdpa_get_driver_features, > > > > > > > .set_config_cb = mlx5_vdpa_set_config_cb, > > > > > > > -- > > > > > > > 2.39.3 > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net-next 2/5] virtio_net: Add page_pool support to improve performance
On Wed, Jul 5, 2023 at 1:41 PM Liang Chen wrote: > > On Fri, Jun 9, 2023 at 10:57 AM Liang Chen wrote: > > > > On Thu, Jun 8, 2023 at 8:38 AM Jason Wang wrote: > > > > > > On Thu, Jun 8, 2023 at 4:17 AM Michael S. Tsirkin wrote: > > > > > > > > On Wed, Jun 07, 2023 at 05:08:59PM +0800, Liang Chen wrote: > > > > > On Tue, May 30, 2023 at 9:19 AM Liang Chen > > > > > wrote: > > > > > > > > > > > > On Mon, May 29, 2023 at 5:55 PM Michael S. Tsirkin > > > > > > wrote: > > > > > > > > > > > > > > On Mon, May 29, 2023 at 03:27:56PM +0800, Liang Chen wrote: > > > > > > > > On Sun, May 28, 2023 at 2:20 PM Michael S. Tsirkin > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Fri, May 26, 2023 at 01:46:18PM +0800, Liang Chen wrote: > > > > > > > > > > The implementation at the moment uses one page per packet > > > > > > > > > > in both the > > > > > > > > > > normal and XDP path. In addition, introducing a module > > > > > > > > > > parameter to enable > > > > > > > > > > or disable the usage of page pool (disabled by default). > > > > > > > > > > > > > > > > > > > > In single-core vm testing environments, it gives a modest > > > > > > > > > > performance gain > > > > > > > > > > in the normal path. > > > > > > > > > > Upstream codebase: 47.5 Gbits/sec > > > > > > > > > > Upstream codebase + page_pool support: 50.2 Gbits/sec > > > > > > > > > > > > > > > > > > > > In multi-core vm testing environments, The most significant > > > > > > > > > > performance > > > > > > > > > > gain is observed in XDP cpumap: > > > > > > > > > > Upstream codebase: 1.38 Gbits/sec > > > > > > > > > > Upstream codebase + page_pool support: 9.74 Gbits/sec > > > > > > > > > > > > > > > > > > > > With this foundation, we can further integrate page pool > > > > > > > > > > fragmentation and > > > > > > > > > > DMA map/unmap support. > > > > > > > > > > > > > > > > > > > > Signed-off-by: Liang Chen > > > > > > > > > > > > > > > > > > Why off by default? > > > > > > > > > I am guessing it sometimes has performance costs too? > > > > > > > > > > > > > > > > > > > > > > > > > > > What happens if we use page pool for big mode too? > > > > > > > > > The less modes we have the better... > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sure, now I believe it makes sense to enable it by default. > > > > > > > > When the > > > > > > > > packet size is very small, it reduces the likelihood of skb > > > > > > > > coalescing. But such cases are rare. > > > > > > > > > > > > > > small packets are rare? These workloads are easy to create > > > > > > > actually. > > > > > > > Pls try and include benchmark with small packet size. > > > > > > > > > > > > > > > > > > > Sure, Thanks! > > > > > > > > > > Before going ahead and posting v2 patch, I would like to hear more > > > > > advice for the cases of small packets. I have done more performance > > > > > benchmark with small packets since then. Here is a list of iperf > > > > > output, > > > > > > > > > > With PP and PP fragmenting: > > > > > 256K: [ 5] 505.00-510.00 sec 1.34 GBytes 2.31 Gbits/sec0 > > > > > 144 KBytes > > > > > 1K: [ 5] 30.00-35.00 sec 4.63 GBytes 7.95 Gbits/sec0 > > > > > 223 KBytes > > > > > 2K: [ 5] 65.00-70.00 sec 8.33 GBytes 14.3 Gbits/sec0 > > > > > 324 KBytes > > > > > 4K: [ 5] 30.00-35.00 sec 13.3 GBytes 22.8 Gbits/sec0 > > > > > 1.08 MBytes > > > > > 8K: [ 5] 50.00-55.00 sec 18.9 GBytes 32.4 Gbits/sec0 > > > > > 744 KBytes > > > > > 16K: [ 5] 25.00-30.00 sec 24.6 GBytes 42.3 Gbits/sec0 > > > > > 963 KBytes > > > > > 32K: [ 5] 45.00-50.00 sec 29.8 GBytes 51.2 Gbits/sec0 > > > > > 1.25 MBytes > > > > > 64K: [ 5] 35.00-40.00 sec 34.0 GBytes 58.4 Gbits/sec0 > > > > > 1.70 MBytes > > > > > 128K: [ 5] 45.00-50.00 sec 36.7 GBytes 63.1 Gbits/sec0 > > > > > 4.26 MBytes > > > > > 256K: [ 5] 30.00-35.00 sec 40.0 GBytes 68.8 Gbits/sec0 > > > > > 3.20 MBytes > > > > > > Note that virtio-net driver is lacking things like BQL and others, so > > > it might suffer from buffer bloat for TCP performance. Would you mind > > > to measure with e.g using testpmd on the vhost to see the rx PPS? > > > > > > > No problem. Before we proceed to measure with testpmd, could you > > please take a look at the PPS measurements we obtained previously and > > see if they are sufficient? Though we will only utilize page pool for > > xdp on v2. > > > > netperf -H 192.168.124.197 -p -t UDP_STREAM -l 0 -- -m $((1)) > > > > with page pool: > > 1. > > Average:IFACE rxpck/s txpck/srxkB/stxkB/s > > rxcmp/s txcmp/s rxmcst/s %ifutil > > Average: enp8s0 655092.27 0.35 27508.77 0.03 > > 0.00 0.00 0.00 0.00 > > 2. > > Average:IFACE rxpck/s txpck/srxkB/stxkB/s > > rxcmp/s txcmp/s rxmcst/s %ifutil > > Average: enp8s0 654749.87 0.63