Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-11-03 Thread Maxime Coquelin




On 11/2/23 19:59, Michael S. Tsirkin wrote:

On Thu, Nov 02, 2023 at 06:56:59PM +0100, Maxime Coquelin wrote:



On 10/24/23 17:30, Casey Schaufler wrote:

On 10/24/2023 2:49 AM, Maxime Coquelin wrote:



On 10/23/23 17:13, Casey Schaufler wrote:

On 10/23/2023 12:28 AM, Maxime Coquelin wrote:



On 10/21/23 00:20, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch introduces LSM hooks for devices creation,
destruction and opening operations, checking the
application is allowed to perform these operations for
the Virtio device type.


Why do you think that there needs to be a special LSM check for virtio
devices? What can't existing device attributes be used?


Michael asked for a way for SELinux to allow/prevent the creation of
some types of devices [0].

A device is created using ioctl() on VDUSE control chardev. Its type is
specified via a field in the structure passed in argument.

I didn't see other way than adding dedicated LSM hooks to achieve this,
but it is possible that their is a better way to do it?


At the very least the hook should be made more general, and I'd have to
see a proposal before commenting on that. security_dev_destroy(dev)
might
be a better approach. If there's reason to control destruction of vduse
devices it's reasonable to assume that there are other devices with the
same or similar properties.


VDUSE is different from other devices as the device is actually
implemented by the user-space application, so this is very specific in
my opinion.


This is hardly unique. If you're implementing the device
in user-space you may well be able to implement the desired
controls there.





Since SELinux is your target use case, can you explain why you can't
create SELinux policy to enforce the restrictions you're after? I
believe
(but can be proven wrong, of course) that SELinux has mechanism for
dealing
with controls on ioctls.



I am not aware of such mechanism to deal with ioctl(), if you have a
pointer that would be welcome.


security/selinux/hooks.c


We might be able to extend selinux_file_ioctl(), but that will only
covers the ioctl for the control file, this patch also adds hook for the
device file opening that would need dedicated hook as the device type
information is stored in the device's private data.

Michael, before going further, I would be interested in your feedback.
Was this patch what you had in mind when requesting for a way to
allow/deny devices types for a given application?

Regards,
Maxime



Yes, this is more or less what I had in mind.


Great.

Do you think we need to cover both ioctl() on the control file and
open() on the device file, or only ioctl() is enough?

If the former, we will need VDUSE-specific hooks. I may be able to
improve my patch to have a single hook instead of 3 by passing the type
of operation as an extra argument (create/destroy/open).

If the latter, we may be able to extend the generic ioctl hook.

Personally, I think it would make sense to also ensure a given
application can only open existing VDUSE devices it supports. For
example, openvswitch should only be allowed to open networking VDUSE
devices.

Thanks,
Maxime







Thanks,
Maxime





Thanks,
Maxime

[0]:
https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/












___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-11-02 Thread Maxime Coquelin




On 10/24/23 17:30, Casey Schaufler wrote:

On 10/24/2023 2:49 AM, Maxime Coquelin wrote:



On 10/23/23 17:13, Casey Schaufler wrote:

On 10/23/2023 12:28 AM, Maxime Coquelin wrote:



On 10/21/23 00:20, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch introduces LSM hooks for devices creation,
destruction and opening operations, checking the
application is allowed to perform these operations for
the Virtio device type.


Why do you think that there needs to be a special LSM check for virtio
devices? What can't existing device attributes be used?


Michael asked for a way for SELinux to allow/prevent the creation of
some types of devices [0].

A device is created using ioctl() on VDUSE control chardev. Its type is
specified via a field in the structure passed in argument.

I didn't see other way than adding dedicated LSM hooks to achieve this,
but it is possible that their is a better way to do it?


At the very least the hook should be made more general, and I'd have to
see a proposal before commenting on that. security_dev_destroy(dev)
might
be a better approach. If there's reason to control destruction of vduse
devices it's reasonable to assume that there are other devices with the
same or similar properties.


VDUSE is different from other devices as the device is actually
implemented by the user-space application, so this is very specific in
my opinion.


This is hardly unique. If you're implementing the device
in user-space you may well be able to implement the desired
controls there.





Since SELinux is your target use case, can you explain why you can't
create SELinux policy to enforce the restrictions you're after? I
believe
(but can be proven wrong, of course) that SELinux has mechanism for
dealing
with controls on ioctls.



I am not aware of such mechanism to deal with ioctl(), if you have a
pointer that would be welcome.


security/selinux/hooks.c


We might be able to extend selinux_file_ioctl(), but that will only
covers the ioctl for the control file, this patch also adds hook for the
device file opening that would need dedicated hook as the device type
information is stored in the device's private data.

Michael, before going further, I would be interested in your feedback.
Was this patch what you had in mind when requesting for a way to
allow/deny devices types for a given application?

Regards,
Maxime





Thanks,
Maxime





Thanks,
Maxime

[0]:
https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/










___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-10-24 Thread Maxime Coquelin




On 10/23/23 17:13, Casey Schaufler wrote:

On 10/23/2023 12:28 AM, Maxime Coquelin wrote:



On 10/21/23 00:20, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch introduces LSM hooks for devices creation,
destruction and opening operations, checking the
application is allowed to perform these operations for
the Virtio device type.


Why do you think that there needs to be a special LSM check for virtio
devices? What can't existing device attributes be used?


Michael asked for a way for SELinux to allow/prevent the creation of
some types of devices [0].

A device is created using ioctl() on VDUSE control chardev. Its type is
specified via a field in the structure passed in argument.

I didn't see other way than adding dedicated LSM hooks to achieve this,
but it is possible that their is a better way to do it?


At the very least the hook should be made more general, and I'd have to
see a proposal before commenting on that. security_dev_destroy(dev) might
be a better approach. If there's reason to control destruction of vduse
devices it's reasonable to assume that there are other devices with the
same or similar properties.


VDUSE is different from other devices as the device is actually
implemented by the user-space application, so this is very specific in
my opinion.



Since SELinux is your target use case, can you explain why you can't
create SELinux policy to enforce the restrictions you're after? I believe
(but can be proven wrong, of course) that SELinux has mechanism for dealing
with controls on ioctls.



I am not aware of such mechanism to deal with ioctl(), if you have a 
pointer that would be welcome.


Thanks,
Maxime





Thanks,
Maxime

[0]:
https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 3/4] vduse: Temporarily disable control queue features

2023-10-23 Thread Maxime Coquelin



On 10/23/23 05:08, Jason Wang wrote:

On Fri, Oct 20, 2023 at 11:58 PM Maxime Coquelin
 wrote:


Virtio-net driver control queue implementation is not safe
when used with VDUSE. If the VDUSE application does not
reply to control queue messages, it currently ends up
hanging the kernel thread sending this command.

Some work is on-going to make the control queue
implementation robust with VDUSE. Until it is completed,
let's disable control virtqueue and features that depend on
it.

Signed-off-by: Maxime Coquelin 


I wonder if it's better to do this with patch 2 or before patch 2 to
unbreak the bisection?


I think it would be better to keep it in a dedicated patch to ease the
revert later when your work will have been accepted, so before patch 2.

Thanks,
Maxime


Thanks


---
  drivers/vdpa/vdpa_user/vduse_dev.c | 37 ++
  1 file changed, 37 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 73ad3b7efd8e..0243dee9cf0e 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -28,6 +28,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 

  #include "iova_domain.h"
@@ -46,6 +47,30 @@

  #define IRQ_UNBOUND -1

+#define VDUSE_NET_VALID_FEATURES_MASK   \
+   (BIT_ULL(VIRTIO_NET_F_CSUM) |   \
+BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \
+BIT_ULL(VIRTIO_NET_F_MTU) |\
+BIT_ULL(VIRTIO_NET_F_MAC) |\
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |  \
+BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_ECN) |   \
+BIT_ULL(VIRTIO_NET_F_HOST_UFO) |   \
+BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |  \
+BIT_ULL(VIRTIO_NET_F_STATUS) | \
+BIT_ULL(VIRTIO_NET_F_HOST_USO) |   \
+BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \
+BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \
+BIT_ULL(VIRTIO_RING_F_EVENT_IDX) |  \
+BIT_ULL(VIRTIO_F_VERSION_1) |  \
+BIT_ULL(VIRTIO_F_ACCESS_PLATFORM) | \
+BIT_ULL(VIRTIO_F_RING_PACKED) |\
+BIT_ULL(VIRTIO_F_IN_ORDER))
+
  struct vduse_virtqueue {
 u16 index;
 u16 num_max;
@@ -1778,6 +1803,16 @@ static struct attribute *vduse_dev_attrs[] = {

  ATTRIBUTE_GROUPS(vduse_dev);

+static void vduse_dev_features_filter(struct vduse_dev_config *config)
+{
+   /*
+* Temporarily filter out virtio-net's control virtqueue and features
+* that depend on it while CVQ is being made more robust for VDUSE.
+*/
+   if (config->device_id == VIRTIO_ID_NET)
+   config->features &= VDUSE_NET_VALID_FEATURES_MASK;
+}
+
  static int vduse_create_dev(struct vduse_dev_config *config,
 void *config_buf, u64 api_version)
  {
@@ -1793,6 +1828,8 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
 if (!dev)
 goto err;

+   vduse_dev_features_filter(config);
+
 dev->api_version = api_version;
 dev->device_features = config->features;
 dev->device_id = config->device_id;
--
2.41.0





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 1/4] vduse: validate block features only with block devices

2023-10-23 Thread Maxime Coquelin




On 10/21/23 00:07, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index df7869537ef1..5b3879976b3d 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id)
return false;
  }
  
-static bool features_is_valid(u64 features)

+static bool features_is_valid(struct vduse_dev_config *config)


This should either be features_are_valid() or feature_is_valid().
Correct pluralization is important in the English language.


Indeed, I will change to features_are_valid() in next revision.

Thanks,
Maxime


  {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
  
  	/* Now we only support read-only configuration space */

-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
  
  	return true;

@@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
  
-	if (!features_is_valid(config->features))

+   if (!features_is_valid(config))
return false;
  
  	return true;




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-10-23 Thread Maxime Coquelin




On 10/21/23 00:20, Casey Schaufler wrote:

On 10/20/2023 8:58 AM, Maxime Coquelin wrote:

This patch introduces LSM hooks for devices creation,
destruction and opening operations, checking the
application is allowed to perform these operations for
the Virtio device type.


Why do you think that there needs to be a special LSM check for virtio
devices? What can't existing device attributes be used?


Michael asked for a way for SELinux to allow/prevent the creation of
some types of devices [0].

A device is created using ioctl() on VDUSE control chardev. Its type is
specified via a field in the structure passed in argument.

I didn't see other way than adding dedicated LSM hooks to achieve this,
but it is possible that their is a better way to do it?

Thanks,
Maxime

[0]: 
https://lore.kernel.org/all/20230829130430-mutt-send-email-...@kernel.org/


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v4 1/4] vduse: validate block features only with block devices

2023-10-20 Thread Maxime Coquelin
This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index df7869537ef1..5b3879976b3d 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id)
return false;
 }
 
-static bool features_is_valid(u64 features)
+static bool features_is_valid(struct vduse_dev_config *config)
 {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
 
/* Now we only support read-only configuration space */
-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
return true;
@@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
 
-   if (!features_is_valid(config->features))
+   if (!features_is_valid(config))
return false;
 
return true;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v4 2/4] vduse: enable Virtio-net device type

2023-10-20 Thread Maxime Coquelin
This patch adds Virtio-net device type to the supported
devices types. Initialization fails if the device does
not support VIRTIO_F_VERSION_1 feature, in order to
guarantee the configuration space is read-only.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 5b3879976b3d..73ad3b7efd8e 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -142,6 +142,7 @@ static struct workqueue_struct *vduse_irq_bound_wq;
 
 static u32 allowed_device_id[] = {
VIRTIO_ID_BLOCK,
+   VIRTIO_ID_NET,
 };
 
 static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa)
@@ -1672,6 +1673,10 @@ static bool features_is_valid(struct vduse_dev_config 
*config)
(config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
+   if ((config->device_id == VIRTIO_ID_NET) &&
+   !(config->features & (1ULL << VIRTIO_F_VERSION_1)))
+   return false;
+
return true;
 }
 
@@ -2027,6 +2032,7 @@ static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops 
= {
 
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
+   { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
 };
 
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v4 0/4] vduse: add support for networking devices

2023-10-20 Thread Maxime Coquelin
This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably,
so while Jason rework is done, a patch is added to disable
CVQ and features that depend on it (tested also with DPDK
v23.07-rc1).

In this v4, LSM hooks are added to allow/deny application
to create/destroy/open devices based on their type (Net,
Block).

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/

v3->v4 changes:
===
- Add LSM hooks (Michael)
- Rebase

v2 -> v3 changes:
=
- Use allow list instead of deny list (Michael)

v1 -> v2 changes:
=
- Add a patch to disable CVQ (Michael)

RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (4):
  vduse: validate block features only with block devices
  vduse: enable Virtio-net device type
  vduse: Temporarily disable control queue features
  vduse: Add LSM hooks to check Virtio device type

 drivers/vdpa/vdpa_user/vduse_dev.c  | 64 +++--
 include/linux/lsm_hook_defs.h   |  4 ++
 include/linux/security.h| 15 +++
 security/security.c | 42 +++
 security/selinux/hooks.c| 55 +
 security/selinux/include/classmap.h |  2 +
 6 files changed, 178 insertions(+), 4 deletions(-)

-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v4 4/4] vduse: Add LSM hooks to check Virtio device type

2023-10-20 Thread Maxime Coquelin
This patch introduces LSM hooks for devices creation,
destruction and opening operations, checking the
application is allowed to perform these operations for
the Virtio device type.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c  | 12 +++
 include/linux/lsm_hook_defs.h   |  4 +++
 include/linux/security.h| 15 
 security/security.c | 42 ++
 security/selinux/hooks.c| 55 +
 security/selinux/include/classmap.h |  2 ++
 6 files changed, 130 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 0243dee9cf0e..ca64eac11ddb 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -8,6 +8,7 @@
  *
  */
 
+#include "linux/security.h"
 #include 
 #include 
 #include 
@@ -1443,6 +1444,10 @@ static int vduse_dev_open(struct inode *inode, struct 
file *file)
if (dev->connected)
goto unlock;
 
+   ret = -EPERM;
+   if (security_vduse_dev_open(dev->device_id))
+   goto unlock;
+
ret = 0;
dev->connected = true;
file->private_data = dev;
@@ -1655,6 +1660,9 @@ static int vduse_destroy_dev(char *name)
if (!dev)
return -EINVAL;
 
+   if (security_vduse_dev_destroy(dev->device_id))
+   return -EPERM;
+
mutex_lock(>lock);
if (dev->vdev || dev->connected) {
mutex_unlock(>lock);
@@ -1819,6 +1827,10 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
int ret;
struct vduse_dev *dev;
 
+   ret = -EPERM;
+   if (security_vduse_dev_create(config->device_id))
+   goto err;
+
ret = -EEXIST;
if (vduse_find_dev(config->name))
goto err;
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index ac962c4cb44b..0b3999ab3264 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -419,3 +419,7 @@ LSM_HOOK(int, 0, uring_override_creds, const struct cred 
*new)
 LSM_HOOK(int, 0, uring_sqpoll, void)
 LSM_HOOK(int, 0, uring_cmd, struct io_uring_cmd *ioucmd)
 #endif /* CONFIG_IO_URING */
+
+LSM_HOOK(int, 0, vduse_dev_create, u32 device_id)
+LSM_HOOK(int, 0, vduse_dev_destroy, u32 device_id)
+LSM_HOOK(int, 0, vduse_dev_open, u32 device_id)
diff --git a/include/linux/security.h b/include/linux/security.h
index 5f16eecde00b..a650c500f841 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -484,6 +484,9 @@ int security_inode_notifysecctx(struct inode *inode, void 
*ctx, u32 ctxlen);
 int security_inode_setsecctx(struct dentry *dentry, void *ctx, u32 ctxlen);
 int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
 int security_locked_down(enum lockdown_reason what);
+int security_vduse_dev_create(u32 device_id);
+int security_vduse_dev_destroy(u32 device_id);
+int security_vduse_dev_open(u32 device_id);
 #else /* CONFIG_SECURITY */
 
 static inline int call_blocking_lsm_notifier(enum lsm_event event, void *data)
@@ -1395,6 +1398,18 @@ static inline int security_locked_down(enum 
lockdown_reason what)
 {
return 0;
 }
+static inline int security_vduse_dev_create(u32 device_id)
+{
+   return 0;
+}
+static inline int security_vduse_dev_destroy(u32 device_id)
+{
+   return 0;
+}
+static inline int security_vduse_dev_open(u32 device_id)
+{
+   return 0;
+}
 #endif /* CONFIG_SECURITY */
 
 #if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE)
diff --git a/security/security.c b/security/security.c
index 23b129d482a7..8d7d4d2eca0b 100644
--- a/security/security.c
+++ b/security/security.c
@@ -5337,3 +5337,45 @@ int security_uring_cmd(struct io_uring_cmd *ioucmd)
return call_int_hook(uring_cmd, 0, ioucmd);
 }
 #endif /* CONFIG_IO_URING */
+
+/**
+ * security_vduse_dev_create() - Check if a VDUSE device type creation is 
allowed
+ * @device_id: the Virtio device ID
+ *
+ * Check whether the Virtio device creation is allowed
+ *
+ * Return: Returns 0 if permission is granted.
+ */
+int security_vduse_dev_create(u32 device_id)
+{
+   return call_int_hook(vduse_dev_create, 0, device_id);
+}
+EXPORT_SYMBOL(security_vduse_dev_create);
+
+/**
+ * security_vduse_dev_destroy() - Check if a VDUSE device type destruction is 
allowed
+ * @device_id: the Virtio device ID
+ *
+ * Check whether the Virtio device destruction is allowed
+ *
+ * Return: Returns 0 if permission is granted.
+ */
+int security_vduse_dev_destroy(u32 device_id)
+{
+   return call_int_hook(vduse_dev_destroy, 0, device_id);
+}
+EXPORT_SYMBOL(security_vduse_dev_destroy);
+
+/**
+ * security_vduse_dev_open() - Check if a VDUSE device type opening is allowed
+ * @device_id: the Virtio device ID
+ *
+ * Check whether the Virtio device opening is allowed
+ *
+ * Return: Returns 0 if permissio

[PATCH v4 3/4] vduse: Temporarily disable control queue features

2023-10-20 Thread Maxime Coquelin
Virtio-net driver control queue implementation is not safe
when used with VDUSE. If the VDUSE application does not
reply to control queue messages, it currently ends up
hanging the kernel thread sending this command.

Some work is on-going to make the control queue
implementation robust with VDUSE. Until it is completed,
let's disable control virtqueue and features that depend on
it.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 37 ++
 1 file changed, 37 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 73ad3b7efd8e..0243dee9cf0e 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "iova_domain.h"
@@ -46,6 +47,30 @@
 
 #define IRQ_UNBOUND -1
 
+#define VDUSE_NET_VALID_FEATURES_MASK   \
+   (BIT_ULL(VIRTIO_NET_F_CSUM) |   \
+BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \
+BIT_ULL(VIRTIO_NET_F_MTU) |\
+BIT_ULL(VIRTIO_NET_F_MAC) |\
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |  \
+BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_ECN) |   \
+BIT_ULL(VIRTIO_NET_F_HOST_UFO) |   \
+BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |  \
+BIT_ULL(VIRTIO_NET_F_STATUS) | \
+BIT_ULL(VIRTIO_NET_F_HOST_USO) |   \
+BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \
+BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \
+BIT_ULL(VIRTIO_RING_F_EVENT_IDX) |  \
+BIT_ULL(VIRTIO_F_VERSION_1) |  \
+BIT_ULL(VIRTIO_F_ACCESS_PLATFORM) | \
+BIT_ULL(VIRTIO_F_RING_PACKED) |\
+BIT_ULL(VIRTIO_F_IN_ORDER))
+
 struct vduse_virtqueue {
u16 index;
u16 num_max;
@@ -1778,6 +1803,16 @@ static struct attribute *vduse_dev_attrs[] = {
 
 ATTRIBUTE_GROUPS(vduse_dev);
 
+static void vduse_dev_features_filter(struct vduse_dev_config *config)
+{
+   /*
+* Temporarily filter out virtio-net's control virtqueue and features
+* that depend on it while CVQ is being made more robust for VDUSE.
+*/
+   if (config->device_id == VIRTIO_ID_NET)
+   config->features &= VDUSE_NET_VALID_FEATURES_MASK;
+}
+
 static int vduse_create_dev(struct vduse_dev_config *config,
void *config_buf, u64 api_version)
 {
@@ -1793,6 +1828,8 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
if (!dev)
goto err;
 
+   vduse_dev_features_filter(config);
+
dev->api_version = api_version;
dev->device_features = config->features;
dev->device_id = config->device_id;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC v2 3/4] vduse: update the vq_info in ioctl

2023-09-29 Thread Maxime Coquelin



On 9/12/23 09:39, Jason Wang wrote:

On Tue, Sep 12, 2023 at 11:00 AM Cindy Lu  wrote:


In VDUSE_VQ_GET_INFO, the driver will sync the last_avail_idx
with reconnect info, After mapping the reconnect pages to userspace
The userspace App will update the reconnect_time in
struct vhost_reconnect_vring, If this is not 0 then it means this
vq is reconnected and will update the last_avail_idx

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 13 +
  include/uapi/linux/vduse.h |  6 ++
  2 files changed, 19 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 2c69f4004a6e..680b23dbdde2 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1221,6 +1221,8 @@ static long vduse_dev_ioctl(struct file *file, unsigned 
int cmd,
 struct vduse_vq_info vq_info;
 struct vduse_virtqueue *vq;
 u32 index;
+   struct vdpa_reconnect_info *area;
+   struct vhost_reconnect_vring *vq_reconnect;

 ret = -EFAULT;
 if (copy_from_user(_info, argp, sizeof(vq_info)))
@@ -1252,6 +1254,17 @@ static long vduse_dev_ioctl(struct file *file, unsigned 
int cmd,

 vq_info.ready = vq->ready;

+   area = >reconnect_info;
+
+   vq_reconnect = (struct vhost_reconnect_vring *)area->vaddr;
+   /*check if the vq is reconnect, if yes then update the 
last_avail_idx*/
+   if ((vq_reconnect->last_avail_idx !=
+vq_info.split.avail_index) &&
+   (vq_reconnect->reconnect_time != 0)) {
+   vq_info.split.avail_index =
+   vq_reconnect->last_avail_idx;
+   }
+
 ret = -EFAULT;
 if (copy_to_user(argp, _info, sizeof(vq_info)))
 break;
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index 11bd48c72c6c..d585425803fd 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -350,4 +350,10 @@ struct vduse_dev_response {
 };
  };

+struct vhost_reconnect_vring {
+   __u16 reconnect_time;
+   __u16 last_avail_idx;
+   _Bool avail_wrap_counter;


Please add a comment for each field.

And I never saw _Bool is used in uapi before, maybe it's better to
pack it with last_avail_idx into a __u32.


Better as two distincts __u16 IMHO.

Thanks,
Maxime



Btw, do we need to track inflight descriptors as well?

Thanks


+};
+
  #endif /* _UAPI_VDUSE_H_ */
--
2.34.3





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v2 4/4] vduse: Add new ioctl VDUSE_GET_RECONNECT_INFO

2023-09-29 Thread Maxime Coquelin



On 9/25/23 04:57, Jason Wang wrote:

On Thu, Sep 21, 2023 at 10:07 PM Cindy Lu  wrote:


On Mon, Sep 18, 2023 at 4:49 PM Jason Wang  wrote:


On Tue, Sep 12, 2023 at 11:01 AM Cindy Lu  wrote:


In VDUSE_GET_RECONNECT_INFO, the Userspace App can get the map size
and The number of mapping memory pages from the kernel. The userspace
App can use this information to map the pages.

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 15 +++
  include/uapi/linux/vduse.h | 15 +++
  2 files changed, 30 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 680b23dbdde2..c99f99892b5c 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1368,6 +1368,21 @@ static long vduse_dev_ioctl(struct file *file, unsigned 
int cmd,
 ret = 0;
 break;
 }
+   case VDUSE_GET_RECONNECT_INFO: {
+   struct vduse_reconnect_mmap_info info;
+
+   ret = -EFAULT;
+   if (copy_from_user(, argp, sizeof(info)))
+   break;
+
+   info.size = PAGE_SIZE;
+   info.max_index = dev->vq_num + 1;
+
+   if (copy_to_user(argp, , sizeof(info)))
+   break;
+   ret = 0;
+   break;
+   }
 default:
 ret = -ENOIOCTLCMD;
 break;
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index d585425803fd..ce55e34f63d7 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -356,4 +356,19 @@ struct vhost_reconnect_vring {
 _Bool avail_wrap_counter;
  };

+/**
+ * struct vduse_reconnect_mmap_info
+ * @size: mapping memory size, always page_size here
+ * @max_index: the number of pages allocated in kernel,just
+ * use for check
+ */
+
+struct vduse_reconnect_mmap_info {
+   __u32 size;
+   __u32 max_index;
+};


One thing I didn't understand is that, aren't the things we used to
store connection info belong to uAPI? If not, how can we make sure the
connections work across different vendors/implementations. If yes,
where?

Thanks


The process for this reconnecttion  is
A.The first-time connection
1> The userland app checks if the device exists
2>  use the ioctl to create the vduse device
3> Mapping the kernel page to userland and save the
App-version/features/other information to this page
4>  if the Userland app needs to exit, then the Userland app will only
unmap the page and then exit

B, the re-connection
1> the userland app finds the device is existing
2> Mapping the kernel page to userland
3> check if the information in shared memory is satisfied to
reconnect,if ok then continue to reconnect
4> continue working

  For now these information are all from userland,So here the page will
be maintained by the userland App
in the previous code we only saved the api-version by uAPI .  if  we
need to support reconnection maybe we need to add 2 new uAPI for this,
one of the uAPI is to save the reconnect  information and another is
to get the information

maybe something like

struct vhost_reconnect_data {
uint32_t version;
uint64_t features;
uint8_t status;
struct virtio_net_config config;
uint32_t nr_vrings;
};


Probably, then we can make sure the re-connection works across
different vduse-daemon implementations.


+1, we need to have this defined in the uAPI to support interoperability
across different VDUSE userspace implementations.





#define VDUSE_GET_RECONNECT_INFO _IOR (VDUSE_BASE, 0x1c, struct
vhost_reconnect_data)

#define VDUSE_SET_RECONNECT_INFO  _IOWR(VDUSE_BASE, 0x1d, struct
vhost_reconnect_data)


Not sure I get this, but the idea is to map those pages to user space,
any reason we need this uAPI?


It should not be necessary if the mmapped layout is properly defined.

Thanks,
Maxime


Thanks



Thanks
Cindy





+
+#define VDUSE_GET_RECONNECT_INFO \
+   _IOWR(VDUSE_BASE, 0x1b, struct vduse_reconnect_mmap_info)
+
  #endif /* _UAPI_VDUSE_H_ */
--
2.34.3









___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC v2 3/4] vduse: update the vq_info in ioctl

2023-09-29 Thread Maxime Coquelin



On 9/25/23 06:15, Cindy Lu wrote:

On Tue, Sep 12, 2023 at 3:39 PM Jason Wang  wrote:


On Tue, Sep 12, 2023 at 11:00 AM Cindy Lu  wrote:


In VDUSE_VQ_GET_INFO, the driver will sync the last_avail_idx
with reconnect info, After mapping the reconnect pages to userspace
The userspace App will update the reconnect_time in
struct vhost_reconnect_vring, If this is not 0 then it means this
vq is reconnected and will update the last_avail_idx

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 13 +
  include/uapi/linux/vduse.h |  6 ++
  2 files changed, 19 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 2c69f4004a6e..680b23dbdde2 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1221,6 +1221,8 @@ static long vduse_dev_ioctl(struct file *file, unsigned 
int cmd,
 struct vduse_vq_info vq_info;
 struct vduse_virtqueue *vq;
 u32 index;
+   struct vdpa_reconnect_info *area;
+   struct vhost_reconnect_vring *vq_reconnect;

 ret = -EFAULT;
 if (copy_from_user(_info, argp, sizeof(vq_info)))
@@ -1252,6 +1254,17 @@ static long vduse_dev_ioctl(struct file *file, unsigned 
int cmd,

 vq_info.ready = vq->ready;

+   area = >reconnect_info;
+
+   vq_reconnect = (struct vhost_reconnect_vring *)area->vaddr;
+   /*check if the vq is reconnect, if yes then update the 
last_avail_idx*/
+   if ((vq_reconnect->last_avail_idx !=
+vq_info.split.avail_index) &&
+   (vq_reconnect->reconnect_time != 0)) {
+   vq_info.split.avail_index =
+   vq_reconnect->last_avail_idx;
+   }
+
 ret = -EFAULT;
 if (copy_to_user(argp, _info, sizeof(vq_info)))
 break;
diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h
index 11bd48c72c6c..d585425803fd 100644
--- a/include/uapi/linux/vduse.h
+++ b/include/uapi/linux/vduse.h
@@ -350,4 +350,10 @@ struct vduse_dev_response {
 };
  };

+struct vhost_reconnect_vring {
+   __u16 reconnect_time;
+   __u16 last_avail_idx;
+   _Bool avail_wrap_counter;


Please add a comment for each field.


Sure will do


And I never saw _Bool is used in uapi before, maybe it's better to
pack it with last_avail_idx into a __u32.


Thanks will fix this

Btw, do we need to track inflight descriptors as well?


I will check this


For existing networking implemenation, this is not necessary.
But it should be for block devices.

Maxime


Thanks

cindy

Thanks


+};
+
  #endif /* _UAPI_VDUSE_H_ */
--
2.34.3







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 0/3] vduse: add support for networking devices

2023-08-30 Thread Maxime Coquelin




On 8/29/23 19:05, Michael S. Tsirkin wrote:

On Tue, Aug 29, 2023 at 03:34:06PM +0200, Maxime Coquelin wrote:



On 8/11/23 00:00, Jakub Kicinski wrote:

On Thu, 10 Aug 2023 17:42:11 -0400 Michael S. Tsirkin wrote:

Directly into the stack? I thought VDUSE is vDPA in user space,
meaning to get to the kernel the packet has to first go thru
a virtio-net instance.


yes. is that a sufficient filter in your opinion?


Yes, the ability to create the device feels stronger than CAP_NET_RAW,
and a bit tangential to CAP_NET_ADMIN. But I don't have much practical
experience with virt so no strong opinion, perhaps it does make sense
for someone's deployment? Dunno..



I'm not sure CAP_NET_ADMIN should be required for creating the VDUSE
devices, as the device could be attached to vhost-vDPA and so not
visible to the Kernel networking stack.

However, CAP_NET_ADMIN should be required to attach the VDUSE device to
virtio-vdpa/virtio-net.

Does that make sense?

Maxime


OK. How are we going to enforce it?


Actually, it seems already enforced for all VDPA devices types.
Indeed, the VDPA_CMD_DEV_NEW Netlink command used to add the device to
the VDPA bus has the GENL_ADMIN_PERM flag set, and so require
CAT_NET_ADMIN.


Also, we need a way for selinux to enable/disable some of these things
but not others.


Ok, I can do it in a patch on top.
Do you have a pointer where it is done for Virtio Block devices?

Maxime

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 0/3] vduse: add support for networking devices

2023-08-29 Thread Maxime Coquelin




On 8/11/23 00:00, Jakub Kicinski wrote:

On Thu, 10 Aug 2023 17:42:11 -0400 Michael S. Tsirkin wrote:

Directly into the stack? I thought VDUSE is vDPA in user space,
meaning to get to the kernel the packet has to first go thru
a virtio-net instance.


yes. is that a sufficient filter in your opinion?


Yes, the ability to create the device feels stronger than CAP_NET_RAW,
and a bit tangential to CAP_NET_ADMIN. But I don't have much practical
experience with virt so no strong opinion, perhaps it does make sense
for someone's deployment? Dunno..



I'm not sure CAP_NET_ADMIN should be required for creating the VDUSE
devices, as the device could be attached to vhost-vDPA and so not
visible to the Kernel networking stack.

However, CAP_NET_ADMIN should be required to attach the VDUSE device to 
virtio-vdpa/virtio-net.


Does that make sense?

Maxime

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v4 2/2] virtio-net: add cond_resched() to the command waiting loop

2023-07-21 Thread Maxime Coquelin




On 7/21/23 17:10, Michael S. Tsirkin wrote:

On Fri, Jul 21, 2023 at 04:58:04PM +0200, Maxime Coquelin wrote:



On 7/21/23 16:45, Michael S. Tsirkin wrote:

On Fri, Jul 21, 2023 at 04:37:00PM +0200, Maxime Coquelin wrote:



On 7/20/23 23:02, Michael S. Tsirkin wrote:

On Thu, Jul 20, 2023 at 01:26:20PM -0700, Shannon Nelson wrote:

On 7/20/23 1:38 AM, Jason Wang wrote:


Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed on a device whose CVQ might be slow.

Signed-off-by: Jason Wang 


This still leaves hung processes, but at least it doesn't pin the CPU any
more.  Thanks.
Reviewed-by: Shannon Nelson 



I'd like to see a full solution
1- block until interrupt


Would it make sense to also have a timeout?
And when timeout expires, set FAILED bit in device status?


virtio spec does not set any limits on the timing of vq
processing.


Indeed, but I thought the driver could decide it is too long for it.

The issue is we keep waiting with rtnl locked, it can quickly make the
system unusable.


if this is a problem we should find a way not to keep rtnl
locked indefinitely.


From the tests I have done, I think it is. With OVS, a reconfiguration 
is performed when the VDUSE device is added, and when a MLX5 device is

in the same bridge, it ends up doing an ioctl() that tries to take the
rtnl lock. In this configuration, it is not possible to kill OVS because
it is stuck trying to acquire rtnl lock for mlx5 that is held by virtio-
net.




2- still handle surprise removal correctly by waking in that case




---
 drivers/net/virtio_net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9f3b1d6ac33d..e7533f29b219 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2314,8 +2314,10 @@ static bool virtnet_send_command(struct virtnet_info 
*vi, u8 class, u8 cmd,
 * into the hypervisor, so the request should be handled 
immediately.
 */
while (!virtqueue_get_buf(vi->cvq, ) &&
-  !virtqueue_is_broken(vi->cvq))
+  !virtqueue_is_broken(vi->cvq)) {
+   cond_resched();
cpu_relax();
+   }

return vi->ctrl->status == VIRTIO_NET_OK;
 }
--
2.39.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization








___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v4 2/2] virtio-net: add cond_resched() to the command waiting loop

2023-07-21 Thread Maxime Coquelin




On 7/21/23 16:45, Michael S. Tsirkin wrote:

On Fri, Jul 21, 2023 at 04:37:00PM +0200, Maxime Coquelin wrote:



On 7/20/23 23:02, Michael S. Tsirkin wrote:

On Thu, Jul 20, 2023 at 01:26:20PM -0700, Shannon Nelson wrote:

On 7/20/23 1:38 AM, Jason Wang wrote:


Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed on a device whose CVQ might be slow.

Signed-off-by: Jason Wang 


This still leaves hung processes, but at least it doesn't pin the CPU any
more.  Thanks.
Reviewed-by: Shannon Nelson 



I'd like to see a full solution
1- block until interrupt


Would it make sense to also have a timeout?
And when timeout expires, set FAILED bit in device status?


virtio spec does not set any limits on the timing of vq
processing.


Indeed, but I thought the driver could decide it is too long for it.

The issue is we keep waiting with rtnl locked, it can quickly make the
system unusable.


2- still handle surprise removal correctly by waking in that case




---
drivers/net/virtio_net.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9f3b1d6ac33d..e7533f29b219 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2314,8 +2314,10 @@ static bool virtnet_send_command(struct virtnet_info 
*vi, u8 class, u8 cmd,
* into the hypervisor, so the request should be handled immediately.
*/
   while (!virtqueue_get_buf(vi->cvq, ) &&
-  !virtqueue_is_broken(vi->cvq))
+  !virtqueue_is_broken(vi->cvq)) {
+   cond_resched();
   cpu_relax();
+   }

   return vi->ctrl->status == VIRTIO_NET_OK;
}
--
2.39.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next v4 2/2] virtio-net: add cond_resched() to the command waiting loop

2023-07-21 Thread Maxime Coquelin




On 7/20/23 23:02, Michael S. Tsirkin wrote:

On Thu, Jul 20, 2023 at 01:26:20PM -0700, Shannon Nelson wrote:

On 7/20/23 1:38 AM, Jason Wang wrote:


Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed on a device whose CVQ might be slow.

Signed-off-by: Jason Wang 


This still leaves hung processes, but at least it doesn't pin the CPU any
more.  Thanks.
Reviewed-by: Shannon Nelson 



I'd like to see a full solution
1- block until interrupt


Would it make sense to also have a timeout?
And when timeout expires, set FAILED bit in device status?


2- still handle surprise removal correctly by waking in that case




---
   drivers/net/virtio_net.c | 4 +++-
   1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9f3b1d6ac33d..e7533f29b219 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2314,8 +2314,10 @@ static bool virtnet_send_command(struct virtnet_info 
*vi, u8 class, u8 cmd,
   * into the hypervisor, so the request should be handled immediately.
   */
  while (!virtqueue_get_buf(vi->cvq, ) &&
-  !virtqueue_is_broken(vi->cvq))
+  !virtqueue_is_broken(vi->cvq)) {
+   cond_resched();
  cpu_relax();
+   }

  return vi->ctrl->status == VIRTIO_NET_OK;
   }
--
2.39.3

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 2/4] vduse: Add file operation for mmap

2023-07-11 Thread Maxime Coquelin




On 6/28/23 08:59, Cindy Lu wrote:

From: Your Name 

Add the operation for mmap, The user space APP will
use this function to map the pages to userspace

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 49 ++
  1 file changed, 49 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index f845dc46b1db..1b833bf0ae37 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1313,6 +1313,54 @@ static struct vduse_dev *vduse_dev_get_from_minor(int 
minor)
return dev;
  }
  
+

+static vm_fault_t vduse_vm_fault(struct vm_fault *vmf)
+{
+   struct vduse_dev *dev = vmf->vma->vm_file->private_data;
+   struct vm_area_struct *vma = vmf->vma;
+   u16 index = vma->vm_pgoff;
+
+   struct vdpa_reconnect_info *info;
+   info = >reconnect_info[index];
+
+   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+   if (remap_pfn_range(vma, vmf->address & PAGE_MASK, PFN_DOWN(info->addr),
+   PAGE_SIZE, vma->vm_page_prot))
+   return VM_FAULT_SIGBUS;
+   return VM_FAULT_NOPAGE;
+}
+
+static const struct vm_operations_struct vduse_vm_ops = {
+   .fault = vduse_vm_fault,
+};
+
+static int vduse_mmap(struct file *file, struct vm_area_struct *vma)
+{
+   struct vduse_dev *dev = file->private_data;
+   struct vdpa_reconnect_info *info;
+   unsigned long index = vma->vm_pgoff;
+
+   if (vma->vm_end - vma->vm_start != PAGE_SIZE)
+   return -EINVAL;
+   if ((vma->vm_flags & VM_SHARED) == 0)
+   return -EINVAL;
+
+   if (index > 65535)
+   return -EINVAL;


You declare an array of 64 entries in patch 1, so it can overflow.


+
+   info = >reconnect_info[index];
+   if (info->addr & (PAGE_SIZE - 1))
+   return -EINVAL;
+   if (vma->vm_end - vma->vm_start != info->size) {
+   return -ENOTSUPP;
+   }
+
+   vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
+   vma->vm_ops = _vm_ops;
+
+   return 0;
+}
+
  static int vduse_dev_open(struct inode *inode, struct file *file)
  {
int ret;
@@ -1345,6 +1393,7 @@ static const struct file_operations vduse_dev_fops = {
.unlocked_ioctl = vduse_dev_ioctl,
.compat_ioctl   = compat_ptr_ioctl,
.llseek = noop_llseek,
+   .mmap   = vduse_mmap,
  };
  
  static struct vduse_dev *vduse_dev_create(void)


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 1/4] vduse: Add the struct to save the vq reconnect info

2023-07-11 Thread Maxime Coquelin

Hello Cindy,

On 6/28/23 08:59, Cindy Lu wrote:

From: Your Name 

this struct is to save the reconnect info struct, in this
struct saved the page info that alloc to save the
reconnect info

Signed-off-by: Cindy Lu 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 26b7e29cb900..f845dc46b1db 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -72,6 +72,12 @@ struct vduse_umem {
struct page **pages;
struct mm_struct *mm;
  };
+struct vdpa_reconnect_info {
+   u32 index;
+   phys_addr_t addr;
+   unsigned long vaddr;
+   phys_addr_t size;
+};
  
  struct vduse_dev {

struct vduse_vdpa *vdev;
@@ -106,6 +112,7 @@ struct vduse_dev {
u32 vq_align;
struct vduse_umem *umem;
struct mutex mem_lock;
+   struct vdpa_reconnect_info reconnect_info[64];


Why 64?
Shouldn't it be part of struct vduse_virtqueue instead?


  };
  
  struct vduse_dev_msg {


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] vduse: Use proper spinlock for IRQ injection

2023-07-05 Thread Maxime Coquelin
The IRQ injection work used spin_lock_irq() to protect the
scheduling of the softirq, but spin_lock_bh() should be
used.

With spin_lock_irq(), we noticed delay of more than 6
seconds between the time a NAPI polling work is scheduled
and the time it is executed.

Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Cc: xieyon...@bytedance.com

Suggested-by: Jason Wang 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index dc38ed21319d..df7869537ef1 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -935,10 +935,10 @@ static void vduse_dev_irq_inject(struct work_struct *work)
 {
struct vduse_dev *dev = container_of(work, struct vduse_dev, inject);
 
-   spin_lock_irq(>irq_lock);
+   spin_lock_bh(>irq_lock);
if (dev->config_cb.callback)
dev->config_cb.callback(dev->config_cb.private);
-   spin_unlock_irq(>irq_lock);
+   spin_unlock_bh(>irq_lock);
 }
 
 static void vduse_vq_irq_inject(struct work_struct *work)
@@ -946,10 +946,10 @@ static void vduse_vq_irq_inject(struct work_struct *work)
struct vduse_virtqueue *vq = container_of(work,
struct vduse_virtqueue, inject);
 
-   spin_lock_irq(>irq_lock);
+   spin_lock_bh(>irq_lock);
if (vq->ready && vq->cb.callback)
vq->cb.callback(vq->cb.private);
-   spin_unlock_irq(>irq_lock);
+   spin_unlock_bh(>irq_lock);
 }
 
 static bool vduse_vq_signal_irqfd(struct vduse_virtqueue *vq)
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 3/3] vduse: Temporarily disable control queue features

2023-07-05 Thread Maxime Coquelin
Virtio-net driver control queue implementation is not safe
when used with VDUSE. If the VDUSE application does not
reply to control queue messages, it currently ends up
hanging the kernel thread sending this command.

Some work is on-going to make the control queue
implementation robust with VDUSE. Until it is completed,
let's filter out control virtqueue and features that depend
on it by keeping only features known to be supported.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 36 ++
 1 file changed, 36 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 1271c9796517..7345071db0a8 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -46,6 +46,30 @@
 
 #define IRQ_UNBOUND -1
 
+#define VDUSE_NET_VALID_FEATURES_MASK   \
+   (BIT_ULL(VIRTIO_NET_F_CSUM) |   \
+BIT_ULL(VIRTIO_NET_F_GUEST_CSUM) | \
+BIT_ULL(VIRTIO_NET_F_MTU) |\
+BIT_ULL(VIRTIO_NET_F_MAC) |\
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO4) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_TSO6) | \
+BIT_ULL(VIRTIO_NET_F_GUEST_ECN) |  \
+BIT_ULL(VIRTIO_NET_F_GUEST_UFO) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO4) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_TSO6) |  \
+BIT_ULL(VIRTIO_NET_F_HOST_ECN) |   \
+BIT_ULL(VIRTIO_NET_F_HOST_UFO) |   \
+BIT_ULL(VIRTIO_NET_F_MRG_RXBUF) |  \
+BIT_ULL(VIRTIO_NET_F_STATUS) | \
+BIT_ULL(VIRTIO_NET_F_HOST_USO) |   \
+BIT_ULL(VIRTIO_F_ANY_LAYOUT) | \
+BIT_ULL(VIRTIO_RING_F_INDIRECT_DESC) | \
+BIT_ULL(VIRTIO_F_EVENT_IDX) |  \
+BIT_ULL(VIRTIO_F_VERSION_1) |  \
+BIT_ULL(VIRTIO_F_IOMMU_PLATFORM) | \
+BIT_ULL(VIRTIO_F_RING_PACKED) |\
+BIT_ULL(VIRTIO_F_IN_ORDER))
+
 struct vduse_virtqueue {
u16 index;
u16 num_max;
@@ -1778,6 +1802,16 @@ static struct attribute *vduse_dev_attrs[] = {
 
 ATTRIBUTE_GROUPS(vduse_dev);
 
+static void vduse_dev_features_filter(struct vduse_dev_config *config)
+{
+   /*
+* Temporarily filter out virtio-net's control virtqueue and features
+* that depend on it while CVQ is being made more robust for VDUSE.
+*/
+   if (config->device_id == VIRTIO_ID_NET)
+   config->features &= VDUSE_NET_VALID_FEATURES_MASK;
+}
+
 static int vduse_create_dev(struct vduse_dev_config *config,
void *config_buf, u64 api_version)
 {
@@ -1793,6 +1827,8 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
if (!dev)
goto err;
 
+   vduse_dev_features_filter(config);
+
dev->api_version = api_version;
dev->device_features = config->features;
dev->device_id = config->device_id;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 1/3] vduse: validate block features only with block devices

2023-07-05 Thread Maxime Coquelin
This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index dc38ed21319d..ff9fdd6783fe 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id)
return false;
 }
 
-static bool features_is_valid(u64 features)
+static bool features_is_valid(struct vduse_dev_config *config)
 {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
 
/* Now we only support read-only configuration space */
-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
return true;
@@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
 
-   if (!features_is_valid(config->features))
+   if (!features_is_valid(config))
return false;
 
return true;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 0/3] vduse: add support for networking devices

2023-07-05 Thread Maxime Coquelin
This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably,
so while Jason rework is done, a patch is added to disable
CVQ and features that depend on it (tested also with DPDK
v23.07-rc1).

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/

v2 -> v3 changes:
=
- Use allow list instead of deny list (Michael)

v1 -> v2 changes:
=
- Add a patch to disable CVQ (Michael)

RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (3):
  vduse: validate block features only with block devices
  vduse: enable Virtio-net device type
  vduse: Temporarily disable control queue features

 drivers/vdpa/vdpa_user/vduse_dev.c | 51 +++---
 1 file changed, 47 insertions(+), 4 deletions(-)

-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 2/3] vduse: enable Virtio-net device type

2023-07-05 Thread Maxime Coquelin
This patch adds Virtio-net device type to the supported
devices types. Initialization fails if the device does
not support VIRTIO_F_VERSION_1 feature, in order to
guarantee the configuration space is read-only.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index ff9fdd6783fe..1271c9796517 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -142,6 +142,7 @@ static struct workqueue_struct *vduse_irq_bound_wq;
 
 static u32 allowed_device_id[] = {
VIRTIO_ID_BLOCK,
+   VIRTIO_ID_NET,
 };
 
 static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa)
@@ -1672,6 +1673,10 @@ static bool features_is_valid(struct vduse_dev_config 
*config)
(config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
+   if ((config->device_id == VIRTIO_ID_NET) &&
+   !(config->features & (1ULL << VIRTIO_F_VERSION_1)))
+   return false;
+
return true;
 }
 
@@ -2027,6 +2032,7 @@ static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops 
= {
 
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
+   { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
 };
 
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 3/3] vduse: Temporarily disable control queue features

2023-07-04 Thread Maxime Coquelin




On 7/4/23 18:43, Michael S. Tsirkin wrote:

On Tue, Jul 04, 2023 at 06:40:45PM +0200, Maxime Coquelin wrote:

Virtio-net driver control queue implementation is not safe
when used with VDUSE. If the VDUSE application does not
reply to control queue messages, it currently ends up
hanging the kernel thread sending this command.

Some work is on-going to make the control queue
implementation robust with VDUSE. Until it is completed,
let's disable control virtqueue and features that depend on
it.

Signed-off-by: Maxime Coquelin 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 21 +
  1 file changed, 21 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 1271c9796517..04367a53802b 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1778,6 +1778,25 @@ static struct attribute *vduse_dev_attrs[] = {
  
  ATTRIBUTE_GROUPS(vduse_dev);
  
+static void vduse_dev_features_fixup(struct vduse_dev_config *config)

+{
+   if (config->device_id == VIRTIO_ID_NET) {
+   /*
+* Temporarily disable control virtqueue and features that
+* depend on it while CVQ is being made more robust for VDUSE.
+*/
+   config->features &= ~((1ULL << VIRTIO_NET_F_CTRL_VQ) |
+   (1ULL << VIRTIO_NET_F_CTRL_RX) |
+   (1ULL << VIRTIO_NET_F_CTRL_VLAN) |
+   (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
+   (1ULL << VIRTIO_NET_F_MQ) |
+   (1ULL << VIRTIO_NET_F_CTRL_MAC_ADDR) |
+   (1ULL << VIRTIO_NET_F_RSS) |
+   (1ULL << VIRTIO_NET_F_HASH_REPORT) |
+   (1ULL << VIRTIO_NET_F_NOTF_COAL));
+   }
+}
+



This will never be exhaustive, we are adding new features.
Please add an allowlist with just legal ones instead.


Ok, got it!
I'll post a new revision.

Thanks,
Maxime





  static int vduse_create_dev(struct vduse_dev_config *config,
void *config_buf, u64 api_version)
  {
@@ -1793,6 +1812,8 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
if (!dev)
goto err;
  
+	vduse_dev_features_fixup(config);

+
dev->api_version = api_version;
dev->device_features = config->features;
dev->device_id = config->device_id;
--
2.41.0




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 3/3] vduse: Temporarily disable control queue features

2023-07-04 Thread Maxime Coquelin
Virtio-net driver control queue implementation is not safe
when used with VDUSE. If the VDUSE application does not
reply to control queue messages, it currently ends up
hanging the kernel thread sending this command.

Some work is on-going to make the control queue
implementation robust with VDUSE. Until it is completed,
let's disable control virtqueue and features that depend on
it.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 1271c9796517..04367a53802b 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1778,6 +1778,25 @@ static struct attribute *vduse_dev_attrs[] = {
 
 ATTRIBUTE_GROUPS(vduse_dev);
 
+static void vduse_dev_features_fixup(struct vduse_dev_config *config)
+{
+   if (config->device_id == VIRTIO_ID_NET) {
+   /*
+* Temporarily disable control virtqueue and features that
+* depend on it while CVQ is being made more robust for VDUSE.
+*/
+   config->features &= ~((1ULL << VIRTIO_NET_F_CTRL_VQ) |
+   (1ULL << VIRTIO_NET_F_CTRL_RX) |
+   (1ULL << VIRTIO_NET_F_CTRL_VLAN) |
+   (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE) |
+   (1ULL << VIRTIO_NET_F_MQ) |
+   (1ULL << VIRTIO_NET_F_CTRL_MAC_ADDR) |
+   (1ULL << VIRTIO_NET_F_RSS) |
+   (1ULL << VIRTIO_NET_F_HASH_REPORT) |
+   (1ULL << VIRTIO_NET_F_NOTF_COAL));
+   }
+}
+
 static int vduse_create_dev(struct vduse_dev_config *config,
void *config_buf, u64 api_version)
 {
@@ -1793,6 +1812,8 @@ static int vduse_create_dev(struct vduse_dev_config 
*config,
if (!dev)
goto err;
 
+   vduse_dev_features_fixup(config);
+
dev->api_version = api_version;
dev->device_features = config->features;
dev->device_id = config->device_id;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 1/3] vduse: validate block features only with block devices

2023-07-04 Thread Maxime Coquelin
This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index dc38ed21319d..ff9fdd6783fe 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1662,13 +1662,14 @@ static bool device_is_allowed(u32 device_id)
return false;
 }
 
-static bool features_is_valid(u64 features)
+static bool features_is_valid(struct vduse_dev_config *config)
 {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
 
/* Now we only support read-only configuration space */
-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
return true;
@@ -1695,7 +1696,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
 
-   if (!features_is_valid(config->features))
+   if (!features_is_valid(config))
return false;
 
return true;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/3] vduse: enable Virtio-net device type

2023-07-04 Thread Maxime Coquelin
This patch adds Virtio-net device type to the supported
devices types. Initialization fails if the device does
not support VIRTIO_F_VERSION_1 feature, in order to
guarantee the configuration space is read-only.

Acked-by: Jason Wang 
Reviewed-by: Xie Yongji 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index ff9fdd6783fe..1271c9796517 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -142,6 +142,7 @@ static struct workqueue_struct *vduse_irq_bound_wq;
 
 static u32 allowed_device_id[] = {
VIRTIO_ID_BLOCK,
+   VIRTIO_ID_NET,
 };
 
 static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa)
@@ -1672,6 +1673,10 @@ static bool features_is_valid(struct vduse_dev_config 
*config)
(config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
+   if ((config->device_id == VIRTIO_ID_NET) &&
+   !(config->features & (1ULL << VIRTIO_F_VERSION_1)))
+   return false;
+
return true;
 }
 
@@ -2027,6 +2032,7 @@ static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops 
= {
 
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
+   { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
 };
 
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 0/3] vduse: add support for networking devices

2023-07-04 Thread Maxime Coquelin
This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably,
so while Jason rework is done, a patch is added to disable
CVQ and features that depend on it (tested also with DPDK
v23.07-rc1).

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/

v1 -> v2 changes:
=
- Add a patch to disable CVQ (Michael)

RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (3):
  vduse: validate block features only with block devices
  vduse: enable Virtio-net device type
  vduse: Temporarily disable control queue features

 drivers/vdpa/vdpa_user/vduse_dev.c | 36 ++
 1 file changed, 32 insertions(+), 4 deletions(-)

-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v1 0/2] vduse: add support for networking devices

2023-07-04 Thread Maxime Coquelin



On 7/4/23 11:59, Michael S. Tsirkin wrote:

On Tue, Jul 04, 2023 at 10:43:07AM +0200, Maxime Coquelin wrote:



On 7/3/23 23:45, Michael S. Tsirkin wrote:

On Mon, Jul 03, 2023 at 09:43:49AM +0200, Maxime Coquelin wrote:


On 7/3/23 08:44, Jason Wang wrote:

On Sun, Jul 2, 2023 at 9:37 PM Michael S. Tsirkin  wrote:


On Tue, Jun 27, 2023 at 01:36:50PM +0200, Maxime Coquelin wrote:

This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/


Jason promised to post a new version of that patch.
Right Jason?


Yes.


For now let's make sure CVQ feature flag is off?


We can do that and relax on top of my patch.


I agree? Do you prefer a features negotiation, or failing init (like
done for VERSION_1) if the VDUSE application advertises CVQ?

Thanks,
Maxime


Unfortunately guests fail probe if feature set is inconsistent.
So I don't think passing through features is a good idea,
you need a list of legal bits. And when doing this,
clear CVQ and everything that depends on it.


Since this is temporary, while cvq is made more robust, I think it is
better to fail VDUSE device creation if CVQ feature is advertised by the
VDUSE application, instead of ensuring features depending on CVQ are
also cleared.

Jason seems to think likewise, would that work for you?

Thanks,
Maxime


Nothing is more permanent than temporary solutions.
My concern would be that hardware devices then start masking CVQ
intentionally just to avoid the pain of broken software.


Got it, I'll add a patch on top that filters out CVQ feature and all the
features that depend on it.

Thanks,
Maxime







Thanks




RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (2):
 vduse: validate block features only with block devices
 vduse: enable Virtio-net device type

drivers/vdpa/vdpa_user/vduse_dev.c | 15 +++
1 file changed, 11 insertions(+), 4 deletions(-)

--
2.41.0










___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1 0/2] vduse: add support for networking devices

2023-07-04 Thread Maxime Coquelin



On 7/3/23 23:45, Michael S. Tsirkin wrote:

On Mon, Jul 03, 2023 at 09:43:49AM +0200, Maxime Coquelin wrote:


On 7/3/23 08:44, Jason Wang wrote:

On Sun, Jul 2, 2023 at 9:37 PM Michael S. Tsirkin  wrote:


On Tue, Jun 27, 2023 at 01:36:50PM +0200, Maxime Coquelin wrote:

This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/


Jason promised to post a new version of that patch.
Right Jason?


Yes.


For now let's make sure CVQ feature flag is off?


We can do that and relax on top of my patch.


I agree? Do you prefer a features negotiation, or failing init (like
done for VERSION_1) if the VDUSE application advertises CVQ?

Thanks,
Maxime


Unfortunately guests fail probe if feature set is inconsistent.
So I don't think passing through features is a good idea,
you need a list of legal bits. And when doing this,
clear CVQ and everything that depends on it.


Since this is temporary, while cvq is made more robust, I think it is
better to fail VDUSE device creation if CVQ feature is advertised by the
VDUSE application, instead of ensuring features depending on CVQ are
also cleared.

Jason seems to think likewise, would that work for you?

Thanks,
Maxime





Thanks




RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (2):
vduse: validate block features only with block devices
vduse: enable Virtio-net device type

   drivers/vdpa/vdpa_user/vduse_dev.c | 15 +++
   1 file changed, 11 insertions(+), 4 deletions(-)

--
2.41.0








___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v1 0/2] vduse: add support for networking devices

2023-07-03 Thread Maxime Coquelin


On 7/3/23 08:44, Jason Wang wrote:

On Sun, Jul 2, 2023 at 9:37 PM Michael S. Tsirkin  wrote:


On Tue, Jun 27, 2023 at 01:36:50PM +0200, Maxime Coquelin wrote:

This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/


Jason promised to post a new version of that patch.
Right Jason?


Yes.


For now let's make sure CVQ feature flag is off?


We can do that and relax on top of my patch.


I agree? Do you prefer a features negotiation, or failing init (like
done for VERSION_1) if the VDUSE application advertises CVQ?

Thanks,
Maxime


Thanks




RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (2):
   vduse: validate block features only with block devices
   vduse: enable Virtio-net device type

  drivers/vdpa/vdpa_user/vduse_dev.c | 15 +++
  1 file changed, 11 insertions(+), 4 deletions(-)

--
2.41.0






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v1 2/2] vduse: enable Virtio-net device type

2023-06-27 Thread Maxime Coquelin
This patch adds Virtio-net device type to the supported
devices types. Initialization fails if the device does
not support VIRTIO_F_VERSION_1 feature, in order to
guarantee the configuration space is read-only.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index c1c2f4c711ae..89088fa27026 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -142,6 +142,7 @@ static struct workqueue_struct *vduse_irq_bound_wq;
 
 static u32 allowed_device_id[] = {
VIRTIO_ID_BLOCK,
+   VIRTIO_ID_NET,
 };
 
 static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa)
@@ -1668,6 +1669,10 @@ static bool features_is_valid(struct vduse_dev_config 
*config)
(config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
+   if ((config->device_id == VIRTIO_ID_NET) &&
+   !(config->features & (1ULL << VIRTIO_F_VERSION_1)))
+   return false;
+
return true;
 }
 
@@ -2023,6 +2028,7 @@ static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops 
= {
 
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
+   { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
 };
 
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1 1/2] vduse: validate block features only with block devices

2023-06-27 Thread Maxime Coquelin
This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 5f5c21674fdc..c1c2f4c711ae 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1658,13 +1658,14 @@ static bool device_is_allowed(u32 device_id)
return false;
 }
 
-static bool features_is_valid(u64 features)
+static bool features_is_valid(struct vduse_dev_config *config)
 {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
 
/* Now we only support read-only configuration space */
-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
return true;
@@ -1691,7 +1692,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
 
-   if (!features_is_valid(config->features))
+   if (!features_is_valid(config))
return false;
 
return true;
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v1 0/2] vduse: add support for networking devices

2023-06-27 Thread Maxime Coquelin
This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support using split rings layout (merged in
DPDK v23.07-rc1).

Control queue support (and so multiqueue) has also been
tested, but requires a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/

RFC -> v1 changes:
==
- Fail device init if it does not support VERSION_1 (Jason)

Maxime Coquelin (2):
  vduse: validate block features only with block devices
  vduse: enable Virtio-net device type

 drivers/vdpa/vdpa_user/vduse_dev.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2] vduse: fix NULL pointer dereference

2023-06-22 Thread Maxime Coquelin
vduse_vdpa_set_vq_affinity callback can be called
with NULL value as cpu_mask when deleting the vduse
device.

This patch resets virtqueue's IRQ affinity mask value
to set all CPUs instead of dereferencing NULL cpu_mask.

[ 4760.952149] BUG: kernel NULL pointer dereference, address: 
[ 4760.959110] #PF: supervisor read access in kernel mode
[ 4760.964247] #PF: error_code(0x) - not-present page
[ 4760.969385] PGD 0 P4D 0
[ 4760.971927] Oops:  [#1] PREEMPT SMP PTI
[ 4760.976112] CPU: 13 PID: 2346 Comm: vdpa Not tainted 6.4.0-rc6+ #4
[ 4760.982291] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.8.1 
06/26/2020
[ 4760.989769] RIP: 0010:memcpy_orig+0xc5/0x130
[ 4760.994049] Code: 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 
c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 fa 08 72 1b <4c> 8b 06 
4c 8b 4c 16 f8 4c 89 07 4c 89 4c 17 f8 c3 cc cc cc cc 66
[ 4761.012793] RSP: 0018:b1d565abb830 EFLAGS: 00010246
[ 4761.018020] RAX: 9f4bf6b27898 RBX: 9f4be23969c0 RCX: 9f4bcadf6400
[ 4761.025152] RDX: 0008 RSI:  RDI: 9f4bf6b27898
[ 4761.032286] RBP:  R08: 0008 R09: 
[ 4761.039416] R10:  R11: 0600 R12: 
[ 4761.046549] R13:  R14: 0080 R15: b1d565abbb10
[ 4761.053680] FS:  7f64c2ec2740() GS:9f635f98() 
knlGS:
[ 4761.061765] CS:  0010 DS:  ES:  CR0: 80050033
[ 4761.067513] CR2:  CR3: 001875270006 CR4: 007706e0
[ 4761.074645] DR0:  DR1:  DR2: 
[ 4761.081775] DR3:  DR6: fffe0ff0 DR7: 0400
[ 4761.088909] PKRU: 5554
[ 4761.091620] Call Trace:
[ 4761.094074]  
[ 4761.096180]  ? __die+0x1f/0x70
[ 4761.099238]  ? page_fault_oops+0x171/0x4f0
[ 4761.103340]  ? exc_page_fault+0x7b/0x180
[ 4761.107265]  ? asm_exc_page_fault+0x22/0x30
[ 4761.111460]  ? memcpy_orig+0xc5/0x130
[ 4761.115126]  vduse_vdpa_set_vq_affinity+0x3e/0x50 [vduse]
[ 4761.120533]  virtnet_clean_affinity.part.0+0x3d/0x90 [virtio_net]
[ 4761.126635]  remove_vq_common+0x1a4/0x250 [virtio_net]
[ 4761.131781]  virtnet_remove+0x5d/0x70 [virtio_net]
[ 4761.136580]  virtio_dev_remove+0x3a/0x90
[ 4761.140509]  device_release_driver_internal+0x19b/0x200
[ 4761.145742]  bus_remove_device+0xc2/0x130
[ 4761.149755]  device_del+0x158/0x3e0
[ 4761.153245]  ? kernfs_find_ns+0x35/0xc0
[ 4761.157086]  device_unregister+0x13/0x60
[ 4761.161010]  unregister_virtio_device+0x11/0x20
[ 4761.165543]  device_release_driver_internal+0x19b/0x200
[ 4761.170770]  bus_remove_device+0xc2/0x130
[ 4761.174782]  device_del+0x158/0x3e0
[ 4761.178276]  ? __pfx_vdpa_name_match+0x10/0x10 [vdpa]
[ 4761.183336]  device_unregister+0x13/0x60
[ 4761.187260]  vdpa_nl_cmd_dev_del_set_doit+0x63/0xe0 [vdpa]

Fixes: 28f6288eb63d ("vduse: Support set_vq_affinity callback")
Cc: xieyon...@bytedance.com

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 5f5c21674fdc..0d84e6a9c3cc 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -726,7 +726,11 @@ static int vduse_vdpa_set_vq_affinity(struct vdpa_device 
*vdpa, u16 idx,
 {
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
 
-   cpumask_copy(>vqs[idx]->irq_affinity, cpu_mask);
+   if (cpu_mask)
+   cpumask_copy(>vqs[idx]->irq_affinity, cpu_mask);
+   else
+   cpumask_setall(>vqs[idx]->irq_affinity);
+
return 0;
 }
 
-- 
2.41.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: fix NULL pointer dereference

2023-06-15 Thread Maxime Coquelin



On 6/15/23 09:25, Yongji Xie wrote:

On Wed, Jun 14, 2023 at 7:52 PM Maxime Coquelin
 wrote:


vduse_vdpa_set_vq_affinity callback can be called
with NULL value as cpu_mask when deleting the vduse
device.

This patch clears virtqueue's IRQ affinity mask value
instead of dereferencing NULL cpu_mask.

[ 4760.952149] BUG: kernel NULL pointer dereference, address: 
[ 4760.959110] #PF: supervisor read access in kernel mode
[ 4760.964247] #PF: error_code(0x) - not-present page
[ 4760.969385] PGD 0 P4D 0
[ 4760.971927] Oops:  [#1] PREEMPT SMP PTI
[ 4760.976112] CPU: 13 PID: 2346 Comm: vdpa Not tainted 6.4.0-rc6+ #4
[ 4760.982291] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.8.1 
06/26/2020
[ 4760.989769] RIP: 0010:memcpy_orig+0xc5/0x130
[ 4760.994049] Code: 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 c3 cc 
cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 fa 08 72 1b <4c> 8b 06 4c 8b 
4c 16 f8 4c 89 07 4c 89 4c 17 f8 c3 cc cc cc cc 66
[ 4761.012793] RSP: 0018:b1d565abb830 EFLAGS: 00010246
[ 4761.018020] RAX: 9f4bf6b27898 RBX: 9f4be23969c0 RCX: 9f4bcadf6400
[ 4761.025152] RDX: 0008 RSI:  RDI: 9f4bf6b27898
[ 4761.032286] RBP:  R08: 0008 R09: 
[ 4761.039416] R10:  R11: 0600 R12: 
[ 4761.046549] R13:  R14: 0080 R15: b1d565abbb10
[ 4761.053680] FS:  7f64c2ec2740() GS:9f635f98() 
knlGS:
[ 4761.061765] CS:  0010 DS:  ES:  CR0: 80050033
[ 4761.067513] CR2:  CR3: 001875270006 CR4: 007706e0
[ 4761.074645] DR0:  DR1:  DR2: 
[ 4761.081775] DR3:  DR6: fffe0ff0 DR7: 0400
[ 4761.088909] PKRU: 5554
[ 4761.091620] Call Trace:
[ 4761.094074]  
[ 4761.096180]  ? __die+0x1f/0x70
[ 4761.099238]  ? page_fault_oops+0x171/0x4f0
[ 4761.103340]  ? exc_page_fault+0x7b/0x180
[ 4761.107265]  ? asm_exc_page_fault+0x22/0x30
[ 4761.111460]  ? memcpy_orig+0xc5/0x130
[ 4761.115126]  vduse_vdpa_set_vq_affinity+0x3e/0x50 [vduse]
[ 4761.120533]  virtnet_clean_affinity.part.0+0x3d/0x90 [virtio_net]
[ 4761.126635]  remove_vq_common+0x1a4/0x250 [virtio_net]
[ 4761.131781]  virtnet_remove+0x5d/0x70 [virtio_net]
[ 4761.136580]  virtio_dev_remove+0x3a/0x90
[ 4761.140509]  device_release_driver_internal+0x19b/0x200
[ 4761.145742]  bus_remove_device+0xc2/0x130
[ 4761.149755]  device_del+0x158/0x3e0
[ 4761.153245]  ? kernfs_find_ns+0x35/0xc0
[ 4761.157086]  device_unregister+0x13/0x60
[ 4761.161010]  unregister_virtio_device+0x11/0x20
[ 4761.165543]  device_release_driver_internal+0x19b/0x200
[ 4761.170770]  bus_remove_device+0xc2/0x130
[ 4761.174782]  device_del+0x158/0x3e0
[ 4761.178276]  ? __pfx_vdpa_name_match+0x10/0x10 [vdpa]
[ 4761.183336]  device_unregister+0x13/0x60
[ 4761.187260]  vdpa_nl_cmd_dev_del_set_doit+0x63/0xe0 [vdpa]

Fixes: 28f6288eb63d ("vduse: Support set_vq_affinity callback")
Cc: xieyon...@bytedance.com

Signed-off-by: Maxime Coquelin 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 5f5c21674fdc..cdca94e85762 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -726,7 +726,11 @@ static int vduse_vdpa_set_vq_affinity(struct vdpa_device 
*vdpa, u16 idx,
  {
 struct vduse_dev *dev = vdpa_to_vduse(vdpa);

-   cpumask_copy(>vqs[idx]->irq_affinity, cpu_mask);
+   if (cpu_mask)
+   cpumask_copy(>vqs[idx]->irq_affinity, cpu_mask);
+   else
+   cpumask_clear(>vqs[idx]->irq_affinity);


I think we should set all the bits of irq affinity instead:
cpumask_setall(>vqs[idx]->irq_affinity);


I hesitated between both.
My understanding is it only happens on removal, so either would work.
But in case it can happen on other cases, it is indeed better to use
cpumask_setall().

I will post a v2 today.

Thanks,
Maxime


Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] vduse: fix NULL pointer dereference

2023-06-14 Thread Maxime Coquelin
vduse_vdpa_set_vq_affinity callback can be called
with NULL value as cpu_mask when deleting the vduse
device.

This patch clears virtqueue's IRQ affinity mask value
instead of dereferencing NULL cpu_mask.

[ 4760.952149] BUG: kernel NULL pointer dereference, address: 
[ 4760.959110] #PF: supervisor read access in kernel mode
[ 4760.964247] #PF: error_code(0x) - not-present page
[ 4760.969385] PGD 0 P4D 0
[ 4760.971927] Oops:  [#1] PREEMPT SMP PTI
[ 4760.976112] CPU: 13 PID: 2346 Comm: vdpa Not tainted 6.4.0-rc6+ #4
[ 4760.982291] Hardware name: Dell Inc. PowerEdge R640/0W23H8, BIOS 2.8.1 
06/26/2020
[ 4760.989769] RIP: 0010:memcpy_orig+0xc5/0x130
[ 4760.994049] Code: 16 f8 4c 89 07 4c 89 4f 08 4c 89 54 17 f0 4c 89 5c 17 f8 
c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 fa 08 72 1b <4c> 8b 06 
4c 8b 4c 16 f8 4c 89 07 4c 89 4c 17 f8 c3 cc cc cc cc 66
[ 4761.012793] RSP: 0018:b1d565abb830 EFLAGS: 00010246
[ 4761.018020] RAX: 9f4bf6b27898 RBX: 9f4be23969c0 RCX: 9f4bcadf6400
[ 4761.025152] RDX: 0008 RSI:  RDI: 9f4bf6b27898
[ 4761.032286] RBP:  R08: 0008 R09: 
[ 4761.039416] R10:  R11: 0600 R12: 
[ 4761.046549] R13:  R14: 0080 R15: b1d565abbb10
[ 4761.053680] FS:  7f64c2ec2740() GS:9f635f98() 
knlGS:
[ 4761.061765] CS:  0010 DS:  ES:  CR0: 80050033
[ 4761.067513] CR2:  CR3: 001875270006 CR4: 007706e0
[ 4761.074645] DR0:  DR1:  DR2: 
[ 4761.081775] DR3:  DR6: fffe0ff0 DR7: 0400
[ 4761.088909] PKRU: 5554
[ 4761.091620] Call Trace:
[ 4761.094074]  
[ 4761.096180]  ? __die+0x1f/0x70
[ 4761.099238]  ? page_fault_oops+0x171/0x4f0
[ 4761.103340]  ? exc_page_fault+0x7b/0x180
[ 4761.107265]  ? asm_exc_page_fault+0x22/0x30
[ 4761.111460]  ? memcpy_orig+0xc5/0x130
[ 4761.115126]  vduse_vdpa_set_vq_affinity+0x3e/0x50 [vduse]
[ 4761.120533]  virtnet_clean_affinity.part.0+0x3d/0x90 [virtio_net]
[ 4761.126635]  remove_vq_common+0x1a4/0x250 [virtio_net]
[ 4761.131781]  virtnet_remove+0x5d/0x70 [virtio_net]
[ 4761.136580]  virtio_dev_remove+0x3a/0x90
[ 4761.140509]  device_release_driver_internal+0x19b/0x200
[ 4761.145742]  bus_remove_device+0xc2/0x130
[ 4761.149755]  device_del+0x158/0x3e0
[ 4761.153245]  ? kernfs_find_ns+0x35/0xc0
[ 4761.157086]  device_unregister+0x13/0x60
[ 4761.161010]  unregister_virtio_device+0x11/0x20
[ 4761.165543]  device_release_driver_internal+0x19b/0x200
[ 4761.170770]  bus_remove_device+0xc2/0x130
[ 4761.174782]  device_del+0x158/0x3e0
[ 4761.178276]  ? __pfx_vdpa_name_match+0x10/0x10 [vdpa]
[ 4761.183336]  device_unregister+0x13/0x60
[ 4761.187260]  vdpa_nl_cmd_dev_del_set_doit+0x63/0xe0 [vdpa]

Fixes: 28f6288eb63d ("vduse: Support set_vq_affinity callback")
Cc: xieyon...@bytedance.com

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 5f5c21674fdc..cdca94e85762 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -726,7 +726,11 @@ static int vduse_vdpa_set_vq_affinity(struct vdpa_device 
*vdpa, u16 idx,
 {
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
 
-   cpumask_copy(>vqs[idx]->irq_affinity, cpu_mask);
+   if (cpu_mask)
+   cpumask_copy(>vqs[idx]->irq_affinity, cpu_mask);
+   else
+   cpumask_clear(>vqs[idx]->irq_affinity);
+
return 0;
 }
 
-- 
2.40.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC 0/2] vduse: add support for networking devices

2023-04-21 Thread Maxime Coquelin



On 4/21/23 07:51, Jason Wang wrote:

On Thu, Apr 20, 2023 at 10:16 PM Maxime Coquelin
 wrote:




On 4/20/23 06:34, Jason Wang wrote:

On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin
 wrote:


This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support [0] using split rings layout.

Control queue support (and so multiqueue) has also been
tested, but require a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

Other than that, we have identified a few gaps:

1. Reconnection:
   a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
  index, even after the virtqueue has already been
  processed. Is that expected? I have tried instead to
  get the driver's avail index directly from the avail
  ring, but it does not seem reliable as I sometimes get
  "id %u is not a head!\n" warnings. Also such solution
  would not be possible with packed ring, as we need to
  know the wrap counters values.


Looking at the codes, it only returns the value that is set via
set_vq_state(). I think it is expected to be called before the
datapath runs.

So when bound to virtio-vdpa, it is expected to return 0. But we need
to fix the packed virtqueue case, I wonder if we need to call
set_vq_state() explicitly in virtio-vdpa before starting the device.

When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which
will end up a call to set_vq_state(). Unfortunately, it doesn't
support packed ring which needs some extension.



   b. Missing IOCTLs: it would be handy to have new IOCTLs to
  query Virtio device status,


What's the use case of this ioctl? It looks to me userspace is
notified on each status change now:

static int vduse_dev_set_status(struct vduse_dev *dev, u8 status)
{
  struct vduse_dev_msg msg = { 0 };

  msg.req.type = VDUSE_SET_STATUS;
  msg.req.s.status = status;

  return vduse_dev_msg_sync(dev, );
}


The idea was to be able to query the status at reconnect time, and
neither having to assume its value nor having to store its value in a
file (the status could change while the VDUSE application is stopped,
but maybe it would receive the notification at reconnect).


I see.



I will prototype using a tmpfs file to save needed information, and see
if it works.


It might work but then the API is not self contained. Maybe it's
better to have a dedicated ioctl.




and retrieve the config
  space set at VDUSE_CREATE_DEV time.


In order to be safe, VDUSE avoids writable config space. Otherwise
drivers could block on config writing forever. That's why we don't do
it now.


The idea was not to make the config space writable, but just to be able
to fetch what was filled at VDUSE_CREATE_DEV time.

With the tmpfs file, we can avoid doing that and just save the config
space there.


Same as the case for status.


I have cooked a DPDK patch to support reconnect with a tmpfs file as
suggested by Yongji:

https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/53913f2b1155b02c44d5d3d298aafd357e7a8c48

That's still rough around the edges, but it seems to work reliably
for the testing I have done so far. We'll certainly want to use the
tmpfs memory to directly store available indexes and wrap counters to
avoid introducing overhead in the datapath. The tricky part will be to
manage NUMA affinity.

Regards,
Maxime



Thanks




We need to harden the config write before we can proceed to this I think.



2. VDUSE application as non-root:
We need to run the VDUSE application as non-root. There
is some race between the time the UDEV rule is applied
and the time the device starts being used. Discussing
with Jason, he suggested we may have a VDUSE daemon run
as root that would create the VDUSE device, manages its
rights and then pass its file descriptor to the VDUSE
app. However, with current IOCTLs, it means the VDUSE
daemon would need to know several information that
belongs to the VDUSE app implementing the device such
as supported Virtio features, config space, etc...
If we go that route, maybe we should have a control
IOCTL to create the device which would just pass the
device type. Then another device IOCTL to perform the
initialization. Would that make sense?


I think so. We can hear from others.



3. Coredump:
In order to be able to perform post-mortem analysis, DPDK
Vhost library marks pages used for vrings and descriptors
buffers as MADV_DODUMP using madvise(). However with
VDUSE it fails with -EINVAL. My understanding is that we
set VM_DONTEXPAND flag to the VMAs and madvise's
MADV_DODUMP fails if it is present. I'm not sure to
understand why madvise would prevent MADV_DODUMP if
VM_DONTEXPAND is set. Any thoughts?


Adding Peter who may know the answer.


Thanks!
Maxime


Th

Re: [RFC 0/2] vduse: add support for networking devices

2023-04-20 Thread Maxime Coquelin



On 4/20/23 10:13, Yongji Xie wrote:

On Wed, Apr 19, 2023 at 9:44 PM Maxime Coquelin
 wrote:


This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support [0] using split rings layout.

Control queue support (and so multiqueue) has also been
tested, but require a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

Other than that, we have identified a few gaps:

1. Reconnection:
  a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
 index, even after the virtqueue has already been
 processed. Is that expected? I have tried instead to
 get the driver's avail index directly from the avail
 ring, but it does not seem reliable as I sometimes get
 "id %u is not a head!\n" warnings. Also such solution
 would not be possible with packed ring, as we need to
 know the wrap counters values.



I'm not sure how to handle the reconnection in the vhost-user-net
case. Can we use a tmpfs file to track inflight I/O like this [1]


We don't have inflight IOs with DPDK Vhsot library for net devices.
But yes, a solution is to have a tmpfs file to save needed data.

Advantage of this solution is it makes it possible to reconnect with
packed ring in case of application crash, as the wrap counter values
would not be lost.


[1] 
https://qemu-project.gitlab.io/qemu/interop/vhost-user.html#inflight-i-o-tracking


  b. Missing IOCTLs: it would be handy to have new IOCTLs to
 query Virtio device status, and retrieve the config
 space set at VDUSE_CREATE_DEV time.



VDUSE_GET_STATUS ioctl might be needed. Or can we use a tmpfs file to
save/restore that info.


2. VDUSE application as non-root:
   We need to run the VDUSE application as non-root. There
   is some race between the time the UDEV rule is applied
   and the time the device starts being used. Discussing
   with Jason, he suggested we may have a VDUSE daemon run
   as root that would create the VDUSE device, manages its
   rights and then pass its file descriptor to the VDUSE
   app. However, with current IOCTLs, it means the VDUSE
   daemon would need to know several information that
   belongs to the VDUSE app implementing the device such
   as supported Virtio features, config space, etc...
   If we go that route, maybe we should have a control
   IOCTL to create the device which would just pass the
   device type. Then another device IOCTL to perform the
   initialization. Would that make sense?



I think we can reuse the VDUSE_CREATE_DEV ioctl (just use name,
device_id and vendor_id) for control device here, and add a new ioctl
VDUSE_DEV_SETUP to do device initialization.


OK.
If we do that, could we also provide the possibility to pass an UUID at 
VDUSE_DEV_SETUP time?


It could be useful if we save information in a tmpfs file, in order to
be able at reconnect time to ensure the tmpfs file UUID matches with the
VDUSE device UUID, and so avoid restoring a leftover tmpfs file of a
previously detroyed then re-created VDUSE device. Would that make sense?

Regards,
Maxime


Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 0/2] vduse: add support for networking devices

2023-04-20 Thread Maxime Coquelin



On 4/20/23 06:34, Jason Wang wrote:

On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin
 wrote:


This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support [0] using split rings layout.

Control queue support (and so multiqueue) has also been
tested, but require a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

Other than that, we have identified a few gaps:

1. Reconnection:
  a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
 index, even after the virtqueue has already been
 processed. Is that expected? I have tried instead to
 get the driver's avail index directly from the avail
 ring, but it does not seem reliable as I sometimes get
 "id %u is not a head!\n" warnings. Also such solution
 would not be possible with packed ring, as we need to
 know the wrap counters values.


Looking at the codes, it only returns the value that is set via
set_vq_state(). I think it is expected to be called before the
datapath runs.

So when bound to virtio-vdpa, it is expected to return 0. But we need
to fix the packed virtqueue case, I wonder if we need to call
set_vq_state() explicitly in virtio-vdpa before starting the device.

When bound to vhost-vdpa, Qemu will call VHOST_SET_VRING_BASE which
will end up a call to set_vq_state(). Unfortunately, it doesn't
support packed ring which needs some extension.



  b. Missing IOCTLs: it would be handy to have new IOCTLs to
 query Virtio device status,


What's the use case of this ioctl? It looks to me userspace is
notified on each status change now:

static int vduse_dev_set_status(struct vduse_dev *dev, u8 status)
{
 struct vduse_dev_msg msg = { 0 };

 msg.req.type = VDUSE_SET_STATUS;
 msg.req.s.status = status;

 return vduse_dev_msg_sync(dev, );
}


The idea was to be able to query the status at reconnect time, and
neither having to assume its value nor having to store its value in a
file (the status could change while the VDUSE application is stopped,
but maybe it would receive the notification at reconnect).

I will prototype using a tmpfs file to save needed information, and see
if it works.


and retrieve the config
 space set at VDUSE_CREATE_DEV time.


In order to be safe, VDUSE avoids writable config space. Otherwise
drivers could block on config writing forever. That's why we don't do
it now.


The idea was not to make the config space writable, but just to be able
to fetch what was filled at VDUSE_CREATE_DEV time.

With the tmpfs file, we can avoid doing that and just save the config
space there.


We need to harden the config write before we can proceed to this I think.



2. VDUSE application as non-root:
   We need to run the VDUSE application as non-root. There
   is some race between the time the UDEV rule is applied
   and the time the device starts being used. Discussing
   with Jason, he suggested we may have a VDUSE daemon run
   as root that would create the VDUSE device, manages its
   rights and then pass its file descriptor to the VDUSE
   app. However, with current IOCTLs, it means the VDUSE
   daemon would need to know several information that
   belongs to the VDUSE app implementing the device such
   as supported Virtio features, config space, etc...
   If we go that route, maybe we should have a control
   IOCTL to create the device which would just pass the
   device type. Then another device IOCTL to perform the
   initialization. Would that make sense?


I think so. We can hear from others.



3. Coredump:
   In order to be able to perform post-mortem analysis, DPDK
   Vhost library marks pages used for vrings and descriptors
   buffers as MADV_DODUMP using madvise(). However with
   VDUSE it fails with -EINVAL. My understanding is that we
   set VM_DONTEXPAND flag to the VMAs and madvise's
   MADV_DODUMP fails if it is present. I'm not sure to
   understand why madvise would prevent MADV_DODUMP if
   VM_DONTEXPAND is set. Any thoughts?


Adding Peter who may know the answer.


Thanks!
Maxime


Thanks



[0]: 
https://patchwork.dpdk.org/project/dpdk/list/?series=27594=%2A=both
[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/

Maxime Coquelin (2):
   vduse: validate block features only with block devices
   vduse: enable Virtio-net device type

  drivers/vdpa/vdpa_user/vduse_dev.c | 11 +++
  1 file changed, 7 insertions(+), 4 deletions(-)

--
2.39.2





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC 1/2] vduse: validate block features only with block devices

2023-04-20 Thread Maxime Coquelin



On 4/20/23 06:06, Jason Wang wrote:

On Wed, Apr 19, 2023 at 9:43 PM Maxime Coquelin
 wrote:


This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Signed-off-by: Maxime Coquelin 
---
  drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 0c3b48616a9f..6fa598a03d8e 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1416,13 +1416,14 @@ static bool device_is_allowed(u32 device_id)
 return false;
  }

-static bool features_is_valid(u64 features)
+static bool features_is_valid(struct vduse_dev_config *config)
  {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
 return false;

 /* Now we only support read-only configuration space */
-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))


The reason we filter WCE out is to avoid writable config space which
might block the driver with a buggy userspace.

For networking, I guess we should fail if VERSION_1 is not negotiated,
then we can avoid setting mac addresses via the config space.


 Ok, I will add it to patch 2 in V1.

Thanks,
Maxime



Thanks


 return false;

 return true;
@@ -1446,7 +1447,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
 if (!device_is_allowed(config->device_id))
 return false;

-   if (!features_is_valid(config->features))
+   if (!features_is_valid(config))
 return false;

 return true;
--
2.39.2





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RFC 1/2] vduse: validate block features only with block devices

2023-04-19 Thread Maxime Coquelin
This patch is preliminary work to enable network device
type support to VDUSE.

As VIRTIO_BLK_F_CONFIG_WCE shares the same value as
VIRTIO_NET_F_HOST_TSO4, we need to restrict its check
to Virtio-blk device type.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 0c3b48616a9f..6fa598a03d8e 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -1416,13 +1416,14 @@ static bool device_is_allowed(u32 device_id)
return false;
 }
 
-static bool features_is_valid(u64 features)
+static bool features_is_valid(struct vduse_dev_config *config)
 {
-   if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
+   if (!(config->features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
return false;
 
/* Now we only support read-only configuration space */
-   if (features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE))
+   if ((config->device_id == VIRTIO_ID_BLOCK) &&
+   (config->features & (1ULL << VIRTIO_BLK_F_CONFIG_WCE)))
return false;
 
return true;
@@ -1446,7 +1447,7 @@ static bool vduse_validate_config(struct vduse_dev_config 
*config)
if (!device_is_allowed(config->device_id))
return false;
 
-   if (!features_is_valid(config->features))
+   if (!features_is_valid(config))
return false;
 
return true;
-- 
2.39.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC 0/2] vduse: add support for networking devices

2023-04-19 Thread Maxime Coquelin
This small series enables virtio-net device type in VDUSE.
With it, basic operation have been tested, both with
virtio-vdpa and vhost-vdpa using DPDK Vhost library series
adding VDUSE support [0] using split rings layout.

Control queue support (and so multiqueue) has also been
tested, but require a Kernel series from Jason Wang
relaxing control queue polling [1] to function reliably.

Other than that, we have identified a few gaps:

1. Reconnection:
 a. VDUSE_VQ_GET_INFO ioctl() returns always 0 for avail
index, even after the virtqueue has already been
processed. Is that expected? I have tried instead to
get the driver's avail index directly from the avail
ring, but it does not seem reliable as I sometimes get
"id %u is not a head!\n" warnings. Also such solution
would not be possible with packed ring, as we need to
know the wrap counters values.

 b. Missing IOCTLs: it would be handy to have new IOCTLs to
query Virtio device status, and retrieve the config
space set at VDUSE_CREATE_DEV time.

2. VDUSE application as non-root:
  We need to run the VDUSE application as non-root. There
  is some race between the time the UDEV rule is applied
  and the time the device starts being used. Discussing
  with Jason, he suggested we may have a VDUSE daemon run
  as root that would create the VDUSE device, manages its
  rights and then pass its file descriptor to the VDUSE
  app. However, with current IOCTLs, it means the VDUSE
  daemon would need to know several information that
  belongs to the VDUSE app implementing the device such
  as supported Virtio features, config space, etc...
  If we go that route, maybe we should have a control
  IOCTL to create the device which would just pass the
  device type. Then another device IOCTL to perform the
  initialization. Would that make sense?

3. Coredump:
  In order to be able to perform post-mortem analysis, DPDK
  Vhost library marks pages used for vrings and descriptors
  buffers as MADV_DODUMP using madvise(). However with
  VDUSE it fails with -EINVAL. My understanding is that we
  set VM_DONTEXPAND flag to the VMAs and madvise's
  MADV_DODUMP fails if it is present. I'm not sure to
  understand why madvise would prevent MADV_DODUMP if
  VM_DONTEXPAND is set. Any thoughts?

[0]: 
https://patchwork.dpdk.org/project/dpdk/list/?series=27594=%2A=both
[1]: 
https://lore.kernel.org/lkml/cacgkmetgrxn3ppwsdo4oosnssljfembez0wvjgrr3whu+qa...@mail.gmail.com/T/

Maxime Coquelin (2):
  vduse: validate block features only with block devices
  vduse: enable Virtio-net device type

 drivers/vdpa/vdpa_user/vduse_dev.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

-- 
2.39.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RFC 2/2] vduse: enable Virtio-net device type

2023-04-19 Thread Maxime Coquelin
This patch adds Virtio-net device type to the supported
devices types.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 6fa598a03d8e..26b7e29cb900 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -131,6 +131,7 @@ static struct workqueue_struct *vduse_irq_wq;
 
 static u32 allowed_device_id[] = {
VIRTIO_ID_BLOCK,
+   VIRTIO_ID_NET,
 };
 
 static inline struct vduse_dev *vdpa_to_vduse(struct vdpa_device *vdpa)
@@ -1738,6 +1739,7 @@ static const struct vdpa_mgmtdev_ops vdpa_dev_mgmtdev_ops 
= {
 
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID },
+   { VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
{ 0 },
 };
 
-- 
2.39.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command

2023-04-13 Thread Maxime Coquelin



On 4/13/23 15:02, Maxime Coquelin wrote:

Hi Jason,

On 4/13/23 08:40, Jason Wang wrote:

Hi all:

The code used to busy poll for cvq command which turns out to have
several side effects:

1) infinite poll for buggy devices
2) bad interaction with scheduler

So this series tries to use sleep instead of busy polling. In this
version, I take a step back: the hardening part is not implemented and
leave for future investigation. We use to aggree to use interruptible
sleep but it doesn't work for a general workqueue.

Please review.


Thanks for working on this.
My DPDK VDUSE RFC missed to set the interrupt, as Xuan Zhou highlighted
it makes the vdpa dev add/del commands to freeze:
[<0>] device_del+0x37/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] unregister_virtio_device+0x11/0x20
[<0>] device_release_driver_internal+0x193/0x200
[<0>] bus_remove_device+0xbf/0x130
[<0>] device_del+0x174/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] vdpa_nl_cmd_dev_del_set_doit+0x66/0xe0 [vdpa]
[<0>] genl_family_rcv_msg_doit.isra.0+0xb8/0x100
[<0>] genl_rcv_msg+0x151/0x290
[<0>] netlink_rcv_skb+0x54/0x100
[<0>] genl_rcv+0x24/0x40
[<0>] netlink_unicast+0x217/0x340
[<0>] netlink_sendmsg+0x23e/0x4a0
[<0>] sock_sendmsg+0x8f/0xa0
[<0>] __sys_sendto+0xfc/0x170
[<0>] __x64_sys_sendto+0x20/0x30
[<0>] do_syscall_64+0x59/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc

Once fixed on DPDK side (you can use my vduse_v1 branch [0] for
testing), it works fine:

Tested-by: Maxime Coquelin 

For the potential missing interrupt with non-compliant devices, I guess
it could be handled with the hardening work as same thing could happen
if the VDUSE application crashed for example.

Regards,
Maxime

[0]:


Better with the link...

[0]: https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commits/vduse_v1/


Thanks

Changes since V1:
- use RTNL to synchronize rx mode worker
- use completion for simplicity
- don't try to harden CVQ command

Changes since RFC:

- switch to use BAD_RING in virtio_break_device()
- check virtqueue_is_broken() after being woken up
- use more_used() instead of virtqueue_get_buf() to allow caller to
   get buffers afterwards
   - break the virtio-net device when timeout
   - get buffer manually since the virtio core check more_used() instead

Jason Wang (2):
   virtio-net: convert rx mode setting to use workqueue
   virtio-net: sleep instead of busy waiting for cvq command

  drivers/net/virtio_net.c | 76 ++--
  1 file changed, 66 insertions(+), 10 deletions(-)



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command

2023-04-13 Thread Maxime Coquelin

Hi Jason,

On 4/13/23 08:40, Jason Wang wrote:

Hi all:

The code used to busy poll for cvq command which turns out to have
several side effects:

1) infinite poll for buggy devices
2) bad interaction with scheduler

So this series tries to use sleep instead of busy polling. In this
version, I take a step back: the hardening part is not implemented and
leave for future investigation. We use to aggree to use interruptible
sleep but it doesn't work for a general workqueue.

Please review.


Thanks for working on this.
My DPDK VDUSE RFC missed to set the interrupt, as Xuan Zhou highlighted
it makes the vdpa dev add/del commands to freeze:
[<0>] device_del+0x37/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] unregister_virtio_device+0x11/0x20
[<0>] device_release_driver_internal+0x193/0x200
[<0>] bus_remove_device+0xbf/0x130
[<0>] device_del+0x174/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] vdpa_nl_cmd_dev_del_set_doit+0x66/0xe0 [vdpa]
[<0>] genl_family_rcv_msg_doit.isra.0+0xb8/0x100
[<0>] genl_rcv_msg+0x151/0x290
[<0>] netlink_rcv_skb+0x54/0x100
[<0>] genl_rcv+0x24/0x40
[<0>] netlink_unicast+0x217/0x340
[<0>] netlink_sendmsg+0x23e/0x4a0
[<0>] sock_sendmsg+0x8f/0xa0
[<0>] __sys_sendto+0xfc/0x170
[<0>] __x64_sys_sendto+0x20/0x30
[<0>] do_syscall_64+0x59/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc

Once fixed on DPDK side (you can use my vduse_v1 branch [0] for
testing), it works fine:

Tested-by: Maxime Coquelin 

For the potential missing interrupt with non-compliant devices, I guess
it could be handled with the hardening work as same thing could happen
if the VDUSE application crashed for example.

Regards,
Maxime

[0]:

Thanks

Changes since V1:
- use RTNL to synchronize rx mode worker
- use completion for simplicity
- don't try to harden CVQ command

Changes since RFC:

- switch to use BAD_RING in virtio_break_device()
- check virtqueue_is_broken() after being woken up
- use more_used() instead of virtqueue_get_buf() to allow caller to
   get buffers afterwards
   - break the virtio-net device when timeout
   - get buffer manually since the virtio core check more_used() instead

Jason Wang (2):
   virtio-net: convert rx mode setting to use workqueue
   virtio-net: sleep instead of busy waiting for cvq command

  drivers/net/virtio_net.c | 76 ++--
  1 file changed, 66 insertions(+), 10 deletions(-)



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3] vduse: prevent uninitialized memory accesses

2022-08-31 Thread Maxime Coquelin
If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

This fix addresses CVE-2022-2308.

Cc: sta...@vger.kernel.org # v5.15+
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Reviewed-by: Xie Yongji 
Acked-by: Jason Wang 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 41c0b29739f1..35dceee3ed56 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -673,10 +673,15 @@ static void vduse_vdpa_get_config(struct vdpa_device 
*vdpa, unsigned int offset,
 {
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
 
-   if (offset > dev->config_size ||
-   len > dev->config_size - offset)
+   /* Initialize the buffer in case of partial copy. */
+   memset(buf, 0, len);
+
+   if (offset > dev->config_size)
return;
 
+   if (len > dev->config_size - offset)
+   len = dev->config_size - offset;
+
memcpy(buf, dev->config + offset, len);
 }
 
-- 
2.37.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vduse: prevent uninitialized memory accesses

2022-08-31 Thread Maxime Coquelin




On 8/31/22 17:46, Michael S. Tsirkin wrote:

On Wed, Aug 31, 2022 at 05:01:11PM +0200, Maxime Coquelin wrote:

On 8/29/22 09:48, Greg KH wrote:

On Mon, Aug 29, 2022 at 09:34:24AM +0200, Maxime Coquelin wrote:

If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

This fix addresses CVE-2022-2308.

Cc: xieyon...@bytedance.com
Cc: sta...@vger.kernel.org # v5.15+
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")

Acked-by: Jason Wang 
Signed-off-by: Maxime Coquelin 


Please no blank line above the Acked-by: line here if possible.


Sure.

Jason, do you prefer I post a new revision with this single change or
you will handle it while applying? Either way is fine to me.

Thanks,
Maxime


I queue these normally.


Ok, I'm preparing the V2.

Thanks,
Maxime


thanks,

greg k-h





___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2] vduse: prevent uninitialized memory accesses

2022-08-31 Thread Maxime Coquelin

On 8/29/22 09:48, Greg KH wrote:

On Mon, Aug 29, 2022 at 09:34:24AM +0200, Maxime Coquelin wrote:

If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

This fix addresses CVE-2022-2308.

Cc: xieyon...@bytedance.com
Cc: sta...@vger.kernel.org # v5.15+
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")

Acked-by: Jason Wang 
Signed-off-by: Maxime Coquelin 


Please no blank line above the Acked-by: line here if possible.


Sure.

Jason, do you prefer I post a new revision with this single change or
you will handle it while applying? Either way is fine to me.

Thanks,
Maxime


thanks,

greg k-h



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2] vduse: prevent uninitialized memory accesses

2022-08-29 Thread Maxime Coquelin
If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

This fix addresses CVE-2022-2308.

Cc: xieyon...@bytedance.com
Cc: sta...@vger.kernel.org # v5.15+
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")

Acked-by: Jason Wang 
Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 41c0b29739f1..35dceee3ed56 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -673,10 +673,15 @@ static void vduse_vdpa_get_config(struct vdpa_device 
*vdpa, unsigned int offset,
 {
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
 
-   if (offset > dev->config_size ||
-   len > dev->config_size - offset)
+   /* Initialize the buffer in case of partial copy. */
+   memset(buf, 0, len);
+
+   if (offset > dev->config_size)
return;
 
+   if (len > dev->config_size - offset)
+   len = dev->config_size - offset;
+
memcpy(buf, dev->config + offset, len);
 }
 
-- 
2.37.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vduse: prevent uninitialized memory accesses

2022-08-29 Thread Maxime Coquelin




On 8/27/22 08:54, Dan Carpenter wrote:

On Fri, Aug 26, 2022 at 06:16:05PM +0200, Maxime Coquelin wrote:

If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

Signed-off-by: Maxime Coquelin 


This sounds like it needs a Fixes tag?


Yes, I actually did it, but somehow forgot to generate the patch bedore
posting.

I'll post a v2 soon.

Thanks,
Maxime


regards,
dan carpenter



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] vduse: prevent uninitialized memory accesses

2022-08-26 Thread Maxime Coquelin
If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

Signed-off-by: Maxime Coquelin 
---
 drivers/vdpa/vdpa_user/vduse_dev.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 41c0b29739f1..35dceee3ed56 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -673,10 +673,15 @@ static void vduse_vdpa_get_config(struct vdpa_device 
*vdpa, unsigned int offset,
 {
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
 
-   if (offset > dev->config_size ||
-   len > dev->config_size - offset)
+   /* Initialize the buffer in case of partial copy. */
+   memset(buf, 0, len);
+
+   if (offset > dev->config_size)
return;
 
+   if (len > dev->config_size - offset)
+   len = dev->config_size - offset;
+
memcpy(buf, dev->config + offset, len);
 }
 
-- 
2.37.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] vdpa: ifcvf: return err when fail to request config irq

2020-08-12 Thread Maxime Coquelin
Hi,

On 7/23/20 11:12 AM, Jason Wang wrote:
> We ignore the err of requesting config interrupt, fix this.
> 
> Fixes: e7991f376a4d ("ifcvf: implement config interrupt in IFCVF")
> Cc: Zhu Lingshan 
> Signed-off-by: Jason Wang 
> ---
>  drivers/vdpa/ifcvf/ifcvf_main.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
> index f5a60c14b979..ae7110955a44 100644
> --- a/drivers/vdpa/ifcvf/ifcvf_main.c
> +++ b/drivers/vdpa/ifcvf/ifcvf_main.c
> @@ -76,6 +76,10 @@ static int ifcvf_request_irq(struct ifcvf_adapter *adapter)
>   ret = devm_request_irq(>dev, irq,
>  ifcvf_config_changed, 0,
>  vf->config_msix_name, vf);
> + if (ret) {
> + IFCVF_ERR(pdev, "Failed to request config irq\n");
> + return ret;
> + }
>  
>   for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) {
>   snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n",
> 

This series fixes below Kernel BUG I faced while doing some experiments:

[ 1398.695362] kernel BUG at drivers/pci/msi.c:375!
[ 1398.700561] invalid opcode:  [#1] SMP PTI
[ 1398.705423] CPU: 0 PID: 25110 Comm: dpdk-testpmd Not tainted
5.8.0-amorenoz-vdpa+ #2
[ 1398.714063] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS
2.4.3 01/17/2017
[ 1398.722415] RIP: 0010:free_msi_irqs+0x189/0x1c0
[ 1398.727470] Code: 14 85 c0 0f 84 cc fe ff ff 31 ed eb 0f 83 c5 01 39
6b 14 0f 86 bc fe ff ff 8b 7b 10 01 ef e8 7e 94 b9 ff 48 83 78 70 00d
[ 1398.748426] RSP: 0018:b48ac5dd3db8 EFLAGS: 00010286
[ 1398.754257] RAX: 9ab298b85400 RBX: 9ab285d97100 RCX:

[ 1398.762219] RDX:  RSI: 0073 RDI:
ac65e8a0
[ 1398.770182] RBP:  R08: 9ab298b85400 R09:
9ab74a8c3d98
[ 1398.778145] R10: 9ab74a8c3f58 R11:  R12:
9ab719fd82e0
[ 1398.786107] R13: 9ab719fd8000 R14: 9ab719fd8000 R15:
9ab719fd80b0
[ 1398.794069] FS:  7efc5dea9000() GS:9ab75fc0()
knlGS:
[ 1398.803099] CS:  0010 DS:  ES:  CR0: 80050033
[ 1398.809508] CR2: 00c79ff8 CR3: 00038283e005 CR4:
003606f0
[ 1398.817471] DR0:  DR1:  DR2:

[ 1398.825434] DR3:  DR6: fffe0ff0 DR7:
0400
[ 1398.833394] Call Trace:
[ 1398.836127]  pci_disable_msix+0xf7/0x120
[ 1398.840504]  pci_free_irq_vectors+0xe/0x20
[ 1398.845074]  ifcvf_vdpa_set_status+0xda/0x301
[ 1398.849938]  vhost_vdpa_unlocked_ioctl+0x61d/0x790
[ 1398.855277]  ? vhost_vdpa_process_iotlb_msg+0x2f0/0x2f0
[ 1398.861109]  ksys_ioctl+0x87/0xc0
[ 1398.864808]  __x64_sys_ioctl+0x16/0x20
[ 1398.868992]  do_syscall_64+0x52/0x90
[ 1398.872982]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1398.878610] RIP: 0033:0x7efc5df9ff9b
[ 1398.882598] Code: 0f 1e fa 48 8b 05 ed ce 0c 00 64 c7 00 26 00 00 00
48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 008
[ 1398.903551] RSP: 002b:7ffd0948e378 EFLAGS: 0283 ORIG_RAX:
0010
[ 1398.911998] RAX: ffda RBX:  RCX:
7efc5df9ff9b
[ 1398.919960] RDX: 7ffd0948e3d4 RSI: 4001af72 RDI:
002c
[ 1398.927921] RBP: 7ffd0948e3c0 R08: 02651bf8 R09:

[ 1398.935883] R10: 7ffd0948e417 R11: 0283 R12:
00408950
[ 1398.943845] R13: 7ffd0948e6a0 R14:  R15:

[ 1398.951809] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel
ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_comment xt_mark nf_tables xt_nat vetht
[ 1398.951847]  ghash_clmulni_intel iTCO_vendor_support mlx5_core dcdbas
rapl intel_cstate intel_uncore ipmi_ssif pcspkr mxm_wmi mlxfw virtii

Tested-by: Maxime Coquelin 

Thanks,
Maxime

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization