Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-15 Thread Jason Wang


在 2021/3/31 下午4:05, Xie Yongji 写道:

+   }
+   case VDUSE_INJECT_VQ_IRQ:
+   ret = -EINVAL;
+   if (arg >= dev->vq_num)
+   break;
+
+   ret = 0;
+   queue_work(vduse_irq_wq, &dev->vqs[arg].inject);
+   break;



One additional note:

Please use array_index_nospec() for all vqs[idx] access where idx is 
under the control of userspace to avoid potential spectre exploitation.


Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-14 Thread Jason Wang


在 2021/4/13 下午12:28, Yongji Xie 写道:

On Tue, Apr 13, 2021 at 11:35 AM Jason Wang  wrote:


在 2021/4/12 下午5:59, Yongji Xie 写道:

On Mon, Apr 12, 2021 at 5:37 PM Jason Wang  wrote:

在 2021/4/12 下午4:02, Yongji Xie 写道:

On Mon, Apr 12, 2021 at 3:16 PM Jason Wang  wrote:

在 2021/4/9 下午4:02, Yongji Xie 写道:

+};
+
+struct vduse_dev_config_data {
+ __u32 offset; /* offset from the beginning of config space */
+ __u32 len; /* the length to read/write */
+ __u8 data[VDUSE_CONFIG_DATA_LEN]; /* data buffer used to read/write */

Note that since VDUSE_CONFIG_DATA_LEN is part of uAPI it means we can
not change it in the future.

So this might suffcient for future features or all type of virtio devices.


Do you mean 256 is no enough here?

Yes.


But this request will be submitted multiple times if config lengh is
larger than 256. So do you think whether we need to extent the size to
512 or larger?

So I think you'd better either:

1) document the limitation (256) in somewhere, (better both uapi and doc)


But the VDUSE_CONFIG_DATA_LEN doesn't mean the limitation of
configuration space. It only means the maximum size of one data
transfer for configuration space. Do you mean document this?

Yes, and another thing is that since you're using
data[VDUSE_CONFIG_DATA_LEN] in the uapi, it implies the length is always
256 which seems not good and not what the code is wrote.


How about renaming VDUSE_CONFIG_DATA_LEN to VDUSE_MAX_TRANSFER_LEN?

Thanks,
Yongji


So a question is the reason to have a limitation of this in the uAPI?
Note that in vhost-vdpa we don't have such:

struct vhost_vdpa_config {
  __u32 off;
  __u32 len;
  __u8 buf[0];
};


If so, we need to call read()/write() multiple times each time
receiving/sending one request or response in userspace and kernel. For
example,

1. read and check request/response type
2. read and check config length if type is VDUSE_SET_CONFIG or VDUSE_GET_CONFIG
3. read the payload

Not sure if it's worth it.

Thanks,
Yongji



Right, I see.

So I'm fine with current approach.

Thanks







___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-12 Thread Jason Wang


在 2021/4/12 下午5:59, Yongji Xie 写道:

On Mon, Apr 12, 2021 at 5:37 PM Jason Wang  wrote:


在 2021/4/12 下午4:02, Yongji Xie 写道:

On Mon, Apr 12, 2021 at 3:16 PM Jason Wang  wrote:

在 2021/4/9 下午4:02, Yongji Xie 写道:

+};
+
+struct vduse_dev_config_data {
+ __u32 offset; /* offset from the beginning of config space */
+ __u32 len; /* the length to read/write */
+ __u8 data[VDUSE_CONFIG_DATA_LEN]; /* data buffer used to read/write */

Note that since VDUSE_CONFIG_DATA_LEN is part of uAPI it means we can
not change it in the future.

So this might suffcient for future features or all type of virtio devices.


Do you mean 256 is no enough here?

Yes.


But this request will be submitted multiple times if config lengh is
larger than 256. So do you think whether we need to extent the size to
512 or larger?

So I think you'd better either:

1) document the limitation (256) in somewhere, (better both uapi and doc)


But the VDUSE_CONFIG_DATA_LEN doesn't mean the limitation of
configuration space. It only means the maximum size of one data
transfer for configuration space. Do you mean document this?


Yes, and another thing is that since you're using
data[VDUSE_CONFIG_DATA_LEN] in the uapi, it implies the length is always
256 which seems not good and not what the code is wrote.


How about renaming VDUSE_CONFIG_DATA_LEN to VDUSE_MAX_TRANSFER_LEN?

Thanks,
Yongji



So a question is the reason to have a limitation of this in the uAPI? 
Note that in vhost-vdpa we don't have such:


struct vhost_vdpa_config {
    __u32 off;
    __u32 len;
    __u8 buf[0];
};

Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-12 Thread Jason Wang


在 2021/4/12 下午4:02, Yongji Xie 写道:

On Mon, Apr 12, 2021 at 3:16 PM Jason Wang  wrote:


在 2021/4/9 下午4:02, Yongji Xie 写道:

+};
+
+struct vduse_dev_config_data {
+ __u32 offset; /* offset from the beginning of config space */
+ __u32 len; /* the length to read/write */
+ __u8 data[VDUSE_CONFIG_DATA_LEN]; /* data buffer used to read/write */

Note that since VDUSE_CONFIG_DATA_LEN is part of uAPI it means we can
not change it in the future.

So this might suffcient for future features or all type of virtio devices.


Do you mean 256 is no enough here?

Yes.


But this request will be submitted multiple times if config lengh is
larger than 256. So do you think whether we need to extent the size to
512 or larger?


So I think you'd better either:

1) document the limitation (256) in somewhere, (better both uapi and doc)


But the VDUSE_CONFIG_DATA_LEN doesn't mean the limitation of
configuration space. It only means the maximum size of one data
transfer for configuration space. Do you mean document this?



Yes, and another thing is that since you're using 
data[VDUSE_CONFIG_DATA_LEN] in the uapi, it implies the length is always 
256 which seems not good and not what the code is wrote.


Thanks




Thanks,
Yongji



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-12 Thread Jason Wang


在 2021/4/9 下午4:02, Yongji Xie 写道:

+};
+
+struct vduse_dev_config_data {
+ __u32 offset; /* offset from the beginning of config space */
+ __u32 len; /* the length to read/write */
+ __u8 data[VDUSE_CONFIG_DATA_LEN]; /* data buffer used to read/write */

Note that since VDUSE_CONFIG_DATA_LEN is part of uAPI it means we can
not change it in the future.

So this might suffcient for future features or all type of virtio devices.


Do you mean 256 is no enough here?

Yes.


But this request will be submitted multiple times if config lengh is
larger than 256. So do you think whether we need to extent the size to
512 or larger?



So I think you'd better either:

1) document the limitation (256) in somewhere, (better both uapi and doc)

or

2) make it variable

Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-08 Thread Jason Wang


在 2021/4/8 下午5:36, Yongji Xie 写道:

On Thu, Apr 8, 2021 at 2:57 PM Jason Wang  wrote:


在 2021/3/31 下午4:05, Xie Yongji 写道:

This VDUSE driver enables implementing vDPA devices in userspace.
Both control path and data path of vDPA devices will be able to
be handled in userspace.

In the control path, the VDUSE driver will make use of message
mechnism to forward the config operation from vdpa bus driver
to userspace. Userspace can use read()/write() to receive/reply
those control messages.

In the data path, VDUSE_IOTLB_GET_FD ioctl will be used to get
the file descriptors referring to vDPA device's iova regions. Then
userspace can use mmap() to access those iova regions. Besides,
userspace can use ioctl() to inject interrupt and use the eventfd
mechanism to receive virtqueue kicks.

Signed-off-by: Xie Yongji 
---
   Documentation/userspace-api/ioctl/ioctl-number.rst |1 +
   drivers/vdpa/Kconfig   |   10 +
   drivers/vdpa/Makefile  |1 +
   drivers/vdpa/vdpa_user/Makefile|5 +
   drivers/vdpa/vdpa_user/vduse_dev.c | 1362 

   include/uapi/linux/vduse.h |  175 +++
   6 files changed, 1554 insertions(+)
   create mode 100644 drivers/vdpa/vdpa_user/Makefile
   create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
   create mode 100644 include/uapi/linux/vduse.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
b/Documentation/userspace-api/ioctl/ioctl-number.rst
index a4c75a28c839..71722e6f8f23 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -300,6 +300,7 @@ Code  Seq#Include File  
 Comments
   'z'   10-4F  drivers/s390/crypto/zcrypt_api.h
conflict!
   '|'   00-7F  linux/media.h
   0x80  00-1F  linux/fb.h
+0x81  00-1F  linux/vduse.h
   0x89  00-06  arch/x86/include/asm/sockios.h
   0x89  0B-DF  linux/sockios.h
   0x89  E0-EF  linux/sockios.h 
SIOCPROTOPRIVATE range
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index a245809c99d0..77a1da522c21 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -25,6 +25,16 @@ config VDPA_SIM_NET
   help
 vDPA networking device simulator which loops TX traffic back to RX.

+config VDPA_USER
+ tristate "VDUSE (vDPA Device in Userspace) support"
+ depends on EVENTFD && MMU && HAS_DMA
+ select DMA_OPS
+ select VHOST_IOTLB
+ select IOMMU_IOVA
+ help
+   With VDUSE it is possible to emulate a vDPA Device
+   in a userspace program.
+
   config IFCVF
   tristate "Intel IFC VF vDPA driver"
   depends on PCI_MSI
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index 67fe7f3d6943..f02ebed33f19 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -1,6 +1,7 @@
   # SPDX-License-Identifier: GPL-2.0
   obj-$(CONFIG_VDPA) += vdpa.o
   obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
+obj-$(CONFIG_VDPA_USER) += vdpa_user/
   obj-$(CONFIG_IFCVF)+= ifcvf/
   obj-$(CONFIG_MLX5_VDPA) += mlx5/
   obj-$(CONFIG_VP_VDPA)+= virtio_pci/
diff --git a/drivers/vdpa/vdpa_user/Makefile b/drivers/vdpa/vdpa_user/Makefile
new file mode 100644
index ..260e0b26af99
--- /dev/null
+++ b/drivers/vdpa/vdpa_user/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+
+vduse-y := vduse_dev.o iova_domain.o
+
+obj-$(CONFIG_VDPA_USER) += vduse.o
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
new file mode 100644
index ..51ca73464d0d
--- /dev/null
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -0,0 +1,1362 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDUSE: vDPA Device in Userspace
+ *
+ * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights 
reserved.
+ *
+ * Author: Xie Yongji 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "iova_domain.h"
+
+#define DRV_VERSION  "1.0"
+#define DRV_AUTHOR   "Yongji Xie "
+#define DRV_DESC "vDPA Device in Userspace"
+#define DRV_LICENSE  "GPL v2"
+
+#define VDUSE_DEV_MAX (1U << MINORBITS)
+
+struct vduse_virtqueue {
+ u16 index;
+ bool ready;
+ spinlock_t kick_lock;
+ spinlock_t irq_lock;
+ struct eventfd_ctx *kickfd;
+ struct vdpa_callback cb;
+ struct work_struct inject;
+};
+
+struct vduse_dev;
+
+struct vduse_vdpa {
+ struct vdpa_device vdpa;
+ struct vduse_dev *dev;
+};
+
+struct vduse_dev {
+ struct vduse_vdpa *vdev;
+ struct device dev;
+ struct cdev cdev;
+ struct vduse_virtqueue *vqs;
+ struct vduse_iova_domain *domain;
+ struct mutex lock;
+ spinlock_t msg_lock;
+ atomic64_t msg_unique;
+ wait_queue_head_t waitq;
+

Re: [PATCH v6 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-04-07 Thread Jason Wang


在 2021/3/31 下午4:05, Xie Yongji 写道:

This VDUSE driver enables implementing vDPA devices in userspace.
Both control path and data path of vDPA devices will be able to
be handled in userspace.

In the control path, the VDUSE driver will make use of message
mechnism to forward the config operation from vdpa bus driver
to userspace. Userspace can use read()/write() to receive/reply
those control messages.

In the data path, VDUSE_IOTLB_GET_FD ioctl will be used to get
the file descriptors referring to vDPA device's iova regions. Then
userspace can use mmap() to access those iova regions. Besides,
userspace can use ioctl() to inject interrupt and use the eventfd
mechanism to receive virtqueue kicks.

Signed-off-by: Xie Yongji 
---
  Documentation/userspace-api/ioctl/ioctl-number.rst |1 +
  drivers/vdpa/Kconfig   |   10 +
  drivers/vdpa/Makefile  |1 +
  drivers/vdpa/vdpa_user/Makefile|5 +
  drivers/vdpa/vdpa_user/vduse_dev.c | 1362 
  include/uapi/linux/vduse.h |  175 +++
  6 files changed, 1554 insertions(+)
  create mode 100644 drivers/vdpa/vdpa_user/Makefile
  create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c
  create mode 100644 include/uapi/linux/vduse.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
b/Documentation/userspace-api/ioctl/ioctl-number.rst
index a4c75a28c839..71722e6f8f23 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -300,6 +300,7 @@ Code  Seq#Include File  
 Comments
  'z'   10-4F  drivers/s390/crypto/zcrypt_api.hconflict!
  '|'   00-7F  linux/media.h
  0x80  00-1F  linux/fb.h
+0x81  00-1F  linux/vduse.h
  0x89  00-06  arch/x86/include/asm/sockios.h
  0x89  0B-DF  linux/sockios.h
  0x89  E0-EF  linux/sockios.h 
SIOCPROTOPRIVATE range
diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig
index a245809c99d0..77a1da522c21 100644
--- a/drivers/vdpa/Kconfig
+++ b/drivers/vdpa/Kconfig
@@ -25,6 +25,16 @@ config VDPA_SIM_NET
help
  vDPA networking device simulator which loops TX traffic back to RX.
  
+config VDPA_USER

+   tristate "VDUSE (vDPA Device in Userspace) support"
+   depends on EVENTFD && MMU && HAS_DMA
+   select DMA_OPS
+   select VHOST_IOTLB
+   select IOMMU_IOVA
+   help
+ With VDUSE it is possible to emulate a vDPA Device
+ in a userspace program.
+
  config IFCVF
tristate "Intel IFC VF vDPA driver"
depends on PCI_MSI
diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile
index 67fe7f3d6943..f02ebed33f19 100644
--- a/drivers/vdpa/Makefile
+++ b/drivers/vdpa/Makefile
@@ -1,6 +1,7 @@
  # SPDX-License-Identifier: GPL-2.0
  obj-$(CONFIG_VDPA) += vdpa.o
  obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
+obj-$(CONFIG_VDPA_USER) += vdpa_user/
  obj-$(CONFIG_IFCVF)+= ifcvf/
  obj-$(CONFIG_MLX5_VDPA) += mlx5/
  obj-$(CONFIG_VP_VDPA)+= virtio_pci/
diff --git a/drivers/vdpa/vdpa_user/Makefile b/drivers/vdpa/vdpa_user/Makefile
new file mode 100644
index ..260e0b26af99
--- /dev/null
+++ b/drivers/vdpa/vdpa_user/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+
+vduse-y := vduse_dev.o iova_domain.o
+
+obj-$(CONFIG_VDPA_USER) += vduse.o
diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
b/drivers/vdpa/vdpa_user/vduse_dev.c
new file mode 100644
index ..51ca73464d0d
--- /dev/null
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -0,0 +1,1362 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDUSE: vDPA Device in Userspace
+ *
+ * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights 
reserved.
+ *
+ * Author: Xie Yongji 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "iova_domain.h"
+
+#define DRV_VERSION  "1.0"
+#define DRV_AUTHOR   "Yongji Xie "
+#define DRV_DESC "vDPA Device in Userspace"
+#define DRV_LICENSE  "GPL v2"
+
+#define VDUSE_DEV_MAX (1U << MINORBITS)
+
+struct vduse_virtqueue {
+   u16 index;
+   bool ready;
+   spinlock_t kick_lock;
+   spinlock_t irq_lock;
+   struct eventfd_ctx *kickfd;
+   struct vdpa_callback cb;
+   struct work_struct inject;
+};
+
+struct vduse_dev;
+
+struct vduse_vdpa {
+   struct vdpa_device vdpa;
+   struct vduse_dev *dev;
+};
+
+struct vduse_dev {
+   struct vduse_vdpa *vdev;
+   struct device dev;
+   struct cdev cdev;
+   struct vduse_virtqueue *vqs;
+   struct vduse_iova_domain *domain;
+   struct mutex lock;
+   spinlock_t msg_lock;
+   atomic64_t msg_unique;
+   wait_queue_head_t waitq;
+   struct list_head send_list;
+   struct list_he