Re: [PATCH] virtio: Work around frames incorrectly marked as gso

2020-02-19 Thread Michael S. Tsirkin
On Thu, Feb 13, 2020 at 04:23:24PM +, Anton Ivanov wrote:
> 
> On 13/02/2020 15:53, Michael S. Tsirkin wrote:
> > On Thu, Feb 13, 2020 at 07:44:06AM -0800, Eric Dumazet wrote:
> > > 
> > > On 2/13/20 2:00 AM, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 12, 2020 at 05:38:09PM +, Anton Ivanov wrote:
> > > > > 
> > > > > On 11/02/2020 10:37, Michael S. Tsirkin wrote:
> > > > > > On Tue, Feb 11, 2020 at 07:42:37AM +, Anton Ivanov wrote:
> > > > > > > On 11/02/2020 02:51, Jason Wang wrote:
> > > > > > > > On 2020/2/11 上午12:55, Anton Ivanov wrote:
> > > > > > > > > 
> > > > > > > > > On 09/12/2019 10:48, anton.iva...@cambridgegreys.com wrote:
> > > > > > > > > > From: Anton Ivanov 
> > > > > > > > > > 
> > > > > > > > > > Some of the frames marked as GSO which arrive at
> > > > > > > > > > virtio_net_hdr_from_skb() have no GSO_TYPE, no
> > > > > > > > > > fragments (data_len = 0) and length significantly shorter
> > > > > > > > > > than the MTU (752 in my experiments).
> > > > > > > > > > 
> > > > > > > > > > This is observed on raw sockets reading off vEth interfaces
> > > > > > > > > > in all 4.x and 5.x kernels I tested.
> > > > > > > > > > 
> > > > > > > > > > These frames are reported as invalid while they are in fact
> > > > > > > > > > gso-less frames.
> > > > > > > > > > 
> > > > > > > > > > This patch marks the vnet header as no-GSO for them instead
> > > > > > > > > > of reporting it as invalid.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Anton Ivanov 
> > > > > > > > > > 
> > > > > > > > > > ---
> > > > > > > > > >     include/linux/virtio_net.h | 8 ++--
> > > > > > > > > >     1 file changed, 6 insertions(+), 2 deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/include/linux/virtio_net.h 
> > > > > > > > > > b/include/linux/virtio_net.h
> > > > > > > > > > index 0d1fe9297ac6..d90d5cff1b9a 100644
> > > > > > > > > > --- a/include/linux/virtio_net.h
> > > > > > > > > > +++ b/include/linux/virtio_net.h
> > > > > > > > > > @@ -112,8 +112,12 @@ static inline int
> > > > > > > > > > virtio_net_hdr_from_skb(const struct sk_buff *skb,
> > > > > > > > > >     hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV4;
> > > > > > > > > >     else if (sinfo->gso_type & SKB_GSO_TCPV6)
> > > > > > > > > >     hdr->gso_type = VIRTIO_NET_HDR_GSO_TCPV6;
> > > > > > > > > > -    else
> > > > > > > > > > -    return -EINVAL;
> > > > > > > > > > +    else {
> > > > > > > > > > +    if (skb->data_len == 0)
> > > > > > > > > > +    hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
> > > > > > > > > > +    else
> > > > > > > > > > +    return -EINVAL;
> > > > > > > > > > +    }
> > > > > > > > > >     if (sinfo->gso_type & SKB_GSO_TCP_ECN)
> > > > > > > > > >     hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
> > > > > > > > > >     } else
> > > > > > > > > > 
> > > > > > > > > ping.
> > > > > > > > > 
> > > > > > > > Do you mean gso_size is set but gso_type is not? Looks like a 
> > > > > > > > bug
> > > > > > > > elsewhere.
> > > > > > > > 
> > > > > > > > Thanks
> > > > > > > > 
> > > > > > > > 
> > > > > > > Yes.
> > > > > > > 
> > > > > > > I could not trace it where it is coming from.
> > > > > > > 
> > > > > > > I see it when doing recvmmsg on raw sockets in the UML vector 
> > > > > > > network
> > > > > > > drivers.
> > > > > > > 
> > > > > > I think we need to find the culprit and fix it there, lots of other 
> > > > > > things
> > > > > > can break otherwise.
> > > > > > Just printing out skb->dev->name should do the trick, no?
> > > > > The printk in virtio_net_hdr_from_skb says NULL.
> > > > > 
> > > > > That is probably normal for a locally originated frame.
> > > > > 
> > > > > I cannot reproduce this with network traffic by the way - it happens 
> > > > > only if the traffic is locally originated on the host.
> > > > > 
> > > > > A,
> > > > OK so is it code in __tcp_transmit_skb that sets gso_size to non-null
> > > > when gso_type is 0?
> > > > 
> > > Correct way to determine if a packet is a gso one is by looking at 
> > > gso_size.
> > > Then only it is legal looking at gso_type
> > > 
> > > 
> > > static inline bool skb_is_gso(const struct sk_buff *skb)
> > > {
> > >  return skb_shinfo(skb)->gso_size;
> > > }
> > > 
> > > /* Note: Should be called only if skb_is_gso(skb) is true */
> > > static inline bool skb_is_gso_v6(const struct sk_buff *skb)
> > > ...
> > > 
> > > 
> > > There is absolutely no relation between GSO and skb->data_len, skb can be 
> > > linearized
> > > for various orthogonal reasons.
> > The reported problem is that virtio gets a packet where gso_size
> > is !0 but gso_type is 0.
> > 
> > It currently drops these on the assumption that it's some type
> > of a gso packet it does not know how to handle.
> > 
> > 
> > So you are saying if skb_is_gso we can still have gso_type set to 0,
> > and that's an expected configuration?
> > 
> > So the patch 

[PATCH V4 5/5] vdpasim: vDPA device simulator

2020-02-19 Thread Jason Wang
This patch implements a software vDPA networking device. The datapath
is implemented through vringh and workqueue. The device has an on-chip
IOMMU which translates IOVA to PA. For kernel virtio drivers, vDPA
simulator driver provides dma_ops. For vhost driers, set_map() methods
of vdpa_config_ops is implemented to accept mappings from vhost.

Currently, vDPA device simulator will loopback TX traffic to RX. So
the main use case for the device is vDPA feature testing, prototyping
and development.

Note, there's no management API implemented, a vDPA device will be
registered once the module is probed. We need to handle this in the
future development.

Signed-off-by: Jason Wang 
---
 drivers/virtio/vdpa/Kconfig |  18 +
 drivers/virtio/vdpa/Makefile|   1 +
 drivers/virtio/vdpa/vdpa_sim/Makefile   |   2 +
 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c | 660 
 4 files changed, 681 insertions(+)
 create mode 100644 drivers/virtio/vdpa/vdpa_sim/Makefile
 create mode 100644 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c

diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
index 9aac904a9515..9e7dc95e0c89 100644
--- a/drivers/virtio/vdpa/Kconfig
+++ b/drivers/virtio/vdpa/Kconfig
@@ -6,3 +6,21 @@ config VDPA
  datapath which complies with virtio specifications with
  vendor specific control path.
 
+menuconfig VDPA_MENU
+   bool "VDPA drivers"
+   default n
+
+if VDPA_MENU
+
+config VDPA_SIM
+   tristate "vDPA device simulator"
+   select VDPA
+   depends on RUNTIME_TESTING_MENU
+   default n
+   help
+ vDPA networking device simulator which loop TX traffic back
+ to RX. This device is used for testing, prototyping and
+ development of vDPA.
+
+endif # VDPA_MENU
+
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
index ee6a35e8a4fb..3814af8e097b 100644
--- a/drivers/virtio/vdpa/Makefile
+++ b/drivers/virtio/vdpa/Makefile
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_VDPA) += vdpa.o
+obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
diff --git a/drivers/virtio/vdpa/vdpa_sim/Makefile 
b/drivers/virtio/vdpa/vdpa_sim/Makefile
new file mode 100644
index ..b40278f65e04
--- /dev/null
+++ b/drivers/virtio/vdpa/vdpa_sim/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
diff --git a/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c 
b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c
new file mode 100644
index ..59d464f72ac2
--- /dev/null
+++ b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c
@@ -0,0 +1,660 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA networking device simulator.
+ *
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_VERSION  "0.1"
+#define DRV_AUTHOR   "Jason Wang "
+#define DRV_DESC "vDPA Device Simulator"
+#define DRV_LICENSE  "GPL v2"
+
+struct vdpasim_virtqueue {
+   struct vringh vring;
+   struct vringh_kiov iov;
+   unsigned short head;
+   bool ready;
+   u64 desc_addr;
+   u64 device_addr;
+   u64 driver_addr;
+   u32 num;
+   void *private;
+   irqreturn_t (*cb)(void *data);
+};
+
+#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
+#define VDPASIM_QUEUE_MAX 256
+#define VDPASIM_DEVICE_ID 0x1
+#define VDPASIM_VENDOR_ID 0
+#define VDPASIM_VQ_NUM 0x2
+#define VDPASIM_NAME "vdpasim-netdev"
+
+static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ (1ULL << VIRTIO_F_VERSION_1)  |
+ (1ULL << VIRTIO_F_IOMMU_PLATFORM);
+
+/* State of each vdpasim device */
+struct vdpasim {
+   struct vdpasim_virtqueue vqs[2];
+   struct work_struct work;
+   /* spinlock to synchronize virtqueue state */
+   spinlock_t lock;
+   struct vdpa_device *vdpa;
+   struct device dev;
+   struct virtio_net_config config;
+   struct vhost_iotlb *iommu;
+   void *buffer;
+   u32 status;
+   u32 generation;
+   u64 features;
+};
+
+struct vdpasim *vdpasim_dev;
+
+static struct vdpasim *dev_to_sim(struct device *dev)
+{
+   return container_of(dev, struct vdpasim, dev);
+}
+
+static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
+{
+   struct device *d = >dev;
+
+   return dev_to_sim(d->parent);
+}
+
+static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
+{
+   struct vdpasim_virtqueue *vq = >vqs[idx];
+   int ret;
+
+   ret = vringh_init_iotlb(>vring, vdpasim_features,
+   VDPASIM_QUEUE_MAX, false,
+   (struct vring_desc *)(uintptr_t)vq->desc_addr,
+   (struct vring_avail *)
+

[PATCH V4 3/5] vDPA: introduce vDPA bus

2020-02-19 Thread Jason Wang
vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
  virtualization (SR-IOV). Its Virtual Function (VF) represents a
  virtualized instance of the device that can be assigned to different
  partitions
- ADI (Assignable Device Interface) and its equivalents - With
  technologies such as Intel Scalable IOV, a virtual device (VDEV)
  composed by host OS utilizing one or more ADIs. Or its equivalent
  like SF (Sub function) from Mellanox.

>From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
  the device can be used on a platform where device access to data in
  memory is limited and/or translated. An example is a PCIE vDPA whose
  DMA request was tagged via a bus (e.g PCIE) specific way. DMA
  translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
  isolation and protection through its own logic. An example is a vDPA
  device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This patch introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method (vdpa_config_ops) between the vDPA device
abstraction and the vDPA bus driver. This allows multiple types of
drivers to be used for vDPA device like the virtio_vdpa and vhost_vdpa
driver to operate on the bus and allow vDPA device could be used by
either kernel virtio driver or userspace vhost drivers as:

   virtio drivers  vhost drivers
  | |
[virtio bus]   [vhost uAPI]
  | |
   virtio device   vhost device
   virtio_vdpa drv vhost_vdpa drv
 \   /
[vDPA bus]
 |
vDPA device
hardware drv
 |
[hardware bus]
 |
vDPA hardware

With the abstraction of vDPA bus and vDPA bus operations, the
difference and complexity of the under layer hardware is hidden from
upper layer. The vDPA bus drivers on top can use a unified
vdpa_config_ops to control different types of vDPA device.

Signed-off-by: Jason Wang 
---
 MAINTAINERS  |   1 +
 drivers/virtio/Kconfig   |   2 +
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/vdpa/Kconfig  |   8 ++
 drivers/virtio/vdpa/Makefile |   2 +
 drivers/virtio/vdpa/vdpa.c   | 167 +
 include/linux/vdpa.h | 232 +++
 7 files changed, 413 insertions(+)
 create mode 100644 drivers/virtio/vdpa/Kconfig
 create mode 100644 drivers/virtio/vdpa/Makefile
 create mode 100644 drivers/virtio/vdpa/vdpa.c
 create mode 100644 include/linux/vdpa.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0fb645b5a7df..2b8d9fa38d9a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17701,6 +17701,7 @@ F:  tools/virtio/
 F: drivers/net/virtio_net.c
 F: drivers/block/virtio_blk.c
 F: include/linux/virtio*.h
+F: include/linux/vdpa.h
 F: include/uapi/linux/virtio_*.h
 F: drivers/crypto/virtio/
 F: mm/balloon_compaction.c
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 078615cf2afc..9c4fdb64d9ac 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -96,3 +96,5 @@ config VIRTIO_MMIO_CMDLINE_DEVICES
 If unsure, say 'N'.
 
 endif # VIRTIO_MENU
+
+source "drivers/virtio/vdpa/Kconfig"
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 3a2b5c5dcf46..fdf5eacd0d0a 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VDPA) += vdpa/
diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
new file mode 100644
index ..9aac904a9515
--- /dev/null
+++ b/drivers/virtio/vdpa/Kconfig
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config VDPA
+   tristate
+   help
+ Enable this module to support vDPA device that uses a
+ datapath which complies with virtio specifications with
+ vendor specific control path.
+
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
new file mode 100644
index ..ee6a35e8a4fb
--- 

[PATCH V4 4/5] virtio: introduce a vDPA based transport

2020-02-19 Thread Jason Wang
This patch introduces a vDPA transport for virtio. This is used to
use kernel virtio driver to drive the vDPA device that is capable
of populating virtqueue directly.

A new virtio-vdpa driver will be registered to the vDPA bus, when a
new virtio-vdpa device is probed, it will register the device with
vdpa based config ops. This means it is a software transport between
vDPA driver and vDPA device. The transport was implemented through
bus_ops of vDPA parent.

Signed-off-by: Jason Wang 
---
 drivers/virtio/Kconfig   |  13 ++
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/virtio_vdpa.c | 392 +++
 3 files changed, 406 insertions(+)
 create mode 100644 drivers/virtio/virtio_vdpa.c

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 9c4fdb64d9ac..99e424570644 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -43,6 +43,19 @@ config VIRTIO_PCI_LEGACY
 
  If unsure, say Y.
 
+config VIRTIO_VDPA
+   tristate "vDPA driver for virtio devices"
+   select VDPA
+   select VIRTIO
+   help
+ This driver provides support for virtio based paravirtual
+ device driver over vDPA bus. For this to be useful, you need
+ an appropriate vDPA device implementation that operates on a
+ physical device to allow the datapath of virtio to be
+ offloaded to hardware.
+
+ If unsure, say M.
+
 config VIRTIO_PMEM
tristate "Support for virtio pmem driver"
depends on VIRTIO
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index fdf5eacd0d0a..3407ac03fe60 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,4 +6,5 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VIRTIO_VDPA) += virtio_vdpa.o
 obj-$(CONFIG_VDPA) += vdpa/
diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
new file mode 100644
index ..077796087abf
--- /dev/null
+++ b/drivers/virtio/virtio_vdpa.c
@@ -0,0 +1,392 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VIRTIO based driver for vDPA device
+ *
+ * Copyright (c) 2020, Red Hat. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MOD_VERSION  "0.1"
+#define MOD_AUTHOR   "Jason Wang "
+#define MOD_DESC "vDPA bus driver for virtio devices"
+#define MOD_LICENSE  "GPL v2"
+
+struct virtio_vdpa_device {
+   struct virtio_device vdev;
+   struct vdpa_device *vdpa;
+   u64 features;
+
+   /* The lock to protect virtqueue list */
+   spinlock_t lock;
+   /* List of virtio_vdpa_vq_info */
+   struct list_head virtqueues;
+};
+
+struct virtio_vdpa_vq_info {
+   /* the actual virtqueue */
+   struct virtqueue *vq;
+
+   /* the list node for the virtqueues list */
+   struct list_head node;
+};
+
+static inline struct virtio_vdpa_device *
+to_virtio_vdpa_device(struct virtio_device *dev)
+{
+   return container_of(dev, struct virtio_vdpa_device, vdev);
+}
+
+static struct vdpa_device *vd_get_vdpa(struct virtio_device *vdev)
+{
+   return to_virtio_vdpa_device(vdev)->vdpa;
+}
+
+static void virtio_vdpa_get(struct virtio_device *vdev, unsigned offset,
+   void *buf, unsigned len)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   ops->get_config(vdpa, offset, buf, len);
+}
+
+static void virtio_vdpa_set(struct virtio_device *vdev, unsigned offset,
+   const void *buf, unsigned len)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   ops->set_config(vdpa, offset, buf, len);
+}
+
+static u32 virtio_vdpa_generation(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->get_generation)
+   return ops->get_generation(vdpa);
+
+   return 0;
+}
+
+static u8 virtio_vdpa_get_status(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->get_status(vdpa);
+}
+
+static void virtio_vdpa_set_status(struct virtio_device *vdev, u8 status)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->set_status(vdpa, status);
+}
+
+static void virtio_vdpa_reset(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->set_status(vdpa, 0);
+}
+
+static bool 

[PATCH V4 2/5] vringh: IOTLB support

2020-02-19 Thread Jason Wang
This patch implements the third memory accessor for vringh besides
current kernel and userspace accessors. This idea is to allow vringh
to do the address translation through an IOTLB which is implemented
via vhost_map interval tree. Users should setup and IOVA to PA mapping
in this IOTLB.

This allows us to:

- Using vringh to access virtqueues with vIOMMU
- Using vringh to implement software virtqueues for vDPA devices

Signed-off-by: Jason Wang 
---
 drivers/vhost/Kconfig.vringh |   1 +
 drivers/vhost/vringh.c   | 421 +--
 include/linux/vringh.h   |  36 +++
 3 files changed, 435 insertions(+), 23 deletions(-)

diff --git a/drivers/vhost/Kconfig.vringh b/drivers/vhost/Kconfig.vringh
index c1fe36a9b8d4..a8d4dd0cb06e 100644
--- a/drivers/vhost/Kconfig.vringh
+++ b/drivers/vhost/Kconfig.vringh
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VHOST_RING
tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the host side of a virtio ring.
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index a0a2d74967ef..ee0491f579ac 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -13,6 +13,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 
 static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
@@ -71,9 +74,11 @@ static inline int __vringh_get_head(const struct vringh *vrh,
 }
 
 /* Copy some bytes to/from the iovec.  Returns num copied. */
-static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov,
+static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
+ struct vringh_kiov *iov,
  void *ptr, size_t len,
- int (*xfer)(void *addr, void *ptr,
+ int (*xfer)(const struct vringh *vrh,
+ void *addr, void *ptr,
  size_t len))
 {
int err, done = 0;
@@ -82,7 +87,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov,
size_t partlen;
 
partlen = min(iov->iov[iov->i].iov_len, len);
-   err = xfer(iov->iov[iov->i].iov_base, ptr, partlen);
+   err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
if (err)
return err;
done += partlen;
@@ -96,6 +101,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov 
*iov,
/* Fix up old iov element then increment. */
iov->iov[iov->i].iov_len = iov->consumed;
iov->iov[iov->i].iov_base -= iov->consumed;
+

iov->consumed = 0;
iov->i++;
@@ -227,7 +233,8 @@ static int slow_copy(struct vringh *vrh, void *dst, const 
void *src,
  u64 addr,
  struct vringh_range *r),
 struct vringh_range *range,
-int (*copy)(void *dst, const void *src, size_t len))
+int (*copy)(const struct vringh *vrh,
+void *dst, const void *src, size_t len))
 {
size_t part, len = sizeof(struct vring_desc);
 
@@ -241,7 +248,7 @@ static int slow_copy(struct vringh *vrh, void *dst, const 
void *src,
if (!rcheck(vrh, addr, , range, getrange))
return -EINVAL;
 
-   err = copy(dst, src, part);
+   err = copy(vrh, dst, src, part);
if (err)
return err;
 
@@ -262,7 +269,8 @@ __vringh_iov(struct vringh *vrh, u16 i,
 struct vringh_range *)),
 bool (*getrange)(struct vringh *, u64, struct vringh_range *),
 gfp_t gfp,
-int (*copy)(void *dst, const void *src, size_t len))
+int (*copy)(const struct vringh *vrh,
+void *dst, const void *src, size_t len))
 {
int err, count = 0, up_next, desc_max;
struct vring_desc desc, *descs;
@@ -291,7 +299,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
err = slow_copy(vrh, , [i], rcheck, getrange,
, copy);
else
-   err = copy(, [i], sizeof(desc));
+   err = copy(vrh, , [i], sizeof(desc));
if (unlikely(err))
goto fail;
 
@@ -404,7 +412,8 @@ static inline int __vringh_complete(struct vringh *vrh,
unsigned int num_used,
int (*putu16)(const struct vringh *vrh,
  __virtio16 *p, u16 val),
-   

[PATCH V4 1/5] vhost: factor out IOTLB

2020-02-19 Thread Jason Wang
This patch factors out IOTLB into a dedicated module in order to be
reused by other modules like vringh. User may choose to enable the
automatic retiring by specifying VHOST_IOTLB_FLAG_RETIRE flag to fit
for the case of vhost device IOTLB implementation.

Signed-off-by: Jason Wang 
---
 MAINTAINERS |   1 +
 drivers/vhost/Kconfig   |   6 +
 drivers/vhost/Makefile  |   2 +
 drivers/vhost/net.c |   2 +-
 drivers/vhost/vhost.c   | 221 +++-
 drivers/vhost/vhost.h   |  36 ++
 drivers/vhost/vhost_iotlb.c | 171 
 include/linux/vhost_iotlb.h |  45 
 8 files changed, 303 insertions(+), 181 deletions(-)
 create mode 100644 drivers/vhost/vhost_iotlb.c
 create mode 100644 include/linux/vhost_iotlb.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c74e4ea714a5..0fb645b5a7df 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17768,6 +17768,7 @@ T:  git 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
 S: Maintained
 F: drivers/vhost/
 F: include/uapi/linux/vhost.h
+F: include/linux/vhost_iotlb.h
 
 VIRTIO INPUT DRIVER
 M: Gerd Hoffmann 
diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 3d03ccbd1adc..e76a72490563 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -36,6 +36,7 @@ config VHOST_VSOCK
 
 config VHOST
tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the core of vhost.
@@ -54,3 +55,8 @@ config VHOST_CROSS_ENDIAN_LEGACY
  adds some overhead, it is disabled by default.
 
  If unsure, say "N".
+
+config VHOST_IOTLB
+   tristate
+   help
+ Generic IOTLB implementation for vhost and vringh.
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 6c6df24f770c..df99756fbb26 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -11,3 +11,5 @@ vhost_vsock-y := vsock.o
 obj-$(CONFIG_VHOST_RING) += vringh.o
 
 obj-$(CONFIG_VHOST)+= vhost.o
+
+obj-$(CONFIG_VHOST_IOTLB) += vhost_iotlb.o
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..e4a20d7a2921 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1594,7 +1594,7 @@ static long vhost_net_reset_owner(struct vhost_net *n)
struct socket *tx_sock = NULL;
struct socket *rx_sock = NULL;
long err;
-   struct vhost_umem *umem;
+   struct vhost_iotlb *umem;
 
mutex_lock(>dev.mutex);
err = vhost_dev_check_owner(>dev);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index f44340b41494..9059b95cac83 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -50,10 +50,6 @@ enum {
 #define vhost_used_event(vq) ((__virtio16 __user *)>avail->ring[vq->num])
 #define vhost_avail_event(vq) ((__virtio16 __user *)>used->ring[vq->num])
 
-INTERVAL_TREE_DEFINE(struct vhost_umem_node,
-rb, __u64, __subtree_last,
-START, LAST, static inline, vhost_umem_interval_tree);
-
 #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY
 static void vhost_disable_cross_endian(struct vhost_virtqueue *vq)
 {
@@ -581,21 +577,25 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_set_owner);
 
-struct vhost_umem *vhost_dev_reset_owner_prepare(void)
+static struct vhost_iotlb *iotlb_alloc(void)
+{
+   return vhost_iotlb_alloc(max_iotlb_entries,
+VHOST_IOTLB_FLAG_RETIRE);
+}
+
+struct vhost_iotlb *vhost_dev_reset_owner_prepare(void)
 {
-   return kvzalloc(sizeof(struct vhost_umem), GFP_KERNEL);
+   return iotlb_alloc();
 }
 EXPORT_SYMBOL_GPL(vhost_dev_reset_owner_prepare);
 
 /* Caller should have device mutex */
-void vhost_dev_reset_owner(struct vhost_dev *dev, struct vhost_umem *umem)
+void vhost_dev_reset_owner(struct vhost_dev *dev, struct vhost_iotlb *umem)
 {
int i;
 
vhost_dev_cleanup(dev);
 
-   /* Restore memory to default empty mapping. */
-   INIT_LIST_HEAD(>umem_list);
dev->umem = umem;
/* We don't need VQ locks below since vhost_dev_cleanup makes sure
 * VQs aren't running.
@@ -618,28 +618,6 @@ void vhost_dev_stop(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_stop);
 
-static void vhost_umem_free(struct vhost_umem *umem,
-   struct vhost_umem_node *node)
-{
-   vhost_umem_interval_tree_remove(node, >umem_tree);
-   list_del(>link);
-   kfree(node);
-   umem->numem--;
-}
-
-static void vhost_umem_clean(struct vhost_umem *umem)
-{
-   struct vhost_umem_node *node, *tmp;
-
-   if (!umem)
-   return;
-
-   list_for_each_entry_safe(node, tmp, >umem_list, link)
-   vhost_umem_free(umem, node);
-
-   kvfree(umem);
-}
-
 static void vhost_clear_msg(struct vhost_dev *dev)
 {
struct vhost_msg_node *node, *n;
@@ -677,9 +655,9 @@ void 

[PATCH V4 0/5] vDPA support

2020-02-19 Thread Jason Wang
Hi all:

This is an update version of vDPA support in kernel.

vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
  virtualization (SR-IOV). Its Virtual Function (VF) represents a
  virtualized instance of the device that can be assigned to different
  partitions
- ADI (Assignable Device Interface) and its equivalents - With
  technologies such as Intel Scalable IOV, a virtual device (VDEV)
  composed by host OS utilizing one or more ADIs. Or its equivalent
  like SF (Sub function) from Mellanox.

>From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
  the device can be used on a platform where device access to data in
  memory is limited and/or translated. An example is a PCIE vDPA whose
  DMA request was tagged via a bus (e.g PCIE) specific way. DMA
  translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
  isolation and protection through its own logic. An example is a vDPA
  device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This series introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method, the bus operations (vdpa_config_ops) between the
vDPA device abstraction and the vDPA bus driver. This allows multiple
types of drivers to be used for vDPA device like the virtio_vdpa and
vhost_vdpa driver to operate on the bus and allow vDPA device could be
used by either kernel virtio driver or userspace vhost drivers as:

   virtio drivers  vhost drivers
  | |
[virtio bus]   [vhost uAPI]
  | |
   virtio device   vhost device
   virtio_vdpa drv vhost_vdpa drv
 \   /
[vDPA bus]
 |
vDPA device
hardware drv
 |
[hardware bus]
 |
vDPA hardware

virtio_vdpa driver is a transport implementation for kernel virtio
drivers on top of vDPA bus operations. An alternative is to refactor
virtio bus which is sub-optimal since the bus and drivers are designed
to be use by kernel subsystem, a non-trivial major refactoring is
needed which may impact a brunches of drivers and devices
implementation inside the kernel. Using a new transport may grealy
simply both the design and changes.

vhost_vdpa driver is a new type of vhost device which allows userspace
vhost drivers to use vDPA devices via vhost uAPI (with minor
extension). This help to minimize the changes of existed vhost drivers
for using vDPA devices.

With the abstraction of vDPA bus and vDPA bus operations, the
difference and complexity of the under layer hardware is hidden from
upper layer. The vDPA bus drivers on top can use a unified
vdpa_config_ops to control different types of vDPA device.

This series contains the bus and virtio_vdpa implementation. We are
working on the vhost part and IFCVF (vDPA driver from Intel) which
will be posted in future few days.

Thanks

Changes from V3:

- various Kconfig fixes (Randy)

Changes from V2:

- release idr in the release function for put_device() unwind (Jason)
- don't panic when fail to register vdpa bus (Jason)
- use unsigned int instead of int for ida (Jason)
- fix the wrong commit log in virito_vdpa patches (Jason)
- make vdpa_sim depends on RUNTIME_TESTING_MENU (Michael)
- provide a bus release function for vDPA device (Jason)
- fix the wrong unwind when creating devices for vDPA simulator (Jason)
- move vDPA simulator to a dedicated directory (Lingshan)
- cancel the work before release vDPA simulator

Changes from V1:

- drop sysfs API, leave the management interface to future development
  (Michael)
- introduce incremental DMA ops (dma_map/dma_unmap) (Michael)
- introduce dma_device and use it instead of parent device for doing
  IOMMU or DMA from bus driver (Michael, Jason, Ling Shan, Tiwei)
- accept parent device and dma device when register vdpa device
- coding style and compile fixes (Randy)
- using vdpa_xxx instead of xxx_vdpa (Jason)
- ove vDPA accessors to header and make it static inline (Jason)
- split vdp_register_device() into two helpers vdpa_init_device() and
  vdpa_register_device() which allows intermediate step to be done (Jason)
- warn on invalidate queue state when fail to creating virtqueue (Jason)
- make 

Re: [PATCH V3 5/5] vdpasim: vDPA device simulator

2020-02-19 Thread Jason Wang


On 2020/2/20 下午12:09, Randy Dunlap wrote:

On 2/19/20 7:56 PM, Jason Wang wrote:

diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
index 7a99170e6c30..e3656b722654 100644
--- a/drivers/virtio/vdpa/Kconfig
+++ b/drivers/virtio/vdpa/Kconfig
@@ -7,3 +7,21 @@ config VDPA
datapath which complies with virtio specifications with
vendor specific control path.
  
+menuconfig VDPA_MENU

+   bool "VDPA drivers"
+   default n
+
+if VDPA_MENU
+
+config VDPA_SIM
+   tristate "vDPA device simulator"
+select VDPA
+depends on RUNTIME_TESTING_MENU
+default n
+help
+  vDPA networking device simulator which loop TX traffic back
+  to RX. This device is used for testing, prototyping and
+  development of vDPA.
+
+endif # VDPA_MENU
+

Use 1 tab for indentation for tristate/select/depends/default/help,
and then 1 tab + 2 spaces for help text.



Yes.

Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 4/5] virtio: introduce a vDPA based transport

2020-02-19 Thread Jason Wang


On 2020/2/20 下午12:07, Randy Dunlap wrote:

On 2/19/20 7:56 PM, Jason Wang wrote:

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 9c4fdb64d9ac..0df3676b0f4f 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -43,6 +43,19 @@ config VIRTIO_PCI_LEGACY
  
  	  If unsure, say Y.
  
+config VIRTIO_VDPA

+   tristate "vDPA driver for virtio devices"
+select VDPA
+select VIRTIO
+   help
+ This driver provides support for virtio based paravirtual
+ device driver over vDPA bus. For this to be useful, you need
+ an appropriate vDPA device implementation that operates on a
+  physical device to allow the datapath of virtio to be
+ offloaded to hardware.
+
+ If unsure, say M.
+

Please use tabs consistently for indentation, not spaces,
except in the Kconfig help text, which should be 1 tab + 2 spaces.



Fixed.

Thanks





  config VIRTIO_PMEM
tristate "Support for virtio pmem driver"
depends on VIRTIO




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 3/5] vDPA: introduce vDPA bus

2020-02-19 Thread Jason Wang


On 2020/2/20 下午12:06, Randy Dunlap wrote:

On 2/19/20 7:56 PM, Jason Wang wrote:

diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
new file mode 100644
index ..7a99170e6c30
--- /dev/null
+++ b/drivers/virtio/vdpa/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config VDPA
+   tristate
+default m

Don't add drivers that are enabled by default, unless they are required
for a system to boot.

And anything that wants VDPA should just select it, so this is not needed.



Right fixed.

Thanks





+help
+  Enable this module to support vDPA device that uses a
+  datapath which complies with virtio specifications with
+  vendor specific control path.
+

thanks.


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 1/5] vhost: factor out IOTLB

2020-02-19 Thread Jason Wang


On 2020/2/20 下午12:04, Randy Dunlap wrote:

On 2/19/20 7:56 PM, Jason Wang wrote:

This patch factors out IOTLB into a dedicated module in order to be
reused by other modules like vringh. User may choose to enable the
automatic retiring by specifying VHOST_IOTLB_FLAG_RETIRE flag to fit
for the case of vhost device IOTLB implementation.

Signed-off-by: Jason Wang 
---
  MAINTAINERS |   1 +
  drivers/vhost/Kconfig   |   7 ++
  drivers/vhost/Makefile  |   2 +
  drivers/vhost/net.c |   2 +-
  drivers/vhost/vhost.c   | 221 +++-
  drivers/vhost/vhost.h   |  36 ++
  drivers/vhost/vhost_iotlb.c | 171 
  include/linux/vhost_iotlb.h |  45 
  8 files changed, 304 insertions(+), 181 deletions(-)
  create mode 100644 drivers/vhost/vhost_iotlb.c
  create mode 100644 include/linux/vhost_iotlb.h


Hi,
Sorry if you have gone over this previously:



Thanks for the review, it's really helpful.





diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 3d03ccbd1adc..eef634ff9a6e 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -36,6 +36,7 @@ config VHOST_VSOCK
  
  config VHOST

tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the core of vhost.
@@ -54,3 +55,9 @@ config VHOST_CROSS_ENDIAN_LEGACY
  adds some overhead, it is disabled by default.
  
  	  If unsure, say "N".

+
+config VHOST_IOTLB
+   tristate
+   default m

"default m" should not be needed. Just make whatever needs it select it.



Yes, will fix.

Thanks





+   help
+ Generic IOTLB implementation for vhost and vringh.




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V3 5/5] vdpasim: vDPA device simulator

2020-02-19 Thread Randy Dunlap
On 2/19/20 7:56 PM, Jason Wang wrote:
> diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
> index 7a99170e6c30..e3656b722654 100644
> --- a/drivers/virtio/vdpa/Kconfig
> +++ b/drivers/virtio/vdpa/Kconfig
> @@ -7,3 +7,21 @@ config VDPA
>datapath which complies with virtio specifications with
>vendor specific control path.
>  
> +menuconfig VDPA_MENU
> + bool "VDPA drivers"
> + default n
> +
> +if VDPA_MENU
> +
> +config VDPA_SIM
> + tristate "vDPA device simulator"
> +select VDPA
> +depends on RUNTIME_TESTING_MENU
> +default n
> +help
> +  vDPA networking device simulator which loop TX traffic back
> +  to RX. This device is used for testing, prototyping and
> +  development of vDPA.
> +
> +endif # VDPA_MENU
> +

Use 1 tab for indentation for tristate/select/depends/default/help,
and then 1 tab + 2 spaces for help text.

-- 
~Randy

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 4/5] virtio: introduce a vDPA based transport

2020-02-19 Thread Randy Dunlap
On 2/19/20 7:56 PM, Jason Wang wrote:
> diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
> index 9c4fdb64d9ac..0df3676b0f4f 100644
> --- a/drivers/virtio/Kconfig
> +++ b/drivers/virtio/Kconfig
> @@ -43,6 +43,19 @@ config VIRTIO_PCI_LEGACY
>  
> If unsure, say Y.
>  
> +config VIRTIO_VDPA
> + tristate "vDPA driver for virtio devices"
> +select VDPA
> +select VIRTIO
> + help
> +   This driver provides support for virtio based paravirtual
> +   device driver over vDPA bus. For this to be useful, you need
> +   an appropriate vDPA device implementation that operates on a
> +  physical device to allow the datapath of virtio to be
> +   offloaded to hardware.
> +
> +   If unsure, say M.
> +

Please use tabs consistently for indentation, not spaces,
except in the Kconfig help text, which should be 1 tab + 2 spaces.

>  config VIRTIO_PMEM
>   tristate "Support for virtio pmem driver"
>   depends on VIRTIO


-- 
~Randy

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 3/5] vDPA: introduce vDPA bus

2020-02-19 Thread Randy Dunlap
On 2/19/20 7:56 PM, Jason Wang wrote:
> diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
> new file mode 100644
> index ..7a99170e6c30
> --- /dev/null
> +++ b/drivers/virtio/vdpa/Kconfig
> @@ -0,0 +1,9 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +config VDPA
> + tristate
> +default m

Don't add drivers that are enabled by default, unless they are required
for a system to boot.

And anything that wants VDPA should just select it, so this is not needed.

> +help
> +  Enable this module to support vDPA device that uses a
> +  datapath which complies with virtio specifications with
> +  vendor specific control path.
> +

thanks.
-- 
~Randy

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V3 1/5] vhost: factor out IOTLB

2020-02-19 Thread Randy Dunlap
On 2/19/20 7:56 PM, Jason Wang wrote:
> This patch factors out IOTLB into a dedicated module in order to be
> reused by other modules like vringh. User may choose to enable the
> automatic retiring by specifying VHOST_IOTLB_FLAG_RETIRE flag to fit
> for the case of vhost device IOTLB implementation.
> 
> Signed-off-by: Jason Wang 
> ---
>  MAINTAINERS |   1 +
>  drivers/vhost/Kconfig   |   7 ++
>  drivers/vhost/Makefile  |   2 +
>  drivers/vhost/net.c |   2 +-
>  drivers/vhost/vhost.c   | 221 +++-
>  drivers/vhost/vhost.h   |  36 ++
>  drivers/vhost/vhost_iotlb.c | 171 
>  include/linux/vhost_iotlb.h |  45 
>  8 files changed, 304 insertions(+), 181 deletions(-)
>  create mode 100644 drivers/vhost/vhost_iotlb.c
>  create mode 100644 include/linux/vhost_iotlb.h
> 

Hi,
Sorry if you have gone over this previously:

> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index 3d03ccbd1adc..eef634ff9a6e 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -36,6 +36,7 @@ config VHOST_VSOCK
>  
>  config VHOST
>   tristate
> + select VHOST_IOTLB
>   ---help---
> This option is selected by any driver which needs to access
> the core of vhost.
> @@ -54,3 +55,9 @@ config VHOST_CROSS_ENDIAN_LEGACY
> adds some overhead, it is disabled by default.
>  
> If unsure, say "N".
> +
> +config VHOST_IOTLB
> + tristate
> + default m

"default m" should not be needed. Just make whatever needs it select it.

> + help
> +   Generic IOTLB implementation for vhost and vringh.


-- 
~Randy

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH V3 5/5] vdpasim: vDPA device simulator

2020-02-19 Thread Jason Wang
This patch implements a software vDPA networking device. The datapath
is implemented through vringh and workqueue. The device has an on-chip
IOMMU which translates IOVA to PA. For kernel virtio drivers, vDPA
simulator driver provides dma_ops. For vhost driers, set_map() methods
of vdpa_config_ops is implemented to accept mappings from vhost.

Currently, vDPA device simulator will loopback TX traffic to RX. So
the main use case for the device is vDPA feature testing, prototyping
and development.

Note, there's no management API implemented, a vDPA device will be
registered once the module is probed. We need to handle this in the
future development.

Signed-off-by: Jason Wang 
---
 drivers/virtio/vdpa/Kconfig |  18 +
 drivers/virtio/vdpa/Makefile|   1 +
 drivers/virtio/vdpa/vdpa_sim/Makefile   |   2 +
 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c | 660 
 4 files changed, 681 insertions(+)
 create mode 100644 drivers/virtio/vdpa/vdpa_sim/Makefile
 create mode 100644 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c

diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
index 7a99170e6c30..e3656b722654 100644
--- a/drivers/virtio/vdpa/Kconfig
+++ b/drivers/virtio/vdpa/Kconfig
@@ -7,3 +7,21 @@ config VDPA
   datapath which complies with virtio specifications with
   vendor specific control path.
 
+menuconfig VDPA_MENU
+   bool "VDPA drivers"
+   default n
+
+if VDPA_MENU
+
+config VDPA_SIM
+   tristate "vDPA device simulator"
+select VDPA
+depends on RUNTIME_TESTING_MENU
+default n
+help
+  vDPA networking device simulator which loop TX traffic back
+  to RX. This device is used for testing, prototyping and
+  development of vDPA.
+
+endif # VDPA_MENU
+
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
index ee6a35e8a4fb..3814af8e097b 100644
--- a/drivers/virtio/vdpa/Makefile
+++ b/drivers/virtio/vdpa/Makefile
@@ -1,2 +1,3 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_VDPA) += vdpa.o
+obj-$(CONFIG_VDPA_SIM) += vdpa_sim/
diff --git a/drivers/virtio/vdpa/vdpa_sim/Makefile 
b/drivers/virtio/vdpa/vdpa_sim/Makefile
new file mode 100644
index ..b40278f65e04
--- /dev/null
+++ b/drivers/virtio/vdpa/vdpa_sim/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o
diff --git a/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c 
b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c
new file mode 100644
index ..59d464f72ac2
--- /dev/null
+++ b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c
@@ -0,0 +1,660 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VDPA networking device simulator.
+ *
+ * Copyright (c) 2020, Red Hat Inc. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_VERSION  "0.1"
+#define DRV_AUTHOR   "Jason Wang "
+#define DRV_DESC "vDPA Device Simulator"
+#define DRV_LICENSE  "GPL v2"
+
+struct vdpasim_virtqueue {
+   struct vringh vring;
+   struct vringh_kiov iov;
+   unsigned short head;
+   bool ready;
+   u64 desc_addr;
+   u64 device_addr;
+   u64 driver_addr;
+   u32 num;
+   void *private;
+   irqreturn_t (*cb)(void *data);
+};
+
+#define VDPASIM_QUEUE_ALIGN PAGE_SIZE
+#define VDPASIM_QUEUE_MAX 256
+#define VDPASIM_DEVICE_ID 0x1
+#define VDPASIM_VENDOR_ID 0
+#define VDPASIM_VQ_NUM 0x2
+#define VDPASIM_NAME "vdpasim-netdev"
+
+static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) |
+ (1ULL << VIRTIO_F_VERSION_1)  |
+ (1ULL << VIRTIO_F_IOMMU_PLATFORM);
+
+/* State of each vdpasim device */
+struct vdpasim {
+   struct vdpasim_virtqueue vqs[2];
+   struct work_struct work;
+   /* spinlock to synchronize virtqueue state */
+   spinlock_t lock;
+   struct vdpa_device *vdpa;
+   struct device dev;
+   struct virtio_net_config config;
+   struct vhost_iotlb *iommu;
+   void *buffer;
+   u32 status;
+   u32 generation;
+   u64 features;
+};
+
+struct vdpasim *vdpasim_dev;
+
+static struct vdpasim *dev_to_sim(struct device *dev)
+{
+   return container_of(dev, struct vdpasim, dev);
+}
+
+static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
+{
+   struct device *d = >dev;
+
+   return dev_to_sim(d->parent);
+}
+
+static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
+{
+   struct vdpasim_virtqueue *vq = >vqs[idx];
+   int ret;
+
+   ret = vringh_init_iotlb(>vring, vdpasim_features,
+   VDPASIM_QUEUE_MAX, false,
+   (struct vring_desc *)(uintptr_t)vq->desc_addr,
+   (struct vring_avail *)
+   

[PATCH V3 3/5] vDPA: introduce vDPA bus

2020-02-19 Thread Jason Wang
vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
  virtualization (SR-IOV). Its Virtual Function (VF) represents a
  virtualized instance of the device that can be assigned to different
  partitions
- ADI (Assignable Device Interface) and its equivalents - With
  technologies such as Intel Scalable IOV, a virtual device (VDEV)
  composed by host OS utilizing one or more ADIs. Or its equivalent
  like SF (Sub function) from Mellanox.

>From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
  the device can be used on a platform where device access to data in
  memory is limited and/or translated. An example is a PCIE vDPA whose
  DMA request was tagged via a bus (e.g PCIE) specific way. DMA
  translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
  isolation and protection through its own logic. An example is a vDPA
  device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This patch introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method (vdpa_config_ops) between the vDPA device
abstraction and the vDPA bus driver. This allows multiple types of
drivers to be used for vDPA device like the virtio_vdpa and vhost_vdpa
driver to operate on the bus and allow vDPA device could be used by
either kernel virtio driver or userspace vhost drivers as:

   virtio drivers  vhost drivers
  | |
[virtio bus]   [vhost uAPI]
  | |
   virtio device   vhost device
   virtio_vdpa drv vhost_vdpa drv
 \   /
[vDPA bus]
 |
vDPA device
hardware drv
 |
[hardware bus]
 |
vDPA hardware

With the abstraction of vDPA bus and vDPA bus operations, the
difference and complexity of the under layer hardware is hidden from
upper layer. The vDPA bus drivers on top can use a unified
vdpa_config_ops to control different types of vDPA device.

Signed-off-by: Jason Wang 
---
 MAINTAINERS  |   1 +
 drivers/virtio/Kconfig   |   2 +
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/vdpa/Kconfig  |   9 ++
 drivers/virtio/vdpa/Makefile |   2 +
 drivers/virtio/vdpa/vdpa.c   | 167 +
 include/linux/vdpa.h | 232 +++
 7 files changed, 414 insertions(+)
 create mode 100644 drivers/virtio/vdpa/Kconfig
 create mode 100644 drivers/virtio/vdpa/Makefile
 create mode 100644 drivers/virtio/vdpa/vdpa.c
 create mode 100644 include/linux/vdpa.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 0fb645b5a7df..2b8d9fa38d9a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17701,6 +17701,7 @@ F:  tools/virtio/
 F: drivers/net/virtio_net.c
 F: drivers/block/virtio_blk.c
 F: include/linux/virtio*.h
+F: include/linux/vdpa.h
 F: include/uapi/linux/virtio_*.h
 F: drivers/crypto/virtio/
 F: mm/balloon_compaction.c
diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 078615cf2afc..9c4fdb64d9ac 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -96,3 +96,5 @@ config VIRTIO_MMIO_CMDLINE_DEVICES
 If unsure, say 'N'.
 
 endif # VIRTIO_MENU
+
+source "drivers/virtio/vdpa/Kconfig"
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index 3a2b5c5dcf46..fdf5eacd0d0a 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,3 +6,4 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VDPA) += vdpa/
diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig
new file mode 100644
index ..7a99170e6c30
--- /dev/null
+++ b/drivers/virtio/vdpa/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config VDPA
+   tristate
+default m
+help
+  Enable this module to support vDPA device that uses a
+  datapath which complies with virtio specifications with
+  vendor specific control path.
+
diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile
new file mode 100644
index 

[PATCH V3 4/5] virtio: introduce a vDPA based transport

2020-02-19 Thread Jason Wang
This patch introduces a vDPA transport for virtio. This is used to
use kernel virtio driver to drive the vDPA device that is capable
of populating virtqueue directly.

A new virtio-vdpa driver will be registered to the vDPA bus, when a
new virtio-vdpa device is probed, it will register the device with
vdpa based config ops. This means it is a software transport between
vDPA driver and vDPA device. The transport was implemented through
bus_ops of vDPA parent.

Signed-off-by: Jason Wang 
---
 drivers/virtio/Kconfig   |  13 ++
 drivers/virtio/Makefile  |   1 +
 drivers/virtio/virtio_vdpa.c | 392 +++
 3 files changed, 406 insertions(+)
 create mode 100644 drivers/virtio/virtio_vdpa.c

diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig
index 9c4fdb64d9ac..0df3676b0f4f 100644
--- a/drivers/virtio/Kconfig
+++ b/drivers/virtio/Kconfig
@@ -43,6 +43,19 @@ config VIRTIO_PCI_LEGACY
 
  If unsure, say Y.
 
+config VIRTIO_VDPA
+   tristate "vDPA driver for virtio devices"
+select VDPA
+select VIRTIO
+   help
+ This driver provides support for virtio based paravirtual
+ device driver over vDPA bus. For this to be useful, you need
+ an appropriate vDPA device implementation that operates on a
+  physical device to allow the datapath of virtio to be
+ offloaded to hardware.
+
+ If unsure, say M.
+
 config VIRTIO_PMEM
tristate "Support for virtio pmem driver"
depends on VIRTIO
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index fdf5eacd0d0a..3407ac03fe60 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -6,4 +6,5 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o
 virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o
 obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o
 obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o
+obj-$(CONFIG_VIRTIO_VDPA) += virtio_vdpa.o
 obj-$(CONFIG_VDPA) += vdpa/
diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c
new file mode 100644
index ..077796087abf
--- /dev/null
+++ b/drivers/virtio/virtio_vdpa.c
@@ -0,0 +1,392 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * VIRTIO based driver for vDPA device
+ *
+ * Copyright (c) 2020, Red Hat. All rights reserved.
+ * Author: Jason Wang 
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MOD_VERSION  "0.1"
+#define MOD_AUTHOR   "Jason Wang "
+#define MOD_DESC "vDPA bus driver for virtio devices"
+#define MOD_LICENSE  "GPL v2"
+
+struct virtio_vdpa_device {
+   struct virtio_device vdev;
+   struct vdpa_device *vdpa;
+   u64 features;
+
+   /* The lock to protect virtqueue list */
+   spinlock_t lock;
+   /* List of virtio_vdpa_vq_info */
+   struct list_head virtqueues;
+};
+
+struct virtio_vdpa_vq_info {
+   /* the actual virtqueue */
+   struct virtqueue *vq;
+
+   /* the list node for the virtqueues list */
+   struct list_head node;
+};
+
+static inline struct virtio_vdpa_device *
+to_virtio_vdpa_device(struct virtio_device *dev)
+{
+   return container_of(dev, struct virtio_vdpa_device, vdev);
+}
+
+static struct vdpa_device *vd_get_vdpa(struct virtio_device *vdev)
+{
+   return to_virtio_vdpa_device(vdev)->vdpa;
+}
+
+static void virtio_vdpa_get(struct virtio_device *vdev, unsigned offset,
+   void *buf, unsigned len)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   ops->get_config(vdpa, offset, buf, len);
+}
+
+static void virtio_vdpa_set(struct virtio_device *vdev, unsigned offset,
+   const void *buf, unsigned len)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   ops->set_config(vdpa, offset, buf, len);
+}
+
+static u32 virtio_vdpa_generation(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   if (ops->get_generation)
+   return ops->get_generation(vdpa);
+
+   return 0;
+}
+
+static u8 virtio_vdpa_get_status(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->get_status(vdpa);
+}
+
+static void virtio_vdpa_set_status(struct virtio_device *vdev, u8 status)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->set_status(vdpa, status);
+}
+
+static void virtio_vdpa_reset(struct virtio_device *vdev)
+{
+   struct vdpa_device *vdpa = vd_get_vdpa(vdev);
+   const struct vdpa_config_ops *ops = vdpa->config;
+
+   return ops->set_status(vdpa, 0);
+}
+
+static bool 

[PATCH V3 2/5] vringh: IOTLB support

2020-02-19 Thread Jason Wang
This patch implements the third memory accessor for vringh besides
current kernel and userspace accessors. This idea is to allow vringh
to do the address translation through an IOTLB which is implemented
via vhost_map interval tree. Users should setup and IOVA to PA mapping
in this IOTLB.

This allows us to:

- Using vringh to access virtqueues with vIOMMU
- Using vringh to implement software vDPA devices

Signed-off-by: Jason Wang 
---
 drivers/vhost/Kconfig.vringh |   1 +
 drivers/vhost/vringh.c   | 421 +--
 include/linux/vringh.h   |  36 +++
 3 files changed, 435 insertions(+), 23 deletions(-)

diff --git a/drivers/vhost/Kconfig.vringh b/drivers/vhost/Kconfig.vringh
index c1fe36a9b8d4..a8d4dd0cb06e 100644
--- a/drivers/vhost/Kconfig.vringh
+++ b/drivers/vhost/Kconfig.vringh
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VHOST_RING
tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the host side of a virtio ring.
diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c
index a0a2d74967ef..ee0491f579ac 100644
--- a/drivers/vhost/vringh.c
+++ b/drivers/vhost/vringh.c
@@ -13,6 +13,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 
 static __printf(1,2) __cold void vringh_bad(const char *fmt, ...)
@@ -71,9 +74,11 @@ static inline int __vringh_get_head(const struct vringh *vrh,
 }
 
 /* Copy some bytes to/from the iovec.  Returns num copied. */
-static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov,
+static inline ssize_t vringh_iov_xfer(struct vringh *vrh,
+ struct vringh_kiov *iov,
  void *ptr, size_t len,
- int (*xfer)(void *addr, void *ptr,
+ int (*xfer)(const struct vringh *vrh,
+ void *addr, void *ptr,
  size_t len))
 {
int err, done = 0;
@@ -82,7 +87,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov,
size_t partlen;
 
partlen = min(iov->iov[iov->i].iov_len, len);
-   err = xfer(iov->iov[iov->i].iov_base, ptr, partlen);
+   err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen);
if (err)
return err;
done += partlen;
@@ -96,6 +101,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov 
*iov,
/* Fix up old iov element then increment. */
iov->iov[iov->i].iov_len = iov->consumed;
iov->iov[iov->i].iov_base -= iov->consumed;
+

iov->consumed = 0;
iov->i++;
@@ -227,7 +233,8 @@ static int slow_copy(struct vringh *vrh, void *dst, const 
void *src,
  u64 addr,
  struct vringh_range *r),
 struct vringh_range *range,
-int (*copy)(void *dst, const void *src, size_t len))
+int (*copy)(const struct vringh *vrh,
+void *dst, const void *src, size_t len))
 {
size_t part, len = sizeof(struct vring_desc);
 
@@ -241,7 +248,7 @@ static int slow_copy(struct vringh *vrh, void *dst, const 
void *src,
if (!rcheck(vrh, addr, , range, getrange))
return -EINVAL;
 
-   err = copy(dst, src, part);
+   err = copy(vrh, dst, src, part);
if (err)
return err;
 
@@ -262,7 +269,8 @@ __vringh_iov(struct vringh *vrh, u16 i,
 struct vringh_range *)),
 bool (*getrange)(struct vringh *, u64, struct vringh_range *),
 gfp_t gfp,
-int (*copy)(void *dst, const void *src, size_t len))
+int (*copy)(const struct vringh *vrh,
+void *dst, const void *src, size_t len))
 {
int err, count = 0, up_next, desc_max;
struct vring_desc desc, *descs;
@@ -291,7 +299,7 @@ __vringh_iov(struct vringh *vrh, u16 i,
err = slow_copy(vrh, , [i], rcheck, getrange,
, copy);
else
-   err = copy(, [i], sizeof(desc));
+   err = copy(vrh, , [i], sizeof(desc));
if (unlikely(err))
goto fail;
 
@@ -404,7 +412,8 @@ static inline int __vringh_complete(struct vringh *vrh,
unsigned int num_used,
int (*putu16)(const struct vringh *vrh,
  __virtio16 *p, u16 val),
-   

[PATCH V3 0/5] vDPA support

2020-02-19 Thread Jason Wang
Hi all:

This is an update version of vDPA support in kernel.

vDPA device is a device that uses a datapath which complies with the
virtio specifications with vendor specific control path. vDPA devices
can be both physically located on the hardware or emulated by
software. vDPA hardware devices are usually implemented through PCIE
with the following types:

- PF (Physical Function) - A single Physical Function
- VF (Virtual Function) - Device that supports single root I/O
  virtualization (SR-IOV). Its Virtual Function (VF) represents a
  virtualized instance of the device that can be assigned to different
  partitions
- ADI (Assignable Device Interface) and its equivalents - With
  technologies such as Intel Scalable IOV, a virtual device (VDEV)
  composed by host OS utilizing one or more ADIs. Or its equivalent
  like SF (Sub function) from Mellanox.

>From a driver's perspective, depends on how and where the DMA
translation is done, vDPA devices are split into two types:

- Platform specific DMA translation - From the driver's perspective,
  the device can be used on a platform where device access to data in
  memory is limited and/or translated. An example is a PCIE vDPA whose
  DMA request was tagged via a bus (e.g PCIE) specific way. DMA
  translation and protection are done at PCIE bus IOMMU level.
- Device specific DMA translation - The device implements DMA
  isolation and protection through its own logic. An example is a vDPA
  device which uses on-chip IOMMU.

To hide the differences and complexity of the above types for a vDPA
device/IOMMU options and in order to present a generic virtio device
to the upper layer, a device agnostic framework is required.

This series introduces a software vDPA bus which abstracts the
common attributes of vDPA device, vDPA bus driver and the
communication method, the bus operations (vdpa_config_ops) between the
vDPA device abstraction and the vDPA bus driver. This allows multiple
types of drivers to be used for vDPA device like the virtio_vdpa and
vhost_vdpa driver to operate on the bus and allow vDPA device could be
used by either kernel virtio driver or userspace vhost drivers as:

   virtio drivers  vhost drivers
  | |
[virtio bus]   [vhost uAPI]
  | |
   virtio device   vhost device
   virtio_vdpa drv vhost_vdpa drv
 \   /
[vDPA bus]
 |
vDPA device
hardware drv
 |
[hardware bus]
 |
vDPA hardware

virtio_vdpa driver is a transport implementation for kernel virtio
drivers on top of vDPA bus operations. An alternative is to refactor
virtio bus which is sub-optimal since the bus and drivers are designed
to be use by kernel subsystem, a non-trivial major refactoring is
needed which may impact a brunches of drivers and devices
implementation inside the kernel. Using a new transport may grealy
simply both the design and changes.

vhost_vdpa driver is a new type of vhost device which allows userspace
vhost drivers to use vDPA devices via vhost uAPI (with minor
extension). This help to minimize the changes of existed vhost drivers
for using vDPA devices.

With the abstraction of vDPA bus and vDPA bus operations, the
difference and complexity of the under layer hardware is hidden from
upper layer. The vDPA bus drivers on top can use a unified
vdpa_config_ops to control different types of vDPA device.

This series contains the bus and virtio_vdpa implementation. We are
working on the vhost part and IFCVF (vDPA driver from Intel) which
will be posted in future days.

Thanks

Changes from V2:

- release idr in the release function for put_device() unwind (Jason)
- don't panic when fail to register vdpa bus (Jason)
- use unsigned int instead of int for ida (Jason)
- fix the wrong commit log in virito_vdpa patches (Jason)
- make vdpa_sim depends on RUNTIME_TESTING_MENU (Michael)
- provide a bus release function for vDPA device (Jason)
- fix the wrong unwind when creating devices for vDPA simulator (Jason)
- move vDPA simulator to a dedicated directory (Lingshan)
- cancel the work before release vDPA simulator

Changes from V1:

- drop sysfs API, leave the management interface to future development
  (Michael)
- introduce incremental DMA ops (dma_map/dma_unmap) (Michael)
- introduce dma_device and use it instead of parent device for doing
  IOMMU or DMA from bus driver (Michael, Jason, Ling Shan, Tiwei)
- accept parent device and dma device when register vdpa device
- coding style and compile fixes (Randy)
- using vdpa_xxx instead of xxx_vdpa (Jason)
- ove vDPA accessors to header and make it static inline (Jason)
- split vdp_register_device() into two helpers vdpa_init_device() and
  vdpa_register_device() which allows intermediate step to be done (Jason)
- warn on invalidate queue state when fail to creating virtqueue (Jason)
- make to_virtio_vdpa_device() static (Jason)
- use kmalloc/kfree instead of 

[PATCH V3 1/5] vhost: factor out IOTLB

2020-02-19 Thread Jason Wang
This patch factors out IOTLB into a dedicated module in order to be
reused by other modules like vringh. User may choose to enable the
automatic retiring by specifying VHOST_IOTLB_FLAG_RETIRE flag to fit
for the case of vhost device IOTLB implementation.

Signed-off-by: Jason Wang 
---
 MAINTAINERS |   1 +
 drivers/vhost/Kconfig   |   7 ++
 drivers/vhost/Makefile  |   2 +
 drivers/vhost/net.c |   2 +-
 drivers/vhost/vhost.c   | 221 +++-
 drivers/vhost/vhost.h   |  36 ++
 drivers/vhost/vhost_iotlb.c | 171 
 include/linux/vhost_iotlb.h |  45 
 8 files changed, 304 insertions(+), 181 deletions(-)
 create mode 100644 drivers/vhost/vhost_iotlb.c
 create mode 100644 include/linux/vhost_iotlb.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c74e4ea714a5..0fb645b5a7df 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17768,6 +17768,7 @@ T:  git 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
 S: Maintained
 F: drivers/vhost/
 F: include/uapi/linux/vhost.h
+F: include/linux/vhost_iotlb.h
 
 VIRTIO INPUT DRIVER
 M: Gerd Hoffmann 
diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 3d03ccbd1adc..eef634ff9a6e 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -36,6 +36,7 @@ config VHOST_VSOCK
 
 config VHOST
tristate
+   select VHOST_IOTLB
---help---
  This option is selected by any driver which needs to access
  the core of vhost.
@@ -54,3 +55,9 @@ config VHOST_CROSS_ENDIAN_LEGACY
  adds some overhead, it is disabled by default.
 
  If unsure, say "N".
+
+config VHOST_IOTLB
+   tristate
+   default m
+   help
+ Generic IOTLB implementation for vhost and vringh.
diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
index 6c6df24f770c..df99756fbb26 100644
--- a/drivers/vhost/Makefile
+++ b/drivers/vhost/Makefile
@@ -11,3 +11,5 @@ vhost_vsock-y := vsock.o
 obj-$(CONFIG_VHOST_RING) += vringh.o
 
 obj-$(CONFIG_VHOST)+= vhost.o
+
+obj-$(CONFIG_VHOST_IOTLB) += vhost_iotlb.o
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e158159671fa..e4a20d7a2921 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1594,7 +1594,7 @@ static long vhost_net_reset_owner(struct vhost_net *n)
struct socket *tx_sock = NULL;
struct socket *rx_sock = NULL;
long err;
-   struct vhost_umem *umem;
+   struct vhost_iotlb *umem;
 
mutex_lock(>dev.mutex);
err = vhost_dev_check_owner(>dev);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index f44340b41494..9059b95cac83 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -50,10 +50,6 @@ enum {
 #define vhost_used_event(vq) ((__virtio16 __user *)>avail->ring[vq->num])
 #define vhost_avail_event(vq) ((__virtio16 __user *)>used->ring[vq->num])
 
-INTERVAL_TREE_DEFINE(struct vhost_umem_node,
-rb, __u64, __subtree_last,
-START, LAST, static inline, vhost_umem_interval_tree);
-
 #ifdef CONFIG_VHOST_CROSS_ENDIAN_LEGACY
 static void vhost_disable_cross_endian(struct vhost_virtqueue *vq)
 {
@@ -581,21 +577,25 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_set_owner);
 
-struct vhost_umem *vhost_dev_reset_owner_prepare(void)
+static struct vhost_iotlb *iotlb_alloc(void)
+{
+   return vhost_iotlb_alloc(max_iotlb_entries,
+VHOST_IOTLB_FLAG_RETIRE);
+}
+
+struct vhost_iotlb *vhost_dev_reset_owner_prepare(void)
 {
-   return kvzalloc(sizeof(struct vhost_umem), GFP_KERNEL);
+   return iotlb_alloc();
 }
 EXPORT_SYMBOL_GPL(vhost_dev_reset_owner_prepare);
 
 /* Caller should have device mutex */
-void vhost_dev_reset_owner(struct vhost_dev *dev, struct vhost_umem *umem)
+void vhost_dev_reset_owner(struct vhost_dev *dev, struct vhost_iotlb *umem)
 {
int i;
 
vhost_dev_cleanup(dev);
 
-   /* Restore memory to default empty mapping. */
-   INIT_LIST_HEAD(>umem_list);
dev->umem = umem;
/* We don't need VQ locks below since vhost_dev_cleanup makes sure
 * VQs aren't running.
@@ -618,28 +618,6 @@ void vhost_dev_stop(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_stop);
 
-static void vhost_umem_free(struct vhost_umem *umem,
-   struct vhost_umem_node *node)
-{
-   vhost_umem_interval_tree_remove(node, >umem_tree);
-   list_del(>link);
-   kfree(node);
-   umem->numem--;
-}
-
-static void vhost_umem_clean(struct vhost_umem *umem)
-{
-   struct vhost_umem_node *node, *tmp;
-
-   if (!umem)
-   return;
-
-   list_for_each_entry_safe(node, tmp, >umem_list, link)
-   vhost_umem_free(umem, node);
-
-   kvfree(umem);
-}
-
 static void vhost_clear_msg(struct vhost_dev *dev)
 {
struct vhost_msg_node *node, *n;
@@ -677,9 +655,9 @@ 

Re: [PATCH] vhost: introduce vDPA based backend

2020-02-19 Thread Tiwei Bie
On Wed, Feb 19, 2020 at 09:11:02AM -0400, Jason Gunthorpe wrote:
> On Wed, Feb 19, 2020 at 10:52:38AM +0800, Tiwei Bie wrote:
> > > > +static int __init vhost_vdpa_init(void)
> > > > +{
> > > > +   int r;
> > > > +
> > > > +   idr_init(_vdpa.idr);
> > > > +   mutex_init(_vdpa.mutex);
> > > > +   init_waitqueue_head(_vdpa.release_q);
> > > > +
> > > > +   /* /dev/vhost-vdpa/$vdpa_device_index */
> > > > +   vhost_vdpa.class = class_create(THIS_MODULE, "vhost-vdpa");
> > > > +   if (IS_ERR(vhost_vdpa.class)) {
> > > > +   r = PTR_ERR(vhost_vdpa.class);
> > > > +   goto err_class;
> > > > +   }
> > > > +
> > > > +   vhost_vdpa.class->devnode = vhost_vdpa_devnode;
> > > > +
> > > > +   r = alloc_chrdev_region(_vdpa.devt, 0, MINORMASK + 1,
> > > > +   "vhost-vdpa");
> > > > +   if (r)
> > > > +   goto err_alloc_chrdev;
> > > > +
> > > > +   cdev_init(_vdpa.cdev, _vdpa_fops);
> > > > +   r = cdev_add(_vdpa.cdev, vhost_vdpa.devt, MINORMASK + 1);
> > > > +   if (r)
> > > > +   goto err_cdev_add;
> > > 
> > > It is very strange, is the intention to create a single global char
> > > dev?
> > 
> > No. It's to create a per-vdpa char dev named
> > vhost-vdpa/$vdpa_device_index in dev.
> > 
> > I followed the code in VFIO which creates char dev
> > vfio/$GROUP dynamically, e.g.:
> > 
> > https://github.com/torvalds/linux/blob/b1da3acc781c/drivers/vfio/vfio.c#L2164-L2180
> > https://github.com/torvalds/linux/blob/b1da3acc781c/drivers/vfio/vfio.c#L373-L387
> > https://github.com/torvalds/linux/blob/b1da3acc781c/drivers/vfio/vfio.c#L1553
> > 
> > Is it something unwanted?
> 
> Yes it is unwanted. This is some special pattern for vfio's unique
> needs. 
> 
> Since this has a struct device for each char dev instance please use
> the normal cdev_device_add() driven pattern here, or justify why it
> needs to be special like this.

I see. Thanks! I will embed the cdev in each vhost_vdpa
structure directly.

Regards,
Tiwei

> 
> Jason
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio_balloon: Fix build error seen with CONFIG_BALLOON_COMPACTION=n

2020-02-19 Thread Guenter Roeck
0day reports:

drivers//virtio/virtio_balloon.c: In function 'virtballoon_probe':
drivers//virtio/virtio_balloon.c:960:1: error:
label 'out_del_vqs' defined but not used [-Werror=unused-label]

This is seen with CONFIG_BALLOON_COMPACTION=n.

Reported-by: kbuild test robot 
Fixes: 1ad6f58ea936 ("virtio_balloon: Fix memory leaks on errors in 
virtballoon_probe()")
Cc: David Hildenbrand 
Cc: Michael S. Tsirkin 
Signed-off-by: Guenter Roeck 
---
 drivers/virtio/virtio_balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 7bfe365d9372..341458fd95ca 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -959,8 +959,8 @@ static int virtballoon_probe(struct virtio_device *vdev)
iput(vb->vb_dev_info.inode);
 out_kern_unmount:
kern_unmount(balloon_mnt);
-#endif
 out_del_vqs:
+#endif
vdev->config->del_vqs(vdev);
 out_free_vb:
kfree(vb);
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/2] virtio-blk: improve handling of DMA mapping failures

2020-02-19 Thread Stefan Hajnoczi
On Thu, Feb 13, 2020 at 01:37:26PM +0100, Halil Pasic wrote:
> Two patches are handling new edge cases introduced by doing DMA mappings
> (which can fail) in virtio core.
> 
> I stumbled upon this while stress testing I/O for Protected Virtual
> Machines. I deliberately chose a tiny swiotlb size and have generated
> load with fio. With more than one virtio-blk disk in use I experienced
> hangs.
> 
> The goal of this series is to fix those hangs.
> 
> Halil Pasic (2):
>   virtio-blk: fix hw_queue stopped on arbitrary error
>   virtio-blk: improve virtqueue error to BLK_STS
> 
>  drivers/block/virtio_blk.c | 17 -
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> 
> base-commit: 39bed42de2e7d74686a2d5a45638d6a5d7e7d473
> -- 
> 2.17.1
> 

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[RESEND PATCH v2 9/9] ath5k: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
Acked-by: Kalle Valo 
---
 drivers/net/wireless/ath/ath5k/ahb.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wireless/ath/ath5k/ahb.c 
b/drivers/net/wireless/ath/ath5k/ahb.c
index 2c9cec8b53d9..8bd01df369fb 100644
--- a/drivers/net/wireless/ath/ath5k/ahb.c
+++ b/drivers/net/wireless/ath/ath5k/ahb.c
@@ -138,18 +138,18 @@ static int ath_ahb_probe(struct platform_device *pdev)
 
if (bcfg->devid >= AR5K_SREV_AR2315_R6) {
/* Enable WMAC AHB arbitration */
-   reg = ioread32((void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
+   reg = ioread32((const void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
reg |= AR5K_AR2315_AHB_ARB_CTL_WLAN;
iowrite32(reg, (void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
 
/* Enable global WMAC swapping */
-   reg = ioread32((void __iomem *) AR5K_AR2315_BYTESWAP);
+   reg = ioread32((const void __iomem *) AR5K_AR2315_BYTESWAP);
reg |= AR5K_AR2315_BYTESWAP_WMAC;
iowrite32(reg, (void __iomem *) AR5K_AR2315_BYTESWAP);
} else {
/* Enable WMAC DMA access (assuming 5312 or 231x*/
/* TODO: check other platforms */
-   reg = ioread32((void __iomem *) AR5K_AR5312_ENABLE);
+   reg = ioread32((const void __iomem *) AR5K_AR5312_ENABLE);
if (to_platform_device(ah->dev)->id == 0)
reg |= AR5K_AR5312_ENABLE_WLAN0;
else
@@ -202,12 +202,12 @@ static int ath_ahb_remove(struct platform_device *pdev)
 
if (bcfg->devid >= AR5K_SREV_AR2315_R6) {
/* Disable WMAC AHB arbitration */
-   reg = ioread32((void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
+   reg = ioread32((const void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
reg &= ~AR5K_AR2315_AHB_ARB_CTL_WLAN;
iowrite32(reg, (void __iomem *) AR5K_AR2315_AHB_ARB_CTL);
} else {
/*Stop DMA access */
-   reg = ioread32((void __iomem *) AR5K_AR5312_ENABLE);
+   reg = ioread32((const void __iomem *) AR5K_AR5312_ENABLE);
if (to_platform_device(ah->dev)->id == 0)
reg &= ~AR5K_AR5312_ENABLE_WLAN0;
else
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 7/9] drm/nouveau: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 1b62ccc57aef..d95bdd65dbca 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -613,7 +613,7 @@ nouveau_bo_rd32(struct nouveau_bo *nvbo, unsigned index)
mem += index;
 
if (is_iomem)
-   return ioread32_native((void __force __iomem *)mem);
+   return ioread32_native((const void __force __iomem *)mem);
else
return *mem;
 }
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 8/9] media: fsl-viu: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/media/platform/fsl-viu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/platform/fsl-viu.c b/drivers/media/platform/fsl-viu.c
index 81a8faedbba6..991d9dc82749 100644
--- a/drivers/media/platform/fsl-viu.c
+++ b/drivers/media/platform/fsl-viu.c
@@ -34,7 +34,7 @@
 /* Allow building this driver with COMPILE_TEST */
 #if !defined(CONFIG_PPC) && !defined(CONFIG_MICROBLAZE)
 #define out_be32(v, a) iowrite32be(a, (void __iomem *)v)
-#define in_be32(a) ioread32be((void __iomem *)a)
+#define in_be32(a) ioread32be((const void __iomem *)a)
 #endif
 
 #define BUFFER_TIMEOUT msecs_to_jiffies(500)  /* 0.5 seconds */
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 5/9] arc: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the
address so they can be converted to a "const" version for const-safety
and consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
Acked-by: Alexey Brodkin 

---

Changes since v1:
1. Add Alexey's ack.
---
 arch/arc/plat-axs10x/axs10x.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arc/plat-axs10x/axs10x.c b/arch/arc/plat-axs10x/axs10x.c
index 63ea5a606ecd..180c260a8221 100644
--- a/arch/arc/plat-axs10x/axs10x.c
+++ b/arch/arc/plat-axs10x/axs10x.c
@@ -84,7 +84,7 @@ static void __init axs10x_print_board_ver(unsigned int creg, 
const char *str)
unsigned int val;
} board;
 
-   board.val = ioread32((void __iomem *)creg);
+   board.val = ioread32((const void __iomem *)creg);
pr_info("AXS: %s FPGA Date: %u-%u-%u\n", str, board.d, board.m,
board.y);
 }
@@ -95,7 +95,7 @@ static void __init axs10x_early_init(void)
char mb[32];
 
/* Determine motherboard version */
-   if (ioread32((void __iomem *) CREG_MB_CONFIG) & (1 << 28))
+   if (ioread32((const void __iomem *) CREG_MB_CONFIG) & (1 << 28))
mb_rev = 3; /* HT-3 (rev3.0) */
else
mb_rev = 2; /* HT-2 (rev2.0) */
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 3/9] ntb: intel: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
Reviewed-by: Geert Uytterhoeven 
Acked-by: Dave Jiang 

---

Changes since v1:
1. Add Geert's review.
2. Add Dave's ack.
---
 drivers/ntb/hw/intel/ntb_hw_gen1.c  | 2 +-
 drivers/ntb/hw/intel/ntb_hw_gen3.h  | 2 +-
 drivers/ntb/hw/intel/ntb_hw_intel.h | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.c 
b/drivers/ntb/hw/intel/ntb_hw_gen1.c
index bb57ec239029..9202502a9787 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c
+++ b/drivers/ntb/hw/intel/ntb_hw_gen1.c
@@ -1202,7 +1202,7 @@ int intel_ntb_peer_spad_write(struct ntb_dev *ntb, int 
pidx, int sidx,
   ndev->peer_reg->spad);
 }
 
-static u64 xeon_db_ioread(void __iomem *mmio)
+static u64 xeon_db_ioread(const void __iomem *mmio)
 {
return (u64)ioread16(mmio);
 }
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen3.h 
b/drivers/ntb/hw/intel/ntb_hw_gen3.h
index 75fb86ca27bb..d1455f24ec99 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen3.h
+++ b/drivers/ntb/hw/intel/ntb_hw_gen3.h
@@ -91,7 +91,7 @@
 #define GEN3_DB_TOTAL_SHIFT33
 #define GEN3_SPAD_COUNT16
 
-static inline u64 gen3_db_ioread(void __iomem *mmio)
+static inline u64 gen3_db_ioread(const void __iomem *mmio)
 {
return ioread64(mmio);
 }
diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.h 
b/drivers/ntb/hw/intel/ntb_hw_intel.h
index e071e28bca3f..3c0a5a2da241 100644
--- a/drivers/ntb/hw/intel/ntb_hw_intel.h
+++ b/drivers/ntb/hw/intel/ntb_hw_intel.h
@@ -102,7 +102,7 @@ struct intel_ntb_dev;
 struct intel_ntb_reg {
int (*poll_link)(struct intel_ntb_dev *ndev);
int (*link_is_up)(struct intel_ntb_dev *ndev);
-   u64 (*db_ioread)(void __iomem *mmio);
+   u64 (*db_ioread)(const void __iomem *mmio);
void (*db_iowrite)(u64 db_bits, void __iomem *mmio);
unsigned long   ntb_ctl;
resource_size_t db_size;
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 6/9] drm/mgag200: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
Reviewed-by: Thomas Zimmermann 

---

Changes since v1:
1. Add Thomas' review.
---
 drivers/gpu/drm/mgag200/mgag200_drv.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h 
b/drivers/gpu/drm/mgag200/mgag200_drv.h
index aa32aad222c2..6512b3af4fb7 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -34,9 +34,9 @@
 
 #define MGAG200FB_CONN_LIMIT 1
 
-#define RREG8(reg) ioread8(((void __iomem *)mdev->rmmio) + (reg))
+#define RREG8(reg) ioread8(((const void __iomem *)mdev->rmmio) + (reg))
 #define WREG8(reg, v) iowrite8(v, ((void __iomem *)mdev->rmmio) + (reg))
-#define RREG32(reg) ioread32(((void __iomem *)mdev->rmmio) + (reg))
+#define RREG32(reg) ioread32(((const void __iomem *)mdev->rmmio) + (reg))
 #define WREG32(reg, v) iowrite32(v, ((void __iomem *)mdev->rmmio) + (reg))
 
 #define ATTR_INDEX 0x1fc0
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 2/9] rtl818x: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
Reviewed-by: Geert Uytterhoeven 
Acked-by: Kalle Valo 

---

Changes since v1:
1. Add Geert's review.
2. Add Kalle's ack. Fix subject prefix.
---
 drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h 
b/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
index 7948a2da195a..2ff00800d45b 100644
--- a/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
+++ b/drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h
@@ -150,17 +150,17 @@ void rtl8180_write_phy(struct ieee80211_hw *dev, u8 addr, 
u32 data);
 void rtl8180_set_anaparam(struct rtl8180_priv *priv, u32 anaparam);
 void rtl8180_set_anaparam2(struct rtl8180_priv *priv, u32 anaparam2);
 
-static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, u8 __iomem *addr)
+static inline u8 rtl818x_ioread8(struct rtl8180_priv *priv, const u8 __iomem 
*addr)
 {
return ioread8(addr);
 }
 
-static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, __le16 __iomem 
*addr)
+static inline u16 rtl818x_ioread16(struct rtl8180_priv *priv, const __le16 
__iomem *addr)
 {
return ioread16(addr);
 }
 
-static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, __le32 __iomem 
*addr)
+static inline u32 rtl818x_ioread32(struct rtl8180_priv *priv, const __le32 
__iomem *addr)
 {
return ioread32(addr);
 }
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 4/9] virtio: pci: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() helpers have inconsistent interface.  On some architectures
void *__iomem address argument is a pointer to const, on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Signed-off-by: Krzysztof Kozlowski 
Reviewed-by: Geert Uytterhoeven 

---

Changes since v1:
1. Add Geert's review.
---
 drivers/virtio/virtio_pci_modern.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index 7abcc50838b8..fc58db4ab6c3 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -26,16 +26,16 @@
  * method, i.e. 32-bit accesses for 32-bit fields, 16-bit accesses
  * for 16-bit fields and 8-bit accesses for 8-bit fields.
  */
-static inline u8 vp_ioread8(u8 __iomem *addr)
+static inline u8 vp_ioread8(const u8 __iomem *addr)
 {
return ioread8(addr);
 }
-static inline u16 vp_ioread16 (__le16 __iomem *addr)
+static inline u16 vp_ioread16 (const __le16 __iomem *addr)
 {
return ioread16(addr);
 }
 
-static inline u32 vp_ioread32(__le32 __iomem *addr)
+static inline u32 vp_ioread32(const __le32 __iomem *addr)
 {
return ioread32(addr);
 }
-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 0/9] iomap: Constify ioreadX() iomem argument

2020-02-19 Thread Krzysztof Kozlowski
Hi,


Changes since v1

https://lore.kernel.org/lkml/1578415992-24054-1-git-send-email-k...@kernel.org/
1. Constify also ioreadX_rep() and mmio_insX(),
2. Squash lib+alpha+powerpc+parisc+sh into one patch for bisectability,
3. Add acks and reviews,
4. Re-order patches so all optional driver changes are at the end.


Description
===
The ioread8/16/32() and others have inconsistent interface among the
architectures: some taking address as const, some not.

It seems there is nothing really stopping all of them to take
pointer to const.

Patchset was only compile tested on affected architectures.  No real
testing.


volatile

There is still interface inconsistency between architectures around
"volatile" qualifier:
 - include/asm-generic/io.h:static inline u32 ioread32(const volatile void 
__iomem *addr)
 - include/asm-generic/iomap.h:extern unsigned int ioread32(const void __iomem 
*);

This is still discussed and out of scope of this patchset.


Merging
===
Multiple architectures are affected in first patch so acks are welcomed.

1. All patches depend on first patch,
2. Patches 2-4 unify the interface also in few drivers,
3. PAtches 5-9 are optional cleanup, without actual impact.


Best regards,
Krzysztof


Krzysztof Kozlowski (9):
  iomap: Constify ioreadX() iomem argument (as in generic
implementation)
  rtl818x: Constify ioreadX() iomem argument (as in generic
implementation)
  ntb: intel: Constify ioreadX() iomem argument (as in generic
implementation)
  virtio: pci: Constify ioreadX() iomem argument (as in generic
implementation)
  arc: Constify ioreadX() iomem argument (as in generic implementation)
  drm/mgag200: Constify ioreadX() iomem argument (as in generic
implementation)
  drm/nouveau: Constify ioreadX() iomem argument (as in generic
implementation)
  media: fsl-viu: Constify ioreadX() iomem argument (as in generic
implementation)
  ath5k: Constify ioreadX() iomem argument (as in generic
implementation)

 arch/alpha/include/asm/core_apecs.h   |  6 +-
 arch/alpha/include/asm/core_cia.h |  6 +-
 arch/alpha/include/asm/core_lca.h |  6 +-
 arch/alpha/include/asm/core_marvel.h  |  4 +-
 arch/alpha/include/asm/core_mcpcia.h  |  6 +-
 arch/alpha/include/asm/core_t2.h  |  2 +-
 arch/alpha/include/asm/io.h   | 12 ++--
 arch/alpha/include/asm/io_trivial.h   | 16 ++---
 arch/alpha/include/asm/jensen.h   |  2 +-
 arch/alpha/include/asm/machvec.h  |  6 +-
 arch/alpha/kernel/core_marvel.c   |  2 +-
 arch/alpha/kernel/io.c| 12 ++--
 arch/arc/plat-axs10x/axs10x.c |  4 +-
 arch/parisc/include/asm/io.h  |  4 +-
 arch/parisc/lib/iomap.c   | 72 +--
 arch/powerpc/kernel/iomap.c   | 28 
 arch/sh/kernel/iomap.c| 22 +++---
 drivers/gpu/drm/mgag200/mgag200_drv.h |  4 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c  |  2 +-
 drivers/media/platform/fsl-viu.c  |  2 +-
 drivers/net/wireless/ath/ath5k/ahb.c  | 10 +--
 .../realtek/rtl818x/rtl8180/rtl8180.h |  6 +-
 drivers/ntb/hw/intel/ntb_hw_gen1.c|  2 +-
 drivers/ntb/hw/intel/ntb_hw_gen3.h|  2 +-
 drivers/ntb/hw/intel/ntb_hw_intel.h   |  2 +-
 drivers/virtio/virtio_pci_modern.c|  6 +-
 include/asm-generic/iomap.h   | 28 
 include/linux/io-64-nonatomic-hi-lo.h |  4 +-
 include/linux/io-64-nonatomic-lo-hi.h |  4 +-
 lib/iomap.c   | 30 
 30 files changed, 156 insertions(+), 156 deletions(-)

-- 
2.17.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[RESEND PATCH v2 1/9] iomap: Constify ioreadX() iomem argument (as in generic implementation)

2020-02-19 Thread Krzysztof Kozlowski
The ioreadX() and ioreadX_rep() helpers have inconsistent interface.  On
some architectures void *__iomem address argument is a pointer to const,
on some not.

Implementations of ioreadX() do not modify the memory under the address
so they can be converted to a "const" version for const-safety and
consistency among architectures.

Suggested-by: Geert Uytterhoeven 
Signed-off-by: Krzysztof Kozlowski 
Reviewed-by: Geert Uytterhoeven 
Reviewed-by: Arnd Bergmann 

---

Changes since v1:
1. Constify also ioreadX_rep() and mmio_insX(),
2. Squash lib+alpha+powerpc+parisc+sh into one patch for bisectability,
3. Add Geert's review.
4. Add Arnd's review.
---
 arch/alpha/include/asm/core_apecs.h   |  6 +--
 arch/alpha/include/asm/core_cia.h |  6 +--
 arch/alpha/include/asm/core_lca.h |  6 +--
 arch/alpha/include/asm/core_marvel.h  |  4 +-
 arch/alpha/include/asm/core_mcpcia.h  |  6 +--
 arch/alpha/include/asm/core_t2.h  |  2 +-
 arch/alpha/include/asm/io.h   | 12 ++---
 arch/alpha/include/asm/io_trivial.h   | 16 +++---
 arch/alpha/include/asm/jensen.h   |  2 +-
 arch/alpha/include/asm/machvec.h  |  6 +--
 arch/alpha/kernel/core_marvel.c   |  2 +-
 arch/alpha/kernel/io.c| 12 ++---
 arch/parisc/include/asm/io.h  |  4 +-
 arch/parisc/lib/iomap.c   | 72 +--
 arch/powerpc/kernel/iomap.c   | 28 +--
 arch/sh/kernel/iomap.c| 22 
 include/asm-generic/iomap.h   | 28 +--
 include/linux/io-64-nonatomic-hi-lo.h |  4 +-
 include/linux/io-64-nonatomic-lo-hi.h |  4 +-
 lib/iomap.c   | 30 +--
 20 files changed, 136 insertions(+), 136 deletions(-)

diff --git a/arch/alpha/include/asm/core_apecs.h 
b/arch/alpha/include/asm/core_apecs.h
index 0a07055bc0fe..2d9726fc02ef 100644
--- a/arch/alpha/include/asm/core_apecs.h
+++ b/arch/alpha/include/asm/core_apecs.h
@@ -384,7 +384,7 @@ struct el_apecs_procdata
}   \
} while (0)
 
-__EXTERN_INLINE unsigned int apecs_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread8(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -420,7 +420,7 @@ __EXTERN_INLINE void apecs_iowrite8(u8 b, void __iomem 
*xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int apecs_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread16(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -456,7 +456,7 @@ __EXTERN_INLINE void apecs_iowrite16(u16 b, void __iomem 
*xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int apecs_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int apecs_ioread32(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
if (addr < APECS_DENSE_MEM)
diff --git a/arch/alpha/include/asm/core_cia.h 
b/arch/alpha/include/asm/core_cia.h
index c706a7f2b061..cb22991f6761 100644
--- a/arch/alpha/include/asm/core_cia.h
+++ b/arch/alpha/include/asm/core_cia.h
@@ -342,7 +342,7 @@ struct el_CIA_sysdata_mcheck {
 #define vuip   volatile unsigned int __force *
 #define vulp   volatile unsigned long __force *
 
-__EXTERN_INLINE unsigned int cia_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread8(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -374,7 +374,7 @@ __EXTERN_INLINE void cia_iowrite8(u8 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int cia_ioread16(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread16(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -404,7 +404,7 @@ __EXTERN_INLINE void cia_iowrite16(u16 b, void __iomem 
*xaddr)
*(vuip) ((addr << 5) + base_and_type) = w;
 }
 
-__EXTERN_INLINE unsigned int cia_ioread32(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int cia_ioread32(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
if (addr < CIA_DENSE_MEM)
diff --git a/arch/alpha/include/asm/core_lca.h 
b/arch/alpha/include/asm/core_lca.h
index 84d5e5b84f4f..ec86314418cb 100644
--- a/arch/alpha/include/asm/core_lca.h
+++ b/arch/alpha/include/asm/core_lca.h
@@ -230,7 +230,7 @@ union el_lca {
} while (0)
 
 
-__EXTERN_INLINE unsigned int lca_ioread8(void __iomem *xaddr)
+__EXTERN_INLINE unsigned int lca_ioread8(const void __iomem *xaddr)
 {
unsigned long addr = (unsigned long) xaddr;
unsigned long result, base_and_type;
@@ -266,7 +266,7 @@ __EXTERN_INLINE void lca_iowrite8(u8 b, void __iomem *xaddr)
*(vuip) ((addr << 5) + base_and_type) 

Re: [PATCH 1/2] virtio-blk: fix hw_queue stopped on arbitrary error

2020-02-19 Thread Halil Pasic
On Wed, 19 Feb 2020 09:46:56 +0800
Ming Lei  wrote:

> On Tue, Feb 18, 2020 at 8:35 PM Halil Pasic  wrote:
> >
> > On Tue, 18 Feb 2020 10:21:18 +0800
> > Ming Lei  wrote:
> >
> > > On Thu, Feb 13, 2020 at 8:38 PM Halil Pasic  wrote:
> > > >
> > > > Since nobody else is going to restart our hw_queue for us, the
> > > > blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
> > > > necessarily sufficient to ensure that the queue will get started again.
> > > > In case of global resource outage (-ENOMEM because mapping failure,
> > > > because of swiotlb full) our virtqueue may be empty and we can get
> > > > stuck with a stopped hw_queue.
> > > >
> > > > Let us not stop the queue on arbitrary errors, but only on -EONSPC which
> > > > indicates a full virtqueue, where the hw_queue is guaranteed to get
> > > > started by virtblk_done() before when it makes sense to carry on
> > > > submitting requests. Let us also remove a stale comment.
> > >
> > > The generic solution may be to stop queue only when there is any
> > > in-flight request
> > > not completed.
> > >
> >
> > I think this is a pretty close to that. The queue is stopped only on
> > ENOSPC, which means virtqueue is full.
> >
> > > Checking -ENOMEM may not be enough, given -EIO can be returned from
> > > virtqueue_add()
> > > too in case of dma map failure.
> >
> > I'm not checking on -ENOMEM. So the queue would not be stopped on EIO.
> > Maybe I'm misunderstanding something In any case, please have another
> > look at the diff, and if your concerns persist please help me understand.
> 
> Looks I misread the patch, and this patch is fine:
> 
> Reviewed-by: Ming Lei 

Thank you very much!

Regards,
Halil

> 
> 
> Thanks,
> Ming Lei

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] vhost: introduce vDPA based backend

2020-02-19 Thread Jason Gunthorpe
On Wed, Feb 19, 2020 at 10:52:38AM +0800, Tiwei Bie wrote:
> > > +static int __init vhost_vdpa_init(void)
> > > +{
> > > + int r;
> > > +
> > > + idr_init(_vdpa.idr);
> > > + mutex_init(_vdpa.mutex);
> > > + init_waitqueue_head(_vdpa.release_q);
> > > +
> > > + /* /dev/vhost-vdpa/$vdpa_device_index */
> > > + vhost_vdpa.class = class_create(THIS_MODULE, "vhost-vdpa");
> > > + if (IS_ERR(vhost_vdpa.class)) {
> > > + r = PTR_ERR(vhost_vdpa.class);
> > > + goto err_class;
> > > + }
> > > +
> > > + vhost_vdpa.class->devnode = vhost_vdpa_devnode;
> > > +
> > > + r = alloc_chrdev_region(_vdpa.devt, 0, MINORMASK + 1,
> > > + "vhost-vdpa");
> > > + if (r)
> > > + goto err_alloc_chrdev;
> > > +
> > > + cdev_init(_vdpa.cdev, _vdpa_fops);
> > > + r = cdev_add(_vdpa.cdev, vhost_vdpa.devt, MINORMASK + 1);
> > > + if (r)
> > > + goto err_cdev_add;
> > 
> > It is very strange, is the intention to create a single global char
> > dev?
> 
> No. It's to create a per-vdpa char dev named
> vhost-vdpa/$vdpa_device_index in dev.
> 
> I followed the code in VFIO which creates char dev
> vfio/$GROUP dynamically, e.g.:
> 
> https://github.com/torvalds/linux/blob/b1da3acc781c/drivers/vfio/vfio.c#L2164-L2180
> https://github.com/torvalds/linux/blob/b1da3acc781c/drivers/vfio/vfio.c#L373-L387
> https://github.com/torvalds/linux/blob/b1da3acc781c/drivers/vfio/vfio.c#L1553
> 
> Is it something unwanted?

Yes it is unwanted. This is some special pattern for vfio's unique
needs. 

Since this has a struct device for each char dev instance please use
the normal cdev_device_add() driven pattern here, or justify why it
needs to be special like this.

Jason
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 3/5] vDPA: introduce vDPA bus

2020-02-19 Thread Jason Gunthorpe
On Wed, Feb 19, 2020 at 01:35:25PM +0800, Jason Wang wrote:
> > But it is
> > open coded and duplicated because .. vdpa?
> 
> 
> I'm not sure I get here, vhost module is reused for vhost-vdpa and all
> current vhost device (e.g net) uses their own char device.

I mean there shouldn't be two fops implementing the same uAPI

Jason
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] iommu/virtio: Build virtio-iommu as module

2020-02-19 Thread Joerg Roedel
On Wed, Feb 19, 2020 at 12:41:33PM +0100, Jean-Philippe Brucker wrote:
> No, I meant Will's changes in 5.6 to make the SMMU drivers modular. This
> patch doesn't depend on the x86 enablement patch-set, but there is a small
> conflict in Kconfig since they both modify it (locally I have this patch
> applied on top of the x86 enablement).

Yeah, I noticed the conflict when I applied it and that's why I asked.
Thanks for clarifying, the patch is now applied.


Regards,

Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] iommu/virtio: Build virtio-iommu as module

2020-02-19 Thread Jean-Philippe Brucker
Hi Joerg,

On Wed, Feb 19, 2020 at 12:16:04PM +0100, Joerg Roedel wrote:
> On Fri, Feb 14, 2020 at 05:38:27PM +0100, Jean-Philippe Brucker wrote:
> > From: Jean-Philippe Brucker 
> > 
> > Now that the infrastructure changes are in place, enable virtio-iommu to
> > be built as a module. Remove the redundant pci_request_acs() call, since
> > it's not exported but is already invoked during DMA setup.
> 
> Which infrastructure changes do you mean? Does this depend on the x86
> enablement patch-set in any way?

No, I meant Will's changes in 5.6 to make the SMMU drivers modular. This
patch doesn't depend on the x86 enablement patch-set, but there is a small
conflict in Kconfig since they both modify it (locally I have this patch
applied on top of the x86 enablement).

Thanks,
Jean
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] iommu/virtio: Build virtio-iommu as module

2020-02-19 Thread Joerg Roedel
On Fri, Feb 14, 2020 at 05:38:27PM +0100, Jean-Philippe Brucker wrote:
> From: Jean-Philippe Brucker 
> 
> Now that the infrastructure changes are in place, enable virtio-iommu to
> be built as a module. Remove the redundant pci_request_acs() call, since
> it's not exported but is already invoked during DMA setup.

Which infrastructure changes do you mean? Does this depend on the x86
enablement patch-set in any way?


Regards,

Joerg

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 23/62] x86/idt: Move IDT to data segment

2020-02-19 Thread Jürgen Groß

On 19.02.20 11:42, Joerg Roedel wrote:

Hi Jürgen,

On Wed, Feb 12, 2020 at 05:28:21PM +0100, Jürgen Groß wrote:

Xen-PV is clearing BSS as the very first action.


In the kernel image? Or in the ELF loader before jumping to the kernel
image?


In the kernel image.

See arch/x86/xen/xen-head.S - startup_xen is the entry point of the
kernel.


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 23/62] x86/idt: Move IDT to data segment

2020-02-19 Thread Joerg Roedel
Hi Jürgen,

On Wed, Feb 12, 2020 at 05:28:21PM +0100, Jürgen Groß wrote:
> Xen-PV is clearing BSS as the very first action.

In the kernel image? Or in the ELF loader before jumping to the kernel
image?

Regards,

Joerg
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 29/52] drm/bochs: Drop explicit drm_mode_config_cleanup

2020-02-19 Thread Daniel Vetter
Instead rely on the automatic clean, for which we just need to check
that drm_mode_config_init succeeded. To avoid an inversion in the
cleanup we also have to move the dev_private allocation over to
drmm_kzalloc.

Signed-off-by: Daniel Vetter 
Cc: Gerd Hoffmann 
Cc: virtualization@lists.linux-foundation.org
---
 drivers/gpu/drm/bochs/bochs.h |  1 -
 drivers/gpu/drm/bochs/bochs_drv.c |  6 ++
 drivers/gpu/drm/bochs/bochs_kms.c | 14 +-
 3 files changed, 7 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/bochs/bochs.h b/drivers/gpu/drm/bochs/bochs.h
index 917767173ee6..e5bd1d517a18 100644
--- a/drivers/gpu/drm/bochs/bochs.h
+++ b/drivers/gpu/drm/bochs/bochs.h
@@ -92,7 +92,6 @@ void bochs_mm_fini(struct bochs_device *bochs);
 
 /* bochs_kms.c */
 int bochs_kms_init(struct bochs_device *bochs);
-void bochs_kms_fini(struct bochs_device *bochs);
 
 /* bochs_fbdev.c */
 extern const struct drm_mode_config_funcs bochs_mode_funcs;
diff --git a/drivers/gpu/drm/bochs/bochs_drv.c 
b/drivers/gpu/drm/bochs/bochs_drv.c
index addb0568c1af..e18c51de1196 100644
--- a/drivers/gpu/drm/bochs/bochs_drv.c
+++ b/drivers/gpu/drm/bochs/bochs_drv.c
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 
 #include "bochs.h"
 
@@ -21,10 +22,7 @@ static void bochs_unload(struct drm_device *dev)
 {
struct bochs_device *bochs = dev->dev_private;
 
-   bochs_kms_fini(bochs);
bochs_mm_fini(bochs);
-   kfree(bochs);
-   dev->dev_private = NULL;
 }
 
 static int bochs_load(struct drm_device *dev)
@@ -32,7 +30,7 @@ static int bochs_load(struct drm_device *dev)
struct bochs_device *bochs;
int ret;
 
-   bochs = kzalloc(sizeof(*bochs), GFP_KERNEL);
+   bochs = drmm_kzalloc(dev, sizeof(*bochs), GFP_KERNEL);
if (bochs == NULL)
return -ENOMEM;
dev->dev_private = bochs;
diff --git a/drivers/gpu/drm/bochs/bochs_kms.c 
b/drivers/gpu/drm/bochs/bochs_kms.c
index e8cc8156d773..8285c03a6a95 100644
--- a/drivers/gpu/drm/bochs/bochs_kms.c
+++ b/drivers/gpu/drm/bochs/bochs_kms.c
@@ -134,7 +134,11 @@ const struct drm_mode_config_funcs bochs_mode_funcs = {
 
 int bochs_kms_init(struct bochs_device *bochs)
 {
-   drm_mode_config_init(bochs->dev);
+   int ret;
+
+   ret = drm_mode_config_init(bochs->dev);
+   if (ret)
+   return ret;
 
bochs->dev->mode_config.max_width = 8192;
bochs->dev->mode_config.max_height = 8192;
@@ -160,11 +164,3 @@ int bochs_kms_init(struct bochs_device *bochs)
 
return 0;
 }
-
-void bochs_kms_fini(struct bochs_device *bochs)
-{
-   if (!bochs->dev->mode_config.num_connector)
-   return;
-
-   drm_mode_config_cleanup(bochs->dev);
-}
-- 
2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 31/52] drm/cirrus: Fully embrace devm_

2020-02-19 Thread Daniel Vetter
With the drm_device lifetime fun cleaned up there's nothing in the way
anymore to use devm_ for everything hw releated. Do it, and in the
process, throw out the entire onion unwinding.

Signed-off-by: Daniel Vetter 
Cc: Dave Airlie 
Cc: Gerd Hoffmann 
Cc: Daniel Vetter 
Cc: "Noralf Trønnes" 
Cc: Emil Velikov 
Cc: Thomas Zimmermann 
Cc: virtualization@lists.linux-foundation.org
---
 drivers/gpu/drm/cirrus/cirrus.c | 44 +++--
 1 file changed, 14 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus.c b/drivers/gpu/drm/cirrus/cirrus.c
index 6ac0286810ec..1b78a2f88f69 100644
--- a/drivers/gpu/drm/cirrus/cirrus.c
+++ b/drivers/gpu/drm/cirrus/cirrus.c
@@ -558,7 +558,7 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
if (ret)
return ret;
 
-   ret = pci_enable_device(pdev);
+   ret = pcim_enable_device(pdev);
if (ret)
return ret;
 
@@ -569,39 +569,38 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
ret = -ENOMEM;
cirrus = kzalloc(sizeof(*cirrus), GFP_KERNEL);
if (cirrus == NULL)
-   goto err_pci_release;
+   return ret;
 
dev = >dev;
-   ret = drm_dev_init(dev, _driver, >dev);
+   ret = devm_drm_dev_init(>dev, dev, _driver);
if (ret) {
kfree(cirrus);
-   goto err_pci_release;
+   return ret;
}
dev->dev_private = cirrus;
drmm_add_final_kfree(dev, cirrus);
 
-   ret = -ENOMEM;
-   cirrus->vram = ioremap(pci_resource_start(pdev, 0),
-  pci_resource_len(pdev, 0));
+   cirrus->vram = devm_ioremap(>dev, pci_resource_start(pdev, 0),
+   pci_resource_len(pdev, 0));
if (cirrus->vram == NULL)
-   goto err_dev_put;
+   return -ENOMEM;
 
-   cirrus->mmio = ioremap(pci_resource_start(pdev, 1),
-  pci_resource_len(pdev, 1));
+   cirrus->mmio = devm_ioremap(>dev, pci_resource_start(pdev, 1),
+   pci_resource_len(pdev, 1));
if (cirrus->mmio == NULL)
-   goto err_unmap_vram;
+   return -ENOMEM;
 
ret = cirrus_mode_config_init(cirrus);
if (ret)
-   goto err_cleanup;
+   return ret;
 
ret = cirrus_conn_init(cirrus);
if (ret < 0)
-   goto err_cleanup;
+   return ret;
 
ret = cirrus_pipe_init(cirrus);
if (ret < 0)
-   goto err_cleanup;
+   return ret;
 
drm_mode_config_reset(dev);
 
@@ -609,33 +608,18 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
pci_set_drvdata(pdev, dev);
ret = drm_dev_register(dev, 0);
if (ret)
-   goto err_cleanup;
+   return ret;
 
drm_fbdev_generic_setup(dev, dev->mode_config.preferred_depth);
return 0;
-
-err_cleanup:
-   iounmap(cirrus->mmio);
-err_unmap_vram:
-   iounmap(cirrus->vram);
-err_dev_put:
-   drm_dev_put(dev);
-err_pci_release:
-   pci_release_regions(pdev);
-   return ret;
 }
 
 static void cirrus_pci_remove(struct pci_dev *pdev)
 {
struct drm_device *dev = pci_get_drvdata(pdev);
-   struct cirrus_device *cirrus = dev->dev_private;
 
drm_dev_unplug(dev);
drm_atomic_helper_shutdown(dev);
-   iounmap(cirrus->mmio);
-   iounmap(cirrus->vram);
-   drm_dev_put(dev);
-   pci_release_regions(pdev);
 }
 
 static const struct pci_device_id pciidlist[] = {
-- 
2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 30/52] drm/cirrus: Drop explicit drm_mode_config_cleanup call

2020-02-19 Thread Daniel Vetter
We can even delete the drm_driver.release hook now!

Signed-off-by: Daniel Vetter 
Cc: Dave Airlie 
Cc: Gerd Hoffmann 
Cc: Daniel Vetter 
Cc: "Noralf Trønnes" 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
Cc: virtualization@lists.linux-foundation.org
---
 drivers/gpu/drm/cirrus/cirrus.c | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus.c b/drivers/gpu/drm/cirrus/cirrus.c
index a9d789a56536..6ac0286810ec 100644
--- a/drivers/gpu/drm/cirrus/cirrus.c
+++ b/drivers/gpu/drm/cirrus/cirrus.c
@@ -510,11 +510,15 @@ static const struct drm_mode_config_funcs 
cirrus_mode_config_funcs = {
.atomic_commit = drm_atomic_helper_commit,
 };
 
-static void cirrus_mode_config_init(struct cirrus_device *cirrus)
+static int cirrus_mode_config_init(struct cirrus_device *cirrus)
 {
struct drm_device *dev = >dev;
+   int ret;
+
+   ret = drm_mode_config_init(dev);
+   if (ret)
+   return ret;
 
-   drm_mode_config_init(dev);
dev->mode_config.min_width = 0;
dev->mode_config.min_height = 0;
dev->mode_config.max_width = CIRRUS_MAX_PITCH / 2;
@@ -522,15 +526,12 @@ static void cirrus_mode_config_init(struct cirrus_device 
*cirrus)
dev->mode_config.preferred_depth = 16;
dev->mode_config.prefer_shadow = 0;
dev->mode_config.funcs = _mode_config_funcs;
+
+   return 0;
 }
 
 /* -- */
 
-static void cirrus_release(struct drm_device *dev)
-{
-   drm_mode_config_cleanup(dev);
-}
-
 DEFINE_DRM_GEM_FOPS(cirrus_fops);
 
 static struct drm_driver cirrus_driver = {
@@ -544,7 +545,6 @@ static struct drm_driver cirrus_driver = {
 
.fops= _fops,
DRM_GEM_SHMEM_DRIVER_OPS,
-   .release = cirrus_release,
 };
 
 static int cirrus_pci_probe(struct pci_dev *pdev,
@@ -591,7 +591,9 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
if (cirrus->mmio == NULL)
goto err_unmap_vram;
 
-   cirrus_mode_config_init(cirrus);
+   ret = cirrus_mode_config_init(cirrus);
+   if (ret)
+   goto err_cleanup;
 
ret = cirrus_conn_init(cirrus);
if (ret < 0)
@@ -613,7 +615,6 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
return 0;
 
 err_cleanup:
-   drm_mode_config_cleanup(dev);
iounmap(cirrus->mmio);
 err_unmap_vram:
iounmap(cirrus->vram);
-- 
2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 08/52] drm/qxl: Use drmm_add_final_kfree

2020-02-19 Thread Daniel Vetter
With this we can drop the final kfree from the release function.

Signed-off-by: Daniel Vetter 
Cc: Dave Airlie 
Cc: Gerd Hoffmann 
Cc: virtualization@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
---
 drivers/gpu/drm/qxl/qxl_drv.c | 2 --
 drivers/gpu/drm/qxl/qxl_kms.c | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index 4fda3f9b29f4..09102e2efabc 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -144,8 +144,6 @@ static void qxl_drm_release(struct drm_device *dev)
 */
qxl_modeset_fini(qdev);
qxl_device_fini(qdev);
-   dev->dev_private = NULL;
-   kfree(qdev);
 }
 
 static void
diff --git a/drivers/gpu/drm/qxl/qxl_kms.c b/drivers/gpu/drm/qxl/qxl_kms.c
index 70b20ee4741a..09d7b5f6d172 100644
--- a/drivers/gpu/drm/qxl/qxl_kms.c
+++ b/drivers/gpu/drm/qxl/qxl_kms.c
@@ -27,6 +27,7 @@
 #include 
 
 #include 
+#include 
 #include 
 
 #include "qxl_drv.h"
@@ -121,6 +122,7 @@ int qxl_device_init(struct qxl_device *qdev,
qdev->ddev.pdev = pdev;
pci_set_drvdata(pdev, >ddev);
qdev->ddev.dev_private = qdev;
+   drmm_add_final_kfree(>ddev, qdev);
 
mutex_init(>gem.mutex);
mutex_init(>update_area_mutex);
-- 
2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 10/52] drm/cirrus: Use drmm_add_final_kfree

2020-02-19 Thread Daniel Vetter
With this we can drop the final kfree from the release function.

I also noticed that cirrus forgot to call drm_dev_fini().

v2: Don't call kfree(cirrus) after we've handed overship of that to
drm_device and the drmm_ stuff.

Signed-off-by: Daniel Vetter 
Cc: Dave Airlie 
Cc: Gerd Hoffmann 
Cc: Daniel Vetter 
Cc: "Noralf Trønnes" 
Cc: Linus Walleij 
Cc: Sam Ravnborg 
Cc: Thomas Zimmermann 
Cc: virtualization@lists.linux-foundation.org
---
 drivers/gpu/drm/cirrus/cirrus.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/cirrus/cirrus.c b/drivers/gpu/drm/cirrus/cirrus.c
index d2ff63ce8eaf..2232556ce34c 100644
--- a/drivers/gpu/drm/cirrus/cirrus.c
+++ b/drivers/gpu/drm/cirrus/cirrus.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -527,10 +528,8 @@ static void cirrus_mode_config_init(struct cirrus_device 
*cirrus)
 
 static void cirrus_release(struct drm_device *dev)
 {
-   struct cirrus_device *cirrus = dev->dev_private;
-
drm_mode_config_cleanup(dev);
-   kfree(cirrus);
+   drm_dev_fini(dev);
 }
 
 DEFINE_DRM_GEM_FOPS(cirrus_fops);
@@ -575,9 +574,12 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
 
dev = >dev;
ret = drm_dev_init(dev, _driver, >dev);
-   if (ret)
-   goto err_free_cirrus;
+   if (ret) {
+   kfree(cirrus);
+   goto err_pci_release;
+   }
dev->dev_private = cirrus;
+   drmm_add_final_kfree(dev, cirrus);
 
ret = -ENOMEM;
cirrus->vram = ioremap(pci_resource_start(pdev, 0),
@@ -618,8 +620,6 @@ static int cirrus_pci_probe(struct pci_dev *pdev,
iounmap(cirrus->vram);
 err_dev_put:
drm_dev_put(dev);
-err_free_cirrus:
-   kfree(cirrus);
 err_pci_release:
pci_release_regions(pdev);
return ret;
-- 
2.24.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/ioperm: add new paravirt function update_io_bitmap

2020-02-19 Thread Jürgen Groß

On 19.02.20 10:22, Thomas Gleixner wrote:

Jürgen Groß  writes:

On 18.02.20 22:03, Thomas Gleixner wrote:

BTW, why isn't stuff like this not catched during next or at least
before the final release? Is nothing running CI on upstream with all
that XEN muck active?


This problem showed up by not being able to start the X server (probably
not the freshest one) in dom0 on a moderate aged AMD system.

Our CI tests tend do be more text console based for dom0.


tools/testing/selftests/x86/io[perm|pl] should have caught that as well,
right? If not, we need to fix the selftests.


Hmm, yes. Thanks for the pointer.

Will ask our testing specialist what is done in this regard and how it
can be enhanced.


Juergen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/ioperm: add new paravirt function update_io_bitmap

2020-02-19 Thread Thomas Gleixner
Jürgen Groß  writes:
> On 18.02.20 22:03, Thomas Gleixner wrote:
>> BTW, why isn't stuff like this not catched during next or at least
>> before the final release? Is nothing running CI on upstream with all
>> that XEN muck active?
>
> This problem showed up by not being able to start the X server (probably
> not the freshest one) in dom0 on a moderate aged AMD system.
>
> Our CI tests tend do be more text console based for dom0.

tools/testing/selftests/x86/io[perm|pl] should have caught that as well,
right? If not, we need to fix the selftests.

Thanks,

tglx
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] x86/ioperm: add new paravirt function update_io_bitmap

2020-02-19 Thread Jan Beulich
On 19.02.2020 06:35, Jürgen Groß wrote:
> On 18.02.20 22:03, Thomas Gleixner wrote:
>> Juergen Gross  writes:
>>> Commit 111e7b15cf10f6 ("x86/ioperm: Extend IOPL config to control
>>> ioperm() as well") reworked the iopl syscall to use I/O bitmaps.
>>>
>>> Unfortunately this broke Xen PV domains using that syscall as there
>>> is currently no I/O bitmap support in PV domains.
>>>
>>> Add I/O bitmap support via a new paravirt function update_io_bitmap
>>> which Xen PV domains can use to update their I/O bitmaps via a
>>> hypercall.
>>>
>>> Fixes: 111e7b15cf10f6 ("x86/ioperm: Extend IOPL config to control ioperm() 
>>> as well")
>>> Reported-by: Jan Beulich 
>>> Cc:  # 5.5
>>> Signed-off-by: Juergen Gross 
>>> Reviewed-by: Jan Beulich 
>>> Tested-by: Jan Beulich 
>>
>> Duh, sorry about that and thanks for fixing it.
>>
>> BTW, why isn't stuff like this not catched during next or at least
>> before the final release? Is nothing running CI on upstream with all
>> that XEN muck active?
> 
> This problem showed up by not being able to start the X server (probably
> not the freshest one) in dom0 on a moderate aged AMD system.

Not the freshest one, yes, but also on a system where KMS would not
be available (my success rate with KMS is rather low overall, and
with newer Linux I see rather more systems to stop working than ones
to become working, but I simply don't have the time to investigate).

Jan
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization