Re: [virtio-dev] [RFC PATCH linux] iommu: Add virtio-iommu driver

2017-06-16 Thread Jean-Philippe Brucker
On 16/06/17 09:48, Bharat Bhushan wrote:
> Hi Jean
>> +static int viommu_map(struct iommu_domain *domain, unsigned long iova,
>> +  phys_addr_t paddr, size_t size, int prot) {
>> +int ret;
>> +struct viommu_domain *vdomain = to_viommu_domain(domain);
>> +struct virtio_iommu_req_map req = {
>> +.head.type  = VIRTIO_IOMMU_T_MAP,
>> +.address_space  = cpu_to_le32(vdomain->id),
>> +.virt_addr  = cpu_to_le64(iova),
>> +.phys_addr  = cpu_to_le64(paddr),
>> +.size   = cpu_to_le64(size),
>> +};
>> +
>> +pr_debug("map %llu 0x%lx -> 0x%llx (%zu)\n", vdomain->id, iova,
>> + paddr, size);
> 
> A query, when I am tracing above prints I see same physical address is mapped 
> with two different virtual address, do you know why kernel does this?

That really depends which driver is calling into viommu. iommu_map is
called from the DMA API, which can be used by any device drivers. Within
an address space, multiple IOVAs pointing to the same PA isn't forbidden.

For example, looking at MAP requests for a virtio-net device, I get the
following trace:

ioas[1] map   0xfff3000 -> 0x8faa (4096)
ioas[1] map   0xfff2000 -> 0x8faa (4096)
ioas[1] map   0xfff1000 -> 0x8faa (4096)
ioas[1] map   0xfff -> 0x8faa (4096)
ioas[1] map   0xffef000 -> 0x8faa (4096)
ioas[1] map   0xffee000 -> 0x8faa (4096)
ioas[1] map   0xffed000 -> 0x8faa (4096)
ioas[1] map   0xffec000 -> 0x8faa (4096)
ioas[1] map   0xffeb000 -> 0x8faa (4096)
ioas[1] map   0xffea000 -> 0x8faa (4096)
ioas[1] map   0xffe8000 -> 0x8faa (8192)
...

During initialization, the virtio-net driver primes the rx queue with
receive buffers, that the host will then fill with network packets. It
calls virtqueue_add_inbuf_ctx to create descriptors on the rx virtqueue
for each buffer. Each buffer is 0x180 bytes here, so one 4k page can
contain around 10 (actually 11, with the last one crossing page boundary).

I guess the call trace goes like this:
 virtnet_open
 try_fill_recv
 add_recvbuf_mergeable
 virtqueue_add_inbuf_ctx
 vring_map_one_sg
 dma_map_page
 __iommu_dma_map

But the IOMMU cannot map fragments of pages, since the granule is 0x1000.
Therefore when virtqueue_add_inbuf_ctx maps the buffer, __iommu_dma_map
aligns address and size on full pages. Someone motivated could probably
optimize this by caching mapped pages and reusing IOVAs, but currently
that's how it goes.

Thanks,
Jean
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


RE: [virtio-dev] [RFC PATCH linux] iommu: Add virtio-iommu driver

2017-06-16 Thread Bharat Bhushan
Hi Jean

> -Original Message-
> From: virtio-...@lists.oasis-open.org [mailto:virtio-dev@lists.oasis-
> open.org] On Behalf Of Jean-Philippe Brucker
> Sent: Saturday, April 08, 2017 12:53 AM
> To: io...@lists.linux-foundation.org; k...@vger.kernel.org;
> virtualization@lists.linux-foundation.org; virtio-...@lists.oasis-open.org
> Cc: cd...@linaro.org; will.dea...@arm.com; robin.mur...@arm.com;
> lorenzo.pieral...@arm.com; j...@8bytes.org; m...@redhat.com;
> jasow...@redhat.com; alex.william...@redhat.com;
> marc.zyng...@arm.com
> Subject: [virtio-dev] [RFC PATCH linux] iommu: Add virtio-iommu driver
> 
> The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
> requests such as map/unmap over virtio-mmio transport. This driver should
> illustrate the initial proposal for virtio-iommu, that you hopefully received
> with it. It handle attach, detach, map and unmap requests.
> 
> The bulk of the code is to create requests and send them through virtio.
> Implementing the IOMMU API is fairly straightforward since the virtio-iommu
> MAP/UNMAP interface is almost identical. I threw in a custom
> map_sg() function which takes up some space, but is optional. The core
> function would send a sequence of map requests, waiting for a reply
> between each mapping. This optimization avoids yielding to the host after
> each map, and instead prepares a batch of requests in the virtio ring and
> kicks the host once.
> 
> It must be applied on top of the probe deferral work for IOMMU, currently
> under discussion. This allows to dissociate early driver detection and device
> probing: device-tree or ACPI is parsed early to find which devices are
> translated by the IOMMU, but the IOMMU itself cannot be probed until the
> core virtio module is loaded.
> 
> Enabling DEBUG makes it extremely verbose at the moment, but it should be
> calmer in next versions.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.bruc...@arm.com>
> ---
>  drivers/iommu/Kconfig |  11 +
>  drivers/iommu/Makefile|   1 +
>  drivers/iommu/virtio-iommu.c  | 980
> ++
>  include/uapi/linux/Kbuild |   1 +
>  include/uapi/linux/virtio_ids.h   |   1 +
>  include/uapi/linux/virtio_iommu.h | 142 ++
>  6 files changed, 1136 insertions(+)
>  create mode 100644 drivers/iommu/virtio-iommu.c  create mode 100644
> include/uapi/linux/virtio_iommu.h
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index
> 37e204f3d9be..8cd56ee9a93a 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -359,4 +359,15 @@ config MTK_IOMMU_V1
> 
> if unsure, say N here.
> 
> +config VIRTIO_IOMMU
> + tristate "Virtio IOMMU driver"
> + depends on VIRTIO_MMIO
> + select IOMMU_API
> + select INTERVAL_TREE
> + select ARM_DMA_USE_IOMMU if ARM
> + help
> +   Para-virtualised IOMMU driver with virtio.
> +
> +   Say Y here if you intend to run this kernel as a guest.
> +
>  endif # IOMMU_SUPPORT
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index
> 195f7b997d8e..1199d8475802 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-
> smmu.o
>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
> +obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> new file mode 100644 index ..1cf4f57b7817
> --- /dev/null
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -0,0 +1,980 @@
> +/*
> + * Virtio driver for the paravirtualized IOMMU
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
> USA.
> + *
> + * Copyright (C) 2017 ARM Limited
> + *
> + * Author: Jean-Philippe Brucker <jean-philippe.bruc...@arm.com>  */
> +
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
> +#include 
> +#include 
> +#inclu

[RFC PATCH linux] iommu: Add virtio-iommu driver

2017-04-07 Thread Jean-Philippe Brucker
The virtio IOMMU is a para-virtualized device, allowing to send IOMMU
requests such as map/unmap over virtio-mmio transport. This driver should
illustrate the initial proposal for virtio-iommu, that you hopefully
received with it. It handle attach, detach, map and unmap requests.

The bulk of the code is to create requests and send them through virtio.
Implementing the IOMMU API is fairly straightforward since the
virtio-iommu MAP/UNMAP interface is almost identical. I threw in a custom
map_sg() function which takes up some space, but is optional. The core
function would send a sequence of map requests, waiting for a reply
between each mapping. This optimization avoids yielding to the host after
each map, and instead prepares a batch of requests in the virtio ring and
kicks the host once.

It must be applied on top of the probe deferral work for IOMMU, currently
under discussion. This allows to dissociate early driver detection and
device probing: device-tree or ACPI is parsed early to find which devices
are translated by the IOMMU, but the IOMMU itself cannot be probed until
the core virtio module is loaded.

Enabling DEBUG makes it extremely verbose at the moment, but it should be
calmer in next versions.

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/Kconfig |  11 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/virtio-iommu.c  | 980 ++
 include/uapi/linux/Kbuild |   1 +
 include/uapi/linux/virtio_ids.h   |   1 +
 include/uapi/linux/virtio_iommu.h | 142 ++
 6 files changed, 1136 insertions(+)
 create mode 100644 drivers/iommu/virtio-iommu.c
 create mode 100644 include/uapi/linux/virtio_iommu.h

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f3d9be..8cd56ee9a93a 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,15 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config VIRTIO_IOMMU
+   tristate "Virtio IOMMU driver"
+   depends on VIRTIO_MMIO
+   select IOMMU_API
+   select INTERVAL_TREE
+   select ARM_DMA_USE_IOMMU if ARM
+   help
+ Para-virtualised IOMMU driver with virtio.
+
+ Say Y here if you intend to run this kernel as a guest.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b997d8e..1199d8475802 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
new file mode 100644
index ..1cf4f57b7817
--- /dev/null
+++ b/drivers/iommu/virtio-iommu.c
@@ -0,0 +1,980 @@
+/*
+ * Virtio driver for the paravirtualized IOMMU
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2017 ARM Limited
+ *
+ * Author: Jean-Philippe Brucker 
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct viommu_dev {
+   struct iommu_device iommu;
+   struct device   *dev;
+   struct virtio_device*vdev;
+
+   struct virtqueue*vq;
+   struct list_headpending_requests;
+   /* Serialize anything touching the vq and the request list */
+   spinlock_t  vq_lock;
+
+   struct list_headlist;
+
+   /* Device configuration */
+   u64 pgsize_bitmap;
+   u64 aperture_start;
+   u64 aperture_end;
+};
+
+struct viommu_mapping {
+   phys_addr_t paddr;
+   struct interval_tree_node   iova;
+};
+
+struct viommu_domain {
+   struct iommu_domain domain;
+   struct viommu_dev   *viommu;
+   struct mutexmutex;
+   u64 id;
+
+   spinlock_t