RE: [PATCH 1/2] virtio-iommu: Fix 64kB host page size VFIO device assignment

Duan, Zhenzhong Tue, 04 Jul 2023 21:53:32 -0700

Hi Eric,

>-----Original Message-----
>From: Eric Auger <eric.au...@redhat.com>
>Sent: Tuesday, July 4, 2023 7:15 PM
>Subject: [PATCH 1/2] virtio-iommu: Fix 64kB host page size VFIO device
>assignment
>
>When running on a 64kB page size host and protecting a VFIO device
>with the virtio-iommu, qemu crashes with this kind of message:
>
>qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible
>with mask 0x20010000


Does 0x20010000 mean only  512MB and 64KB super page mapping is
supported for host iommu hw? 4KB mapping not supported?

There is a check in guest kernel side hint only 4KB is supported, with
this patch we force viommu->pgsize_bitmap to 0x20010000
and fail below check? Does this device work in guest?
Please correct me if I understand wrong.

static int viommu_domain_finalise(struct viommu_endpoint *vdev,
                                  struct iommu_domain *domain)
{
...
        viommu_page_size = 1UL << __ffs(viommu->pgsize_bitmap);
        if (viommu_page_size > PAGE_SIZE) {
                dev_err(vdev->dev,
                        "granule 0x%lx larger than system page size 0x%lx\n",
                        viommu_page_size, PAGE_SIZE);
                return -ENODEV;
        }

Another question is: Presume 0x20010000 does mean only 512MB and 64KB
is supported. Is host hva->hpa mapping ensured to be compatible with at least
64KB mapping? If host mmu only support 4KB mapping or other non-compatible
with 0x20010000, will vfio_dma_map() fail?

>qemu: hardware error: vfio: DMA mapping failed, unable to continue
>
>This is due to the fact the IOMMU MR corresponding to the VFIO device
>is enabled very late on domain attach, after the machine init.
>The device reports a minimal 64kB page size but it is too late to be
>applied. virtio_iommu_set_page_size_mask() fails and this causes
>vfio_listener_region_add() to end up with hw_error();
>
>To work around this issue, we transiently enable the IOMMU MR on
>machine init to collect the page size requirements and then restore
>the bypass state.
>
>Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned
>device")
>Signed-off-by: Eric Auger <eric.au...@redhat.com>
>---
> include/hw/virtio/virtio-iommu.h |  2 ++
> hw/virtio/virtio-iommu.c         | 30 ++++++++++++++++++++++++++++--
> hw/virtio/trace-events           |  1 +
> 3 files changed, 31 insertions(+), 2 deletions(-)
>
>diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-
>iommu.h
>index 2ad5ee320b..a93fc5383e 100644
>--- a/include/hw/virtio/virtio-iommu.h
>+++ b/include/hw/virtio/virtio-iommu.h
>@@ -61,6 +61,8 @@ struct VirtIOIOMMU {
>     QemuRecMutex mutex;
>     GTree *endpoints;
>     bool boot_bypass;
>+    Notifier machine_done;
>+    bool granule_frozen;
> };
>
> #endif
>diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>index 1cd258135d..1eaf81bab5 100644
>--- a/hw/virtio/virtio-iommu.c
>+++ b/hw/virtio/virtio-iommu.c
>@@ -24,6 +24,7 @@
> #include "hw/virtio/virtio.h"
> #include "sysemu/kvm.h"
> #include "sysemu/reset.h"
>+#include "sysemu/sysemu.h"
> #include "qapi/error.h"
> #include "qemu/error-report.h"
> #include "trace.h"
>@@ -1106,12 +1107,12 @@ static int
>virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr,
>     }
>
>     /*
>-     * After the machine is finalized, we can't change the mask anymore. If by
>+     * Once the granule is frozen we can't change the mask anymore. If by
>      * chance the hotplugged device supports the same granule, we can still
>      * accept it. Having a different masks is possible but the guest will use
>      * sub-optimal block sizes, so warn about it.
>      */
>-    if (phase_check(PHASE_MACHINE_READY)) {
>+    if (s->granule_frozen) {
>         int new_granule = ctz64(new_mask);
>         int cur_granule = ctz64(cur_mask);
>
>@@ -1146,6 +1147,27 @@ static void virtio_iommu_system_reset(void
>*opaque)
>
> }
>
>+static void virtio_iommu_freeze_granule(Notifier *notifier, void *data)
>+{
>+    VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done);
>+    bool boot_bypass = s->config.bypass;
>+    int granule;
>+
>+    /*
>+     * Transient IOMMU MR enable to collect page_size_mask requirement
>+     * through memory_region_iommu_set_page_size_mask() called by
>+     * VFIO region_add() callback
>+     */
>+    s->config.bypass = 0;
>+    virtio_iommu_switch_address_space_all(s);
>+    /* restore default */
>+    s->config.bypass = boot_bypass;
>+    virtio_iommu_switch_address_space_all(s);
>+    s->granule_frozen = true;
>+    granule = ctz64(s->config.page_size_mask);
>+    trace_virtio_iommu_freeze_granule(BIT(granule));
>+}

It looks a bit heavy here just in order to get page_size_mask from host side.
But maybe this is the only way with current interface.

Thanks
Zhenzhong

>+
> static void virtio_iommu_device_realize(DeviceState *dev, Error **errp)
> {
>     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>@@ -1189,6 +1211,9 @@ static void virtio_iommu_device_realize(DeviceState
>*dev, Error **errp)
>         error_setg(errp, "VIRTIO-IOMMU is not attached to any PCI bus!");
>     }
>
>+    s->machine_done.notify = virtio_iommu_freeze_granule;
>+    qemu_add_machine_init_done_notifier(&s->machine_done);
>+
>     qemu_register_reset(virtio_iommu_system_reset, s);
> }
>
>@@ -1198,6 +1223,7 @@ static void
>virtio_iommu_device_unrealize(DeviceState *dev)
>     VirtIOIOMMU *s = VIRTIO_IOMMU(dev);
>
>     qemu_unregister_reset(virtio_iommu_system_reset, s);
>+    qemu_remove_machine_init_done_notifier(&s->machine_done);
>
>     g_hash_table_destroy(s->as_by_busptr);
>     if (s->domains) {
>diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
>index 8f8d05cf9b..68b752e304 100644
>--- a/hw/virtio/trace-events
>+++ b/hw/virtio/trace-events
>@@ -131,6 +131,7 @@ virtio_iommu_set_page_size_mask(const char *name,
>uint64_t old, uint64_t new) "m
> virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s"
> virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s"
> virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn,
>bool on) "Device %02x:%02x.%x switching address space (iommu
>enabled=%d)"
>+virtio_iommu_freeze_granule(uint64_t page_size_mask) "granule set to
>0x%"PRIx64
>
> # virtio-mem.c
> virtio_mem_send_response(uint16_t type) "type=%" PRIu16
>--
>2.38.1

RE: [PATCH 1/2] virtio-iommu: Fix 64kB host page size VFIO device assignment

Reply via email to