On 2026/01/27 15:14, Kasireddy, Vivek wrote:
Hi Akihiko,

Subject: Re: [PATCH v4 8/8] virtio-gpu-dmabuf: Create dmabuf for blobs
associated with VFIO devices

On 2026/01/23 15:17, Vivek Kasireddy wrote:
In addition to memfd, a blob resource can also have its backing
storage in a VFIO device region. Since, there is no effective way
to determine where the backing storage is located, we first try to
create a dmabuf assuming it is in memfd. If that fails, we try to
create a dmabuf assuming it is in VFIO device region.

So, we first call virtio_gpu_create_udmabuf() to check if the blob's
backing storage is located in a memfd or not. If it is not, we call
vfio_create_dmabuf() which identifies the right VFIO device and
eventually invokes the vfio_device_create_dmabuf_fd() API to have
a dmabuf fd created.

Note that in virtio_gpu_remap_dmabuf(), we first try to test if
the VFIO dmabuf exporter supports mmap or not. If it doesn't, we
use the VFIO device fd directly to create the CPU mapping.

While at it, remove the unnecessary rcu_read_lock/rcu_read_unlock
from virtio_gpu_create_udmabuf() and also replace warn_report()
with qemu_log_mask().

Cc: Marc-André Lureau <[email protected]>
Cc: Alex Bennée <[email protected]>
Cc: Akihiko Odaki <[email protected]>
Cc: Dmitry Osipenko <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Cédric Le Goater <[email protected]>
Signed-off-by: Vivek Kasireddy <[email protected]>
---
   hw/display/Kconfig             |   5 ++
   hw/display/virtio-gpu-dmabuf.c | 117
++++++++++++++++++++++++++++++---
   2 files changed, 113 insertions(+), 9 deletions(-)

diff --git a/hw/display/Kconfig b/hw/display/Kconfig
index 1e95ab28ef..0d090f25f5 100644
--- a/hw/display/Kconfig
+++ b/hw/display/Kconfig
@@ -106,6 +106,11 @@ config VIRTIO_VGA
       depends on VIRTIO_PCI
       select VGA

+config VIRTIO_GPU_VFIO_BLOB
+    bool
+    default y
+    depends on VFIO
+
   config VHOST_USER_GPU
       bool
       default y
diff --git a/hw/display/virtio-gpu-dmabuf.c b/hw/display/virtio-gpu-
dmabuf.c
index c34d4c85bc..2dfe70d7eb 100644
--- a/hw/display/virtio-gpu-dmabuf.c
+++ b/hw/display/virtio-gpu-dmabuf.c
@@ -18,6 +18,7 @@
   #include "ui/console.h"
   #include "hw/virtio/virtio-gpu.h"
   #include "hw/virtio/virtio-gpu-pixman.h"
+#include "hw/vfio/vfio-device.h"
   #include "trace.h"
   #include "system/ramblock.h"
   #include "system/hostmem.h"
@@ -27,6 +28,38 @@
   #include "standard-headers/linux/udmabuf.h"
   #include "standard-headers/drm/drm_fourcc.h"

+static void vfio_create_dmabuf(struct virtio_gpu_simple_resource
*res)
+{
+#if defined(VIRTIO_GPU_VFIO_BLOB)
+    VFIODevice *vbasedev;
+    RAMBlock *first_rb;
+    ram_addr_t offset;
+
+    first_rb = qemu_ram_block_from_host(res->iov[0].iov_base, false,
&offset);
+    if (!first_rb) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: Could not find ramblock\n", __func__);
+        return;
+    }
+
+    vbasedev = vfio_device_lookup(first_rb->mr);
+    if (!vbasedev) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: No VFIO device found to create dmabuf from\n",
+                      __func__);
+        return;
+    }
+
+    res->dmabuf_fd = vfio_device_create_dmabuf_fd(vbasedev,
+                                                  res->iov, res->iov_cnt);
+    if (res->dmabuf_fd < 0) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: VFIO_DEVICE_FEATURE_DMA_BUF: %s\n",
+                      __func__, strerror(errno));
+    }

This API design of vfio_device_create_dmabuf_fd() has some room of
improvements.

First, this function requires to pass VFIODevice * which is derived from
res->iov[0]. Since it is getting res->iov[0] anyway, such a parameter is
redundant. vfio_device_create_dmabuf_fd() can call
vfio_device_lookup()
itself instead.
Yeah, I agree. I'll move the call to lookup into vfio_device_create_dmabuf_fd().


The second problem is the error handling. The error may not be caused by
VFIO_DEVICE_FEATURE_DMA_BUF but may be caused by the guest
passing
regions that do not belong to one VFIO device. The two cases should be
distinguished for a correct message.
Ok, I'll add separate error messages.


+#endif
+}
+
   static void virtio_gpu_create_udmabuf(struct
virtio_gpu_simple_resource *res)
   {
       struct udmabuf_create_list *list;
@@ -43,10 +76,7 @@ static void virtio_gpu_create_udmabuf(struct
virtio_gpu_simple_resource *res)
                        sizeof(struct udmabuf_create_item) * res->iov_cnt);

       for (i = 0; i < res->iov_cnt; i++) {
-        rcu_read_lock();
           rb = qemu_ram_block_from_host(res->iov[i].iov_base, false,
&offset);
-        rcu_read_unlock();
-
           if (!rb || rb->fd < 0) {
               g_free(list);
               return;
@@ -62,17 +92,84 @@ static void virtio_gpu_create_udmabuf(struct
virtio_gpu_simple_resource *res)

       res->dmabuf_fd = ioctl(udmabuf, UDMABUF_CREATE_LIST, list);
       if (res->dmabuf_fd < 0) {
-        warn_report("%s: UDMABUF_CREATE_LIST: %s", __func__,
-                    strerror(errno));
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "%s: UDMABUF_CREATE_LIST: %s\n",
+                      __func__, strerror(errno));
       }
       g_free(list);
   }

These changes of virtio_gpu_create_udmabuf() are independent so
should
go into separate patches with explanation.
I called these out in the commit message and included them as they were
trivial but I guess I could move them into separate patches.

As docs/devel/submitting-a-patch.rst says,
> A commit message that mentions "Also, ..." is often a good candidate
> for splitting into multiple patches.

I think there will be another round of review so I suggest taking the chance to do so.




+static void *vfio_dmabuf_mmap(struct virtio_gpu_simple_resource
*res)
+{
+    struct vfio_region_info *info = NULL;
+    VFIODevice *vbasedev = NULL;
+    ram_addr_t offset, len = 0;
+    RAMBlock *first_rb, *rb;
+    void *map, *submap;
+    int i, ret = -1;
+
+    first_rb = qemu_ram_block_from_host(res->iov[0].iov_base, false,
&offset);
+    if (!first_rb) {
+        return MAP_FAILED;
+    }
+#if defined(VIRTIO_GPU_VFIO_BLOB)
+    vbasedev = vfio_device_lookup(first_rb->mr);
+#endif
+    if (!vbasedev) {
+        return MAP_FAILED;
+    }
+
+    /*
+     * We first reserve a contiguous chunk of address space for the entire
+     * dmabuf, then replace it with smaller mappings that correspond to
the
+     * individual segments of the dmabuf.
+     */
+    map = mmap(NULL, res->blob_size, PROT_READ, MAP_SHARED,
vbasedev->fd, 0);

It should have PROT_NONE instead of PROT_READ to prevent accidental
accesses to the placeholder mapping.
Likewise, -1 should be passed instead of vbasedev->fd so that the
placeholder mapping won't result in a mapping to real device.
I am OK with both suggestions. However, I don't see any problem if the
placeholder mapping is associated with the real device. In other words,
what specific problem do you see if we keep vbasedev->fd instead of -1?

These two suggestions are to avoid accidents. If the code is right, there will be no accesses to the head of the device memory. But if there is something wrong with it, these two changes will be safe guards. In other words, with these two changes, you will not have such an accidental access unless all of the following three happen:
1) Accidentally accessing the mapped page.
2) The protection is set something else PROT_NONE.
3) The mapping points to the device.

Each of these conditions make accidents rarer to occur.



+    if (map == MAP_FAILED) {
+        return map;
+    }
+
+    for (i = 0; i < res->iov_cnt; i++) {
+        rb = qemu_ram_block_from_host(res->iov[i].iov_base, false,
&offset);
+        if (rb != first_rb) {
+            goto err;
+        }
+#if defined(VIRTIO_GPU_VFIO_BLOB)
+        ret = vfio_get_region_index_from_mr(rb->mr);
+        if (ret < 0) {
+            goto err;
+        }
+
+        ret = vfio_device_get_region_info(vbasedev, ret, &info);

This function can take rb->mr and call vfio_device_lookup() and
vfio_get_region_index_from_mr() internally to simplify the API as
similar can be done for vfio_device_create_dmabuf_fd().
Right, but vfio_device_get_region_info() is an already existing API
that is called from various places that I am trying to leverage. I could
create another API to wrap the calls to vfio_get_region_index_from_mr()
and vfio_device_get_region_info() but not sure if that makes sense given
that it would only have one caller.

vfio_device_get_region_info() should be regarded as an internal function of VFIO.

Thinking of that, the current API design that relies on vfio_device_lookup() to expose VFIODevice * is not optimal. We never want virtio-gpu or anyone else VFIO to mess up with it using functions declared in include/hw/vfio/vfio-device.h. Likewise, exposing
vbasedev->fd introduces bunch of hazards.

I think this entire function is better moved to VFIO for better encapsulation. I also think it should be placed close to vfio_device_create_dmabuf_fd(). These functions are quite similar so are better reviewed togather. Any future event to change either of the function is also likely to require a change to the other, so placing them nearby will be convenient.

Regards,
Akihiko Odaki

Reply via email to