On 2025/10/22 14:10, Kasireddy, Vivek wrote:
Hi Akihiko,
Subject: Re: [PATCH v1 2/7] virtio-gpu: Don't rely on res->blob to
identify
blob resources
Subject: Re: [PATCH v1 2/7] virtio-gpu: Don't rely on res->blob to
identify blob
resources
Subject: Re: [PATCH v1 2/7] virtio-gpu: Don't rely on res->blob
to
identify
blob
resources
On 2025/10/04 8:35, Vivek Kasireddy wrote:
The res->blob pointer may not be valid (non-NULL) for some
blobs
where the backing storage is not memfd based. Therefore,
we
cannot
use it to determine if a resource is a blob or not. Instead, we
could use res->blob_size to make this determination as it is
non-zero for blob resources regardless of where their backing
storage is located.
I think this patch is no longer necessary since now you add
code to
mmap() VFIO storage with "[PATCH v1 7/7] virtio-gpu-
udmabuf:
Create
dmabuf for blobs associated with VFIO devices".
Right, but given that mmap() can still fail for various reasons
and
this
use-case can work as long as dmabuf creation succeeds, I think
it
makes
sense to not rely on res->blob to determine if a resource is blob
or
not.
I think the code will be simpler by making resource creation fail
when
mmap() fails, and I am concerned that the guest may
mulfunction
with
such an incomplete resource.
AFAICT, mmap() is a slow, optional path except for the cursor
(which
needs
further improvement). So, failing resource creation when mmap()
fails
does not seem like a good idea to me given the fact that
supporting
mmap()
is considered optional for dmabuf providers. And, even with vfio,
mmap()
can be blocked for various reasons by the kernel driver IIUC.
Reviewing the code again, I don't think mmap() can fail with the
current
version of this series.
udmabuf obviously always supports mmap().
For VFIO, checking memory_region_is_ram_device() ensures that
VFIO
supports mmap(); memory_region_init_ram_device_ptr() is called
from
vfio_region_mmap(), which is only called when VFIO supports
mmap().
My point is not whether a dmabuf provider provides support for
mmap()
or not but about the fact that mmap() can fail (for various reasons
because
it is not a guarantee) making res->blob NULL. But we are incorrectly
using
res->blob pointer to determine whether a resource is a blob (and
usable)
or not which doesn't make sense because even if res->blob is NULL,
the
resource is still valid and usable via the dmabuf fd, which is the
preferred,
accelerated path.
Failing to mmap something that is already mmap-ed to another
address is
very unrealistic and I can't really think of a possibility of such a
failure aside bugs.
The fact that it is already mmap'd to another address would only be
true for
VFIO devices but as I mentioned previously, we cannot make such
assumptions
with other (future) dmabuf providers.
It is true for udmabuf, though the memfds that back udmabuf are
directly
mapped instead; I don't think the indirection of udmabuf makes any
difference.
If it's only for future DMA-BUF exporter, it is better to make the
change when the exporter is actually added, or we are adding code that
cannot be tested right now and may or may not work when such an
exporter
is added.
If this condition (a valid resource with a NULL res->blob) could only
happen due to a bug, then, in my opinion, marking such a resource as
invalid is actually a more defensive and desirable approach. If a core
operation like mmap fails unexpectedly on a resource that should
support
But mmap is not considered as a core operation for dmabuf. It is
considered
optional by dmabuf providers. For example, although very unlikely, it
might
be possible that support for mmap() can be removed from udmabuf
driver
driver for some reason. And, when this happens, the only adverse effect
would
be that gl=off would not work, which is not great but definitely not
catastrophic.
We should be able to safely assume it never happens due to the "no
regressions" rule of Linux. If a userspace program breaks due to a UAPI
I help maintain the udmabuf driver in the kernel and AFAICT, that rule
does
not apply here because udmabuf driver providing support for mmap()
cannot
be considered UAPI because it is not providing any direct (user visible)
interface
to invoke mmap() as described here:
https://docs.kernel.org/admin-guide/reporting-regressions.html#what-is-
a-regression-and-what-is-the-no-regressions-rule
I suppose that the result of open("/dev/udmabuf", O_RDWR) and the
UDMABUF_CREATE_LIST ioctl is always compatible with mmap(). Not sure
what you mean by "direct" interface, but they are all plain userspace
interfaces.
I suggest asking people maintaining the udmabuf and the DMA-BUF
infrastructure if you know them.
I am one of the two maintainers of udmabuf driver and I am also involved
in the discussions with dmabuf maintainers regarding upstreaming of
vfio-pci dmabuf feature in the kernel. And, based on these interactions,
it is clear that mmap() is completely optional for dmabuf providers and
userspace should never make any assumptions about whether mmap()
(for dmabuf) works or not. Here is one reference:
https://lore.kernel.org/dri-devel/[email protected]/
That adds nothing more than you have already told in the past discussion
in this thread.
I started this thread because I found that the current implementation
only exercises udmabuf and VFIO devices that is known to implement
mmap(). On the other hand, the cited email discusses VFIO devices that
are not known to implement mmap() or future DMA-BUF exporters in
general. It says:
> Just wanted to confirm that it's entirely legit to not implement
> dma-buf mmap. Same for the in-kernel vmap functions. Especially for
> really funny buffers like these it's just not a good idea, and the
> dma-buf interfaces are intentionally "everything is optional".
This doesn't apply to udmabuf and VFIO devices we are concerned with.
udmabuf is obviously not "funny buffers", and can be vmapped. Besides,
the udmabuf already implements mmap() and *removing* it can break
userspace unlike a DMA-BUF exporter that never implemented mmap(), which
was discussed in the cited email.
Speaking of VFIO devices, the code already checks it's mmapped by
calling memory_region_is_ram_device() and yet this patch tries to allow
blobs that cannot be mmapped, which is inconsistent.
So, the discussion should be summarized as the following two questions:
1. Can the kernel remove an existing mmap() implementation that is known
to be depended by the userspace (udmabuf)?
2. Can the kernel let an mmap() implementation fail though it once
worked (VFIO)?
The cited email doesn't answer either of them.
Regards,
Akihiko Odaki