Hi Akihiko, > > > >> Subject: Re: [PATCH v1 2/7] virtio-gpu: Don't rely on res->blob to identify > >> blob resources > >> > >>>>>> > >>>>>>> Subject: Re: [PATCH v1 2/7] virtio-gpu: Don't rely on res->blob to > >>>>>>> identify blob > >>>>>>> resources > >>>>>>> > >>>>>>>> > >>>>>>>>> Subject: Re: [PATCH v1 2/7] virtio-gpu: Don't rely on res->blob to > >>>>>>>>> identify > >>>>>>> blob > >>>>>>>>> resources > >>>>>>>>> > >>>>>>>>> On 2025/10/04 8:35, Vivek Kasireddy wrote: > >>>>>>>>>> The res->blob pointer may not be valid (non-NULL) for some > blobs > >>>>>>>>>> where the backing storage is not memfd based. Therefore, we > >> cannot > >>>>>>>>>> use it to determine if a resource is a blob or not. Instead, we > >>>>>>>>>> could use res->blob_size to make this determination as it is > >>>>>>>>>> non-zero for blob resources regardless of where their backing > >>>>>>>>>> storage is located. > >>>>>>>>> > >>>>>>>>> I think this patch is no longer necessary since now you add code to > >>>>>>>>> mmap() VFIO storage with "[PATCH v1 7/7] virtio-gpu-udmabuf: > >> Create > >>>>>>>>> dmabuf for blobs associated with VFIO devices". > >>>>>>>> Right, but given that mmap() can still fail for various reasons and > >>>>>>>> this > >>>>>>>> use-case can work as long as dmabuf creation succeeds, I think it > >> makes > >>>>>>>> sense to not rely on res->blob to determine if a resource is blob or > >>>>>>>> not. > >>>>>>> > >>>>>>> I think the code will be simpler by making resource creation fail when > >>>>>>> mmap() fails, and I am concerned that the guest may mulfunction > >> with > >>>>>>> such an incomplete resource. > >>>>>> AFAICT, mmap() is a slow, optional path except for the cursor (which > >>>>>> needs > >>>>>> further improvement). So, failing resource creation when mmap() fails > >>>>>> does not seem like a good idea to me given the fact that supporting > >>>>>> mmap() > >>>>>> is considered optional for dmabuf providers. And, even with vfio, > >> mmap() > >>>>>> can be blocked for various reasons by the kernel driver IIUC. > >>>> > >>>> Reviewing the code again, I don't think mmap() can fail with the current > >>>> version of this series. > >>>> > >>>> udmabuf obviously always supports mmap(). > >>>> > >>>> For VFIO, checking memory_region_is_ram_device() ensures that VFIO > >>>> supports mmap(); memory_region_init_ram_device_ptr() is called from > >>>> vfio_region_mmap(), which is only called when VFIO supports mmap(). > >>> My point is not whether a dmabuf provider provides support for mmap() > >>> or not but about the fact that mmap() can fail (for various reasons > >> because > >>> it is not a guarantee) making res->blob NULL. But we are incorrectly using > >>> res->blob pointer to determine whether a resource is a blob (and usable) > >>> or not which doesn't make sense because even if res->blob is NULL, the > >>> resource is still valid and usable via the dmabuf fd, which is the > >>> preferred, > >>> accelerated path. > >> > >> Failing to mmap something that is already mmap-ed to another address is > >> very unrealistic and I can't really think of a possibility of such a > >> failure aside bugs. > > The fact that it is already mmap'd to another address would only be true for > > VFIO devices but as I mentioned previously, we cannot make such > assumptions > > with other (future) dmabuf providers. > > It is true for udmabuf, though the memfds that back udmabuf are directly > mapped instead; I don't think the indirection of udmabuf makes any > difference. > > If it's only for future DMA-BUF exporter, it is better to make the > change when the exporter is actually added, or we are adding code that > cannot be tested right now and may or may not work when such an exporter > is added. > > > > >> > >> If this condition (a valid resource with a NULL res->blob) could only > >> happen due to a bug, then, in my opinion, marking such a resource as > >> invalid is actually a more defensive and desirable approach. If a core > >> operation like mmap fails unexpectedly on a resource that should support > > But mmap is not considered as a core operation for dmabuf. It is considered > > optional by dmabuf providers. For example, although very unlikely, it might > > be possible that support for mmap() can be removed from udmabuf driver > > driver for some reason. And, when this happens, the only adverse effect > would > > be that gl=off would not work, which is not great but definitely not > catastrophic. > > We should be able to safely assume it never happens due to the "no > regressions" rule of Linux. If a userspace program breaks due to a UAPI I help maintain the udmabuf driver in the kernel and AFAICT, that rule does not apply here because udmabuf driver providing support for mmap() cannot be considered UAPI because it is not providing any direct (user visible) interface to invoke mmap() as described here: https://docs.kernel.org/admin-guide/reporting-regressions.html#what-is-a-regression-and-what-is-the-no-regressions-rule
> change, the UAPI change is a breaking change, which kernel developers > carefully avoid. > > Existing QEMU versions will break if such a change happens. Perhaps the Qemu should not have assumed that udmabuf driver (or any dmabuf provider) would always support mmap(), because for dmabuf providers like udmabuf, supporting mmap() is optional as mentioned here: https://elixir.bootlin.com/linux/v6.17/source/include/linux/dma-buf.h#L269 And, as mentioned previously, Qemu would not break if a dmabuf provider does not support mmap() because the preferred, fd based fast path (gl=on) would still be available. Thanks, Vivek > no regressions policy may be exempted if the relevant userspace program > is so broken that it depends on something never meant to be stable, but > it is not the case. > > Regards, > Akihiko Odaki
