Hi Alex,
On 21/04/2026 20:31, Alex Williamson wrote:
On Tue, 21 Apr 2026 10:41:40 -0700
Matt Evans <[email protected]> wrote:
The previous lazy-setup of the barmaps provided opportunities to
forget to do it (for example, DMABUF export). Since all users will
ultimately require BAR resources to have been requested, request them
in vfio_pci_core_enable.
Existing calls to vfio_pci_core_setup_barmap() are now benign, but
remain because some callers use it to validate a BAR index. This
fixes at least the case where DMABUF export could succeed before
resources were requested.
This keeps resource request and ioremap() done at the same time, but
in future the ioremap() could be done on-demand (not all VFIO users
need it).
Fixes: 7f5764e179c6 ("vfio: use vfio_pci_core_setup_barmap to map bar in mmap")
Fixes: 0d77ed3589ac0 ("vfio/pci: Pull BAR mapping setup from read-write path")
Signed-off-by: Matt Evans <[email protected]>
---
drivers/vfio/pci/vfio_pci_core.c | 63 +++++++++++++++++++++++++++-----
drivers/vfio/pci/vfio_pci_rdwr.c | 25 +++----------
2 files changed, 60 insertions(+), 28 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 3f8d093aacf8..4a314213f3ae 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -482,6 +482,55 @@ static int vfio_pci_core_runtime_resume(struct device *dev)
}
#endif /* CONFIG_PM */
+static void __vfio_pci_core_unmap_bars(struct vfio_pci_core_device *vdev)
+{
+ struct pci_dev *pdev = vdev->pdev;
+ int i;
+
+ for (i = 0; i < PCI_STD_NUM_BARS; i++) {
+ int bar = i + PCI_STD_RESOURCES;
+
+ if (!vdev->barmap[i])
+ continue;
+ pci_iounmap(pdev, vdev->barmap[bar]);
+ pci_release_selected_regions(pdev, 1 << bar);
+ vdev->barmap[bar] = NULL;
+ }
+}
+
+static int __vfio_pci_core_map_bars(struct vfio_pci_core_device *vdev)
+{
+ struct pci_dev *pdev = vdev->pdev;
+ int ret = 0;
+ int i;
+
+ /* Eager-request BAR resources (and iomap) */
+ for (i = 0; i < PCI_STD_NUM_BARS; i++) {
+ int bar = i + PCI_STD_RESOURCES;
+ void __iomem *io;
+
+ if (pci_resource_len(pdev, i) == 0)
+ continue;
+
+ ret = pci_request_selected_regions(pdev, 1 << bar, "vfio");
+ if (ret)
+ goto err_free_barmap;
+
+ io = pci_iomap(pdev, bar, 0);
+ if (!io) {
+ pci_release_selected_regions(pdev, 1 << bar);
+ ret = -ENOMEM;
+ goto err_free_barmap;
+ }
+ vdev->barmap[bar] = io;
+ }
+ return 0;
+
+err_free_barmap:
+ __vfio_pci_core_unmap_bars(vdev);
+ return ret;
+}
+
/*
* The pci-driver core runtime PM routines always save the device state
* before going into suspended state. If the device is going into low power
@@ -568,6 +617,9 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
vdev->has_vga = true;
+ ret = __vfio_pci_core_map_bars(vdev);
+ if (ret)
+ goto out_free_zdev;
Beyond simply changing to a preemptive mapping scheme, this hard
failure makes me concerned about regressing userspace. With the
current lazy scheme, we only claim the BAR if the user accesses it and
the failure only occurs in the access path. That means we could have
BARs that are never mapped. With a hard failure here, with might be
uncovering some latent issues, and if those latent issues are with BARs
that we never cared to map previously, we're causing trouble for no
gain.
Thanks. Hm, so for example a valid non-zero-sized BAR but the
pci_iomap() hitting ENOMEM. Good point, that could be annoying if the
BAR was otherwise unused.
I'd suggest setup should be a void function that generates pci_warn()
on conflict or iomap error, but doesn't block the device. Each usage
path (including mmap that gets removed in the next path) still needs to
validate the mapping is present for the given BAR and fail the call
path otherwise.
I'd hoped to be able to assume resources have been successfully
requested, since we've at least one example of forgetting to do it
explicitly (DMABUF export), but no worries. I'll re-add explicit checks
to usage paths (incl. for DMABUF export) to make failures consistent
with current behaviour.
There's also a subtle range check added in the virtio driver in the
next patch, is that fixing a bug or paranoia? Do we need a helper that
does a range test and barmap test? Thanks,
That'll be nicer, I agree. Given the point above, two tiny helpers make
sense for the two "uses":
- Was resource X requested successfully? (For DMABUF, mmap, etc.)
- Get X's iomapped base address. (For rdwr, virtio, nvgpu-grace.)
(The motivation for the range check was just robustness, not a specific
bug. It's cheap to sanity-check and Pretty Bad if the index is ever out
of range.)
Reposting... Thanks,
Matt