On Thu, 30 Apr 2026 03:03:20 -0700
Matt Evans <[email protected]> wrote:

> Previously BAR resource requests and the corresponding pci_iomap()
> were performed on-demand and without synchronisation, which was racy.
> Rather than add synchronisation, it's simplest to address this by
> doing both activities from vfio_pci_core_enable().
> 
> The resource allocation and/or pci_iomap() can still fail; their
> status is tracked and existing calls to vfio_pci_core_setup_barmap()
> will fail in a similar way to before.  This keeps the point of failure
> as observed by userspace the same, i.e. failures to request/map unused
> BARs are benign.
> 
> Fixes: 7f5764e179c6 ("vfio: use vfio_pci_core_setup_barmap to map bar in 
> mmap")
> Fixes: 0d77ed3589ac0 ("vfio/pci: Pull BAR mapping setup from read-write path")

Neither of these introduced races, they only moved what they were
already doing into a function or made use of that shared function for
what they were already doing.  I'm inclined to believe the raciness
existed from the introduction, 89e1f7d4c66d.

> Signed-off-by: Matt Evans <[email protected]>
> ---
>  drivers/vfio/pci/vfio_pci_core.c | 33 ++++++++++++++++++++++++++++++++
>  drivers/vfio/pci/vfio_pci_rdwr.c | 29 ++++++++++++----------------
>  2 files changed, 45 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
> b/drivers/vfio/pci/vfio_pci_core.c
> index 3f8d093aacf8..eab4f2626b39 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -482,6 +482,38 @@ static int vfio_pci_core_runtime_resume(struct device 
> *dev)
>  }
>  #endif /* CONFIG_PM */
>  
> +static void vfio_pci_core_map_bars(struct vfio_pci_core_device *vdev)
> +{
> +     struct pci_dev *pdev = vdev->pdev;
> +     int i;
> +
> +     /*
> +      * Eager-request BAR resources, and iomap.  Soft failures are
> +      * allowed, and consumers must check the barmap before use in
> +      * order to give compatible user-visible behaviour with the
> +      * previous on-demand allocation method.
> +      */
> +     for (i = 0; i < PCI_STD_NUM_BARS; i++) {
> +             int bar = i + PCI_STD_RESOURCES;
> +             void __iomem *io = ERR_PTR(-ENODEV);

It would collapse the nesting depth to just do:

                vdev->barmap[bar] = ERR_PTR(-ENODEV);

                if (!pci_resource_len(pdev, i))
                        continue;

                if (pci_request_selected_regions(pdev, 1 << bar, "vfio")) {
                        pci_dbg(vdev->pdev, "Failed to reserve region %d\n", 
bar);
                        vdev->barmap[bar] = ERR_PTR(-EBUSY);
                        continue;
                }

                vdev->barmap[bar] = pci_iomap(pdev, bar, 0);
                if (!vdev->barmap[bar]) {
                        pci_dbg(vdev->pdev, "Failed to iomap region %d\n", bar);
                        vdev->barmap[bar] = ERR_PTR(-ENOMEM);
                }

It's debatable what level to use for the errors, but we were previously
silent on this, so going all the way to pci_warn() seems unnecessary.

> +
> +             if (pci_resource_len(pdev, i) > 0) {
> +                     if (pci_request_selected_regions(pdev, 1 << bar, 
> "vfio")) {
> +                             pci_warn(vdev->pdev, "Failed to reserve region 
> %d\n", bar);
> +                             io = ERR_PTR(-EBUSY);
> +                     } else {
> +                             io = pci_iomap(pdev, bar, 0);
> +                             if (!io) {
> +                                     pci_warn(vdev->pdev, "Failed to iomap 
> region %d\n",
> +                                              bar);
> +                                     io = ERR_PTR(-ENOMEM);
> +                             }
> +                     }
> +             }
> +             vdev->barmap[bar] = io;
> +     }
> +}
> +
>  /*
>   * The pci-driver core runtime PM routines always save the device state
>   * before going into suspended state. If the device is going into low power
> @@ -568,6 +600,7 @@ int vfio_pci_core_enable(struct vfio_pci_core_device 
> *vdev)
>       if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
>               vdev->has_vga = true;
>  
> +     vfio_pci_core_map_bars(vdev);
>  
>       return 0;

You're missing the barmap test in vfio_pci_core_disable() now, it's
still testing for NULL, which is (almost?) never true.  It needs to
convert to IS_ERR_OR_NULL().

>  
> diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c 
> b/drivers/vfio/pci/vfio_pci_rdwr.c
> index 4251ee03e146..f66ad3d96481 100644
> --- a/drivers/vfio/pci/vfio_pci_rdwr.c
> +++ b/drivers/vfio/pci/vfio_pci_rdwr.c
> @@ -200,25 +200,20 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_do_io_rw);
>  
>  int vfio_pci_core_setup_barmap(struct vfio_pci_core_device *vdev, int bar)
>  {
> -     struct pci_dev *pdev = vdev->pdev;
> -     int ret;
> -     void __iomem *io;
> -
> -     if (vdev->barmap[bar])
> -             return 0;
> -
> -     ret = pci_request_selected_regions(pdev, 1 << bar, "vfio");
> -     if (ret)
> -             return ret;
> -
> -     io = pci_iomap(pdev, bar, 0);
> -     if (!io) {
> -             pci_release_selected_regions(pdev, 1 << bar);
> -             return -ENOMEM;
> -     }
> +     /*
> +      * The barmap is set up in vfio_pci_core_enable().  Callers
> +      * use this function to check that the BAR resources are
> +      * requested or that the pci_iomap() was done.
> +      */

Looks like a function level comment to be placed above the function
definition.  TBH, the comment in the previous function could also be
pulled up as a function level comment.

> +     if (bar < 0 || bar >= PCI_STD_NUM_BARS)

Maybe `if ((unsigned)bar >= PCI_STD_NUM_BARS)` but really author
preference here.

> +             return -EINVAL;
>  
> -     vdev->barmap[bar] = io;
> +     /* Did vfio_pci_core_map_bars() set it up yet? */
> +     if (!vdev->barmap[bar])
> +             return -ENODEV;

What hits this?  Should it be a WARN_ON_ONCE?  It would need to be a use
case that accesses barmap outside of the window between enable and
disable, where I think we're defining the contract that it's only valid
between those events.  Both this and the range check could move to the
iomap implemenation to keep the Fixes: patch reasonably small since
afaik they're not triggered.  The BAR range test could be WARN_ON_ONCE
as well, only driver bugs should hit it.  Thanks,

Alex

>  
> +     if (IS_ERR(vdev->barmap[bar]))
> +             return PTR_ERR(vdev->barmap[bar]);
>       return 0;
>  }
>  EXPORT_SYMBOL_GPL(vfio_pci_core_setup_barmap);


Reply via email to