Re: [PATCH v4] vfio/pci: Fix INTx handling on legacy non-PCI 2.3 devices

Alex Williamson Fri, 26 Sep 2025 20:15:40 -0700

On Tue, 23 Sep 2025 12:04:33 -0500 (CDT)
Timothy Pearson <[email protected]> wrote:


> PCI devices prior to PCI 2.3 both use level interrupts and do not support
> interrupt masking, leading to a failure when passed through to a KVM guest on
> at least the ppc64 platform. This failure manifests as receiving and
> acknowledging a single interrupt in the guest, while the device continues to
> assert the level interrupt indicating a need for further servicing.
> 
> When lazy IRQ masking is used on DisINTx- (non-PCI 2.3) hardware, the 
> following
> sequence occurs:
> 
>  * Level IRQ assertion on device
>  * IRQ marked disabled in kernel
>  * Host interrupt handler exits without clearing the interrupt on the device
>  * Eventfd is delivered to userspace
>  * Guest processes IRQ and clears device interrupt
>  * Device de-asserts INTx, then re-asserts INTx while the interrupt is masked
>  * Newly asserted interrupt acknowledged by kernel VMM without being handled
>  * Software mask removed by VFIO driver
>  * Device INTx still asserted, host controller does not see new edge after EOI
> 
> The behavior is now platform-dependent.  Some platforms (amd64) will continue
> to spew IRQs for as long as the INTX line remains asserted, therefore the IRQ
> will be handled by the host as soon as the mask is dropped.  Others (ppc64) 
> will
> only send the one request, and if it is not handled no further interrupts will
> be sent.  The former behavior theoretically leaves the system vulnerable to
> interrupt storm, and the latter will result in the device stalling after
> receiving exactly one interrupt in the guest.
> 
> Work around this by disabling lazy IRQ masking for DisINTx- INTx devices.
> 
> Signed-off-by: Timothy Pearson <[email protected]>
> ---
>  drivers/vfio/pci/vfio_pci_intrs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
> b/drivers/vfio/pci/vfio_pci_intrs.c
> index 123298a4dc8f..61d29f6b3730 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -304,9 +304,14 @@ static int vfio_intx_enable(struct vfio_pci_core_device 
> *vdev,
>  
>       vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX;
>  
> +     if (!vdev->pci_2_3)
> +             irq_set_status_flags(pdev->irq, IRQ_DISABLE_UNLAZY);
> +
>       ret = request_irq(pdev->irq, vfio_intx_handler,
>                         irqflags, ctx->name, ctx);
>       if (ret) {
> +             if (!vdev->pci_2_3)
> +                     irq_clear_status_flags(pdev->irq, IRQ_DISABLE_UNLAZY);
>               vdev->irq_type = VFIO_PCI_NUM_IRQS;
>               kfree(name);
>               vfio_irq_ctx_free(vdev, ctx, 0);
> @@ -352,6 +357,8 @@ static void vfio_intx_disable(struct vfio_pci_core_device 
> *vdev)
>               vfio_virqfd_disable(&ctx->unmask);
>               vfio_virqfd_disable(&ctx->mask);
>               free_irq(pdev->irq, ctx);
> +             if (!vdev->pci_2_3)
> +                     irq_clear_status_flags(pdev->irq, IRQ_DISABLE_UNLAZY);
>               if (ctx->trigger)
>                       eventfd_ctx_put(ctx->trigger);
>               kfree(ctx->name);

As expected, I don't note any functional issues with this on x86.  I
didn't do a full statistical analysis, but I suspect this might
slightly reduce the mean interrupt rate (netperf TCP_RR) and increase
the standard deviation, but not sufficiently worrisome for a niche use
case like this.

Applied to vfio next branch for v6.18.  Thanks,

Alex

Re: [PATCH v4] vfio/pci: Fix INTx handling on legacy non-PCI 2.3 devices

Reply via email to