Hi Andy

在 2026/02/23 星期一 23:50, Andy Shevchenko 写道:
On Mon, Feb 23, 2026 at 5:32 PM Shawn Lin <[email protected]> wrote:

This patch series addresses a long-standing design issue in the PCI/MSI
subsystem where the implicit, automatic management of IRQ vectors by
the devres framework conflicts with explicit driver cleanup, creating
ambiguity and potential resource management bugs.

==== The Problem: Implicit vs. Explicit Management ====
Historically, `pcim_enable_device()` not only manages standard PCI resources
(BARs) via devres but also implicitly triggers automatic IRQ vector management
by setting a flag that registers `pcim_msi_release()` as a cleanup action.

This creates an ambiguous ownership model. Many drivers follow a pattern of:
1. Calling `pci_alloc_irq_vectors()` to allocate interrupts.
2. Also calling `pci_free_irq_vectors()` in their error paths or remove 
routines.

When such a driver also uses `pcim_enable_device()`, the devres framework may
attempt to free the IRQ vectors a second time upon device release, leading to
a double-free. Analysis of the tree shows this hazardous pattern exists widely,
while 35 other drivers correctly rely solely on the implicit cleanup.

Is this confirmed? What I read from the cover letter, this series was
only compile-tested, so how can you prove the problem exists in the
first place?

Yes, it's confirmed. My debug of a double free issue of a out-of-tree
PCIe wifi driver which uses
pcim_enable_device + pci_alloc_irq_vectors + pci_free_irq_vectors expose
it. And we did have a TODO to cleanup this hybrid usage, targeted in
this cycle[1] suggested by Philipp:

[1] https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/log/?h=msi


==== The Solution: Making Management Explicit ====
This series enforces a clear, predictable model:
1.  New Managed API (Patch 1/37): Introduces pcim_alloc_irq_vectors() and
     pcim_alloc_irq_vectors_affinity(). Drivers that desire devres-managed IRQ
     vectors should use these functions, which set the is_msi_managed flag and
     ensure automatic cleanup.
2.  Patches 2 through 36 convert each driver that uses pcim_enable_device() 
alongside
     pci_alloc_irq_vectors() and relies on devres for IRQ vector cleanup to 
instead
     make an explicit call to pcim_alloc_irq_vectors().
3.  Core Change (Patch 37/37): With the former cleanup, now modifies 
pcim_setup_msi_release()
     to check only the is_msi_managed flag. This decouples automatic IRQ 
cleanup from
     pcim_enable_device(). IRQ vectors allocated via pci_alloc_irq_vectors*()
     are now solely the driver's responsibility to free with 
pci_free_irq_vectors().

With these changes, we clear ownership model: Explicit resource management 
eliminates
ambiguity and follows the "principle of least surprise." New drivers choose one 
model and
be consistent.
- Use `pci_alloc_irq_vectors()` + `pci_free_irq_vectors()` for explicit control.
- Use `pcim_alloc_irq_vectors()` for devres-managed, automatic cleanup.

Have you checked previous attempts? Why is your series better than those?

There seems not previous attempts.


==== Testing And Review ====
1. This series is only compiled test with allmodconfig.
2. Given the substantial size of this patch series, I have structured the 
mailing
    to facilitate efficient review. The cover letter, the first patch and the 
last one will be sent
    to all relevant mailing lists and key maintainers to ensure broad 
visibility and
    initial feedback on the overall approach. The remaining subsystem-specific 
patches
    will be sent only to the respective subsystem maintainers and their 
associated
    mailing lists, reducing noise.

Reply via email to