[bug report][PPC]: rmod nvme driver causes the kernel panic

Nilay Shroff Fri, 17 Oct 2025 22:48:00 -0700

Hi Nam,

On the latest upstream mainline kernel, I am encountering a kernel
crash when attempting to unload the NVMe driver module (rmmod nvme)
on a POWER9 system. The crash appears to be triggered by the recent
work on using MSI parent domains, discussed here: 
https://lore.kernel.org/all/[email protected]/


System details:
===============
Architecture: PowerPC (POWER9, IBM 9008-22L)
Kernel: 6.18.0-rc1 (mainline, unmodified)
Platform: pSeries / PHYP
Reproducibility: Always, when running rmmod nvme

Crash trace:
============
Kernel attempted to read user page (8) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000008
Faulting instruction address: 0xc000000000b30638
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash  SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp bonding tls nft_fib_inet nft_fib_ipv4 
nft_fib_ipv6 nft_fib nft_reject_inet n
CPU: 14 UID: 0 PID: 1973 Comm: rmmod Not tainted 6.18.0-rc1 #63 VOLUNTARY 
Hardware name: IBM,9008-22L POWER9 (architected) 0x4e0202 0xf000005 
of:IBM,FW950.80 (VL950_131) hv:phyp pSeries
NIP:  c000000000b30638 LR: c000000000111d90 CTR: c000000000111d6c
REGS: c00000011f1076e0 TRAP: 0300   Not tainted  (6.18.0-rc1)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 48008228  XER: 
200400cf
CFAR: c00000000000d9cc DAR: 0000000000000008 DSISR: 40000000 IRQMASK: 0 
GPR00: c000000000111d90 c00000011f107980 c000000001da8100 0000000000000000 
GPR04: c0000000bcf535e8 0000000000000000 73efa01ced0dd290 00000000b0734e18 
GPR08: 0000000ffb4c0000 c0000000bcf53540 0000000000000000 0000000048008222 
GPR12: c000000000111d6c c000000017ff1c80 0000000000000000 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR24: 0000000000000000 0000000000000001 c0000000b70bd980 c0000000995890c8 
GPR28: 0000000000000000 c0000000bcf53590 c00000000e79c800 c0000000995890c8 
NIP [c000000000b30638] msi_desc_to_pci_dev+0x8/0x14
LR [c000000000111d90] pseries_msi_ops_teardown+0x24/0x38
Call Trace:
[c00000011f107980] [c0000000995890c8] 0xc0000000995890c8 (unreliable)
[c00000011f1079a0] [c000000000276118] msi_remove_device_irq_domain+0x9c/0x18c
[c00000011f1079e0] [c00000000027623c] msi_device_data_release+0x34/0xa8
[c00000011f107a10] [c000000000c657b8] release_nodes+0xac/0x1f0
[c00000011f107ab0] [c000000000c675e8] devres_release_all+0xc0/0x138
[c00000011f107b20] [c000000000c5bb8c] device_unbind_cleanup+0x2c/0xb0
[c00000011f107b50] [c000000000c5dfc8] device_release_driver_internal+0x2fc/0x34c
[c00000011f107ba0] [c000000000c5e0c4] driver_detach+0x74/0xe0
[c00000011f107bd0] [c000000000c5b3e0] bus_remove_driver+0x94/0x140
[c00000011f107c50] [c000000000c5f1c8] driver_unregister+0x48/0x88
[c00000011f107cc0] [c000000000b228ec] pci_unregister_driver+0x40/0x128
[c00000011f107d10] [c008000004b6834c] nvme_exit+0x20/0x7cd4 [nvme]
[c00000011f107d30] [c0000000002becb8] 
__do_sys_delete_module.constprop.0+0x1ac/0x3ec
[c00000011f107e10] [c000000000032324] system_call_exception+0x134/0x360
[c00000011f107e50] [c00000000000cedc] system_call_vectored_common+0x15c/0x2ec

Analysis:
=========
>From tracing the cleanup path, it appears that the crash happens because the 
>MSI 
descriptor is freed before the MSI teardown is invoked. Specifically, during 
NVMe
module unload (rmmod nvme), the call sequence is as follows:

cleanup_module
 -> pci_unregister_driver
   -> driver_unregister
     -> bus_remove_driver
       -> driver_detach
         -> device_release_driver_internal
           -> device_remove
            -> pci_device_remove
              -> nvme_remove
                -> nvme_dev_disable
                  -> pci_free_irq_vectors
                    -> pci_disable_msix
                      -> pci_free_msi_irqs
                        -> pci_msi_teardown_msi_irqs  ==> here we free msi_desc


Later, when call stack continue unwinding through,
-> device_release_driver_internal
  -> device_unbind_cleanup
    -> devres_release_all
      -> release_nodes
        -> msi_device_data_release
          -> msi_remove_device_irq_domain
            -> pseries_msi_ops_teardown => here the freed msi_desc is 
dereferenced, leads to crash

Possible Cause:
===============
This looks like a cleanup ordering issue introduced by the recent MSI parent
domain rework. The PCI/MSI teardown seems to assume that the MSI descriptor
remains valid until after the domain teardown path executes — which no longer
appears to hold true in this sequence.

Expected behavior:
==================
The rmmod nvme operation should cleanly unload the module without triggering a
crash or accessing freed MSI descriptors.

Additional notes:
=================
- The crash reproduces consistently on PowerPC (pseries, PHYP).
- It did not occur before the MSI parent domain series was merged.
- Likely to affect other MSI-capable PCI drivers.

Let me know if you need any further details. Also if you fix this bug,
I'd be glad to assist you validating the fix on PPC.

Thanks,
--Nilay

[bug report][PPC]: rmod nvme driver causes the kernel panic

Reply via email to