On Mon, Sep 29, 2025 at 02:15:47AM -0700, Breno Leitao wrote: > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > does not rate limit, given this is fatal. > > This prevents a kernel crash triggered by dereferencing a NULL pointer > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > which already performs this NULL check. > > Cc: [email protected] > Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error > logging") > Signed-off-by: Breno Leitao <[email protected]>
Thanks, Breno, I applied this to pci/aer for v6.18. I added a little more detail to the commit log because the path where we hit this is a bit obscure. Please take a look and see if it makes sense: https://git.kernel.org/cgit/linux/kernel/git/pci/pci.git/commit/?id=451f30b97807 > --- > - This problem is still happening in upstream, and unfortunately no action > was done in the previous discussion. > - Link to previous post: > https://lore.kernel.org/r/[email protected] > --- > drivers/pci/pcie/aer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index e286c197d7167..55abc5e17b8b1 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev > *pdev, > > static int aer_ratelimit(struct pci_dev *dev, unsigned int severity) > { > + if (!dev->aer_info) > + return 1; > + > switch (severity) { > case AER_NONFATAL: > return __ratelimit(&dev->aer_info->nonfatal_ratelimit); > > --- > base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a > change-id: 20250801-aer_crash_2-b21cc2ef0d00 > > Best regards, > -- > Breno Leitao <[email protected]> >
