Hi, I'm experiencing a regression in EEH that was introduced somewhere between 4.0 and 4.1.
I have been reproducing this with a CAPI (CXL) card, but the behaviour isn't CAPI related and the triggering code hasn't changed. CAPI cards are reprogrammed by PERSTing the slot they sit in, so CAPI exposes a 'reset' file in sysfs that does "pci_set_pcie_reset_state(dev, pcie_warm_reset)", and then relies on EEH noticing to properly reset the card. In 4.0 and earlier, this worked: the slot would be persted, EEH would notice and hotplug. You could do this as many times as you liked. In 4.1 and later, you can do 1 successful reset, but any subsequent reset causes the following to be printed in dmesg: [ 225.118656] cxl-pci 0006:01:00.0: CXL reset [ 225.118663] pcibios_set_pcie_reset_state: No PE found on PCI device 0006:01:00.0 [ 225.118672] cxl-pci 0006:01:00.0: cxl: pcie_warm_reset failed I'm digging through the commits between 4.0 and 4.1 at the moment, but I thought I'd post it here in hopes someone had an idea what the root cause was. -- Regards, Daniel
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev