EEH regression: PE <-> device binding lost after reset

Daniel Axtens Sun, 09 Aug 2015 16:26:48 -0700

Hi,

I'm experiencing a regression in EEH that was introduced somewhere
between 4.0 and 4.1.


I have been reproducing this with a CAPI (CXL) card, but the behaviour
isn't CAPI related and the triggering code hasn't changed. CAPI cards
are reprogrammed by PERSTing the slot they sit in, so CAPI exposes a
'reset' file in sysfs that does "pci_set_pcie_reset_state(dev,
pcie_warm_reset)", and then relies on EEH noticing to properly reset the
card.

In 4.0 and earlier, this worked: the slot would be persted, EEH would
notice and hotplug. You could do this as many times as you liked.

In 4.1 and later, you can do 1 successful reset, but any subsequent
reset causes the following to be printed in dmesg:

[  225.118656] cxl-pci 0006:01:00.0: CXL reset
[  225.118663] pcibios_set_pcie_reset_state: No PE found on PCI device 
0006:01:00.0
[  225.118672] cxl-pci 0006:01:00.0: cxl: pcie_warm_reset failed

I'm digging through the commits between 4.0 and 4.1 at the moment, but I
thought I'd post it here in hopes someone had an idea what the root
cause was. 


-- 
Regards,
Daniel

signature.asc
Description: This is a digitally signed message part

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

EEH regression: PE <-> device binding lost after reset

Reply via email to