Hi all,
There is (still) an issue with Linux PCIe PTM enabling that happens because
Linux automatically enables PTM if certain capabilities are set. However,
turns out this is not enough because once we enumerate PCIe Switch Upstream
port we also enable PTM but the Downstream Ports are not yet enumerated.
This triggers floods of AER errors like this:
pcieport 0000:00:07.1: AER: Multiple Uncorrectable (Non-Fatal) error
message received from 0000:00:07.1
pcieport 0000:00:07.1: PCIe Bus Error: severity=Uncorrectable
(Non-Fatal), type=Transaction Layer, (Receiver ID)
pcieport 0000:00:07.1: device [8086:d44f] error
status/mask=00200000/00000000
pcieport 0000:00:07.1: [21] ACSViol (First)
pcieport 0000:00:07.1: AER: TLP Header: 0x34000000 0x00000052
0x00000000 0x00000000
pcieport 0000:00:07.1: AER: device recovery successful
pcieport 0000:00:07.1: AER: Uncorrectable (Non-Fatal) error message
received from 0000:00:07.1
We have ACS Source Validation enabled so Requester ID 0 which is sent by
the not-enumerated Downstream Port triggers the ACS violation AER.
This can be prevented by enabling PTM when the whole topology has been
enumerated and doing it like that seems to be reasonable anyway because we
only have a couple of drivers enabling it now so it does not make sense to
enable otherwise as it consumes bandwidth.
I did that fix and the problem went away but wanted to test with a device
and driver that actually enables PTM. I have a couple of igc NICs here that
has this support. However, when testing I noticed that during power state
transitions we still get errors like this from igc:
igc 0000:03:00.0 enp3s0: Timeout reading IGC_PTM_STAT register
and after this PTM for the device stays disabled.
This series includes fixes for igc that deal with the issues I found and
now PTM gets succesfully enabled and works accross suspend and runtime
suspend of igc, and there are no flood of AER errors as above. While there
there is one cleanup patch in the middle that drops unused parameter.
Mika Westerberg (5):
igc: Call netif_queue_set_napi() with rntl locked
igc: Let the PCI core deal with the PM resume flow
igc: Don't reset the hardware on suspend path
PCI/PTM: Drop granularity parameter from pci_enable_ptm()
PCI/PTM: Do not enable PTM automatically for Root and Switch Upstream Ports
drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
drivers/net/ethernet/intel/idpf/idpf_main.c | 2 +-
drivers/net/ethernet/intel/igc/igc.h | 2 +-
drivers/net/ethernet/intel/igc/igc_ethtool.c | 6 +-
drivers/net/ethernet/intel/igc/igc_main.c | 33 ++++----
.../net/ethernet/mellanox/mlx5/core/main.c | 2 +-
drivers/pci/pcie/ptm.c | 77 ++++++++++---------
include/linux/pci.h | 6 +-
8 files changed, 64 insertions(+), 66 deletions(-)
--
2.50.1