18.01.2024 21:51, Matthew Rosato :
Commit ef1535901a0 (re-)introduced an issue where passthrough ISM devices
on s390x would enter an error state after reboot. This was previously fixed
by 03451953c79e, using device reset callbacks, however the change in
ef1535901a0 effectively triggers a cold reset of the pci bus before the
device reset callbacks are triggered.
To resolve this, this series proposes to remove the use of the reset callback
for ISM cleanup and instead trigger ISM reset from subsystem_reset before
triggering bus resets. This has to happen before the bus resets because the
reset of s390-pcihost will trigger reset of the PCI bus followed by the
s390-pci bus, and the former will trigger vfio-pci reset / the aperture-wide
unmap that ISM gets upset about.
/s390-pcihost (s390-pcihost)
/pci.0 (PCI)
/s390-pcibus.0 (s390-pcibus)
While fixing this, it was also noted that kernel warnings could be seen that
indicate a guest ISC reference count error. That's because in some reset
cases we were not bothering to disable AIF, but would again re-enable it after
the reset (causing the reference count to grow erroneously). This was a base
issue that went unnoticed because the kernel previously did not detect and
issue a warning for this scenario.
Matthew Rosato (3):
s390x/pci: avoid double enable/disable of aif
s390x/pci: refresh fh before disabling aif
s390x/pci: drive ISM reset from subsystem reset
Is it this a material for -stable, or there's no need to bother?
(changes 1 and 2 applies to 7.2 (while 2 fixes later change),
all 3 applies to 8.1 (while 3 fixes later change), and all 3 can be
picked up for 8.2, I guess).
Thanks,
/mjt