** Description changed:

+ [ Impact ]
+ 
+ s390/pci: Fix immediate re-add of PCI function after remove
+ 
+ A PCI function may be reserved directly after being
+ deconfigured. If it subsequently returns back in the standby
+ state Linux may not be able to use the new instance generating
+ a kernel warning about trying to create an already existing
+ sysfs file for the IOMMU.
+ 
+ The problem occurs because the new instance of the same
+ underlying device is created before the prior instance is
+ completely torn down. This happens because the lifetime of the
+ PCI device representation in Linux is determined by reference
+ counts. A driver, the network stack, or even user-space
+ (including via vfio-pci) may be holding onto the device
+ represenation even after the underlying device is gone.
+ 
+ The solution to this is twofold. Firstly allow re-using the
+ pre-existing struct zpci_dev and/or struct pci_dev for the newly
+ re-added instance of the underlying device up until the point
+ where the struct zpci_dev is fully removed. Secondly serialize
+ the addition and removal of PCI functions such that re-adding
+ a new instance, after the old one is already being removed, will
+ wait for the removal to finish before adding the new instance.
+ This fix also builds on prior upstream work of serializing state
+ transitions for PCI devices e.g. from configured to standby.
+ 
+ [ Fix ]
+ 
+ Backport from mainline:
+ - 0d48566d4b58 s390/pci: rename lock member in struct zpci_dev
+ - bcb5d6c76903 s390/pci: introduce lock to synchronize state of zpci_dev's
+ - 6ee600bfbe0f s390/pci: remove hotplug slot when releasing the device
+ - c4a585e952ca s390/pci: Fix potential double remove of hotplug slot
+ - 42420c50c68f s390/pci: Fix missing check for zpci_create_device() error 
return
+ - 05a2538f2b48 s390/pci: Fix duplicate pci_dev_put() in disable_slot() when 
PF has child VFs
+ - d76f96332967 s390/pci: Remove redundant bus removal and disable from 
zpci_release_device()
+ - 47c397844869 s390/pci: Prevent self deletion in disable_slot()
+ - 4b1815a52d7e s390/pci: Allow re-add of a reserved but not yet removed device
+ - 774a1fa880bc s390/pci: Serialize device addition and removal
+ 
+ [ Test Plan ]
+ 
+ The issue can be reproduced looking at the behavior of the kernel wrt to
+ NETH PCI functions. In fact, IBM Z firmware temporarily reserves NETH
+ PCI functions to check for pending service when the last FID of a PCHID
+ is deconfigured. When nothing is pending the PCI function is immediately
+ returned in the standby state, thus triggering this issue quite
+ reliably.
+ 
+ [ Where Problems Could Occur ]
+ 
+ The fix affects the PCI function lifecycle management in the s390 PCI
+ hotplug infrastructure, specifically the serialization and reuse logic
+ of zpci_dev and pci_dev structures during rapid remove and re-add
+ cycles. An issue with this fix may introduce problems such as stale or
+ incorrectly reused device state, leading to improper reinitialization of
+ PCI functions.
+ 
+ 
+ ---
+ 
  Description:   s390/pci: Fix immediate re-add of PCI function after
  remove
  
  Symptom:       A PCI function may be reserved directly after being
-                deconfigured. If it subsequently returns back in the standby
-                state Linux may not be able to use the new instance generating
-                a kernel warning about trying to create an already existing
-                sysfs file for the IOMMU.
+                deconfigured. If it subsequently returns back in the standby
+                state Linux may not be able to use the new instance generating
+                a kernel warning about trying to create an already existing
+                sysfs file for the IOMMU.
  
  Problem:       The problem occurs because the new instance of the same
-                underlying device is created before the prior instance is
-                completely torn down. This happens because the lifetime of the
-                PCI device representation in Linux is determined by reference
-                counts. A driver, the network stack, or even user-space
-                (including via vfio-pci) may be holding onto the device
-                represenation even after the underlying device is gone.
+                underlying device is created before the prior instance is
+                completely torn down. This happens because the lifetime of the
+                PCI device representation in Linux is determined by reference
+                counts. A driver, the network stack, or even user-space
+                (including via vfio-pci) may be holding onto the device
+                represenation even after the underlying device is gone.
  
  Solution:      The solution to this is twofold. Firstly allow re-using the
-                pre-existing struct zpci_dev and/or struct pci_dev for the 
newly
-                re-added instance of the underlying device up until the point
-                where the struct zpci_dev is fully removed. Secondly serialize
-                the addition and removal of PCI functions such that re-adding
-                a new instance, after the old one is already being removed, 
will
-                wait for the removal to finish before adding the new instance.
-                This fix also builds on prior upstream work of serializing 
state
-                transitions for PCI devices e.g. from configured to standby.
+                pre-existing struct zpci_dev and/or struct pci_dev for the 
newly
+                re-added instance of the underlying device up until the point
+                where the struct zpci_dev is fully removed. Secondly serialize
+                the addition and removal of PCI functions such that re-adding
+                a new instance, after the old one is already being removed, 
will
+                wait for the removal to finish before adding the new instance.
+                This fix also builds on prior upstream work of serializing 
state
+                transitions for PCI devices e.g. from configured to standby.
  
  Reproduction:  This problem was originally found with firmware which
-                temporarily reserves NETH PCI functions to check for pending
-                service when the last FID of a PCHID is deconfigured. When
-                nothing is pending the PCI function is immediately returned in
-                the standby state, thus triggering this issue quite reliably.
+                temporarily reserves NETH PCI functions to check for pending
+                service when the last FID of a PCHID is deconfigured. When
+                nothing is pending the PCI function is immediately returned in
+                the standby state, thus triggering this issue quite reliably.
  
  Upstream-ID:   0d48566d4b58946c8e1b0baac0347616060a81c9
-                bcb5d6c769039c8358a2359e7c3ea5d97ce93108
-                6ee600bfbe0f818ffb7748d99e9b0c89d0d9f02a
-                c4a585e952ca403a370586d3f16e8331a7564901
-                42420c50c68f3e95e90de2479464f420602229fc
-                05a2538f2b48500cf4e8a0a0ce76623cc5bafcf1
-                d76f9633296785343d45f85199f4138cb724b6d2
-                47c397844869ad0e6738afb5879c7492f4691122
-                4b1815a52d7eb03b3e0e6742c6728bc16a4b2d1d
-                774a1fa880bc949d88b5ddec9494a13be733dfa8
+                bcb5d6c769039c8358a2359e7c3ea5d97ce93108
+                6ee600bfbe0f818ffb7748d99e9b0c89d0d9f02a
+                c4a585e952ca403a370586d3f16e8331a7564901
+                42420c50c68f3e95e90de2479464f420602229fc
+                05a2538f2b48500cf4e8a0a0ce76623cc5bafcf1
+                d76f9633296785343d45f85199f4138cb724b6d2
+                47c397844869ad0e6738afb5879c7492f4691122
+                4b1815a52d7eb03b3e0e6742c6728bc16a4b2d1d
+                774a1fa880bc949d88b5ddec9494a13be733dfa8

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2114174

Title:
  [UBUNTU 24.04] s390/pci: Fix immediate re-add of PCI function after
  remove

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2114174/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to