Imre Deak wrote:
[..]
> > > Also Imre tried with 2 PCI patches together 
> > > https://patchwork.freedesktop.org/series/134193/ 
> > > And still not good for those 4 systems (mtlp-9, bat-dg2-13/14 and 
> > > bat-adlp-11) :
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/index.html? 
> > > Dave, Dan, thoughts? 
> > 
> > Can you provide the dmesg from the failure system with the 2 patches 
> > applied please?
> 
> For the above 4 machines, mtlp-9 not having the originally reported WARN
> (at pci.c:4886) only some other lockdep issue, while the other 3
> machines having both the originally reported one and the other lockdep
> issue:

> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-mtlp-9/boot0.txt

This one does not seem to implicate cfg_access_lock at all. I wonder if
you revert the lockdep annotation completely if it still fails. I.e.
this is a new lockdep report for v6.10-rc independent of this new
cfg_access_lock annotation.

> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-dg2-13/boot0.txt
> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-dg2-14/boot0.txt
> https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_134193v1/bat-adlp-11/boot0.txt

These are all identical and are pointing out that vmd, via
pci_reset_bus(), has long been performing an unlocked secondary bus
reset that userspace could race and confuse the kernel.

I think the fix for that is below, but this is an increasingly spicy
level of change that gives me some pause, i.e. teach pci_bus_lock() to
lock the bridge itself:

-- 8< --
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..ac3999bc59e8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -5442,6 +5442,7 @@ static void pci_bus_lock(struct pci_bus *bus)
 {
        struct pci_dev *dev;
 
+       pci_dev_lock(bus->self);
        list_for_each_entry(dev, &bus->devices, bus_list) {
                pci_dev_lock(dev);
                if (dev->subordinate)
@@ -5459,6 +5460,7 @@ static void pci_bus_unlock(struct pci_bus *bus)
                        pci_bus_unlock(dev->subordinate);
                pci_dev_unlock(dev);
        }
+       pci_dev_unlock(bus->self);
 }
 
 /* Return 1 on successful lock, 0 on contention */
@@ -5466,6 +5468,7 @@ static int pci_bus_trylock(struct pci_bus *bus)
 {
        struct pci_dev *dev;
 
+       pci_dev_lock(bus->self);
        list_for_each_entry(dev, &bus->devices, bus_list) {
                if (!pci_dev_trylock(dev))
                        goto unlock;
@@ -5484,6 +5487,7 @@ static int pci_bus_trylock(struct pci_bus *bus)
                        pci_bus_unlock(dev->subordinate);
                pci_dev_unlock(dev);
        }
+       pci_dev_unlock(bus->self);
        return 0;
 }
 

Reply via email to