Re: [PATCH v1] powerpc/pci: restore LSI mappings on card present state change
On 8/27/24 22:27, Oleksandr Ocheretnyi wrote: Commit 450be4960a0f ("powerpc/pci: Remove LSI mappings on device teardown") frees irq descriptors on PCIe hotplug link change event (Link Down), but the disposed mappings are not restored back on PCIe hotplug link change event (Card present). This change restores IRQ mappings disposed earlier when pcieport link's gone down. So, the call pci_read_irq_line is invoked again on pcieport's state change (Card present). Few things are important to know regarding these change-sets: 1. The commit (450be4960aof) addressed an issue where the removal or hot-unplug of an LSI passthroughed IO adapter was not working on pseries machines. This was due to interrupt resources not getting cleaned up before removal. Since there were no pcibios_* hooks for the interrupt cleanup, the interrupt-related resource cleanup has been addressed using the notifier framework and an explicit call of ppc_pci_intx_release(). 2. Does without your current patch and after hot-plug operation, device is not working (io not happening or interrupt not getting generated) correctly ? 3. There is already a pcibios_* hook available for creating the interrupt mapping. Here's a snippet - /* * Called after device has been added to bus and * before sysfs has been created. */ void (*pcibios_bus_add_device)(struct pci_dev *pdev); Above function calls - below function to restore the irq mapping. /* Read default IRQs and fixup if necessary */ pci_read_irq_line(dev); 4. Does the above pcibios_* hook not sufficient for enabling the interrupt mapping or does it not getting invoked during hot-plug operation ? Fixes 450be4960a0f ("powerpc/pci: Remove LSI mappings on device teardown") Signed-off-by: Oleksandr Ocheretnyi --- arch/powerpc/kernel/pci-common.c | 30 -- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index eac84d687b53..a0e7cab2baa7 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -390,22 +390,32 @@ static void ppc_pci_intx_release(struct kref *kref) kfree(vi); } +static int pci_read_irq_line(struct pci_dev *pci_dev); + static int ppc_pci_unmap_irq_line(struct notifier_block *nb, unsigned long action, void *data) { struct pci_dev *pdev = to_pci_dev(data); - if (action == BUS_NOTIFY_DEL_DEVICE) { - struct pci_intx_virq *vi; - - mutex_lock(&intx_mutex); - list_for_each_entry(vi, &intx_list, list_node) { - if (vi->virq == pdev->irq) { - kref_put(&vi->kref, ppc_pci_intx_release); - break; + switch (action) { + case BUS_NOTIFY_DEL_DEVICE: + { + struct pci_intx_virq *vi; + + mutex_lock(&intx_mutex); + list_for_each_entry(vi, &intx_list, list_node) { + if (vi->virq == pdev->irq) { + kref_put(&vi->kref, ppc_pci_intx_release); + break; + } + } + mutex_unlock(&intx_mutex); } - } - mutex_unlock(&intx_mutex); + break; + + case BUS_NOTIFY_ADD_DEVICE: + pci_read_irq_line(pdev); + break; The above code is fine only if my aforementioned points do not hold. } return NOTIFY_DONE; Best Regards, Krishna
Re: [PATCH v4 1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Hi Michael, When you apply/merge the patch, please consider below URL in commit message. This is the URL where this issue was reported - https://lists.ozlabs.org/pipermail/linuxppc-dev/2023-December/267034.html Thanks, Krishna On 7/1/24 1:15 PM, Krishna Kumar wrote: > Description of the problem: The hotplug driver for powerpc > (pci/hotplug/pnv_php.c) gives kernel crash when we try to > hot-unplug/disable the PCIe switch/bridge from the PHB. > > Root Cause of Crash: The crash is due to the reason that, though the msi > data structure has been released during disable/hot-unplug path and it > has been assigned with NULL, still during unregistration the code was > again trying to explicitly disable the MSI which causes the NULL pointer > dereference and kernel crash. > > The patch fixes the check during unregistration path to prevent invoking > pci_disable_msi/msix() since its data structure is already freed. > > Cc: Michael Ellerman > Cc: Nicholas Piggin > Cc: Christophe Leroy > Cc: "Aneesh Kumar K.V" > Cc: Bjorn Helgaas > Cc: Gaurav Batra > Cc: Nathan Lynch > Cc: Brian King > > Acked-by: Bjorn Helgaas > Tested-by: Shawn Anastasio > Signed-off-by: Krishna Kumar > --- > drivers/pci/hotplug/pnv_php.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c > index 694349be9d0a..573a41869c15 100644 > --- a/drivers/pci/hotplug/pnv_php.c > +++ b/drivers/pci/hotplug/pnv_php.c > @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot > *php_slot, > bool disable_device) > { > struct pci_dev *pdev = php_slot->pdev; > - int irq = php_slot->irq; > u16 ctrl; > > if (php_slot->irq > 0) { > @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot > *php_slot, > php_slot->wq = NULL; > } > > - if (disable_device || irq > 0) { > + if (disable_device) { > if (pdev->msix_enabled) > pci_disable_msix(pdev); > else if (pdev->msi_enabled)
[PATCH v4 2/2] powerpc: hotplug driver bridge support
There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the switch/bridge. The existing driver depends on pci_hp_add_devices() function for device enablement. This function calls pci_scan_slot() on only one device-node/port of the bridge, not on all the siblings' device-node/port. The missing code needs to be added which will find all the sibling device-nodes/bridge-ports and will run explicit pci_scan_slot() on those. A new function has been added for this purpose which gets invoked from pci_hp_add_devices(). This new function pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling bridge-ports by traversal and explicitly invokes pci_scan_slot on them. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: Bjorn Helgaas Cc: Gaurav Batra Cc: Nathan Lynch Cc: Brian King Tested-by: Shawn Anastasio Signed-off-by: Krishna Kumar --- arch/powerpc/include/asm/ppc-pci.h | 4 arch/powerpc/kernel/pci-hotplug.c | 5 ++--- arch/powerpc/kernel/pci_dn.c | 32 ++ 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h index a8b7e8682f5b..83db8d0798ac 100644 --- a/arch/powerpc/include/asm/ppc-pci.h +++ b/arch/powerpc/include/asm/ppc-pci.h @@ -28,6 +28,10 @@ struct pci_dn; void *pci_traverse_device_nodes(struct device_node *start, void *(*fn)(struct device_node *, void *), void *data); + +void pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, + struct pci_bus *bus); + extern void pci_devs_phb_init_dynamic(struct pci_controller *phb); #if defined(CONFIG_IOMMU_API) && (defined(CONFIG_PPC_PSERIES) || \ diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c index 0fe251c6ac2c..639a3d592fe2 100644 --- a/arch/powerpc/kernel/pci-hotplug.c +++ b/arch/powerpc/kernel/pci-hotplug.c @@ -106,7 +106,7 @@ EXPORT_SYMBOL_GPL(pci_hp_remove_devices); */ void pci_hp_add_devices(struct pci_bus *bus) { - int slotno, mode, max; + int mode, max; struct pci_dev *dev; struct pci_controller *phb; struct device_node *dn = pci_bus_to_OF_node(bus); @@ -129,8 +129,7 @@ void pci_hp_add_devices(struct pci_bus *bus) * order for fully rescan all the way down to pick them up. * They can have been removed during partial hotplug. */ - slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); - pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + pci_traverse_sibling_nodes_and_scan_slot(dn, bus); max = bus->busn_res.start; /* * Scan bridges that are already configured. We don't touch diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c index 38561d6a2079..bea612759832 100644 --- a/arch/powerpc/kernel/pci_dn.c +++ b/arch/powerpc/kernel/pci_dn.c @@ -493,4 +493,36 @@ static void pci_dev_pdn_setup(struct pci_dev *pdev) pdn = pci_get_pdn(pdev); pdev->dev.archdata.pci_data = pdn; } + +void pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, struct pci_bus *bus) +{ + struct device_node *dn; + int slotno; + + u32 class = 0; + + if (!of_property_read_u32(start->child, "class-code", &class)) { + /* Call of pci_scan_slot for non-bridge/EP case */ + if (!((class >> 8) == PCI_CLASS_BRIDGE_PCI)) { + slotno = PCI_SLOT(PCI_DN(start->child)->devfn); + pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + return; + } + } + + /* Iterate all siblings */ + for_each_child_of_node(start, dn) { + class = 0; + + if (!of_property_read_u32(start->child, "class-code", &class)) { + /* Call of pci_scan_slot on each sibling-nodes/bridge-ports */ + if ((class >> 8) == PCI_CLASS_BRIDGE_PCI) { + slotno = PCI_SLOT(PCI_DN(dn)->devfn); + pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + } + } + } + +} + DECLARE_PCI_FIXUP_EARLY(PCI_ANY_ID, PCI_ANY_ID, pci_dev_pdn_setup); -- 2.45.0
[PATCH v4 1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Description of the problem: The hotplug driver for powerpc (pci/hotplug/pnv_php.c) gives kernel crash when we try to hot-unplug/disable the PCIe switch/bridge from the PHB. Root Cause of Crash: The crash is due to the reason that, though the msi data structure has been released during disable/hot-unplug path and it has been assigned with NULL, still during unregistration the code was again trying to explicitly disable the MSI which causes the NULL pointer dereference and kernel crash. The patch fixes the check during unregistration path to prevent invoking pci_disable_msi/msix() since its data structure is already freed. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: Bjorn Helgaas Cc: Gaurav Batra Cc: Nathan Lynch Cc: Brian King Acked-by: Bjorn Helgaas Tested-by: Shawn Anastasio Signed-off-by: Krishna Kumar --- drivers/pci/hotplug/pnv_php.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 694349be9d0a..573a41869c15 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, bool disable_device) { struct pci_dev *pdev = php_slot->pdev; - int irq = php_slot->irq; u16 ctrl; if (php_slot->irq > 0) { @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, php_slot->wq = NULL; } - if (disable_device || irq > 0) { + if (disable_device) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled) -- 2.45.0
[PATCH v4 0/2] PCI hotplug driver fixes
The fix of Powerpc hotplug driver (drivers/pci/hotplug/pnv_php.c) addresses below two issues. 1. Kernel Crash during hot unplug of bridge/switch slot. 2. Bridge Support Enablement - Previously, when we do a hot-unplug operation on a bridge slot, all the ports and devices behind the bridge-ports would be hot-unplugged/offline, but when we do a hot-plug operation on the same bridge slot, all the ports and devices behind the bridge would not get hot-plugged/online. In this case, Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged/offline. After the fix, The hot-unplug and hot-plug operations on the slot associated with the bridge started behaving correctly and became in sync. Now, after the hot plug operation on the same slot, all the bridge ports and devices behind the bridge become hot-plugged/online/restored in the same manner as it was before the hot-unplug operation. Krishna Kumar (2): pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv powerpc: hotplug driver bridge support arch/powerpc/include/asm/ppc-pci.h | 4 arch/powerpc/kernel/pci-hotplug.c | 5 ++--- arch/powerpc/kernel/pci_dn.c | 32 ++ drivers/pci/hotplug/pnv_php.c | 3 +-- 4 files changed, 39 insertions(+), 5 deletions(-) Changelog: == v4: 1st July 2024 - Changed the description and corrected the typo in patch1 - Added Acked-by and Tested-by tag in patch1 - Added Tested-by tag in patch2 v3: 24 June 2024 - Removed the DPC keyword from description in cover letter and patch2 v2: 14 May 2024 - Used of_property_read_u32() in place of of_get_property() and of_read_number(). [patch2] - Removed some unnecessary variable and changed the function return type from void* to void. [patch2] - Removed the export declaration of pci_traverse_sibling_nodes_and_scan_slot() as its not needed. [patch2] -- 2.45.0
Re: [PATCH v2 2/2] powerpc: hotplug driver bridge support
On 5/31/24 7:22 PM, Krishna Kumar wrote: > On 5/31/24 16:14, Krishna Kumar wrote: >> On 5/23/24 20:22, Krishna Kumar wrote: >>> Hi Oliver, >>> >>> Thanks for your suggestions. Pls find my response: >>> >>> On 5/20/24 20:29, Oliver O'Halloran wrote: >>>> On Fri, May 17, 2024 at 9:15 PM krishna kumar >>>> wrote: >>>>>> Uh, if I'm reading this right it looks like your "slot" C5 is actually >>>>>> the PCIe switch's internal bus which is definitely not hot pluggable. >>>>> It's a hotplug slot. Please see the snippet below: >>>>> >>>>> :~$ sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug >>>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>>> Surprise- >>>>> :~$ >>>>> >>>>> :~$ sudo lspci -vvv -s 0004:02:01.0 | grep --color HotPlug >>>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>>> Surprise- >>>>> :~$ >>>>> >>>>> :~$ sudo lspci -vvv -s 0004:02:02.0 | grep --color HotPlug >>>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>>> Surprise- >>>>> :~$ >>>>> >>>>> :~$ sudo lspci -vvv -s 0004:02:03.0 | grep --color HotPlug >>>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>>> Surprise- >>>>> :~$ >>>> All this is showing is that the switch downstream ports on bus 0004:02 >>>> have a slot capability. I already know that (see what I said >>>> previously about physical links). The fact the downstream ports have a >>>> slot capability also has absolutely nothing to do with anything I was >>>> saying. Look at the lspci output for 0004:01:00.0 which is the >>>> switch's upstream port. The upstream port device will not have a slot >>>> capability because it's a bridge into the virtual PCI bus that is >>>> internal to the switch. >>> Let me try to understand your suggestion and what needs to be done now: >>> >>> lspci -nn snippet in current scenario: >>> >>> 0004:01:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >>> 0004:01:00.1 Memory controller [0580]: PMC-Sierra Inc. Device [11f8:4052] >>> 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >>> 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >>> 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >>> 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >>> >>> lspci -tv snippet in current scenario: >>> >>> +-[0001:00]---00.0-[01-0a]--+-00.0-[02-0a]--+-00.0-[03-07]-- >>> | | +-01.0-[08]--+-00.0 Broadcom >>> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >>> | | | +-00.1 Broadcom >>> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >>> | | | +-00.2 Broadcom >>> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >>> | | | \-00.3 Broadcom >>> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >>> | | +-02.0-[09]00.0 Broadcom >>> / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 >>> | | \-03.0-[0a]00.0 IBM PCI-E >>> IPR SAS Adapter (ASIC) >>> | \-00.1 PMC-Sierra Inc. Device 4052 >>> >>> C5 bus address: >>> >>> [root@ltczzess2-lp1 ~]# cat /sys/bus/pci/slots/C5/address >>> 0004:02:00 >>> [root@ltczzess2-lp1 ~]# >>> >>> 0004:01:00.0 doesn't have hotplug capability but 0004:02:00.0 does >>> have this capability. Below snippet tells about this: >>> >>> [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:01:00.0 | grep --color >>> HotPlug >>> [root@ltczzess2-lp1 ~]# >>> [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:02:00.0 | grep --color >>> HotPlug >>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>> Surprise- >>> [root@ltczzess2-lp1 ~]# >>> >>> In Function - pnv_php_register_one() is responsible for slot creation from >>>
[PATCH v3 2/2] powerpc: hotplug driver bridge support
There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the switch/bridge. The existing driver depends on pci_hp_add_devices() function for device enablement. This function calls pci_scan_slot() on only one device-node/port of the bridge, not on all the siblings' device-node/port. The missing code needs to be added which will find all the sibling device-nodes/bridge-ports and will run explicit pci_scan_slot() on those. A new function has been added for this purpose which gets invoked from pci_hp_add_devices(). This new function pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling bridge-ports by traversal and explicitly invokes pci_scan_slot on them. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: Bjorn Helgaas Cc: Gaurav Batra Cc: Nathan Lynch Cc: Brian King Signed-off-by: Krishna Kumar --- arch/powerpc/include/asm/ppc-pci.h | 4 arch/powerpc/kernel/pci-hotplug.c | 5 ++--- arch/powerpc/kernel/pci_dn.c | 32 ++ 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-pci.h b/arch/powerpc/include/asm/ppc-pci.h index a8b7e8682f5b..83db8d0798ac 100644 --- a/arch/powerpc/include/asm/ppc-pci.h +++ b/arch/powerpc/include/asm/ppc-pci.h @@ -28,6 +28,10 @@ struct pci_dn; void *pci_traverse_device_nodes(struct device_node *start, void *(*fn)(struct device_node *, void *), void *data); + +void pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, + struct pci_bus *bus); + extern void pci_devs_phb_init_dynamic(struct pci_controller *phb); #if defined(CONFIG_IOMMU_API) && (defined(CONFIG_PPC_PSERIES) || \ diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-hotplug.c index 0fe251c6ac2c..639a3d592fe2 100644 --- a/arch/powerpc/kernel/pci-hotplug.c +++ b/arch/powerpc/kernel/pci-hotplug.c @@ -106,7 +106,7 @@ EXPORT_SYMBOL_GPL(pci_hp_remove_devices); */ void pci_hp_add_devices(struct pci_bus *bus) { - int slotno, mode, max; + int mode, max; struct pci_dev *dev; struct pci_controller *phb; struct device_node *dn = pci_bus_to_OF_node(bus); @@ -129,8 +129,7 @@ void pci_hp_add_devices(struct pci_bus *bus) * order for fully rescan all the way down to pick them up. * They can have been removed during partial hotplug. */ - slotno = PCI_SLOT(PCI_DN(dn->child)->devfn); - pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + pci_traverse_sibling_nodes_and_scan_slot(dn, bus); max = bus->busn_res.start; /* * Scan bridges that are already configured. We don't touch diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c index 38561d6a2079..bea612759832 100644 --- a/arch/powerpc/kernel/pci_dn.c +++ b/arch/powerpc/kernel/pci_dn.c @@ -493,4 +493,36 @@ static void pci_dev_pdn_setup(struct pci_dev *pdev) pdn = pci_get_pdn(pdev); pdev->dev.archdata.pci_data = pdn; } + +void pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, struct pci_bus *bus) +{ + struct device_node *dn; + int slotno; + + u32 class = 0; + + if (!of_property_read_u32(start->child, "class-code", &class)) { + /* Call of pci_scan_slot for non-bridge/EP case */ + if (!((class >> 8) == PCI_CLASS_BRIDGE_PCI)) { + slotno = PCI_SLOT(PCI_DN(start->child)->devfn); + pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + return; + } + } + + /* Iterate all siblings */ + for_each_child_of_node(start, dn) { + class = 0; + + if (!of_property_read_u32(start->child, "class-code", &class)) { + /* Call of pci_scan_slot on each sibling-nodes/bridge-ports */ + if ((class >> 8) == PCI_CLASS_BRIDGE_PCI) { + slotno = PCI_SLOT(PCI_DN(dn)->devfn); + pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + } + } + } + +} + DECLARE_PCI_FIXUP_EARLY(PCI_ANY_ID, PCI_ANY_ID, pci_dev_pdn_setup); -- 2.45.0
[PATCH v3 1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Description of the problem: The hotplug driver for powerpc (pci/hotplug/pnv_php.c) gives kernel crash when we try to hot-unplug/disable the PCIe switch/bridge from the PHB. Root Cause of Crash: The crash is due to the reason that, though the msi data structure has been released during disable/hot-unplug path and it has been assigned with NULL, still during unregistartion the code was again trying to explicitly disable the msi which causes the Null pointer dereference and kernel crash. Proposed Fix : The fix is to correct the check during unregistration path so that the code should not try to invoke pci_disable_msi/msix() if its data structure is already freed. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: Bjorn Helgaas Cc: Gaurav Batra Cc: Nathan Lynch Cc: Brian King Signed-off-by: Krishna Kumar --- drivers/pci/hotplug/pnv_php.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 694349be9d0a..573a41869c15 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, bool disable_device) { struct pci_dev *pdev = php_slot->pdev; - int irq = php_slot->irq; u16 ctrl; if (php_slot->irq > 0) { @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, php_slot->wq = NULL; } - if (disable_device || irq > 0) { + if (disable_device) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled) -- 2.45.0
[PATCH v3 0/2] PCI hotplug driver fixes
The fix of Powerpc hotplug driver (drivers/pci/hotplug/pnv_php.c) addresses below two issues. 1. Kernel Crash during hot unplug of bridge/switch slot. 2. Bridge Support Enablement - Previously, when we do a hot-unplug operation on a bridge slot, all the ports and devices behind the bridge-ports would be hot-unplugged/offline, but when we do a hot-plug operation on the same bridge slot,??all??the ports and devices behind??the bridge??would not get hot-plugged/online.??In this case, Only??the??first port of the bridge gets enabled??and??the remaining port/devices remain unplugged/offline. After the fix, The??hot-unplug and hot-plug operations on the slot associated with the bridge started behaving correctly and became in sync. Now, after the hot plug operation on the same slot, all the bridge ports and devices behind the bridge become hot-plugged/online/restored in the same manner as it was before the hot-unplug operation. Krishna Kumar (2): pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv powerpc: hotplug driver bridge support arch/powerpc/include/asm/ppc-pci.h | 4 arch/powerpc/kernel/pci-hotplug.c | 5 ++--- arch/powerpc/kernel/pci_dn.c | 32 ++ drivers/pci/hotplug/pnv_php.c | 3 +-- 4 files changed, 39 insertions(+), 5 deletions(-) Changelog: == v3: 24 June 2024 - Removed the DPC keyword from description in cover letter and patch2 v2: 14 May 2024 - Used of_property_read_u32() in place of of_get_property() and of_read_number(). [patch2] - Removed some unnecessary variable and changed the function return type from void* to void. [patch2] - Removed the export declaration of pci_traverse_sibling_nodes_and_scan_slot() as its not needed. [patch2] -- 2.45.0
Re: [PATCH v2 2/2] powerpc: hotplug driver bridge support
On 5/31/24 16:14, Krishna Kumar wrote: > On 5/23/24 20:22, Krishna Kumar wrote: >> >> Hi Oliver, >> >> Thanks for your suggestions. Pls find my response: >> >> On 5/20/24 20:29, Oliver O'Halloran wrote: >>> On Fri, May 17, 2024 at 9:15 PM krishna kumar >>> wrote: >>>>> Uh, if I'm reading this right it looks like your "slot" C5 is actually >>>>> the PCIe switch's internal bus which is definitely not hot pluggable. >>>> It's a hotplug slot. Please see the snippet below: >>>> >>>> :~$ sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug >>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>> Surprise- >>>> :~$ >>>> >>>> :~$ sudo lspci -vvv -s 0004:02:01.0 | grep --color HotPlug >>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>> Surprise- >>>> :~$ >>>> >>>> :~$ sudo lspci -vvv -s 0004:02:02.0 | grep --color HotPlug >>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>> Surprise- >>>> :~$ >>>> >>>> :~$ sudo lspci -vvv -s 0004:02:03.0 | grep --color HotPlug >>>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>>> Surprise- >>>> :~$ >>> All this is showing is that the switch downstream ports on bus 0004:02 >>> have a slot capability. I already know that (see what I said >>> previously about physical links). The fact the downstream ports have a >>> slot capability also has absolutely nothing to do with anything I was >>> saying. Look at the lspci output for 0004:01:00.0 which is the >>> switch's upstream port. The upstream port device will not have a slot >>> capability because it's a bridge into the virtual PCI bus that is >>> internal to the switch. >> Let me try to understand your suggestion and what needs to be done now: >> >> lspci -nn snippet in current scenario: >> >> 0004:01:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >> 0004:01:00.1 Memory controller [0580]: PMC-Sierra Inc. Device [11f8:4052] >> 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >> 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >> 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >> 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] >> >> lspci -tv snippet in current scenario: >> >> +-[0001:00]---00.0-[01-0a]--+-00.0-[02-0a]--+-00.0-[03-07]-- >> | | +-01.0-[08]--+-00.0 Broadcom >> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >> | | | +-00.1 Broadcom >> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >> | | | +-00.2 Broadcom >> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >> | | | \-00.3 Broadcom >> Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe >> | | +-02.0-[09]00.0 Broadcom / >> LSI SAS3216 PCI-Express Fusion-MPT SAS-3 >> | | \-03.0-[0a]00.0 IBM PCI-E >> IPR SAS Adapter (ASIC) >> | \-00.1 PMC-Sierra Inc. Device 4052 >> >> C5 bus address: >> >> [root@ltczzess2-lp1 ~]# cat /sys/bus/pci/slots/C5/address >> 0004:02:00 >> [root@ltczzess2-lp1 ~]# >> >> 0004:01:00.0 doesn't have hotplug capability but 0004:02:00.0 does >> have this capability. Below snippet tells about this: >> >> [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:01:00.0 | grep --color >> HotPlug >> [root@ltczzess2-lp1 ~]# >> [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:02:00.0 | grep --color >> HotPlug >> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- >> [root@ltczzess2-lp1 ~]# >> >> In Function - pnv_php_register_one() is responsible for slot creation from >> hotplug capable device node: >> >> Below is the current code that does check the device node for hot plug >> capability and takes the decision >> >> /* Check if it's hotpluggable slot */ >> ret = of_property_read_u32(dn, "ibm,slot-pluggable", &prop32); >> if (ret || !prop32){ >> return -ENXI
Re: [PATCH v2 2/2] powerpc: hotplug driver bridge support
On 5/23/24 20:22, Krishna Kumar wrote: > > > Hi Oliver, > > Thanks for your suggestions. Pls find my response: > > On 5/20/24 20:29, Oliver O'Halloran wrote: >> On Fri, May 17, 2024 at 9:15 PM krishna kumar wrote: >>>> Uh, if I'm reading this right it looks like your "slot" C5 is actually >>>> the PCIe switch's internal bus which is definitely not hot pluggable. >>> It's a hotplug slot. Please see the snippet below: >>> >>> :~$ sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug >>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>> Surprise- >>> :~$ >>> >>> :~$ sudo lspci -vvv -s 0004:02:01.0 | grep --color HotPlug >>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>> Surprise- >>> :~$ >>> >>> :~$ sudo lspci -vvv -s 0004:02:02.0 | grep --color HotPlug >>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>> Surprise- >>> :~$ >>> >>> :~$ sudo lspci -vvv -s 0004:02:03.0 | grep --color HotPlug >>> SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ >>> Surprise- >>> :~$ >> All this is showing is that the switch downstream ports on bus 0004:02 >> have a slot capability. I already know that (see what I said >> previously about physical links). The fact the downstream ports have a >> slot capability also has absolutely nothing to do with anything I was >> saying. Look at the lspci output for 0004:01:00.0 which is the >> switch's upstream port. The upstream port device will not have a slot >> capability because it's a bridge into the virtual PCI bus that is >> internal to the switch. > > Let me try to understand your suggestion and what needs to be done now: > > lspci -nn snippet in current scenario: > > 0004:01:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] > 0004:01:00.1 Memory controller [0580]: PMC-Sierra Inc. Device [11f8:4052] > 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] > 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] > 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] > 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] > > lspci -tv snippet in current scenario: > > +-[0001:00]---00.0-[01-0a]--+-00.0-[02-0a]--+-00.0-[03-07]-- > | | +-01.0-[08]--+-00.0 Broadcom > Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe > | | | +-00.1 Broadcom > Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe > | | | +-00.2 Broadcom > Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe > | | | \-00.3 Broadcom > Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe > | | +-02.0-[09]00.0 Broadcom / > LSI SAS3216 PCI-Express Fusion-MPT SAS-3 > | | \-03.0-[0a]00.0 IBM PCI-E > IPR SAS Adapter (ASIC) > | \-00.1 PMC-Sierra Inc. Device 4052 > > C5 bus address: > > [root@ltczzess2-lp1 ~]# cat /sys/bus/pci/slots/C5/address > 0004:02:00 > [root@ltczzess2-lp1 ~]# > > 0004:01:00.0 doesn't have hotplug capability but 0004:02:00.0 does > have this capability. Below snippet tells about this: > > [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:01:00.0 | grep --color HotPlug > [root@ltczzess2-lp1 ~]# > [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug > SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- > [root@ltczzess2-lp1 ~]# > > In Function - pnv_php_register_one() is responsible for slot creation from > hotplug capable device node: > > Below is the current code that does check the device node for hot plug > capability and takes the decision > > /* Check if it's hotpluggable slot */ > ret = of_property_read_u32(dn, "ibm,slot-pluggable", &prop32); > if (ret || !prop32){ > return -ENXIO; > } > > Its obvious that 0004:01:00.0 does not get above criteria fulfilled but > 0004:02:00.0 does, so is the current behavior (Upstream port is not became > C5 slot but downstream port became C5 slot). > > I am summarizing your suggested changes. Please let > me know if I've got it right: > > 1. Do you want me to modify the code so that the C5 > device-bdf and bus-address becom
Re: [PATCH v2 1/1] powerpc/numa: Online a node if PHB is attached.
On 5/17/24 19:55, Nilay Shroff wrote: In the current design, a numa-node is made online only if that node is attached to cpu/memory. With this design, if any PCI/IO device is found to be attached to a numa-node which is not online then the numa-node id of the corresponding PCI/IO device is set to NUMA_NO_NODE(-1). This design may negatively impact the performance of PCIe device if the numa-node assigned to PCIe device is -1 because in such case we may not be able to accurately calculate the distance between two nodes. The multi-controller NVMe PCIe disk has an issue with calculating the node distance if the PCIe NVMe controller is attached to a PCI host bridge which has numa-node id value set to NUMA_NO_NODE. This patch helps fix this ensuring that a cpu/memory less numa node is made online if it's attached to PCI host bridge. Signed-off-by: Nilay Shroff Thanks for fixing this. Looks good to me. Reviewed-by: Krishna Kumar (krish...@linux.ibm.com)
Re: [PATCH v2 2/2] powerpc: hotplug driver bridge support
Hi Oliver, Thanks for your suggestions. Pls find my response: On 5/20/24 20:29, Oliver O'Halloran wrote: On Fri, May 17, 2024 at 9:15 PM krishna kumar wrote: Uh, if I'm reading this right it looks like your "slot" C5 is actually the PCIe switch's internal bus which is definitely not hot pluggable. It's a hotplug slot. Please see the snippet below: :~$ sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug SltCap:AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ :~$ sudo lspci -vvv -s 0004:02:01.0 | grep --color HotPlug SltCap:AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ :~$ sudo lspci -vvv -s 0004:02:02.0 | grep --color HotPlug SltCap:AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ :~$ sudo lspci -vvv -s 0004:02:03.0 | grep --color HotPlug SltCap:AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ All this is showing is that the switch downstream ports on bus 0004:02 have a slot capability. I already know that (see what I said previously about physical links). The fact the downstream ports have a slot capability also has absolutely nothing to do with anything I was saying. Look at the lspci output for 0004:01:00.0 which is the switch's upstream port. The upstream port device will not have a slot capability because it's a bridge into the virtual PCI bus that is internal to the switch. Let me try to understand your suggestion and what needs to be done now: lspci -nn snippet in current scenario: 0004:01:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:01:00.1 Memory controller [0580]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] lspci -tv snippet in current scenario: +-[0001:00]---00.0-[01-0a]--+-00.0-[02-0a]--+-00.0-[03-07]-- | | +-01.0-[08]--+-00.0 Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe | | | +-00.1 Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe | | | +-00.2 Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe | | | \-00.3 Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a]00.0 IBM PCI-E IPR SAS Adapter (ASIC) | \-00.1 PMC-Sierra Inc. Device 4052 C5 bus address: [root@ltczzess2-lp1 ~]# cat /sys/bus/pci/slots/C5/address 0004:02:00 [root@ltczzess2-lp1 ~]# 0004:01:00.0 doesn't have hotplug capability but 0004:02:00.0 does have this capability. Below snippet tells about this: [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:01:00.0 | grep --color HotPlug [root@ltczzess2-lp1 ~]# [root@ltczzess2-lp1 ~]# sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- [root@ltczzess2-lp1 ~]# In Function - pnv_php_register_one() is responsible for slot creation from hotplug capable device node: Below is the current code that does check the device node for hot plug capability and takes the decision /* Check if it's hotpluggable slot */ ret = of_property_read_u32(dn, "ibm,slot-pluggable", &prop32); if (ret || !prop32){ return -ENXIO; } Its obvious that 0004:01:00.0 does not get above criteria fulfilled but 0004:02:00.0 does, so is the current behavior (Upstream port is not became C5 slot but downstream port became C5 slot). I am summarizing your suggested changes. Please let me know if I've got it right: 1. Do you want me to modify the code so that the C5 device-bdf and bus-address become 0004:01:00/0004:01 instead of 0004:02:00/0004:01? 2. When performing a hot-unplug operation on C5, should all devices from 0004:01 be removed? And should all devices from 0004:02 also be removed? I think the answer is yes, but please confirm. 3. When performing a hot-plug operation on C5, should all the devices removed earlier from 0004:01 and 0004:02 be re-attached? 4. Will there be any PCIe topology changes in this workflow? Once you confirm the above requirements, we can discuss how to proceed further. I have some follow up questions from your last mail and I am putting these questions in below paragraphs as inline statements. It will confirm me if we should do above things or not. It seems like your explanation about the missing 0004:01
Re: [PATCH v2 2/2] powerpc: hotplug driver bridge support
Hi Oliver, Thanks for review. Please find my response below - On 5/17/24 08:12, Oliver O'Halloran wrote: On Tue, May 14, 2024 at 11:54 PM Krishna Kumar wrote: There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the DPC switch/bridge. I don't see anything touching DPC in this series? I apologize for the confusion. When I mentioned DPC, I was referring to the downward bridge port (the sibling ones) that were not enabled after the hot plug operation. I didn't mean to refer to DPC events. It seems that this caused some confusion, and I will remove this word in my next version of the patch. Thank you! *snip* Command for reproducing the issue : For hot unplug/disable - echo 0 > /sys/bus/pci/slots/C5/power For hot plug/enable -echo 1 > /sys/bus/pci/slots/C5/power where C5 is slot associated with bridge. Scenario/Tests: Output of lspci -nn before test is given below. This snippet contains devices used for testing on Powernv machine. 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) 0004:09:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) Output of lspci -tv before test is as follows: # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 C5(bridge) and C6(End Point) slot address are as below: # cat /sys/bus/pci/slots/C5/address 0004:02:00 # cat /sys/bus/pci/slots/C6/address 0004:09:00 Uh, if I'm reading this right it looks like your "slot" C5 is actually the PCIe switch's internal bus which is definitely not hot pluggable. It's a hotplug slot. Please see the snippet below: :~$ sudo lspci -vvv -s 0004:02:00.0 | grep --color HotPlug SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ :~$ sudo lspci -vvv -s 0004:02:01.0 | grep --color HotPlug SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ :~$ sudo lspci -vvv -s 0004:02:02.0 | grep --color HotPlug SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ :~$ sudo lspci -vvv -s 0004:02:03.0 | grep --color HotPlug SltCap: AttnBtn- PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- :~$ I find it helps to look at the PCI topology in terms of where the physical PCIe links are. Here we've got: - A link between the PHB (0004:00:00.0) and the switch upstream port (0004:01:00.0) - A link from switch downstream port 0 (0004:02:00.0) to nothing - A link from switch downstream port 1 (0004:02:01.0) to a SAS card - A link from switch downstream port 2 (0004:02:02.0) to a SAS card - A link from switch downstream port 2 (0004:02:03.0) to nothing Note that there's no PCIe link between the switch upstream port (0004:01:00.0) and the downstream ports on bus 0004:02. The connection between those is invisible to us because it's custom bus logic internal to the PCIe switch ASIC. What I think has happened here is that system firmware has supplied bad PCIe slot information to OPAL which has resulted in pnv_php advertising a slot in the wrong place. Assuming this following the usual IBM convention I'd expect the bridge device for C5 to be the PHB's root port and the bus should be 0004:01. It seems like your explanation about the missing 0004:01:00.0 may be correct and could be due to a firmware bug. However, the scope of this patch does not relate to this issue. Additionally, if it starts with 0004:01:00.0 to 0004:01:03.0, the behavior of hot-unplug and hot-plug operations will remain inconsistent. This patch aims to address the inconsistent behavior of hot-unplug and hot-plug. It might be worth adding some logic to pnv_php to verify the PCI bridge upstream of the slot actually has the PCIe slot capability to guard aga
[PATCH v2 2/2] powerpc: hotplug driver bridge support
There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the DPC switch/bridge. The existing driver depends on pci_hp_add_devices() function for device enablement. This function calls pci_scan_slot() on only one device-node/port of the bridge, not on all the siblings' device-node/port. The missing code needs to be added which will find all the sibling device-nodes/bridge-ports and will run explicit pci_scan_slot() on those. A new function has been added for this purpose which gets invoked from pci_hp_add_devices(). This new function pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling bridge-ports by traversal and explicitly invokes pci_scan_slot on them. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: Bjorn Helgaas Cc: Gaurav Batra Cc: Nathan Lynch Cc: Brian King Signed-off-by: Krishna Kumar --- Command for reproducing the issue : For hot unplug/disable - echo 0 > /sys/bus/pci/slots/C5/power For hot plug/enable -echo 1 > /sys/bus/pci/slots/C5/power where C5 is slot associated with bridge. Scenario/Tests: Output of lspci -nn before test is given below. This snippet contains devices used for testing on Powernv machine. 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) 0004:09:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) Output of lspci -tv before test is as follows: # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 C5(bridge) and C6(End Point) slot address are as below: # cat /sys/bus/pci/slots/C5/address 0004:02:00 # cat /sys/bus/pci/slots/C6/address 0004:09:00 Hot-unplug operation on slot associated with bridge: # echo 0 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 >From the above lspci -tv output, it can be observed that hot unplug operation has removed all the PMC-Sierra bridge ports like: 00.0-[03-07], 01.0-[08], 02.0-[09], 03.0-[0a-0e] and the SAS devices behind the bridge-port. Without the fix, when the hot plug operation is done on the same slot, it adds only the first bridge port and doesn't restore all the bridge-ports and devices that it unplugged earlier. Below snippet shows this. Hot-plug operation on the bridge slot without the fix: # echo 1 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | \-00.1 PMC-Sierra Inc. Device 4052 After the fix, it restores all the devices in the same manner how it unplugged them earlier during the hot unplug operation. The below snippet shows the same. Hot-plug operation on bridge slot with the fix: # echo 1 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 Removal of End point device behind bridge are also intact and behaving correctly. Hot-unplug operation on Endpoint device C6: # echo 0 > /sys/bus/pci/slots/C6/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]-- | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 Hot-plug opera
[PATCH v2 1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Description of the problem: The hotplug driver for powerpc (pci/hotplug/pnv_php.c) gives kernel crash when we try to hot-unplug/disable the PCIe switch/bridge from the PHB. Root Cause of Crash: The crash is due to the reason that, though the msi data structure has been released during disable/hot-unplug path and it has been assigned with NULL, still during unregistartion the code was again trying to explicitly disable the msi which causes the Null pointer dereference and kernel crash. Proposed Fix : The fix is to correct the check during unregistration path so that the code should not try to invoke pci_disable_msi/msix() if its data structure is already freed. Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: "Aneesh Kumar K.V" Cc: Bjorn Helgaas Cc: Gaurav Batra Cc: Nathan Lynch Cc: Brian King Signed-off-by: Krishna Kumar --- Command used for reproducing the bug: echo 0 > /sys/bus/pci/slots/C5/power where C5 is slot associated with bridge. Snippet of Crash: Kernel attempted to read user page (10) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x0010 Faulting instruction address: 0xc0fad7d4 Oops: Kernel access of bad area, sig: 11 [#1] Hardware name: 5105-22E POWER9 0x4e1203 opal:v7.0-39-g4660e63a PowerNV NIP [c0fad7d4] mutex_lock+0x34/0x88 LR [c0fad7c8] mutex_lock+0x28/0x88 Call Trace: [c0017075f940] [c0fad7c8] mutex_lock+0x28/0x88 (unreliable) [c0017075f970] [c0214464] msi_lock_descs+0x28/0x3c [c0017075f990] [c08e8be8] pci_disable_msi+0x68/0xa8 [c0017075f9c0] [c090f0a4] pnv_php_disable_irq+0x2a0/0x2b0 [c0017075fab0] [c090f128] pnv_php_free_slot+0x74/0xc4 [c0017075fb30] [c0912184] pnv_php_disable_slot.part.0+0x1b8/0x24c [c0017075fc00] [c0902df0] power_write_file+0xf8/0x18c [c0017075fc80] [c08f84d8] pci_slot_attr_store+0x40/0x5c [c0017075fca0] [c06b4834] sysfs_kf_write+0x64/0x78 [c0017075fcc0] [c06b3300] kernfs_fop_write_iter+0x1b8/0x2dc [c0017075fd10] [c05b3eb0] vfs_write+0x224/0x4e8 [c0017075fdc0] [c05b44b0] ksys_write+0x88/0x150 [c0017075fe10] [c0030864] system_call_exception+0x124/0x320 [c0017075fe50] [c000cedc] system_call_vectored_common+0x15c/0x2ec --- interrupt: 3000 at 0x7fffb9748774 Root-Cause: Its safe to call pci_disable_msi() if its associated data structre are not freed (during bailout path). But when the driver code disables the bridge during hot-unplug operation, its msi data structure becomes NULL (php_slot->pdev->dev.msi.data:). This happens before unregistration and during disable path in function msi_device_data_release(). In this case, its not safe to explicitly call pci_disable_msi/msix() due to NULL pointer dereference. But since the current code does so, the crash is happening at the line mutex_lock(&dev->msi.data->mutex). FIX: In the current code, there are two paths to invoke pci_disable_msi/msix(). In the error/bailout path, first argument of the check - if(disable_device || irq > 0), i.e. disable_device is true, so it will always invoke pci_disable_msi/msix(), it will never depend on second argument. In this path it's fine to call pci_disable_msi/msix(). During the slot releasing/disable/hot-unpug path the disable_device is false, irq is having old value which is making the overall check true and causing the crash. Of course, we should not choose the old/stale value of irq but should choose php_slot->irq for check. Also, since php_slot->irq value is always 0 before the check, so it does not matter if it will not be included into the check. So, the check can be formed with only one argument i.e. disable_device. Based on its value pci_disable_msi/msix() will be invoked and this is the fix for the crash. During the bailout path its value will be true and during the hot-unplug operation on the bridge slot, its value will be false. drivers/pci/hotplug/pnv_php.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 694349be9d0a..573a41869c15 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, bool disable_device) { struct pci_dev *pdev = php_slot->pdev; - int irq = php_slot->irq; u16 ctrl; if (php_slot->irq > 0) { @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, php_slot->wq = NULL; } - if (disable_device || irq > 0) { + if (disable_device) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled) -- 2.45.0
[PATCH v2 0/2] PCI hotplug driver fixes
The fix of Powerpc hotplug driver (drivers/pci/hotplug/pnv_php.c) addresses below two issues. 1. Kernel Crash during hot unplug of bridge/switch slot. 2. DPC-Support Enablement - Previously, when we do a hot-unplug operation on a bridge slot, all the ports and devices behind the bridge-ports would be hot-unplugged/offline, but when we do a hot-plug operation on the same bridge slot, all the ports and devices behind the bridge would not get hot-plugged/online. In this case, Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged/offline. After the fix, The hot-unplug and hot-plug operations on the slot associated with the bridge started behaving correctly and became in sync. Now, after the hot plug operation on the same slot, all the bridge ports and devices behind the bridge become hot-plugged/online/restored in the same manner as it was before the hot-unplug operation. Krishna Kumar (2): pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv powerpc: hotplug driver bridge support arch/powerpc/include/asm/ppc-pci.h | 4 arch/powerpc/kernel/pci-hotplug.c | 5 ++--- arch/powerpc/kernel/pci_dn.c | 32 ++ drivers/pci/hotplug/pnv_php.c | 3 +-- 4 files changed, 39 insertions(+), 5 deletions(-) Changelog: == v2: 14 May 2024 - Used of_property_read_u32() in place of of_get_property() and of_read_number(). [patch2] - Removed some unnecessary variable and changed the function return type from void* to void. [patch2] - Removed the export declaration of pci_traverse_sibling_nodes_and_scan_slot() as its not needed. [patch2] -- 2.45.0
Re: [PATCH 2/2] arch/powerpc: hotplug driver bridge support
Thanks Sourabh for review. I have incorporated your comments and will be sending my patch soon. Best Regards, Krishna On 5/13/24 15:15, Sourabh Jain wrote: Hello Krishna, Is "arch" in commit title really required? On 09/05/24 17:35, Krishna Kumar wrote: There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the DPC switch/bridge. The existing driver depends on pci_hp_add_devices() function for device enablement. This function calls pci_scan_slot() on only one device-node/port of the bridge, not on all the siblings' device-node/port. The missing code needs to be added which will find all the sibling device-nodes/bridge-ports and will run explicit pci_scan_slot() on those. A new function has been added for this purpose which gets invoked from pci_hp_add_devices(). This new function pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling bridge-ports by traversal and explicitly invokes pci_scan_slot on them. Signed-off-by: Krishna Kumar --- Command for reproducing the issue : For hot unplug/disable - echo 0 > /sys/bus/pci/slots/C5/power For hot plug/enable - echo 1 > /sys/bus/pci/slots/C5/power where C5 is slot associated with bridge. Scenario/Tests: Output of lspci -nn before test is given below. This snippet contains devices used for testing on Powernv machine. 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) 0004:09:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) Output of lspci -tv before test is as follows: # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 C5(bridge) and C6(End Point) slot address are as below: # cat /sys/bus/pci/slots/C5/address 0004:02:00 # cat /sys/bus/pci/slots/C6/address 0004:09:00 Hot-unplug operation on slot associated with bridge: # echo 0 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 From the above lspci -tv output, it can be observed that hot unplug operation has removed all the PMC-Sierra bridge ports like: 00.0-[03-07], 01.0-[08], 02.0-[09], 03.0-[0a-0e] and the SAS devices behind the bridge-port. Without the fix, when the hot plug operation is done on the same slot, it adds only the first bridge port and doesn't restore all the bridge-ports and devices that it unplugged earlier. Below snippet shows this. Hot-plug operation on the bridge slot without the fix: # echo 1 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | \-00.1 PMC-Sierra Inc. Device 4052 After the fix, it restores all the devices in the same manner how it unplugged them earlier during the hot unplug operation. The below snippet shows the same. Hot-plug operation on bridge slot with the fix: # echo 1 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 Removal of End point device behind bridge are also intact and behaving correctly. Hot-unplug operation on Endpoint device C6: # echo 0 > /sys/bus/pci/slots/C6/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]-- | | \-03.0-[0a-0e]-- |
Re: [PATCH 2/2] arch/powerpc: hotplug driver bridge support
Thanks Andy for review. I have incorporated your comments and will be sending the patch soon. On 5/11/24 11:12, Andy Shevchenko wrote: Thu, May 09, 2024 at 05:35:54PM +0530, Krishna Kumar kirjoitti: There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the DPC switch/bridge. The existing driver depends on pci_hp_add_devices() function for device enablement. This function calls pci_scan_slot() on only one device-node/port of the bridge, not on all the siblings' device-node/port. The missing code needs to be added which will find all the sibling device-nodes/bridge-ports and will run explicit pci_scan_slot() on those. A new function has been added for this purpose which gets invoked from pci_hp_add_devices(). This new function pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling bridge-ports by traversal and explicitly invokes pci_scan_slot on them. One blank line is enough here. I have incorporated this. Signed-off-by: Krishna Kumar ... +void *pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, struct pci_bus *bus) +{ + struct device_node *dn; + struct device_node *parent; + int slotno; + + const __be32 *classp1; + u32 class1 = 0; + classp1 = of_get_property(start->child, "class-code", NULL); + if (classp1) + class1 = of_read_number(classp1, 1); What's wrong with of_property_read_u32()? I have incorporated this. + /* Call of pci_scan_slot for non-bridge/EP case */ + if (!((class1 >> 8) == PCI_CLASS_BRIDGE_PCI)) { + slotno = PCI_SLOT(PCI_DN(start->child)->devfn); + pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + return NULL; + } + + /* Iterate all siblings */ + parent = start; + for_each_child_of_node(parent, dn) { + const __be32 *classp; + u32 class = 0; + + classp = of_get_property(dn, "class-code", NULL); + if (classp) + class = of_read_number(classp, 1); Ditto. + /* Call of pci_scan_slot on each sibling-nodes/bridge-ports */ + if ((class >> 8) == PCI_CLASS_BRIDGE_PCI) { + slotno = PCI_SLOT(PCI_DN(dn)->devfn); + pci_scan_slot(bus, PCI_DEVFN(slotno, 0)); + } + } + + return NULL; +} Best Regards, Krishna
[PATCH 2/2] arch/powerpc: hotplug driver bridge support
There is an issue with the hotplug operation when it's done on the bridge/switch slot. The bridge-port and devices behind the bridge, which become offline by hot-unplug operation, don't get hot-plugged/enabled by doing hot-plug operation on that slot. Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged. The hot plug/unplug operation is done by the hotplug driver (drivers/pci/hotplug/pnv_php.c). Root Cause Analysis: This behavior is due to missing code for the DPC switch/bridge. The existing driver depends on pci_hp_add_devices() function for device enablement. This function calls pci_scan_slot() on only one device-node/port of the bridge, not on all the siblings' device-node/port. The missing code needs to be added which will find all the sibling device-nodes/bridge-ports and will run explicit pci_scan_slot() on those. A new function has been added for this purpose which gets invoked from pci_hp_add_devices(). This new function pci_traverse_sibling_nodes_and_scan_slot() gets all the sibling bridge-ports by traversal and explicitly invokes pci_scan_slot on them. Signed-off-by: Krishna Kumar --- Command for reproducing the issue : For hot unplug/disable - echo 0 > /sys/bus/pci/slots/C5/power For hot plug/enable -echo 1 > /sys/bus/pci/slots/C5/power where C5 is slot associated with bridge. Scenario/Tests: Output of lspci -nn before test is given below. This snippet contains devices used for testing on Powernv machine. 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052] 0004:08:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) 0004:09:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01) Output of lspci -tv before test is as follows: # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 C5(bridge) and C6(End Point) slot address are as below: # cat /sys/bus/pci/slots/C5/address 0004:02:00 # cat /sys/bus/pci/slots/C6/address 0004:09:00 Hot-unplug operation on slot associated with bridge: # echo 0 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 >From the above lspci -tv output, it can be observed that hot unplug operation has removed all the PMC-Sierra bridge ports like: 00.0-[03-07], 01.0-[08], 02.0-[09], 03.0-[0a-0e] and the SAS devices behind the bridge-port. Without the fix, when the hot plug operation is done on the same slot, it adds only the first bridge port and doesn't restore all the bridge-ports and devices that it unplugged earlier. Below snippet shows this. Hot-plug operation on the bridge slot without the fix: # echo 1 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | \-00.1 PMC-Sierra Inc. Device 4052 After the fix, it restores all the devices in the same manner how it unplugged them earlier during the hot unplug operation. The below snippet shows the same. Hot-plug operation on bridge slot with the fix: # echo 1 > /sys/bus/pci/slots/C5/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 Removal of End point device behind bridge are also intact and behaving correctly. Hot-unplug operation on Endpoint device C6: # echo 0 > /sys/bus/pci/slots/C6/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- | | +-01.0-[08]00.0 Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 | | +-02.0-[09]-- | | \-03.0-[0a-0e]-- | \-00.1 PMC-Sierra Inc. Device 4052 Hot-plug operation on Endpoint device C6: # echo 1 > /sys/bus/pci/slots/C6/power # lspci -tv +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]-- |
[PATCH 1/2] pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Description of the problem: The hotplug driver for powerpc (pci/hotplug/pnv_php.c) gives kernel crash when we try to hot-unplug/disable the PCIe switch/bridge from the PHB. Root Cause of Crash: The crash is due to the reason that, though the msi data structure has been released during disable/hot-unplug path and it has been assigned with NULL, still during unregistartion the code was again trying to explicitly disable the msi which causes the Null pointer dereference and kernel crash. Proposed Fix : The fix is to correct the check during unregistration path so that the code should not try to invoke pci_disable_msi/msix() if its data structure is already freed. Signed-off-by: Krishna Kumar --- Command used for reproducing the bug: echo 0 > /sys/bus/pci/slots/C5/power where C5 is slot associated with bridge. Snippet of Crash: Kernel attempted to read user page (10) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x0010 Faulting instruction address: 0xc0fad7d4 Oops: Kernel access of bad area, sig: 11 [#1] Hardware name: 5105-22E POWER9 0x4e1203 opal:v7.0-39-g4660e63a PowerNV NIP [c0fad7d4] mutex_lock+0x34/0x88 LR [c0fad7c8] mutex_lock+0x28/0x88 Call Trace: [c0017075f940] [c0fad7c8] mutex_lock+0x28/0x88 (unreliable) [c0017075f970] [c0214464] msi_lock_descs+0x28/0x3c [c0017075f990] [c08e8be8] pci_disable_msi+0x68/0xa8 [c0017075f9c0] [c090f0a4] pnv_php_disable_irq+0x2a0/0x2b0 [c0017075fab0] [c090f128] pnv_php_free_slot+0x74/0xc4 [c0017075fb30] [c0912184] pnv_php_disable_slot.part.0+0x1b8/0x24c [c0017075fc00] [c0902df0] power_write_file+0xf8/0x18c [c0017075fc80] [c08f84d8] pci_slot_attr_store+0x40/0x5c [c0017075fca0] [c06b4834] sysfs_kf_write+0x64/0x78 [c0017075fcc0] [c06b3300] kernfs_fop_write_iter+0x1b8/0x2dc [c0017075fd10] [c05b3eb0] vfs_write+0x224/0x4e8 [c0017075fdc0] [c05b44b0] ksys_write+0x88/0x150 [c0017075fe10] [c0030864] system_call_exception+0x124/0x320 [c0017075fe50] [c000cedc] system_call_vectored_common+0x15c/0x2ec --- interrupt: 3000 at 0x7fffb9748774 Root-Cause: Its safe to call pci_disable_msi() if its associated data structre are not freed (during bailout path). But when the driver code disables the bridge during hot-unplug operation, its msi data structure becomes NULL (php_slot->pdev->dev.msi.data:). This happens before unregistration and during disable path in function msi_device_data_release(). In this case, its not safe to explicitly call pci_disable_msi/msix() due to NULL pointer dereference. But since the current code does so, the crash is happening at the line mutex_lock(&dev->msi.data->mutex). FIX: In the current code, there are two paths to invoke pci_disable_msi/msix(). In the error/bailout path, first argument of the check - if(disable_device || irq > 0), i.e. disable_device is true, so it will always invoke pci_disable_msi/msix(), it will never depend on second argument. In this path it's fine to call pci_disable_msi/msix(). During the slot releasing/disable/hot-unpug path the disable_device is false, irq is having old value which is making the overall check true and causing the crash. Of course, we should not choose the old/stale value of irq but should choose php_slot->irq for check. Also, since php_slot->irq value is always 0 before the check, so it does not matter if it will not be included into the check. So, the check can be formed with only one argument i.e. disable_device. Based on its value pci_disable_msi/msix() will be invoked and this is the fix for the crash. During the bailout path its value will be true and during the hot-unplug operation on the bridge slot, its value will be false. drivers/pci/hotplug/pnv_php.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 694349be9d0a..573a41869c15 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -40,7 +40,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, bool disable_device) { struct pci_dev *pdev = php_slot->pdev; - int irq = php_slot->irq; u16 ctrl; if (php_slot->irq > 0) { @@ -59,7 +58,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, php_slot->wq = NULL; } - if (disable_device || irq > 0) { + if (disable_device) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled) -- 2.44.0
[PATCH 0/2] PCI hotplug driver fixes
The fix of Powerpc hotplug driver (drivers/pci/hotplug/pnv_php.c) addresses below two issues. 1. Kernel Crash during hot unplug of bridge/switch slot. 2. DPC-Support Enablement - Previously, when we do a hot-unplug operation on a bridge slot, all the ports and devices behind the bridge-ports would be hot-unplugged/offline, but when we do a hot-plug operation on the same bridge slot, all the ports and devices behind the bridge would not get hot-plugged/online. In this case, Only the first port of the bridge gets enabled and the remaining port/devices remain unplugged/offline. After the fix, The hot-unplug and hot-plug operations on the slot associated with the bridge started behaving correctly and became in sync. Now, after the hot plug operation on the same slot, all the bridge ports and devices behind the bridge become hot-plugged/online/restored in the same manner as it was before the hot-unplug operation. Krishna Kumar (2): pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv arch/powerpc: hotplug driver bridge support arch/powerpc/include/asm/ppc-pci.h | 4 +++ arch/powerpc/kernel/pci-hotplug.c | 5 ++-- arch/powerpc/kernel/pci_dn.c | 42 ++ drivers/pci/hotplug/pnv_php.c | 3 +-- 4 files changed, 49 insertions(+), 5 deletions(-) -- 2.44.0