A SURPRISE removal of a hotplug PCIe device, caused by a Link Down
event will execute an orderly removal of the driver, which normally
includes releasing the IRQs with pci_free_irq(_vectors):

 * SURPRISE removal event causes Link Down
 * pciehp_disable_slot()
 * pci_device_remove()
 * driver->remove()
 * pci_free_irq(_vectors)()
 * irq_chip->irq_mask()
 * pci_msi_mask_irq()

Eventually, msi_set_mask_bit() will attempt to do MMIO over the dead
link, usually resulting in an Unsupported Request error. This can
confuse the firmware on FFS machines, and lead to a system crash.

Since the channel will have been marked "pci_channel_io_perm_failure"
by the hotplug thread, we know we should avoid sending blind IO to a
dead link.
When the device is disconnected, bail out of MSI teardown.

If device removal and Link Down are independent events, there exists a
race condition when the Link Down event occurs right after the
pci_dev_is_disconnected() check. This is outside the scope of this patch.

Signed-off-by: Alexandru Gagniuc <mr.nuke...@gmail.com>
---
Changes since v2:
 * Updated commit message

 drivers/pci/msi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 4c0b47867258..6b6541ab264f 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 
flag)
 {
        struct msi_desc *desc = irq_data_get_msi_desc(data);
 
+       if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc)))
+               return;
+
        if (desc->msi_attrib.is_msix) {
                msix_mask_irq(desc, flag);
                readl(desc->mask_base);         /* Flush write to device */
-- 
2.19.2

Reply via email to