On Sat, Jun 28, 2025 at 02:58:49PM -0400, Michael S. Tsirkin wrote: > At the moment, in case of a surprise removal, the regular > remove callback is invoked, exclusively. > This works well, because mostly, the cleanup would be the same. > > However, there's a race: imagine device removal was initiated by a user > action, such as driver unbind, and it in turn initiated some cleanup and > is now waiting for an interrupt from the device. If the device is now > surprise-removed, that never arrives and the remove callback hangs > forever. > > Drivers can artificially add timeouts to handle that, but it can be > flaky. > > Instead, let's add a way for the driver to be notified about the > disconnect. It can then do any necessary cleanup, knowing that the > device is inactive. [...] > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -549,6 +549,15 @@ static inline int pci_dev_set_disconnected(struct > pci_dev *dev, void *unused) > pci_dev_set_io_state(dev, pci_channel_io_perm_failure); > pci_doe_disconnected(dev); > > + /* Notify driver of surprise removal */ > + device_lock(&dev->dev); > + > + if (dev->driver && dev->driver->err_handler && > + dev->driver->err_handler->disconnect) > + dev->driver->err_handler->disconnect(dev); > + > + device_unlock(&dev->dev); > + > return 0; > }
No, that's not good: 1/ The device_lock() will reintroduce the issues solved by 74ff8864cc84. 2/ pci_dev_set_disconnected() needs to be fast so that devices are marked unplugged as quickly as possible. We want to minimize the time window where MMIO and Config Space reads already return "all ones" and writes go to nirvana, but pci_dev_is_disconnected() still returns false. Hence invoking some driver callback which may take arbitrarily long or even sleeps is not an option. The driver is already notified of removal through invocation of the ->remove() callback. The use case you're describing is arguably a corner case. I do think that a timeout is a better approach than the one proposed here. How long does it take for the interrupt to arrive? If it's not just a few msec, consider polling the device and breaking out of the pool loop as soon as pci_dev_is_disconnected() returns true (or the MMIO read returns PCI_POSSIBLE_ERROR()). If/when respinning, please explain the use case in more detail, i.e. which driver, which device, pointers to code... Thanks! Lukas