On Sun, Jun 29, 2025 at 03:36:27PM +0200, Lukas Wunner wrote: > On Sat, Jun 28, 2025 at 02:58:49PM -0400, Michael S. Tsirkin wrote: > > At the moment, in case of a surprise removal, the regular > > remove callback is invoked, exclusively. > > This works well, because mostly, the cleanup would be the same. > > > > However, there's a race: imagine device removal was initiated by a user > > action, such as driver unbind, and it in turn initiated some cleanup and > > is now waiting for an interrupt from the device. If the device is now > > surprise-removed, that never arrives and the remove callback hangs > > forever. > > > > Drivers can artificially add timeouts to handle that, but it can be > > flaky. > > > > Instead, let's add a way for the driver to be notified about the > > disconnect. It can then do any necessary cleanup, knowing that the > > device is inactive. > [...] > > --- a/drivers/pci/pci.h > > +++ b/drivers/pci/pci.h > > @@ -549,6 +549,15 @@ static inline int pci_dev_set_disconnected(struct > > pci_dev *dev, void *unused) > > pci_dev_set_io_state(dev, pci_channel_io_perm_failure); > > pci_doe_disconnected(dev); > > > > + /* Notify driver of surprise removal */ > > + device_lock(&dev->dev); > > + > > + if (dev->driver && dev->driver->err_handler && > > + dev->driver->err_handler->disconnect) > > + dev->driver->err_handler->disconnect(dev); > > + > > + device_unlock(&dev->dev); > > + > > return 0; > > }
thanks for the feedback. Would appreciate a couple more hints: > No, that's not good: > > 1/ The device_lock() will reintroduce the issues solved by 74ff8864cc84. I see. What other way is there to prevent dev->driver from going away, though? I guess I can add a new spinlock and take it both here and when dev->driver changes? Acceptable? > 2/ pci_dev_set_disconnected() needs to be fast so that devices are marked > unplugged as quickly as possible. We want to minimize the time window > where MMIO and Config Space reads already return "all ones" and writes > go to nirvana, but pci_dev_is_disconnected() still returns false. > Hence invoking some driver callback which may take arbitrarily long or > even sleeps is not an option. Well, there's no plan to do that there - just to wake up some wq so things can be completed. I can add code comments. > The driver is already notified of removal through invocation of the > ->remove() callback. The use case you're describing is arguably > a corner case. I do think that a timeout is a better approach > than the one proposed here. How long does it take for the interrupt > to arrive? It's a virtual device - kind of unpredictable. > If it's not just a few msec, consider polling the device > and breaking out of the pool loop as soon as pci_dev_is_disconnected() > returns true (or the MMIO read returns PCI_POSSIBLE_ERROR()). Yes but with no callback, we don't know when to do it. The config reads in pci_dev_is_disconnected are also expensive on VMs... > If/when respinning, please explain the use case in more detail, > i.e. which driver, which device, pointers to code... > > Thanks! > > Lukas It's virtio-blk.