Re: [PATCH RFC] pci: report surprise removal events

Michael S. Tsirkin Mon, 30 Jun 2025 00:17:38 -0700

On Sun, Jun 29, 2025 at 05:39:58PM -0600, Keith Busch wrote:
> On Sun, Jun 29, 2025 at 01:28:08PM -0400, Michael S. Tsirkin wrote:
> > On Sun, Jun 29, 2025 at 03:36:27PM +0200, Lukas Wunner wrote:
> > > On Sat, Jun 28, 2025 at 02:58:49PM -0400, Michael S. Tsirkin wrote:
> > > 
> > > 1/ The device_lock() will reintroduce the issues solved by 74ff8864cc84.
> > 
> > I see. What other way is there to prevent dev->driver from going away,
> > though? I guess I can add a new spinlock and take it both here and when
> > dev->driver changes? Acceptable?
> 
> You're already holding the pci_bus_sem here, so the final device 'put'
> can't have been called yet, so the device is valid and thread safe in
> this context. I think maintaining the desired lifetime of the
> instantiated driver is just a matter of reference counting within your
> driver.
> 
> Just a thought on your patch, instead of introducing a new callback, you
> could call the existing '->error_detected()' callback with the
> previously set 'pci_channel_io_perm_failure' status. That would totally
> work for nvme to kick its cleanup much quicker than the blk_mq timeout
> handling we currently rely on for this scenario.


That's even easier, sure. However, Lukas raised the issue that
pci_dev_set_disconnected must be fast, and drivers might do silly things
in their callbacks. So, I was working on adding ability to schedule work
on such an event, so prevent such misuse.

At the same time, it's somewhat hard to abstract it all away in
a driver independent manner, a callback is certainly easier.

WDYT?

-- 
MST

Re: [PATCH RFC] pci: report surprise removal events

Reply via email to