> From: Doug Ledford [mailto:dledf...@redhat.com]
> > I suppose that the main issue would be handling existing user memory
> mappings,
> > which cannot be just invalidated -- the user-space driver may not be aware
> of the
> > device removal and may access this memory concurrently, and we don't
> want it
> > to crash.
> 
> In this case, you are mapping it out of the device BAR space and into a
> random kernel page, yes?  So, if the driver doesn't catch the
> DRIVER_FATAL event and process that to mean "don't bother touching this
> RDMA device any more", it's going to write to a mailbox that no longer
> responds and have infinite timeouts, yes?  Essentially meaning all
> mailbox type operations just go into lala land from here on out, right?
> 

Pressed 'send' too early...

The kernel activity is asynchronous to user-space.
The device may be un-plugged before the user-space driver has a chance to learn 
that a DEVICE_FATAL event has occurred. In fact, in the current user-space 
stack design, device drivers don't have a context of their own to read() from 
file descriptors and rely on the application for that.
But even so, you probably don't want a driver to invoke a system call during 
the fast path just to check this condition.

For devices that just write the BAR space, an arbitrary kernel page would do.
Other devices might wish to first populate this page so that the user-space 
driver can detect this situation efficiently.

--Liran

Reply via email to