On 4 Apr 2004, James Bottomley wrote:
> OK, your "problem" definition is that "there's a race", which I agree
> with, I just don't agree that it's a problem.
>
> Disconnections are fundamentally asynchronous events (a device may be
> disconnected by the user at any stage regardless of what any kernel
> internal state model is doing). Trying to impose synchronisation on
> asynchronous events is asking for trouble.
>
> In the open race scenario, either the open is refused or the user gets a
> fd that cannot do anything (because the device isn't there) and simply
> returns errors to all operations. Both cases are correct, so who wins
> the race is irrelevant.
Ah, you have left out the third, bad alternative: open succeeds, user gets
an fd that points to a deallocated device. More details below...
> Let me illustrate: the user may disconnect the device then open it. If
> they open it before even the USB subsystem gets notified of the
> disconnection then all the elaborate synchronisation in the world isn't
> going to be able to prevent that (the device was gone when they opened
> it, just nothing in the kernel knew that). Since we cannot solve that
> race, there's no reason to try to solve the "some parts of the kernel
> know but others don't" part of the race.
I agree with everything except your conclusion. :-) Just because some
outcomes of the race lead to a benign result no matter what, that doesn't
mean all outcomes will or that we can ignore the race.
Let's consider a simple example, one that doesn't have all the
complexities of the sr driver with its multiple driver layers. The
usb-skeleton program in drivers/usb makes a good illustration; I added a
semaphore to it some time ago to protect against exactly this sort of
race. Without that semaphore, here's what can happen:
Open process: Disconnect process:
Get minor number from inode
Lookup USB interface using
minor number
Get device pointer from the
interface's private data
and check it's not NULL
Get device pointer from the
interface's private data
Set the private data to NULL
Lock the device sem
Unregister the minor number
Terminate ongoing I/O operations
Clear the device->present flag
Unlock the device
Since the open count is 0,
deallocate the device structure
Lock the device sem --> oops
The idea is that at some stage the open process has got far enough along
to believe the device exists, but not far enough to hold a reference to
it. (That's inevitable, since you can't try to acquire a reference until
you're sure the device does exist.) If the disconnect process deallocates
the device at that time, there will be trouble.
Alan Stern
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel