I agree with Oliver on this.  Although the existing API provides enough
mechanism for drivers to make sure that all their urbs are unlinked and
all completion handlers are through, it's not easy to do correctly.  Since
practically every driver has to deal with this problem during unloading,
the functionality deserves to be centralized in the core.  Furthermore,
doing so turns out to be relatively easy and can even provide better error
checking.

Oliver, it turns out the advice I gave you yesterday about putting in the
spinlocking stuff in unlink_complete() was wrong.  That's probably what
was causing your kernel to hang.  I should know better than to make a
hasty reply on a complicated topic.

All right.  I spent a long time thinking about the issues last night, and
now I'm confident that I have a correct solution.  First, a few comments.

For ease of discussion, let's say that an urb is "finished" when the 
low-level HCD code is done with it.  This happens just before the 
completion handler is called.

The intended semantics for usb_unlink_urb() should be something like this:  
An ongoing urb is stopped reasonably quickly, and the completion handler
is not allowed to resubmit it.  If the urb has already finished and the
completion handler is running, but the handler has not yet resubmitted, a 
resubmission will fail.  If the handler has already resubmitted, then the 
unlink applies to the new incarnation of the urb.  In any case, if the 
unlink is synchronous, it will not return until after the corresponding 
completion handler has finished running.

It seems to me that at the point where the completion handler returns,
it's necessary to distinguish between urbs that have and have not been
resubmitted by the handler.  There are at least two reasons for this.  
First, state changes (i.e., going to IDLE) should not happen for
resubmitted urbs.  Second (and more subtle), there are differences between
resubmitting an urb from the handler as opposed to submitting it again
after the handler is done.  For example, any bandwidth reserved for the
urb is kept when the handler resubmits, but it is deallocated otherwise.  
(I don't know where or even if this really happens -- logically the
decision can only be made directly after the handler returns.)  At any
rate, this means that we need to guarantee, following the call to
urb->complete(), that the urb is in different states depending on whether
or not it was resubmitted.

Normally, at this point a resubmitted urb will be SUBMITTED, which is
okay.  But it might finish or get cancelled, in which case the state would
be COMPLETING or CANCELED, which is not good.  We can take care of the
COMPLETING possibility by only making the SUBMITTED -> COMPLETING
transition while the handler_lock is held.  The CANCELED possibility is a
bit harder.  I considered the idea of making usb_unlink_urb() spin if it
is called for a resubmitted urb still in its completion handler, but that
turned out to be neither desirable nor feasible.  In the end, the best
answer was to add a new state.  Let's agree that when a CANCELED urb's
handler is called, the state changes to TERMINATING, just as a SUBMITTED
urb's state changes to COMPLETING.  (The name TERMINATING is meant to
suggest that an indefinite handler-automatically-resubmits loop will be
broken.)  That takes care of everything: when the completion handler
returns, if the urb was resubmitted then its state will be either
SUBMITTED or CANCELED, otherwise its state will be either COMPLETING or
TERMINATING.

If there turns out to be a driver that really needs to resubmit a 
cancelled urb from within its completion handler, it will be easy to write 
an API that changes the state from TERMINATING back to COMPLETING.

Since there are now two spinlocks involved here, we have to take care to 
avoid deadlock.  Completion handlers run with handler_lock held, and they 
have to be able to call usb_submit_urb() which acquires lock (I think -- 
correct me if this is wrong), so the rule must be:  Whenever lock and 
handler_lock are both acquired, you have to acquire handler_lock first.

The state diagram is now a little too complicated for my ASCII art skills.  
I'll just present it as a table.

        IDLE: the initial state.  It is a BUG to destroy an urb that is 
        not in the IDLE state (maybe urb->count already takes care of 
        that).
                usb_submit_urb() --> SUBMITTED

        SUBMITTED: under control of the low-level HCD.
                HCD finishes --> COMPLETING
                usb_unlink_urb() --> CANCELED

        CANCELED: still under control of the HCD.
                HCD finishes --> TERMINATING

        COMPLETING: a non-cancelled urb passed to its completion handler.
                usb_unlink_urb() --> TERMINATING
                usb_submit_urb() --> SUBMITTED
                handler returns --> IDLE

        TERMINATING: a cancelled urb passed to its completion handler.
                handler returns --> IDLE


Adjustments needed with respect to the last patch:

Add the URB_TERMINATING state to usb.h.

Remove the spinlock calls from unlink_complete().  Since unlink_complete() 
is itself a completion handler, the spinlock has already been acquired.  
Likewise, don't change the state to IDLE.

In hcd_unlink_urb(): the CANCELED and TERMINATING cases are both BUGS (or
maybe they are both just oopses) -- someone has tried to unlink the same
urb twice.  In the COMPLETING case, the state should change to
TERMINATING.

In usb_hcd_giveback_urb(): move spin_lock(&urb->handler_lock) to the
start, immediately before spin_lock(&urb->lock).  If the state was
CANCELED, set it to TERMINATING.  At the end, release handler_lock only
after updating the state.  Set the state to IDLE if it was either
COMPLETING or TERMINATING.

In usb_submit_urb(): if the state was either SUBMITTED or CANCELED then 
it's a BUG or an oops.  If the state was TERMINATING, return -EINTR.

Update the appropriate kerneldoc comments and the error code listing in 
Documentation/usb/error-codes.txt.

Alan Stern



-------------------------------------------------------
This SF.NET email is sponsored by: Take your first step towards giving 
your online business a competitive advantage. Test-drive a Thawte SSL 
certificate - our easy online guide will show you how. Click here to get 
started: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0027en
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to