Am Mittwoch, 15. Januar 2003 17:04 schrieb Alan Stern: > I agree with Oliver on this. Although the existing API provides enough > mechanism for drivers to make sure that all their urbs are unlinked and > all completion handlers are through, it's not easy to do correctly. Since > practically every driver has to deal with this problem during unloading, > the functionality deserves to be centralized in the core. Furthermore, > doing so turns out to be relatively easy and can even provide better error > checking. > > Oliver, it turns out the advice I gave you yesterday about putting in the > spinlocking stuff in unlink_complete() was wrong. That's probably what > was causing your kernel to hang. I should know better than to make a > hasty reply on a complicated topic.
No problem. Easy to rip out again. > All right. I spent a long time thinking about the issues last night, and > now I'm confident that I have a correct solution. First, a few comments. > > For ease of discussion, let's say that an urb is "finished" when the > low-level HCD code is done with it. This happens just before the > completion handler is called. > > The intended semantics for usb_unlink_urb() should be something like this: > An ongoing urb is stopped reasonably quickly, and the completion handler > is not allowed to resubmit it. If the urb has already finished and the > completion handler is running, but the handler has not yet resubmitted, a > resubmission will fail. If the handler has already resubmitted, then the > unlink applies to the new incarnation of the urb. In any case, if the > unlink is synchronous, it will not return until after the corresponding > completion handler has finished running. I am with you so far. > It seems to me that at the point where the completion handler returns, > it's necessary to distinguish between urbs that have and have not been > resubmitted by the handler. There are at least two reasons for this. > First, state changes (i.e., going to IDLE) should not happen for Ack. > resubmitted urbs. Second (and more subtle), there are differences between > resubmitting an urb from the handler as opposed to submitting it again > after the handler is done. For example, any bandwidth reserved for the > urb is kept when the handler resubmits, but it is deallocated otherwise. Eh, how so? usb_hcd_giveback_urb() does call unlink_urb() unconditionally. > (I don't know where or even if this really happens -- logically the > decision can only be made directly after the handler returns.) At any > rate, this means that we need to guarantee, following the call to > urb->complete(), that the urb is in different states depending on whether > or not it was resubmitted. Ack. > Normally, at this point a resubmitted urb will be SUBMITTED, which is > okay. But it might finish or get cancelled, in which case the state would > be COMPLETING or CANCELED, which is not good. We can take care of the Where's the problem with it being CANCELED? The URB cannot finish regularly because that would be a hardware IRQ reentering itself. The generic IRQ code should prevent that. So the only cause of a status change at that point can be usb_unlink_urb(). In that case going to CANCELED seem to be correct and not causing a problem. > COMPLETING possibility by only making the SUBMITTED -> COMPLETING > transition while the handler_lock is held. The CANCELED possibility is a > bit harder. I considered the idea of making usb_unlink_urb() spin if it > is called for a resubmitted urb still in its completion handler, but that > turned out to be neither desirable nor feasible. In the end, the best Why? IMHO changing the status to CANCELED is no problem and neither is spinning on the handler_lock. Submitting and unlinking an URB from a completion handler is arcane anyway. > answer was to add a new state. Let's agree that when a CANCELED urb's > handler is called, the state changes to TERMINATING, just as a SUBMITTED > urb's state changes to COMPLETING. (The name TERMINATING is meant to > suggest that an indefinite handler-automatically-resubmits loop will be > broken.) That takes care of everything: when the completion handler > returns, if the urb was resubmitted then its state will be either > SUBMITTED or CANCELED, otherwise its state will be either COMPLETING or > TERMINATING. > > If there turns out to be a driver that really needs to resubmit a > cancelled urb from within its completion handler, it will be easy to write > an API that changes the state from TERMINATING back to COMPLETING. Ack. > Since there are now two spinlocks involved here, we have to take care to > avoid deadlock. Completion handlers run with handler_lock held, and they > have to be able to call usb_submit_urb() which acquires lock (I think -- > correct me if this is wrong), so the rule must be: Whenever lock and > handler_lock are both acquired, you have to acquire handler_lock first. > > The state diagram is now a little too complicated for my ASCII art skills. > I'll just present it as a table. > > IDLE: the initial state. It is a BUG to destroy an urb that is > not in the IDLE state (maybe urb->count already takes care of > that). > usb_submit_urb() --> SUBMITTED > > SUBMITTED: under control of the low-level HCD. > HCD finishes --> COMPLETING > usb_unlink_urb() --> CANCELED > > CANCELED: still under control of the HCD. > HCD finishes --> TERMINATING > > COMPLETING: a non-cancelled urb passed to its completion handler. > usb_unlink_urb() --> TERMINATING > usb_submit_urb() --> SUBMITTED > handler returns --> IDLE > > TERMINATING: a cancelled urb passed to its completion handler. > handler returns --> IDLE Could you explain why CANCELED and TERMINATING should be differentiated? It seems to me that both usb_unlink_urb() and usb_submit_urb() would fail with an error in both cases. > Adjustments needed with respect to the last patch: > > Add the URB_TERMINATING state to usb.h. > > Remove the spinlock calls from unlink_complete(). Since unlink_complete() > is itself a completion handler, the spinlock has already been acquired. > Likewise, don't change the state to IDLE. Done. > In hcd_unlink_urb(): the CANCELED and TERMINATING cases are both BUGS (or > maybe they are both just oopses) -- someone has tried to unlink the same > urb twice. In the COMPLETING case, the state should change to > TERMINATING. A double unlink can happen in some network drivers if timeout and disconnect race. Can we simply return an error? > In usb_hcd_giveback_urb(): move spin_lock(&urb->handler_lock) to the > start, immediately before spin_lock(&urb->lock). If the state was > CANCELED, set it to TERMINATING. At the end, release handler_lock only > after updating the state. Set the state to IDLE if it was either > COMPLETING or TERMINATING. Done. > In usb_submit_urb(): if the state was either SUBMITTED or CANCELED then > it's a BUG or an oops. If the state was TERMINATING, return -EINTR. > > Update the appropriate kerneldoc comments and the error code listing in > Documentation/usb/error-codes.txt. I'll do that as soon as we have a final version. I've followed your suggestions that seem obvious, but can we discuss the introduction of a TERMINATING state and double locking? Regards Oliver # This is a BitKeeper generated patch for the following project: # Project Name: greg k-h's linux 2.5 USB kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1270 -> 1.1273 # include/linux/usb.h 1.122 -> 1.123 # drivers/usb/core/urb.c 1.19 -> 1.20 # drivers/usb/core/hcd.c 1.86 -> 1.89 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/01/14 [EMAIL PROTECTED] 1.1271 # - explicit URB state # - improvement in unlinking semantics # -------------------------------------------- # 03/01/14 [EMAIL PROTECTED] 1.1272 # - spinlocks in unlink notification path # - fix state in case of resubmission # -------------------------------------------- # 03/01/15 [EMAIL PROTECTED] 1.1273 # - remove incorrect spinlocking # -------------------------------------------- # diff -Nru a/drivers/usb/core/hcd.c b/drivers/usb/core/hcd.c --- a/drivers/usb/core/hcd.c Wed Jan 15 20:16:16 2003 +++ b/drivers/usb/core/hcd.c Wed Jan 15 20:16:16 2003 @@ -1138,14 +1138,23 @@ goto done; } - /* Any status except -EINPROGRESS means something already started to - * unlink this URB from the hardware. So there's no more work to do. - * - * FIXME use better explicit urb state + /* + * We decide based upon the internal_state what's to be done */ - if (urb->status != -EINPROGRESS) { + switch (urb->internal_state) { + case URB_SUBMITTED: /* normal case */ + urb->internal_state = URB_CANCELED; + break; + default: + BUG(); + case URB_IDLE: + case URB_CANCELED: retval = -EBUSY; goto done; + case URB_COMPLETING: + retval = -EBUSY; + urb->internal_state = URB_CANCELED; + goto err_out_waiting; } /* maybe set up to block until the urb's completion fires. the @@ -1205,6 +1214,18 @@ if (retval && sys) dev_dbg (sys, "hcd_unlink_urb %p fail %d\n", urb, retval); return retval; + +err_out_waiting: + /* we are asked to unlink while the completing routine is already + running. We drop the locks and spin in case of synchronous unlink */ + spin_unlock (&hcd_data_lock); + spin_unlock_irqrestore (&urb->lock, flags); + if (!(urb->transfer_flags & URB_ASYNC_UNLINK)) { + spin_lock_irq(&urb->handler_lock); + spin_unlock_irq(&urb->handler_lock); + } + + return retval; } /*-------------------------------------------------------------------------*/ @@ -1287,6 +1308,10 @@ void usb_hcd_giveback_urb (struct usb_hcd *hcd, struct urb *urb, struct pt_regs *regs) { urb_unlink (urb); + spin_lock(&urb->lock); + if (urb->internal_state == URB_SUBMITTED) + urb->internal_state = URB_COMPLETING; + spin_unlock(&urb->lock); // NOTE: a generic device/urb monitoring hook would go here. // hcd_monitor_hook(MONITOR_URB_FINISH, urb, dev) @@ -1307,7 +1332,15 @@ } /* pass ownership to the completion handler */ + spin_lock(&urb->handler_lock); urb->complete (urb, regs); + spin_unlock(&urb->handler_lock); + /* we are now idle unless the urb was resubmitted */ + spin_lock(&urb->lock); + if (urb->internal_state == URB_COMPLETING) + urb->internal_state = URB_IDLE; + spin_unlock(&urb->lock); + usb_put_urb (urb); } EXPORT_SYMBOL (usb_hcd_giveback_urb); diff -Nru a/drivers/usb/core/urb.c b/drivers/usb/core/urb.c --- a/drivers/usb/core/urb.c Wed Jan 15 20:16:16 2003 +++ b/drivers/usb/core/urb.c Wed Jan 15 20:16:16 2003 @@ -44,6 +44,7 @@ memset(urb, 0, sizeof(*urb)); urb->count = (atomic_t)ATOMIC_INIT(1); spin_lock_init(&urb->lock); + spin_lock_init(&urb->handler_lock); return urb; } @@ -192,6 +193,8 @@ struct usb_device *dev; struct usb_operations *op; int is_out; + int r; + unsigned long flags; if (!urb || urb->hcpriv || !urb->complete) return -EINVAL; @@ -346,7 +349,28 @@ urb->interval = temp; } + spin_lock_irqsave(urb->lock, flags); + switch (urb->internal_state) { + case URB_IDLE: + case URB_COMPLETING: + urb->internal_state = URB_SUBMITTED; + break; + case URB_CANCELED: + r = -EINTR; + goto err_out; + case URB_SUBMITTED: + default: + BUG(); + r = -EBUSY; + goto err_out; + } + spin_unlock_irqrestore(urb->lock, flags); + return op->submit_urb (urb, mem_flags); + +err_out: + spin_unlock_irqrestore(urb->lock, flags); + return r; } /*-------------------------------------------------------------------*/ diff -Nru a/include/linux/usb.h b/include/linux/usb.h --- a/include/linux/usb.h Wed Jan 15 20:16:16 2003 +++ b/include/linux/usb.h Wed Jan 15 20:16:16 2003 @@ -728,6 +728,8 @@ { spinlock_t lock; /* lock for the URB */ atomic_t count; /* reference count of the URB */ + int internal_state; /* state in the URB's lifecycle */ + spinlock_t handler_lock; /* lock during completion routine */ void *hcpriv; /* private data for host controller */ struct list_head urb_list; /* list pointer to all active urbs */ struct usb_device *dev; /* (in) pointer to associated device */ @@ -750,6 +752,11 @@ usb_complete_t complete; /* (in) completion routine */ struct usb_iso_packet_descriptor iso_frame_desc[0]; /* (in) ISO ONLY */ }; + +#define URB_IDLE 0 +#define URB_SUBMITTED 1 +#define URB_COMPLETING 2 +#define URB_CANCELED 3 /* -------------------------------------------------------------------------- */ ------------------------------------------------------- This SF.NET email is sponsored by: A Thawte Code Signing Certificate is essential in establishing user confidence by providing assurance of authenticity and code integrity. Download our Free Code Signing guide: http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0028en _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel