I could imagine that "misc OHCI updates" patch of a few weeks back speeding up an OHCI implementation because it had less work to do (no periodic schedule dma), exposing such a race.
OK, see how this patch behaves for you. The "bad entry" stuff seems to have been caused by trying to have a fast-path (the same one 2.4 usb-ohci has) getting rid of a 1 msec unlink delay in some cases. It finally stopped working; I confess I've never quite trusted that shortcut!
- Dave
p.s. Alan, notice how the "unlink during submit" case works. It's got to giveback(), as promised by the unlink logic.
Fix two OHCI unlink issues.
* All EDs now get a 1 msec delay before re-linking, even those which were seemingly "clean" unlink cases. This gets rid of some list corruption issues ("bad entry") by getting rid of a fast-path carried over from 2.4 usb-ohci. * In case of unlink-during-submit, we must giveback() right away. This is a reasonably rare case. There have been recent reports of problems here. The "bad entry" showed up with usbtest tests #11 and #12, or "stir4200", and maybe in other cases. The unlink-during-submit shows up in usbtest. --- 1.56/drivers/usb/host/ohci-hcd.c Fri Feb 20 07:59:50 2004 +++ edited/drivers/usb/host/ohci-hcd.c Tue Mar 2 13:52:40 2004 @@ -229,11 +229,21 @@ goto fail; } + /* in case of unlink-during-submit */ + spin_lock (&urb->lock); + if (urb->status != -EINPROGRESS) { + spin_unlock (&urb->lock); + + finish_urb (ohci, urb, 0); + retval = 0; + goto fail; + } + /* schedule the ed if needed */ if (ed->state == ED_IDLE) { retval = ed_schedule (ohci, ed); if (retval < 0) - goto fail; + goto fail0; if (ed->type == PIPE_ISOCHRONOUS) { u16 frame = OHCI_FRAME_NO(ohci->hcca); @@ -257,6 +267,8 @@ urb->hcpriv = urb_priv; td_submit_urb (ohci, urb); +fail0: + spin_unlock (&urb->lock); fail: if (retval) urb_free_priv (ohci, urb_priv); --- 1.49/drivers/usb/host/ohci-q.c Fri Feb 20 07:59:50 2004 +++ edited/drivers/usb/host/ohci-q.c Tue Mar 2 13:52:46 2004 @@ -331,19 +331,6 @@ periodic_unlink (ohci, ed); break; } - - /* NOTE: Except for a couple of exceptionally clean unlink cases - * (like unlinking the only c/b ED, with no TDs) HCs may still be - * caching this operational ED (or its address). Safe unlinking - * involves not marking it ED_IDLE till INTR_SF; we always do that - * if td_list isn't empty. Otherwise the race is small; but ... - */ - if (ed->state == ED_OPER) { - ed->state = ED_IDLE; - ed->hwINFO &= ~(ED_SKIP | ED_DEQUEUE); - ed->hwHeadP &= ~ED_H; - wmb (); - } } @@ -665,6 +652,7 @@ /* start periodic dma if needed */ if (periodic) { + wmb (); ohci->hc_control |= OHCI_CTRL_PLE|OHCI_CTRL_IE; writel (ohci->hc_control, &ohci->regs->control); } @@ -1053,7 +1041,7 @@ /* clean schedule: unlink EDs that are no longer busy */ if (list_empty (&ed->td_list)) - ed_deschedule (ohci, ed); + start_ed_unlink (ohci, ed); /* ... reenabling halted EDs only after fault cleanup */ else if ((ed->hwINFO & (ED_SKIP | ED_DEQUEUE)) == ED_SKIP) { td = list_entry (ed->td_list.next, struct td, td_list);