I could imagine that "misc OHCI updates" patch of a few weeks back speeding up an OHCI implementation because it had less work to do (no periodic schedule dma), exposing such a race.
OK, see how this patch behaves for you. The "bad entry" stuff seems to have been caused by trying to have a fast-path (the same one 2.4 usb-ohci has) getting rid of a 1 msec unlink delay in some cases. It finally stopped working; I confess I've never quite trusted that shortcut!
- Dave
p.s. Alan, notice how the "unlink during submit" case works.
It's got to giveback(), as promised by the unlink logic.Fix two OHCI unlink issues.
* All EDs now get a 1 msec delay before re-linking, even those
which were seemingly "clean" unlink cases. This gets rid of
some list corruption issues ("bad entry") by getting rid of
a fast-path carried over from 2.4 usb-ohci.
* In case of unlink-during-submit, we must giveback() right away.
This is a reasonably rare case.
There have been recent reports of problems here. The "bad entry"
showed up with usbtest tests #11 and #12, or "stir4200", and maybe
in other cases. The unlink-during-submit shows up in usbtest.
--- 1.56/drivers/usb/host/ohci-hcd.c Fri Feb 20 07:59:50 2004
+++ edited/drivers/usb/host/ohci-hcd.c Tue Mar 2 13:52:40 2004
@@ -229,11 +229,21 @@
goto fail;
}
+ /* in case of unlink-during-submit */
+ spin_lock (&urb->lock);
+ if (urb->status != -EINPROGRESS) {
+ spin_unlock (&urb->lock);
+
+ finish_urb (ohci, urb, 0);
+ retval = 0;
+ goto fail;
+ }
+
/* schedule the ed if needed */
if (ed->state == ED_IDLE) {
retval = ed_schedule (ohci, ed);
if (retval < 0)
- goto fail;
+ goto fail0;
if (ed->type == PIPE_ISOCHRONOUS) {
u16 frame = OHCI_FRAME_NO(ohci->hcca);
@@ -257,6 +267,8 @@
urb->hcpriv = urb_priv;
td_submit_urb (ohci, urb);
+fail0:
+ spin_unlock (&urb->lock);
fail:
if (retval)
urb_free_priv (ohci, urb_priv);
--- 1.49/drivers/usb/host/ohci-q.c Fri Feb 20 07:59:50 2004
+++ edited/drivers/usb/host/ohci-q.c Tue Mar 2 13:52:46 2004
@@ -331,19 +331,6 @@
periodic_unlink (ohci, ed);
break;
}
-
- /* NOTE: Except for a couple of exceptionally clean unlink cases
- * (like unlinking the only c/b ED, with no TDs) HCs may still be
- * caching this operational ED (or its address). Safe unlinking
- * involves not marking it ED_IDLE till INTR_SF; we always do that
- * if td_list isn't empty. Otherwise the race is small; but ...
- */
- if (ed->state == ED_OPER) {
- ed->state = ED_IDLE;
- ed->hwINFO &= ~(ED_SKIP | ED_DEQUEUE);
- ed->hwHeadP &= ~ED_H;
- wmb ();
- }
}
@@ -665,6 +652,7 @@
/* start periodic dma if needed */
if (periodic) {
+ wmb ();
ohci->hc_control |= OHCI_CTRL_PLE|OHCI_CTRL_IE;
writel (ohci->hc_control, &ohci->regs->control);
}
@@ -1053,7 +1041,7 @@
/* clean schedule: unlink EDs that are no longer busy */
if (list_empty (&ed->td_list))
- ed_deschedule (ohci, ed);
+ start_ed_unlink (ohci, ed);
/* ... reenabling halted EDs only after fault cleanup */
else if ((ed->hwINFO & (ED_SKIP | ED_DEQUEUE)) == ED_SKIP) {
td = list_entry (ed->td_list.next, struct td, td_list);
