Benjamin Herrenschmidt wrote:
Hi !

I have very frequent errors of this type with the USB controller
found in the G5s. It's random, a given kernel will always die or
never die apparently, depending on how much flies you had in the
room when building it. Problem has been around since I had the G5
port up, I don't know what's up.

The controller is an EHCI/OHCI, Apple-branded but I suspect it's
actually a NEC part.

ohci_hcd 0001:02:0b.0: OHCI Unrecoverable Error, disabled

For what it's worth, I've recently seen similar symptoms on one particular machine that seem related to PCI DMA problems.

 - Happens on _most_ of the 3 different types of OHCI
   controllers on that machine.  (Except a NEC controller!)

- Happens when it's idling (no devices, not like your case).

 - In one case, didn't happen at all ... until one other
   driver started to use 10-20 MByte/sec DMA bandwidth.

Usually the "UE" interrupt indicates something on the order
of a PCI abort, so the controller can't complete a DMA access.

In my case it seems more indicative of a PCI problem; it's
certainly been a couple years now since that's turned up
any kind of driver problem (like freeing and poisoning data
structures the hardware is still using).


drivers/usb/input/hid-core.c: ctrl urb status -2 received
ohci_hcd 0001:02:0b.0: HC died; cleaning up
usb 4-2: USB disconnect, address 2
ohci_hcd 0001:02:0b.0: leak ed c0525040 (#2) state 0 (has tds)

Hmm, that one shouldn't happen. If you modify the top of ohci_endpoint_disable() by adding a call to finish_unlinks():

        if (!HCD_IS_RUNNING (ohci->hcd.state)) {
                ed->state = ED_IDLE;
                finish_unlinks (ohci, 0, 0);
        }
        switch (ed->state) {

does that remove that "leak ed" message?


Any clue ?

Well, a few clues in my case, some of which might be applicable. I've not yet spent time exploring these:

 (1) The way the fault happened almost immediately when
     one other device started to DMA, on a different
     PCI bus segment.

 (2) The observation that periodic schedule scanning is
     left on by default, even when there are no periodic
     transfers active.  OHCI_CTRL_PLE doesn't get turned
     off the way OHCI_CTRL_{BLE,CLE} are.

(3) What seem like other PCI or BIOS issues on that mobo.

#2 has been true with OHCI forever.  It's probably worth
changing the OHCI driver to disable the periodic list;
that'd be friendlier to power management schemes too.
But since you have a HID device, that may not matter:
you'll need PLE enabled to poll that device.

I'd investigate what other PCI activity might be getting
in the way of the OHCI controller's DMA requests.

- Dave



(2.6.1 btw)

Ben.





------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ [EMAIL PROTECTED] To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to