On Tuesday 21 December 2004 5:54 pm, Alan Stern wrote:
> On Tue, 21 Dec 2004, David Brownell wrote:
> 
> > Hmm, if this is after you get that "fatal error" out of EHCI
> > during init (which I've not had time to do much with; and it
> > doesn't happen on any of my hardware) I suspect it's just that
> > usbcore isn't cleaning up properly there.  All the HCDs, and
> > other USB drivers, need to be first idled, then removed; but
> > the hub driver, at least, doesn't get cleaned up.
> 
> It would be nice to find out what part of the cleanup isn't happening
> right. 

Cleanup that's never happened right.  Basically what's needed is
to have the hub driver (a) stop the root hub timer, which is simple
enough, then (b) invoke hcd->stop(), which isn't.  The stop() has
to be from khubd otherwise deadlocks happen; for a while, keventd
was calling that (and I saw deadlocks).  Maybe a quick'n'dirty fix
is just to clear the config flag register as well as reset EHCI,
and then let "rmmod" do the stop().  That'd defer any disconnects
till very late, though.

That is, there are two separate problems.  That cleanup after an
"hc died" fault has been a longstanding -- but rare -- problem.

But actually needing it much at all is newish, and seems to have
been needed more often (EHCI only) starting with 2.6.6 or so, or
2.6.7, or 2.6.8 depending on who reports it.  And I didn't notice
any particularly relevant EHCI changes in those kernels either!

One VT6202 version of the failure is suggestive too:

http://marc.theaimsgroup.com/?l=linux-usb-devel&m=110316031705757&w=2

Because that chip wasn't reporting that it could switch port
power, yet it went and disabled the power on several ports.
And enabling the "should be a NOP" power switch logic seemed
to make a difference.  (The current code reports "ganged"
power switching, which according to the EHCI spec is wrong.
EHCI should do per-port switching, or none at all.)


> Do you know at what point that "fatal error" occurs?  Is it during 
> probe?  Can you post a short patch that will simulate the same effect?

Near the end of ehci_irq(), just mask STS_FATAL into the
IRQ status mask before testing whether that's in the mask.
That ought to do it.

It happens after probe, sometime after the root hub has
been enumerated and before the first "real IRQ" has been
fully processed.


> I will spend some time this week redoing the bugfix and cleanup parts of 
> the as424b-as426b patches, without the bus glue changes.  Maybe that will 
> make a difference.

Hold off on that a bit yet.  I may yet be persuaded that your
current patches are just fine; like I said, I just wanted a
chance to think about them properly!

- Dave


> 
> Alan Stern
> 
> 


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to