Re: [linux-usb-devel] SuperH UDC problem with stall

David Brownell Tue, 28 Oct 2003 13:51:58 -0800

Julian Back wrote:

Thanks to all on this list I have got my USB Device Controller Driver for the SuperH working quite well. It works OK with the mass storage gadget and I've tested it with Linux 2.4, Linux 2.6, Windows 2000 and Windows XP hosts. We've also got the SuperH host controller at least partially working.

Excellent!

The biggest problem I have had with the UDC is stalling endpoints from the device. This happens quite a lot when using the mass storage device with Windows as a lot of unsupported (optional) commands are sent.


You might want to pull the latest BK code from my trees, there have
been some stall-related changes recently.  (Thanks to Pat LaVarre
and Alan Stern for helping sort this out, and I suspect they may
be able to provide further illumination...)  The updates weren't
in BK 24 hours ago.

Very briefly: BBB stalls IN endpoints in certain cases, for example:

  - Host requests MODE_SENSE_10 and request 28 bytes
  - But the device only has 20 bytes, so it
       (a) writes 20 bytes
       (b) stalls
       (c) writes 13 byte CSW
  - Then the host
       (a) reads the 20 bytes (short transfer);
       (b) reads a stall
       (b') clears halt feature on that IN endpoint
       (c) reads 13 byte CSW

The issue is that hardware treats STALL as out-of-band messaging
and doesn't (as a rule) synchronize it at all with the fifo.  But
the mass storage protocol expects STALL at specific in-band
locations, so it requires synchronization to work right.

Without synch, either (a), (b), or (c) usually fails rudely.  In
fact, each of three controller drivers (pxa2xx_udc, goku_udc, and
net2280) had a different failure mode ... dropping the (a), (b),
or (c) packets, I forget which did which.

Given the choice between a complicated change in every controller
driver (queueing halts, for both PIO and DMA modes), and simpler
ones in the controller drivers and FSG, we chose the latter:

  - For IN endpoints, usb_ep_set_halt() returns -EAGAIN until
    the queue and FIFO are both empty.
  - When FSG gets that -EAGAIN, it retries.

That handles the synchronization between (a) and (b).  Most of
the controllers seem to synch (b) and (c) automatically.

The problem is that the hardware is (again) trying to be too clever. I set a bit in a register to stall an endpoint. The hardware also has an internal stall bit for each endpoint (not accessible to software). When the host attempts a transaction on an endpoint the hardware checks the internal stall bit and if this is set it sends a stall, if the internal bit is not set it checks the software settable bit and if this is set it latches the internal bit and sends a stall. Resetting the software stall bit does not affect the latched bit, the latched bit can only be cleared by a Clear Feature command from the host. But the Clear Feature is processed by the hardware and the software is not informed. If I


Does it clear that software settable bit when it's latched?  I'd hope so;
and likewise, I'd hope that the clear_feature isn't clearing anything
else except toggle.  (Like the FIFO holding the 13 bytes of CSW...)

leave the software stall bit set there will be another stall on the next transaction so I need to clear the stall bit. I do this by starting a timer to clear the stall bit sometime after I set it. This seems to work most of the time but it's not 100% reliable as I don't know whether the host has actually received the stall (or sent the Clear Feature).


I'm not quite clear what the problem is here -- latch-without-clear?
Can you detect whether a STALL was sent?  Does it matter whether there
was data in the FIFO before STALL was set ... or when it was cleared?

The optimum time delay needed for resetting the stall seems to depend on the speed of the host system, I've tried it with a 2.2GHz P4 and a 300MHz Celeron and it doesn't seem possible to have a value that works well with both systems.


Right, any host could take any amount of time.  It just depends on
how the host software is written and how busy it is.

Unless anyone can suggest an alternative that will make this more reliable it seems my only solution will be to implement all the optional mass storage commands that Windows sends so the gadget never has to stall an endpoint. I've already implemented a few of the missing commands and this has improved things. I've also added an option to the mass storage gadget to pad out short replies rather than sending the short reply and stalling (this seems to be a valid alternative according to the mass storage spec) and this also makes it more reliable.


Hmm, can you see if implementing that revised usb_ep_set_halt() behavior
affects this behavior?

Pat had some comments about the pad-out-replies strategies that I
won't try to summarize, beyond noting that not all hosts handle
the various protocol variants equally well.

I'd be quite willing to believe that the SuperH UDC brings some
new halt/stall problems to the table.  The good news is that so
far this only seems to affect the mass storage class support,
since most other device protocols don't use that mechanism.

- Dave

Julian


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?   SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] SuperH UDC problem with stall

Reply via email to