Re: XHCI Clear halt issue
Thanks for your help, Mathias! See my comments inline below: Mathias Nyman wrote on 04/08/2014 10:26:43 AM: > The issue we currently have is that the xHCI (both driver and hw) > refuses to reset an endpoint if it's not halted. > SetFeature(ENDPOINT_HALT) will set the device to halted state, but it > requires some additional transfer that returns STALL until xHCI will see > the endpoint as halted. > > So in this case the situation is: > Abort pending urbs > SetFeature(ENDPOINT_HALT) > - ep halted on device side, xHCI doesn't consider ep halted. > usb_clear_halt() > - ClearFeature(ENDPOINT_HALT) -> device resets its ep toggle/sequence > - call hcd->driver->endpoint_reset(), but the xhci .endpoint_reset() > callback can't reset an endpoint it doesn't consider halted. > xhci host side toggle/sequence are not reset -> mismatch. Ok. But there shouldn't be any way user code should be able to get the two out-of-sync, right? This is really a layer below what the user should be able to interact with. Maybe this is what you are saying? > With dynamic debugging enabled for xhci you should probably see: > "Endpoint x not halted, refusing to reset." I'll try to get a kernel installed with this enabled. Right now it is a bit tricky to update kernels on our systems because their are a whole hierarchy of dependencies that need to rebuild with it. If there are specific things to test that I can lump together I can rebuild it all at once. > Discussion threads touching this topic: > http://marc.info/?l=linux-usb&m=134922286125585&w=2 > http://marc.info/?l=linux-usb&m=134852269014614&w=2 > http://marc.info/?l=linux-usb&m=139025060301432&w=2 Thanks for consolidating those messages. Those were the ones I had seen previously but wasn't sure what to conclude. > I'm focusing on this issue right now, and I appreciate if you are able > to run some test with your setup once I get something ready. Great! I can help as needed. > The main thing that needs to be done is what xHCI specs states > in an additional Note added to section 4.6.8 : > " If software wishes reset the Data Toggle or Sequence Number of an > endpoint that isn't in the Halted state, then software may issue a > Configure Endpoint Command with the Drop and Add bits set for the > target endpoint." But some other tweaking to how xhci driver handles > STALL and clears halted endpoints is also needed. Since the bus trace looks the same on Windows as on Linux (minues the incorrect sequence number and the failure), I assume this must be how it is done there? Eric Gross -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
XHCI Clear halt issue
Hi all, I am implementing a driver (currently libusb-based, but may change to kernel-based eventually) for a USB standard class type that makes use of endpoint stalling as a synchronization mechanism to recover after error conditions between device and host (the reasons for needing it are a bit complex). The driver code I have been using works beautifully on Windows, some embedded OSes with proprietary USB stacks, and Linux via the EHCI driver. However, I ran into problems as soon as we started using this driver on XHCI systems (based off the 3.10 kernel). The sequence the driver typically does when encountering an error (or thinking it needs to resync) is: - Abort any pending URBs (may be several queued to the EP) - Set Feature(HALT) - Clear EP Stall - Continue What we saw with a bus analyzer was that, independent of host controller used (tested Intel and Renesas), the sequence number of the next outgoing packet (or toggle bit when in High Speed mode) was incorrect after clearing the stall. The device resets its expected sequence/toggle after un-stalling the EP and hence it ignores the next packet with the incorrect one. Interestingly, some devices are actually tolerant of this behavior and accept the incorrect sequence id, but any devices based on the Cypress FX3 (a large number of devices implementing this class type) fail. When researching this issue I saw a number of previous posts hinting at known issues like this, but I have not seen a firm conclusion. It seems that some of the early responses by Sarah Sharp indicate that it is working this way by design (I admit I am not an expert in the XHCI spec). I see some newer posts referencing a "clear halt bug", but I have been unable to find what this definitively is referencing. Based on my experience with how every other stack appears to work (including the Linux EHCI driver) and how the device is supposed to behave when it gets the clear stall request, I can't help but think that the behavior as it currently is is wrong. I can provide any additional information (bus traces, testing results, etc) as needed. If this is a known issue that someone can point me to the bugzilla entry for (I have been unsuccessful finding one) or some previous discussion threads I may have not found, it would be appreciated as well. Thanks, Eric Gross National Instruments -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html