Hi Mathias,

 We have run into a problem with a USB printer which we're quite confident is a 
bug in the Linux xHCI driver. There is no problem when the same printer is 
plugged into a port managed by the EHCI driver.

 The core problem is that xhci_reset_endpoint() doesn't do anything, and more 
specifically does not reset the xHC's data toggle/sequence number. That is not 
normally an issue, because the reset does happen in response to a STALL; in our 
scenario, there is no STALL or any other error. That can lead to the data 
toggle getting out of sync and the host dropping a packet sent by the device.

 Now a detailed problem description. We have a USB printer passed through to a 
VM. The VM runs Windows 8.1 or 10 (other versions may be affected too), and 
uses Microsoft's standard usbprint.sys to talk to the printer. The vendor 
printer driver tries to query the printer's configuration, using the control 
endpoint, one OUT endpoint, and one IN endpoint. The query always times 
out/fails when printer is plugged into a port managed by xHCI, yet works in 
EHCI ports.

 The usbprint.sys driver is a bit funny and in many cases (though not always) 
queues up URBs on the IN endpoint in advance, and once it decides that it has 
received the entire response, cancels the last URB and resets the IN endpoint 
(issuing SetFeature(CLEAR_HALT)). After much head scratching, we realized, and 
later confirmed with a USB analyzer, that the next IN packet that the printer 
sends is not seen by the host's USB stack at all, let alone the guest OS. Other 
packets arrive just fine, but the guest OS keeps waiting for more data to 
arrive, eventually loses patience and fails.

 We cannot observe the data toggle state of the xHC but we are fairly certain 
that things go wrong when the data toggle is set (on both ends) prior to the 
endpoint reset. SetFeature(CLEAR_HALT) resets the toggle on the device, but not 
on the host. But we know for a fact that the device sends a packet (with data 
toggle 0) which the host USB stack never sees, and a data toggle mismatch 
explains that quite well.

 We are using USBFS to talk to the printer, but that shouldn't matter much. I 
will note that the available documentation<1> explicitly says that 
USBDEVFS_RESETEP and USBDEVFS_CLEAR_HALT both reset the data toggle. That is 
indeed the case for the Linux EHCI driver but not xHCI. Both of the USBFS 
IOCTLs call into xhci_reset_endpoint() which does nothing.

 We believe that xhci_reset_endpoint() needs to reset the data toggle/sequence 
number to match the documentation and for compatibility with the EHCI driver. 
We tried but failed to find a workaround which would reset the data toggle 
without side effects (e.g. USBDEVFS_SETINTERFACE does reset the toggle on the 
IN endpoint, but also resets it on the OUT endpoint and talks to the device, so 
that's no good).

 The data toggle management is not terribly well documented in the xHCI spec so 
we hope you know about it more than we do. Based on our understanding of the 
xHCI specification, xhci_reset_endpoint() should issue either a Reset Endpoint 
command with TSP=0 or a dummy Configure Endpoint command dropping/re-adding the 
specified endpoint (as the xHCI 1.1 spec suggests at the end of 4.6.8). Please 
confirm if that should solve the problem.

 We don't know how many devices this problem affects. We suspect it affects 
many USB printers and could in theory affect more or less any device, but few 
drivers reset endpoints when there are no errors. The problem scenario can 
probably be artificially reproduced with more or less any USB device (when data 
toggle is set, issue USBDEVFS_CLEAR_HALT, see if next packet arrives at 
destination).


      Regards,
         Michal


1:
  https://www.kernel.org/doc/htmldocs/usb/usbfs-ioctl.html
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to