Hello, sorry for bringing back an old thread, but in all this excitement in the world right now, I kind of lost track of some stuff.
So the broad context is that I bought a USB scanner after choosing it from the list supported by SANE, hoping to use it on my OpenBSD home computer, and it worked only once after being plugged. on Thursday 13 February 2020 at 14:44, Natasha Kerensikova wrote: > More detials below, but TL;DR: the problem only happens on XHCI USB, > everything is fine with the same code going through EHCI; and the > problem is the loss of a short bulk xfer from the device, but it never > happens on the first handle opened for the device file, only on > subsequent uses of the same device. > > on Wednesday 12 February 2020 at 22:33, Natasha Kerensikova wrote: > > Basically, if I use two pairs libusb_bulk_transfer() in a row, > > everything works fine, but I release the interface and claim it again > > between them, I reproduce the timeout problem. > > > > So at this point, I suspect that libusb_claim_interface() does not > > exactly undo what libusb_release_interface(). Thanks to personal communications with mpi@ and patrick@, we found that there is indeed a XHCI-specific persistent state which is not reset, and that state is the PID which alternates between DATA0 and DATA1 to detect packet loss, and when the OS and the device are out of sync communication is no longer possible. patrick@ suggested to change ugen(4) to reset the PID state recorded by the OS, using usbd_clear_endpoint_stall(), so that both OpenBSD and the device continue on DATA0. He advised to call it after opening the device, which did not work, but maybe I did it wrong, so I left it in the commented lines in the patch below. I moved to right before closing the device, which is the non-commented line in the patch below, and it fixes the scanner for me, without breaking my other ugen(4) device (a digital camera used with gphoto2 port). To be honest I don't really understand why one works and not the other. I don't really have much idea of what I'm doing (whether in OpenBSD src or in XHCI stuff), but I can testify it does fix the scanner issue I had. So the patch below is intended more as the beginning of a discussion than a merge proposition. In case it matters, most of my actual use was done on top of codebase https://github.com/openbsd/src/commit/1a69f90406bdc08a7da080e105fa608babda44ed but since then there are only 2 commits on ugen(4) and 2 commits on xhci(4) and none of them seem to affect the issue. Thanks for your time, Natasha --- a/sys/dev/usb/ugen.c +++ b/sys/dev/usb/ugen.c @@ -309,6 +309,7 @@ ugenopen(dev_t dev, int flag, int mode, struct proc *p) edesc->bEndpointAddress, 0, &sce->pipeh); if (err) return (EIO); +// usbd_clear_endpoint_stall(sce->pipeh); break; } isize = UGETW(edesc->wMaxPacketSize); @@ -329,6 +330,7 @@ ugenopen(dev_t dev, int flag, int mode, struct proc *p) clfree(&sce->q); return (EIO); } +// usbd_clear_endpoint_stall(sce->pipeh); DPRINTFN(5, ("ugenopen: interrupt open done\n")); break; case UE_BULK: @@ -336,6 +338,7 @@ ugenopen(dev_t dev, int flag, int mode, struct proc *p) edesc->bEndpointAddress, 0, &sce->pipeh); if (err) return (EIO); +// usbd_clear_endpoint_stall(sce->pipeh); break; case UE_ISOCHRONOUS: if (dir == OUT) @@ -387,6 +390,7 @@ ugenopen(dev_t dev, int flag, int mode, struct proc *p) sce->timeout = USBD_DEFAULT_TIMEOUT; return (EINVAL); } +// usbd_clear_endpoint_stall(sce->pipeh); } sc->sc_is_open[endpt] = 1; return (0); @@ -441,6 +445,7 @@ ugen_do_close(struct ugen_softc *sc, int endpt, int flag) DPRINTFN(5, ("ugenclose: endpt=%d dir=%d sce=%p\n", endpt, dir, sce)); + usbd_clear_endpoint_stall(sce->pipeh); usbd_close_pipe(sce->pipeh); sce->pipeh = NULL;