On Fri, 10 Jul 2020 14:32:47 +0200 Marcus Glocker <mar...@nazgul.ch> wrote:
> On Fri, 10 Jul 2020 14:19:00 +0200 > Patrick Wildt <patr...@blueri.se> wrote: > > > On Fri, Jul 10, 2020 at 01:00:31PM +0200, Marcus Glocker wrote: > > > Hi All, > > > > > > Laurence Tratt was reporting about corrupted video images when > > > using uvideo(4) on xhci(4) with MJPEG and higher resolutions, > > > e.g. when using ffplay: > > > > > > $ ffplay -f v4l2 -input_format mjpeg -video_size 1280x720 -f > > > /dev/video1 > > > > > > When trying to re-produce the issue on my side, the video images > > > were still OK, but I also could see a lot of errors returned by > > > ffplay related to corrupted MJPEG data. > > > > > > When debugging the error I noticed that the corruption only > > > happens when TRBs are get chained due to hitting a 64k boundary, > > > and therefore require to queue up two TRBs. So, what happened? > > > > > > In xhci.c:xhci_event_xfer_isoc() we do the accounting of the > > > processed, chained TD: > > > > > > /* > > > * If we queued two TRBs for a frame and this is the second TRB, > > > * check if the first TRB needs accounting since it might not have > > > * raised an interrupt in case of full data received. > > > */ > > > > > > If there was a zero length transfer with such a TD, and the 2. TRB > > > receives the interrupt with len=0, it assumes that the 1. TRB was > > > processed (but it wasn't), and adds the requested 1. TRB length to > > > actlen, passing back that data has been received, which is > > > obviously wrong, resulting in the seen data corruptions. > > > > > > Instead, when the 2. TRB receives the IOC interrupt with len=0, we > > > need to ignore the 1. TRB, considering this was a zero length > > > transfer. > > > > > > With the below diff Laurent is getting a clear video image now > > > with high resolutions, and I could remove all the output errors > > > from ffplay in my case. I also tested with an uaudio(4) device. > > > > > > Feedback, more testers, OKs? > > > > > > Thanks, > > > Marcus > > > > > > > What if the first TRB was filled completely, but the second TRB > > transferred 0? Wouldn't we then need to add 1. TRB? I think your > > diff wouldn't do that. > > Right - That was the only case I was thinking about as well which > wouldn't get handled at the moment, but I wasn't sure if it can be a > case at all (according to your comment it can :-) > > > I think what we actually need is some "processed" flag. If you look > > on bugs@ or so, there's a diff that tries to fix the same issue by > > adding a "CHAINED" or so flag. I'm not sure if that's the right > > way. > > > > But, anyway, I think we need a way to say "this TRB got an > > interrupt" or "this TRB was processed". Sorry, I just did see the "xhci: zero length multi-TRB inbound xfer does not work" mail thread now. Honestly, at the moment I also don't know what's the right way to implement this. I need to understand the approach from sc.dyings diff a bit better first. > And I was thinking about this too. > > I think it should be more something like "this TRB was processed", > since at the moment we don't receive an interrupt for the 1st TRB. > > Two questions related to that: > > 1. Setting XHCI_TRB_IOC together with XHCI_TRB_CHAIN is probably not > something which would work with our current code nor supported with > the specs? > > 2. Is there already something, somewhere which sets a status flag when > the 1. TRB was touched, or is it something we really need to write > an own handler for from scratch? > > > > > > > Index: xhci.c > > > =================================================================== > > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v > > > retrieving revision 1.116 > > > diff -u -p -u -p -r1.116 xhci.c > > > --- xhci.c 30 Jun 2020 10:21:59 -0000 1.116 > > > +++ xhci.c 10 Jul 2020 05:57:11 -0000 > > > @@ -932,6 +932,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x > > > struct usbd_xfer *skipxfer; > > > struct xhci_xfer *xx = (struct xhci_xfer *)xfer; > > > int trb0_idx, frame_idx = 0; > > > + uint32_t len; > > > > > > KASSERT(xx->index >= 0); > > > trb0_idx = > > > @@ -945,6 +946,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x > > > if (trb0_idx++ == (xp->ring.ntrb - 1)) > > > trb0_idx = 0; > > > } > > > + len = > > > XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - > > > remain; /* > > > * If we queued two TRBs for a frame and this is the > > > second TRB, @@ -954,18 +956,17 @@ xhci_event_xfer_isoc(struct > > > usbd_xfer *x if ((letoh32(xp->ring.trbs[trb_idx].trb_flags) & > > > XHCI_TRB_TYPE_MASK) == XHCI_TRB_TYPE_NORMAL) { > > > frame_idx--; > > > - if (trb_idx == 0) > > > - trb0_idx = xp->ring.ntrb - 2; > > > - else > > > - trb0_idx = trb_idx - 1; > > > - if (xfer->frlengths[frame_idx] == 0) { > > > + if (len > 0 && xfer->frlengths[frame_idx] == 0) { > > > + if (trb_idx == 0) > > > + trb0_idx = xp->ring.ntrb - 2; > > > + else > > > + trb0_idx = trb_idx - 1; > > > xfer->frlengths[frame_idx] = > > > XHCI_TRB_LEN(letoh32( xp->ring.trbs[trb0_idx].trb_status)); > > > } > > > } > > > > > > - xfer->frlengths[frame_idx] += > > > - > > > XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain; > > > + xfer->frlengths[frame_idx] += len; > > > xfer->actlen += xfer->frlengths[frame_idx]; > > > > > > if (xx->index != trb_idx) > > > >