Re: xhci(4) isoc: fix bogus handling of chained TRBs

Marcus Glocker Sat, 11 Jul 2020 02:58:14 -0700

On Fri, 10 Jul 2020 14:32:47 +0200
Marcus Glocker <mar...@nazgul.ch> wrote:


> On Fri, 10 Jul 2020 14:19:00 +0200
> Patrick Wildt <patr...@blueri.se> wrote:
> 
> > On Fri, Jul 10, 2020 at 01:00:31PM +0200, Marcus Glocker wrote:  
> > > Hi All,
> > > 
> > > Laurence Tratt was reporting about corrupted video images when
> > > using uvideo(4) on xhci(4) with MJPEG and higher resolutions,
> > > e.g. when using ffplay:
> > > 
> > > $ ffplay -f v4l2 -input_format mjpeg -video_size 1280x720 -f
> > > /dev/video1
> > > 
> > > When trying to re-produce the issue on my side, the video images
> > > were still OK, but I also could see a lot of errors returned by
> > > ffplay related to corrupted MJPEG data.
> > > 
> > > When debugging the error I noticed that the corruption only
> > > happens when TRBs are get chained due to hitting a 64k boundary,
> > > and therefore require to queue up two TRBs.  So, what happened?
> > > 
> > > In xhci.c:xhci_event_xfer_isoc() we do the accounting of the
> > > processed, chained TD:
> > > 
> > > /*
> > >  * If we queued two TRBs for a frame and this is the second TRB,
> > >  * check if the first TRB needs accounting since it might not have
> > >  * raised an interrupt in case of full data received.
> > >  */
> > > 
> > > If there was a zero length transfer with such a TD, and the 2. TRB
> > > receives the interrupt with len=0, it assumes that the 1. TRB was
> > > processed (but it wasn't), and adds the requested 1. TRB length to
> > > actlen, passing back that data has been received, which is
> > > obviously wrong, resulting in the seen data corruptions.
> > > 
> > > Instead, when the 2. TRB receives the IOC interrupt with len=0, we
> > > need to ignore the 1. TRB, considering this was a zero length
> > > transfer.
> > > 
> > > With the below diff Laurent is getting a clear video image now
> > > with high resolutions, and I could remove all the output errors
> > > from ffplay in my case.  I also tested with an uaudio(4) device.
> > > 
> > > Feedback, more testers, OKs?
> > > 
> > > Thanks,
> > > Marcus
> > >     
> > 
> > What if the first TRB was filled completely, but the second TRB
> > transferred 0?  Wouldn't we then need to add 1. TRB?  I think your
> > diff wouldn't do that.  
> 
> Right - That was the only case I was thinking about as well which
> wouldn't get handled at the moment, but I wasn't sure if it can be a
> case at all (according to your comment it can :-)
> 
> > I think what we actually need is some "processed" flag.  If you look
> > on bugs@ or so, there's a diff that tries to fix the same issue by
> > adding a "CHAINED" or so flag.  I'm not sure if that's the right
> > way.
> > 
> > But, anyway, I think we need a way to say "this TRB got an
> > interrupt" or "this TRB was processed".  

Sorry, I just did see the "xhci: zero length multi-TRB inbound xfer does
not work" mail thread now.

Honestly, at the moment I also don't know what's the right way to
implement this.  I need to understand the approach from sc.dyings diff
a bit better first.

> And I was thinking about this too.
> 
> I think it should be more something like "this TRB was processed",
> since at the moment we don't receive an interrupt for the 1st TRB.
> 
> Two questions related to that:
> 
> 1. Setting XHCI_TRB_IOC together with XHCI_TRB_CHAIN is probably not
>    something which would work with our current code nor supported with
>    the specs?
> 
> 2. Is there already something, somewhere which sets a status flag when
>    the 1. TRB was touched, or is it something we really need to write
>    an own handler for from scratch?
> 
> > > 
> > > Index: xhci.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/dev/usb/xhci.c,v
> > > retrieving revision 1.116
> > > diff -u -p -u -p -r1.116 xhci.c
> > > --- xhci.c        30 Jun 2020 10:21:59 -0000      1.116
> > > +++ xhci.c        10 Jul 2020 05:57:11 -0000
> > > @@ -932,6 +932,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
> > >   struct usbd_xfer *skipxfer;
> > >   struct xhci_xfer *xx = (struct xhci_xfer *)xfer;
> > >   int trb0_idx, frame_idx = 0;
> > > + uint32_t len;
> > >  
> > >   KASSERT(xx->index >= 0);
> > >   trb0_idx =
> > > @@ -945,6 +946,7 @@ xhci_event_xfer_isoc(struct usbd_xfer *x
> > >           if (trb0_idx++ == (xp->ring.ntrb - 1))
> > >                   trb0_idx = 0;
> > >   }
> > > + len =
> > > XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) -
> > > remain; /*
> > >    * If we queued two TRBs for a frame and this is the
> > > second TRB, @@ -954,18 +956,17 @@ xhci_event_xfer_isoc(struct
> > > usbd_xfer *x if ((letoh32(xp->ring.trbs[trb_idx].trb_flags) &
> > > XHCI_TRB_TYPE_MASK) == XHCI_TRB_TYPE_NORMAL) {
> > >           frame_idx--;
> > > -         if (trb_idx == 0)
> > > -                 trb0_idx = xp->ring.ntrb - 2;
> > > -         else
> > > -                 trb0_idx = trb_idx - 1;
> > > -         if (xfer->frlengths[frame_idx] == 0) {
> > > +         if (len > 0 && xfer->frlengths[frame_idx] == 0) {
> > > +                 if (trb_idx == 0)
> > > +                         trb0_idx = xp->ring.ntrb - 2;
> > > +                 else
> > > +                         trb0_idx = trb_idx - 1;
> > >                   xfer->frlengths[frame_idx] =
> > > XHCI_TRB_LEN(letoh32( xp->ring.trbs[trb0_idx].trb_status));
> > >           }
> > >   }
> > >  
> > > - xfer->frlengths[frame_idx] +=
> > > -
> > > XHCI_TRB_LEN(letoh32(xp->ring.trbs[trb_idx].trb_status)) - remain;
> > > + xfer->frlengths[frame_idx] += len;
> > >   xfer->actlen += xfer->frlengths[frame_idx];
> > >  
> > >   if (xx->index != trb_idx)
> > >     
>

Re: xhci(4) isoc: fix bogus handling of chained TRBs

Reply via email to