Re: [linux-usb-devel] [PATCH 15/15] usbaudio retries EL2NSYNC

Alan Stern Mon, 09 Oct 2006 20:41:49 -0700

On Mon, 9 Oct 2006, David Brownell wrote:

> On Saturday 07 October 2006 12:47 pm, Alan Stern wrote:
> > On Sat, 7 Oct 2006, Christopher "Monty" Montgomery wrote:
> > 
> > > Let's establish how to report missed slots, I'll update the patches,
> > > and then try to figure out what's going on with usbaudio.
> > 
> > Okay, let's move on to discuss this.  Part of the problem with our earlier
> > discussion has been that there are actually 3 queues, any of which can dry
> > up:
> > 
> >     Queue A is the flow of data from the application to snd-usb-audio
> >     or some other high-level driver.  If this queue drains it is an
> >     xrun and quite probably loss-of-sync.  The driver is free to
> >     report the error to the application any way it wants to.  This is 
> >     where application latency matters.
> > 
> >     Queue B is the flow of URBs from the driver to ehci-hcd.  If this
> 
> Such issues are not unique to EHCI of course ... so in this email, it's
> safe to just read "HCD" unless something specific to that driver is
> being discussed.


Yes.

> >     queue drains then the bandwidth is deallocated, something you
> >     desperately want to avoid.  It's up to the higher-level driver to
> >     keep the queue non-empty, even if that means submitting URBs with
> >     dummy data.  Latency has no effect here.
> > 
> >     Queue C is the flow of packets from the host controller to the
> >     device.  If this queue drains it is a loss-of-sync. 
> 
> And queue C would never drain unless queue B drains first... since they
> are coupled one-to-one.  No packet goes to/from the peripheral (C) unless
> it's been told to do so by the driver (B).

Not true at all.  Queue C drains (i.e., the device is left with a gap in
its data stream) whenever a slot isn't filled in time.  This can be caused
by excessive kernel latency, even if queue B remains full.

> One issue being that the driver (ALSA usb, V4L2, etc) normally uses
> URB completions (from the HCD) to drive motion through queue B.  And
> those completions are IRQ-driven, so IRQ latency is a factor in being
> able to notice that B has actually emptied.

?  It's a factor in noticing that _C_ has actually emptied.  Queue B has 
nothing directly to do with the passage of time; it will remain full 
provided the completion handler always resubmits, regardless of how 
quickly or slowly the handler is called.


> > So we only need to consider errors caused by Queue C draining.  Currently 
> > there is no standardized way to report these errors back to the 
> > higher-level driver.  Looking through documentation/usb/error-codes.txt, 
> > the closest thing we see are these wonderful entries:
> > 
> > -EXDEV                      ISO transfer only partially completed
> >                     look at individual frame status for details
> 
> ... that can itself be the individual frame status though!

Well yes, that's the idea.  When a slot is missed, the frame status is set 
to -EXDEV.

> > -EINVAL                     ISO madness, if this happens: Log off and go 
> > home
> 
> ... and I'm not sure anyone reports that any more, at least as ISO
> frame status.

Then we can remove it from the documentation.  I don't know what it ever 
was supposed to mean.


> > To be definite, let's suppose a periodic stream has been allocated a
> > specific series of slots, and up until slot N-1 everything has been okay.
> 
> Where "series" == (u)frames number BASE + X * PERIOD, for all X.

Yes.

> > Now URB U is submitted, supposedly starting in slot N.  But something has
> > gone wrong, and ehci-hcd isn't able to add U to the hardware schedule in
> > time for slot N to be filled.  What should happen?
> > 
> > Sometimes it will be apparent at submission time that U is already too 
> > late.  For instance, slot N's microframe might already be over.  In such 
> > cases it is possible to return a submission error.  Let's call this option 
> > #1.
> 
> Right, and that's the intent of the current reporting of EL2NSYNC.

Which is undocumented and hence a new addition to the API.

> > Sometimes it won't be apparent until later that the slot was missed.  
> > When this happens, ehci-hcd is unable to report a submission error U.  
> > Another possible approach is to report a submission error for the
> > following URB, U+1.  Let's call this option #2.
> 
> Don't much like that one.  Requires HCDs to record history that would
> otherwise not be needed, and interrogate it.

I don't like it either; even less than #1.  However Monty seemed to 
mention it, so I brought it up for discussion.

> > The only other reasonable option, #3, is to report an error upon the 
> > completion of U.  These two events (submission and completion) are the 
> > only chances ehci-hcd has to communicate with a higher-level driver.
> 
> This #3 is when -EXDEV gets reported, and/or noticing the start_frame
> hiccup.  I see no way around having these.

It seems clear that #3 is unavoidable, because there are circumstances in 
which the HCD is unable to use #1 or #2.  The real question is whether #1 
should be used at all.

> > As Dave has already mentioned, options #1 and #2 share certain practical
> > problems.  Returning an error for a submission when in fact the submission
> > was accepted is _not_ a good policy.  The return code could be a positive
> > value, not a negative -Exxx code.  Then it would be necessary to audit all
> > the USB drivers that use ISO endpoints, because there could easily be
> > cases where the driver checks only for 0 or nonzero.
> 
> That audit should easy enough for the in-core druvers.

Auditing isn't enough; the drivers have to be fixed up to handle these 
faults in a reasonable way.  Unless ignoring them _is_ reasonable -- in 
which case why bother to report them?

For example, suppose the URB contains multiple slots and some of them have
already been missed while the rest are still okay.  Then the driver
doesn't have to do anything at all to catch back up.

The issue then becomes, what if all the slots in the URB have been missed 
(or might be missed, since the HCD can't tell in cases where there's a 
close call)?  Okay, I admit, in this situation it makes sense to reject 
the submission entirely.  Maybe also set urb->start_frame to the next 
available slot.

> > We could in fact return an error code and _not_ accept the submission,
> > counting on the driver to realize what went wrong and make a new
> > submission.  I think this is overly complicated.  And it provides no hint
> > as to exactly how many slots were missed.  Finally, it increases the
> > amount of work performed by both the higher-level driver and ehci-hcd --
> > exactly the sort of thing you _don't_ want to do when a stream is lagging
> > behind.
> 
> I don't know how ALSA works just now, but I suspect that it'd be better
> to return a status code meaning "I couldn't queue this, your driver is
> N uframes behind" so that the driver could retry intelligently by just
> skipping those uframes.
> 
> So this question is best answered by the drivers who would be using
> that recovery mechanism ... notably ALSA and V4L2.

We don't currently have any way for a driver to tell the HCD it wants to
skip N (u)frames.  I suppose urb->start_frame could be used for this
purpose.

Alan Stern


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH 15/15] usbaudio retries EL2NSYNC

Reply via email to