Re: [linux-usb-devel] [PATCH 15/15] usbaudio retries EL2NSYNC

David Brownell Sun, 15 Oct 2006 15:46:24 -0700

> > >   Queue B is the flow of URBs from the driver to ehci-hcd.  If this
> > >   queue drains then the bandwidth is deallocated, something you
> > >   desperately want to avoid.  It's up to the higher-level driver to
> > >   keep the queue non-empty, even if that means submitting URBs with
> > >   dummy data.  Latency has no effect here.
> > > 
> > >   Queue C is the flow of packets from the host controller to the
> > >   device.  If this queue drains it is a loss-of-sync. 
> > 
> > And queue C would never drain unless queue B drains first... since they
> > are coupled one-to-one.  No packet goes to/from the peripheral (C) unless
> > it's been told to do so by the driver (B).
> 
> Not true at all.  Queue C drains (i.e., the device is left with a gap in
> its data stream) whenever a slot isn't filled in time.  This can be caused
> by excessive kernel latency, even if queue B remains full.


I guess you're defining things differently than I am then.  Neither OHCI
nor EHCI supports the steady-state notion of an URB that's been handed to
the HCD which has NOT been handed to the hardware.  And even the spinlocked
code paths where the driver is adding to the HCD queue B can't realistically
be said to complete before adding to the HC queue C.

That is, the notion of an URB that's on queue B (handed to HCD) yet is
not on queue C (handed to HC) seems nonsensical to me.


> > One issue being that the driver (ALSA usb, V4L2, etc) normally uses
> > URB completions (from the HCD) to drive motion through queue B.  And
> > those completions are IRQ-driven, so IRQ latency is a factor in being
> > able to notice that B has actually emptied.
> 
> ?  It's a factor in noticing that _C_ has actually emptied.  Queue B has 
> nothing directly to do with the passage of time; it will remain full 
> provided the completion handler always resubmits, regardless of how 
> quickly or slowly the handler is called.

See above.  If C is emptied, then B is emptied.


> > > So we only need to consider errors caused by Queue C draining.  Currently 
> > > there is no standardized way to report these errors back to the 
> > > higher-level driver.  Looking through documentation/usb/error-codes.txt, 
> > > the closest thing we see are these wonderful entries:
> > > 
> > > -EXDEV                    ISO transfer only partially completed
> > >                   look at individual frame status for details
> > 
> > ... that can itself be the individual frame status though!
> 
> Well yes, that's the idea.  When a slot is missed, the frame status is set 
> to -EXDEV.

Hence "look at individual frame status" is going to get you back to -EXDEV;
maybe that's some of the wonderfulness you were hinting at.  :)


> > > -EINVAL                   ISO madness, if this happens: Log off and go 
> > > home
> > 
> > ... and I'm not sure anyone reports that any more, at least as ISO
> > frame status.
> 
> Then we can remove it from the documentation.  I don't know what it ever 
> was supposed to mean.

Pavel was up way too late one night! 


> > > To be definite, let's suppose a periodic stream has been allocated a
> > > specific series of slots, and up until slot N-1 everything has been okay.
> > 
> > Where "series" == (u)frames number BASE + X * PERIOD, for all X.
> 
> Yes.
> 
> > > Now URB U is submitted, supposedly starting in slot N.  But something has
> > > gone wrong, and ehci-hcd isn't able to add U to the hardware schedule in
> > > time for slot N to be filled.  What should happen?
> > > 
> > > Sometimes it will be apparent at submission time that U is already too 
> > > late.  For instance, slot N's microframe might already be over.  In such 
> > > cases it is possible to return a submission error.  Let's call this 
> > > option 
> > > #1.
> > 
> > Right, and that's the intent of the current reporting of EL2NSYNC.
> 
> Which is undocumented and hence a new addition to the API.

Well, "new" ~= 3+ years by now.  You have to know that specific documentation
has never been complete/accurate.

 
> > > Sometimes it won't be apparent until later that the slot was missed.  
> > > When this happens, ehci-hcd is unable to report a submission error U.  
> > > Another possible approach is to report a submission error for the
> > > following URB, U+1.  Let's call this option #2.
> > 
> > Don't much like that one.  Requires HCDs to record history that would
> > otherwise not be needed, and interrogate it.
> 
> I don't like it either; even less than #1.  However Monty seemed to 
> mention it, so I brought it up for discussion.

OK.


> > > The only other reasonable option, #3, is to report an error upon the 
> > > completion of U.  These two events (submission and completion) are the 
> > > only chances ehci-hcd has to communicate with a higher-level driver.
> > 
> > This #3 is when -EXDEV gets reported, and/or noticing the start_frame
> > hiccup.  I see no way around having these.
> 
> It seems clear that #3 is unavoidable, because there are circumstances in 
> which the HCD is unable to use #1 or #2.  The real question is whether #1 
> should be used at all.

No; I'd say that #1 and #3 are significantly different faults.

Which makes the question different:  whether two such faults should
be combined into one report, thereby discarding information that
some drivers would be able to use.


 
> > > As Dave has already mentioned, options #1 and #2 share certain practical
> > > problems.  Returning an error for a submission when in fact the submission
> > > was accepted is _not_ a good policy.  The return code could be a positive
> > > value, not a negative -Exxx code.  Then it would be necessary to audit all
> > > the USB drivers that use ISO endpoints, because there could easily be
> > > cases where the driver checks only for 0 or nonzero.
> > 
> > That audit should easy enough for the in-core druvers.
> 
> Auditing isn't enough; the drivers have to be fixed up to handle these 
> faults in a reasonable way.  Unless ignoring them _is_ reasonable -- in 
> which case why bother to report them?

As Monty said:  one issue is just "no regressions".

Another is that in _some_ cases ignoring is reasonable; but that does
not mean it's always going to be reasonable.   That's why to "bother"
reporting them even if ALSA, or near-term tweaks to ALSA, doesn't make
effective use of them.

 
> For example, suppose the URB contains multiple slots and some of them have
> already been missed while the rest are still okay.  Then the driver
> doesn't have to do anything at all to catch back up.

Erm, URBs don't contain slots.  They contain packets.  Packets get put
into slots by the HCD.

I'm catching up on some of this email, but this is that case where I had
pointed out you were assuming a new/different scheduling policy.  If the
policy is the existing "ASAP", then there *WILL* be a gap ...

If there's any catching-up to be done, something has to do it, and "ASAP"
policy inside an HCD does not (so far as I've ever understood it) do that.

 
> The issue then becomes, what if all the slots in the URB have been missed 
> (or might be missed, since the HCD can't tell in cases where there's a 
> close call)?  Okay, I admit, in this situation it makes sense to reject 
> the submission entirely.  Maybe also set urb->start_frame to the next 
> available slot.

Hmm, again we have different interpretations of what ASAP means.  In my
book, this XRUN case could easily be scheduled ... because ASAP would
clearly mean (as in the case right above!!) "starting right now".


> > > We could in fact return an error code and _not_ accept the submission,
> > > counting on the driver to realize what went wrong and make a new
> > > submission.  I think this is overly complicated.  And it provides no hint
> > > as to exactly how many slots were missed.  Finally, it increases the
> > > amount of work performed by both the higher-level driver and ehci-hcd --
> > > exactly the sort of thing you _don't_ want to do when a stream is lagging
> > > behind.
> > 
> > I don't know how ALSA works just now, but I suspect that it'd be better
> > to return a status code meaning "I couldn't queue this, your driver is
> > N uframes behind" so that the driver could retry intelligently by just
> > skipping those uframes.
> > 
> > So this question is best answered by the drivers who would be using
> > that recovery mechanism ... notably ALSA and V4L2.
> 
> We don't currently have any way for a driver to tell the HCD it wants to
> skip N (u)frames.  I suppose urb->start_frame could be used for this
> purpose.

Heck, last I looked we didn't even have drivers that looked at the USB
frame counter; nobody was even thinking about these issues.

I hope we agree that on URB completion, start_frame indicates when that
frame started.

My understanding of start_frame on urb submission was that originally
the idea was:  if ISO_ASAP wasn't set, that would specify the start frame.
Now, I know that OHCI never implemented that; and EHCI didn't either.
Maybe one of the UHCIs did/does.

If we expect those semantics -- start_frame without ISO_ASAP set -- then
we wouldn't need a "skip" mechanism.

- Dave


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH 15/15] usbaudio retries EL2NSYNC

Reply via email to