On 10/9/06, Alan Stern <[EMAIL PROTECTED]> wrote:
> It's amusing watching the two of you talk past each other...  especially
> in regard to issues of latency.

I'd have called it 'frustrating', but the discussion is at least still
moving forward.

> Let's consider a simple case.  Suppose we're using a full-speed controller
> instead of EHCI.  It's easier to analyze and more demanding.  Here's a
> best-case scenario:
>
>     A.  During frame 0 a pressure wave strikes the microphone.  The
>         hardware digitizes the signal and stores the data internally.
>
>     B.  During frame 1 the data collected in A is sent to the host.
>
>     C.  At the end of frame 1 the host gets a transfer-completed IRQ.
>
>     D.  During frame 2 the HCD invokes the read URB's callback.  The
>         audio driver and application massage and re-buffer the data for
>         retransmission.
>
>     E.  Then the HCD invokes the callback for an old write URB.  The
>         audio driver resubmits, using the data generated in step D.
>
>     F.  The HCD takes the submitted URB and schedules it for frame 3.

I will point out that it's possible (I know that's not how linux does it):

The read completes in frame 1 and the URB callback is called before
frame 1 ends.  The processing of the audio happens immediately enough
to make it available for a slot in frame 2.  This would assume one or
both of more aggressive completion handling and/or coupling read and
playback URB handling... which low-latency software would be doing in
userspace anyway; read/write are lockstep (I don't know of an OS that
does it that way internally though.  I believe MacOS just
special-cases scheduling and completion handling).

>     G.  During frame 3 the data is sent to the speaker device.
>
>     H.  During frame 4 the speaker plays the data out into the air.

Given isoch and the design of low-latency devices, I wouldn't be
surprised if devices designed for low latency aren't already playing
the data in frame 3 of your scheme, as the samples pour in.

But yes, I see this example as a clear illustration of how Linux is
doing things.

> Even with no problems and 0 interrupt latency, there's still a 4-ms delay
> between the microphone and the speaker.  With EHCI this can be reduced
> somewhat; you could ask the controller to generate IRQs every 0.5 ms
> instead of every ms and thereby reduce the delay to 2 ms.  This quickly
> leads to diminishing returns, however, because of all the overhead
> involved in processing IRQs, invoking callbacks, etc.

I will point out that low latency applications generally consider the
machine to be an embedded appliance.  They own the box.  This is
accepted.

> In addition, there are two important problems not mentioned above:
>
> The first has to do with the order of the callbacks.  Although I wrote
> step E after step D, there's no reason they can't occur in the opposite
> order.  When that happens the driver will be in trouble because the data
> for E's submission won't be available yet.  The driver could submit dummy
> data, but that's foolish given that step D will happen in the very near
> future.  Or the driver could submit nothing until step D occurs, which
> would cause the endpoint queue to dry up and the bandwidth to be
> deallocated.  This is precisely why Monty wants to allow allocations to
> survive for some period even with no URBs in the queue.

Yes.

> The second difficulty has to do with kernel latency.  Suppose for one
> reason or another the system is busy, and consequently step E doesn't
> occur until after frame 3 has already begun.  It wouldn't take much of a
> delay, and according to Dave a 1-ms IRQ latency is to be expected every
> now and then.  The only way to survive without a dropout is to have an
> extra URB in the write queue.  That will give an additional millisecond of
> breathing space, at the cost of increasing the microphone-to-speaker delay
> by 1 ms.
>
> Perhaps not coincindentally, it turns out that the solution to the second
> problem also solves the first problem.  If you're working with low-latency
> EHCI then the overall delay goes up from 2 ms to 3 ms.  Isn't that still
> low enough to be usable?

Usable?  Oh, yes.  But I'm striving for 'doing as good a job as other
systems that have been around for a while'.  I'm not striving for
mediocrity.  I accept it will not happen all at once (and settling the
XRUN reporting channel is higher priority and has greater practical
significance).  I would also take a reliable 5ms over '2ms that
hiccups once a day' without thinking twice.

> (It's worth pointing out that this analysis doesn't mention the depth of
> the read queue.  It doesn't matter how many URBs are waiting in that
> queue; all that matters is how much data each URB can receive.)

You have a point, and I had been neglecting to notice that.

Monty

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Reply via email to