On 10/9/06, Alan Stern <[EMAIL PROTECTED]> wrote: > It's amusing watching the two of you talk past each other... especially > in regard to issues of latency.
I'd have called it 'frustrating', but the discussion is at least still moving forward. > Let's consider a simple case. Suppose we're using a full-speed controller > instead of EHCI. It's easier to analyze and more demanding. Here's a > best-case scenario: > > A. During frame 0 a pressure wave strikes the microphone. The > hardware digitizes the signal and stores the data internally. > > B. During frame 1 the data collected in A is sent to the host. > > C. At the end of frame 1 the host gets a transfer-completed IRQ. > > D. During frame 2 the HCD invokes the read URB's callback. The > audio driver and application massage and re-buffer the data for > retransmission. > > E. Then the HCD invokes the callback for an old write URB. The > audio driver resubmits, using the data generated in step D. > > F. The HCD takes the submitted URB and schedules it for frame 3. I will point out that it's possible (I know that's not how linux does it): The read completes in frame 1 and the URB callback is called before frame 1 ends. The processing of the audio happens immediately enough to make it available for a slot in frame 2. This would assume one or both of more aggressive completion handling and/or coupling read and playback URB handling... which low-latency software would be doing in userspace anyway; read/write are lockstep (I don't know of an OS that does it that way internally though. I believe MacOS just special-cases scheduling and completion handling). > G. During frame 3 the data is sent to the speaker device. > > H. During frame 4 the speaker plays the data out into the air. Given isoch and the design of low-latency devices, I wouldn't be surprised if devices designed for low latency aren't already playing the data in frame 3 of your scheme, as the samples pour in. But yes, I see this example as a clear illustration of how Linux is doing things. > Even with no problems and 0 interrupt latency, there's still a 4-ms delay > between the microphone and the speaker. With EHCI this can be reduced > somewhat; you could ask the controller to generate IRQs every 0.5 ms > instead of every ms and thereby reduce the delay to 2 ms. This quickly > leads to diminishing returns, however, because of all the overhead > involved in processing IRQs, invoking callbacks, etc. I will point out that low latency applications generally consider the machine to be an embedded appliance. They own the box. This is accepted. > In addition, there are two important problems not mentioned above: > > The first has to do with the order of the callbacks. Although I wrote > step E after step D, there's no reason they can't occur in the opposite > order. When that happens the driver will be in trouble because the data > for E's submission won't be available yet. The driver could submit dummy > data, but that's foolish given that step D will happen in the very near > future. Or the driver could submit nothing until step D occurs, which > would cause the endpoint queue to dry up and the bandwidth to be > deallocated. This is precisely why Monty wants to allow allocations to > survive for some period even with no URBs in the queue. Yes. > The second difficulty has to do with kernel latency. Suppose for one > reason or another the system is busy, and consequently step E doesn't > occur until after frame 3 has already begun. It wouldn't take much of a > delay, and according to Dave a 1-ms IRQ latency is to be expected every > now and then. The only way to survive without a dropout is to have an > extra URB in the write queue. That will give an additional millisecond of > breathing space, at the cost of increasing the microphone-to-speaker delay > by 1 ms. > > Perhaps not coincindentally, it turns out that the solution to the second > problem also solves the first problem. If you're working with low-latency > EHCI then the overall delay goes up from 2 ms to 3 ms. Isn't that still > low enough to be usable? Usable? Oh, yes. But I'm striving for 'doing as good a job as other systems that have been around for a while'. I'm not striving for mediocrity. I accept it will not happen all at once (and settling the XRUN reporting channel is higher priority and has greater practical significance). I would also take a reliable 5ms over '2ms that hiccups once a day' without thinking twice. > (It's worth pointing out that this analysis doesn't mention the depth of > the read queue. It doesn't matter how many URBs are waiting in that > queue; all that matters is how much data each URB can receive.) You have a point, and I had been neglecting to notice that. Monty ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel