On Sat, 7 Oct 2006, Christopher "Monty" Montgomery wrote: > Let's establish how to report missed slots, I'll update the patches, > and then try to figure out what's going on with usbaudio.
Okay, let's move on to discuss this. Part of the problem with our earlier discussion has been that there are actually 3 queues, any of which can dry up: Queue A is the flow of data from the application to snd-usb-audio or some other high-level driver. If this queue drains it is an xrun and quite probably loss-of-sync. The driver is free to report the error to the application any way it wants to. This is where application latency matters. Queue B is the flow of URBs from the driver to ehci-hcd. If this queue drains then the bandwidth is deallocated, something you desperately want to avoid. It's up to the higher-level driver to keep the queue non-empty, even if that means submitting URBs with dummy data. Latency has no effect here. Queue C is the flow of packets from the host controller to the device. If this queue drains it is a loss-of-sync. The higher-level driver can find out about these occurrences only by checking various return status codes from ehci-hcd, and then it has to decide how to relay the information to the application. This is where kernel and IRQ latency matter. First a word of warning. ISO transfers are UNRELIABLE! A data-out packet sent by the host might not be received by the device, and the host would have no way to know. Obviously such errors can't be reported since ehci-hcd never realizes they occur. Of course, this doesn't reduce our obligation to report accurately the errors we _do_ know about. Errors caused by Queue A draining presumably are already handled in a satisfactory manner. If they aren't, it's a matter for the higher-level driver; ehci-hcd has nothing to do with it. Errors caused by Queue B draining have a fixed meaning according to the API, and they are catastrophic. Fortunately they are easily avoided if the higher-level driver is properly configured. So we only need to consider errors caused by Queue C draining. Currently there is no standardized way to report these errors back to the higher-level driver. Looking through documentation/usb/error-codes.txt, the closest thing we see are these wonderful entries: -EXDEV ISO transfer only partially completed look at individual frame status for details -EINVAL ISO madness, if this happens: Log off and go home That's why I chose EXDEV in uhci-hcd; it seemed to be the closest match. This might be a good time to settle the matter once and for all. (Incidentally, the descriptions for EAGAIN and EFBIG are confusing and possibly overlapping. We should straighten them out as well.) To be definite, let's suppose a periodic stream has been allocated a specific series of slots, and up until slot N-1 everything has been okay. Now URB U is submitted, supposedly starting in slot N. But something has gone wrong, and ehci-hcd isn't able to add U to the hardware schedule in time for slot N to be filled. What should happen? Sometimes it will be apparent at submission time that U is already too late. For instance, slot N's microframe might already be over. In such cases it is possible to return a submission error. Let's call this option #1. Sometimes it won't be apparent until later that the slot was missed. When this happens, ehci-hcd is unable to report a submission error U. Another possible approach is to report a submission error for the following URB, U+1. Let's call this option #2. The only other reasonable option, #3, is to report an error upon the completion of U. These two events (submission and completion) are the only chances ehci-hcd has to communicate with a higher-level driver. So which option should we use? As Dave has already mentioned, options #1 and #2 share certain practical problems. Returning an error for a submission when in fact the submission was accepted is _not_ a good policy. The return code could be a positive value, not a negative -Exxx code. Then it would be necessary to audit all the USB drivers that use ISO endpoints, because there could easily be cases where the driver checks only for 0 or nonzero. We could in fact return an error code and _not_ accept the submission, counting on the driver to realize what went wrong and make a new submission. I think this is overly complicated. And it provides no hint as to exactly how many slots were missed. Finally, it increases the amount of work performed by both the higher-level driver and ehci-hcd -- exactly the sort of thing you _don't_ want to do when a stream is lagging behind. Lastly, there is the objection that sometimes option #1 can't be used because ehci-hcd doesn't know about the error until later. The same objection applies to option #2: If the driver submits several URBs in quick succession, ehci-hcd might not realize until after all of them are submitted that the first one was submitted too late. That's why I chose to use option #3 in uhci-hcd. It has the advantage of being reliable and not messing up any existing code by changing an accepted API. It also indicates exactly which slots were missed, by the error codes in urb->iso_frame_desc[n].status. It has the disadvantage of delaying the notification for some number of milliseconds after sync was lost. I don't know just how bad that disadvantage is. No doubt it depends on the nature of the application. However the fact remains, once sync has been lost it's already too late to recover fully. All you can do is recover as much as possible, as quickly as possible. Delaying the error notification until U completes won't slow down recovery significantly. There's one final matter I want to bring up, having to do with urb->start_frame. The API doesn't say much about how start_frame should be interpreted once a stream is already established. The assumption is that drivers won't use it, specifying URB_ISO_ASAP instead to cause each URB to fill the slot immediately following the end of the previous URB. I don't know how ohci-hcd and ehci-hcd treat start_frame in an established stream. Here's what uhci-hcd does: It compares start_frame with the actual frame number of the next slot (the one immediately following the end of the previous URB). If the values agree, well and good. If they disagree, the submission is rejected with -EINVAL. A different and completely reasonable policy would be to check that start_frame does indeed match one of the allocated future slots and to start the URB in that slot, leaving the intervening slots empty. This would make it a little easier for higher-level drivers to retain their bandwidth reservation when Queue A runs dry. However that's not how uhci-hcd currently works. Thoughts? Alan Stern ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ linux-usb-devel@lists.sourceforge.net To unsubscribe, use the last form field at: https://lists.sourceforge.net/lists/listinfo/linux-usb-devel