Re: [linux-usb-devel] [PATCH as498] USB API modification

David Brownell Sun, 17 Apr 2005 13:53:06 -0700

On Wednesday 13 April 2005 3:08 pm, Alan Stern wrote:
> On Wed, 13 Apr 2005, David Brownell wrote:
> 
> > A tasklet for the IRQ handling code wouldn't bother me the way even
> > the concept of one in the submit path does!
> 
> I sense a common theme in these comments!  You're clearly concerned about 
> undue delays during submission.


Among other things.  As I've pointed out, (non-PIO) HCDs basically do
only two things:  they package URBs for HC DMA hardware, then they process
notifications from the HC to issue completions for those URBs.  Anything
else is, statistically, noise where correctness (including lack of any
data stream corruption) is all that matters.

Or said differently, there are only two critical paths:  putting requests
from driver submit onto the DMA queue; then later taking completed requests
off the remnant of that queue (HC's forgotten them) and giving them back.


I just had a thought:  maybe one of the reasons Microsoft has such big
per-request latencies is that they're using something analgous to tasklets.
It's always puzzled me why they go to such effort to batch networking
requests, for example, especially when the handful of Linux-vs-Windows
comparisons I've seen on the same hardware shows that Linux gets better
throughput _without_ needing batching.  It could just be more evidence
that TCP-offload architectures aren't wins ... or it could be just a
not-so-good consequence of a particular USB design tradeoff.


> I don't see any way around it, though.  Yes, I would like to take a lot of 
> the stuff now done with interrupts disabled (which isn't quite the same as 
> under the spinlock) and let them run with interrupts enabled.  Remember, 
> however, that ->enqueue() can be called with interrupts already disabled!  

So?  We've talked about other ways to reduce the amount of time spent
with IRQs disabled.  (Including not using memory poisoning, and other
reductions of TD allocation costs.  As well as less invasive ways to
reduce TD allocations.)


> The necessary conclusion is that it mustn't do very much.  Certainly it 
> musn't try to allocate and prepare a whole lot of TDs.  Note that the 
> controller is capable of going through up to 19 max-size bulk packets per 
> frame -- that's more than I would like to prepare with interrupts 
> disabled.

That conclusion isn't "necessary".  It'd increase the latency for getting
requests onto the DMA queue.  At full (or low!) speeds that's unlikely
to affect throughput very much ... but it would negatively affect resubmit
paths for periodic transfers, as well as normal submit paths.

So at any rate, interrupt transfers would be incurring TWO additional
tasklet-induced latencies, which could be driver-visible.  One between
IRQ and tasklet running, for the completion.  Another between submit
and tasklet running, for the submit.  Maybe some merging can happen,
but the point is basically that this is a troublesome type of latency
that would be increasing, not decreasing, if submit/enqueue gets pushed
into tasklet code.

Although I've always recommended that periodic transfers maintain queues
on the endpoint if they're going to defend against increased latencies,
usually it's just ISO drivers doing that.  Any driver that potentially
uses interrupt transfers with fast transfer intervals could notice the
difference if both completion and submit processing go to tasklets, which
increases software-visible latencies.


> > > It should be possible to deliver pt_regs even with a top-half/bottom-half 
> > > split. 
> > 
> > The correct pt_regs -- current as of the IRQ?  I thought they
> > were specific to that IRQ context.
> 
> I'm not sure about this.  My understanding, which could be wrong, is that
> pt_regs contains nothing more than the register contents at the time the
> interrupt was fielded. 

Last I looked, it was a pointer onto an IRQ stack.  So it'd be
invalid during a tasklet; you can't usefully save it, even for
the same CPU.


> > Not stopping "until all the preceding URBs have been completed"
> > still seems like a _needless_ change, and I don't much like those
> > in areass where I've put effort into coming up with as simple a
> > formulation as possible.  And it wasn't how I read your text ...
> 
> Why not?  The exact text of the patch was: "... if an URB is unlinked
> before the hardware has started to execute it, then its queue is not
> guaranteed to stop until all the preceding URBs have completed".  How else 
> could you read it?

That was more or less a "needless" part.  It was the other change
that bothered me more, since it would change semantics.  The two
parts didn't add up in a way I liked.


> The point of my question was this.  Suppose the queue is currently 20
> deep, with the hardware working on the first URB, when a driver unlinks
> the fifteenth.  If the HCD takes that URB off the schedule and calls the
> completion routine while the hardware is still way back on the first or
> second URB, why should the driver care if the queue is still running?  
> How could it even tell? 

By data stream corruption when the hardware keeps processing the
requests.  Yes, tricky to detect directly, and not common ... the
point of the current (pre-patch) API was to preclude such trouble


> All it cares about is that the queue should stop 
> before the hardware reaches URB 16.  If the HCD can make that guarantee
> without stopping the queue, why shouldn't it?

"If the queue can be stopped without stopping the queue..." ????


> > More like holding back all fault completions until the queue stops.
> > For hardware faults ("errors") that should already be happening,
> > at least for non-ISO endpoint queues.  Doing that for software
> > faults (unlink/kill/cancel) would be the new behavior.
> 
> That's what I said: Don't report the fault completion right away when an
> URB is unlinked; hold it back until the queue stops and all the unlinked
> URBs are returned.  Only the "don't report right away" part is new.  The
> existing code already takes care of completion reports when the queue
> stops; no additional mechanism is needed there (contrary to what you
> said).

But there _is_ a new mechanism you want to require, as you said.
Holding back all fault completions iff an unlink is pending.

- Dave


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel

Re: [linux-usb-devel] [PATCH as498] USB API modification

Reply via email to