On Sat, Sep 28, 2013 at 9:38 PM, Alan Stern <st...@rowland.harvard.edu> wrote:
>
> Very few non-xHCI controllers can do DMA above the 4 GB limit.

Yes, but I am wondering non-xHCI need this kind of zero copy
optimization, since very few user space drivers complain or care
performance or cpu utilization when devices attach to non xHCI.

>
>> > make sure this will happen?
>>actually
>> That can't be guaranteed but we can handle it with page bounce, just like
>> block device.
>
> Obviously.  But if we have to bounce the pages, it isn't zero-copy any
> more.

Suppose the optimization is mainly for xHCI, there should be no such
problem. The problem only exists when non-xHCI is used and
system has more than 4GB memory, which looks not a mainstream
configuration.

I propose the idea only for comparing the two approaches, and each one
has its own advantage and disadvantage, maybe the two can coexist.

mmap approach:
- interface is a bit complicated, each URB need usbfs to allocate one buffer
- not easy to scale well if the buffer need to be very big for obtaining good
performance

direct i/o approach:
- interface is simple, maybe passing O_DIRECT to open() should be enough
- if HCD can't DMA to 4GB above memory, part of 4GB above pages need to
be bounced.

>
>> Actually I observed both throughput and cpu utilization can be improved
>> with the 4GB of DMA limit on either 32bit arch or 64bit arch, wrt. direct I/O
>> over usb mass storage block device.
>
> This may depend more on the host controller capabilities than on the
> CPU architecture.

Yes, but for most cases, more than 4GB ram is seldom used in 32bit CPU.


Thanks,
-- 
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to