Re: [Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)

2006-08-09 Thread Joanne M Mikkelson
Hi, sorry for my long delay, I was on vacation and then playing
catch-up.

> I'm not sure I followed the explanation for why on NetBSD the
> unidirectional case isn't equal to the sum of the bidirectional case.
> Could you try explaining again?  On second thought, is the problem
> that there's only one request in the h/w endpoint queue for a given
> endpoint and direction?  If so, I think you could get the completion
> interrupt service time out of the critical path by ensuring that there
> are always two requests queued in each direction, not just one.

Yes, as the driver is currently implemented, there is only one
request queued for a given endpoint at a time.  You're correct that
having more than one would reduce the interrupt service time's effect
on performance, but doing this will require changes to more than just
ugen.  The ehci driver will need some work in order to work properly
with more than one bulk request queued at a time.  We haven't changed
the ehci driver, so until that happens, the ugen driver will have to
use just a single request at a time.

> I'd also be interested in seeing how the throughput holds up with
> smaller transfer sizes and smaller total amount of buffer space.

Because we only have one request at a time, the throughput will
suffer as request sizes get smaller.  In my experience the total
buffer space need not be more than a few requests' worth (and the
numbers showed that having the buffers too large hurts performance),
but this testing wasn't with much computation load.  At least the
latency should still be improved over what we had with ugen before.

Using test_usrp_standard_rx and _tx, a block size of 1024 only works
with decimation 64/interpolation 128 (4 MB/s) and a block size of
4096 works with decimation 16/interpolation 32 (16 MB/s).  This is
without real-time scheduling, which isn't working.

Joanne


___
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio


[Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)

2006-07-21 Thread Joanne M Mikkelson
Hi,

We collected some data comparing the USB throughput we're getting now
on NetBSD against the throughput on Linux.  For those who are
interested in the current performance on NetBSD, I've included a
summary.  The full set of measurements taken (along with the summary)
is available at:

http://acert.ir.bbn.com/viewvc/adroitgrdevel/adroitgrdevel/radio_test/usb/test-results?view=co

Summary
===

The following USB throughput results were collected on two systems
with the same hardware, running NetBSD-current with our ugen changes
and SuSE Linux.

The ugen changes allow specifying the length of the transfer to
request from the host controller, and here the fusb_netbsd testing
code was recompiled with the different sizes.  The fusb_linux code
uses 16k requests (and says that this is the largest request
possible).  In both cases the USRP library's default buffer size of 2
MB was used.  The ugen driver could also be changed to avoid a copy to
the buffer in the driver, and these tests investigate how much
performance is improved in that case.


For reference, here is how interpolation/decimation relates to the
intended data rate:

data rate | decimation | interpolation
--
16 MB/s 16   32
18.3 MB/s   14   28
21.3 MB/s   12   24
25.6 MB/s   10   20
32 MB/s 816
42.6 MB/s   612


benchmark_usb.py (bidirectional test)

driver   | xfer size | maximum (read+write)
--
NetBSD 16k 32 MB/s
Linux  16k 36.57 MB/s
NetBSD 64k 32 MB/s (usually gets 36.57)
NetBSD 128k32 MB/s
NetBSD -copy   16k 32 MB/s
NetBSD -copy   64k 42.6 MB/s
NetBSD -copy   128k42.6 MB/s


test_standard_usrp_rx

driver   | xfer size | maximum
--
NetBSD 16k 21.3
Linux  16k 32
NetBSD 64k 25.6
NetBSD 128k21.3
NetBSD -copy   16k 25.6
NetBSD -copy   64k 25.6
NetBSD -copy   128k25.6

test_standard_usrp_tx

driver   | xfer size | maximum
--
NetBSD 16k 21.3
Linux  16k 32
NetBSD 64k 25.6
NetBSD 128k21.3
NetBSD -copy   16k 21.3
NetBSD -copy   64k 25.6
NetBSD -copy   128k25.6


The Linux numbers suggest that there is about 36 MB/s bandwidth
available total (maybe more but less than 42), and it must be divided
between transmit and receive.  So 32 can be done one-way, but as soon
as one needs bidirectional traffic, neither direction can do 32.
Probably the USRP could be set up to use, say, 25.6 and 8 between read
and write instead of 16 and 16, but not 25.6 and 16.

This follows fairly well from the implementation.  On Linux, USRP
reads and writes are all done via a generic request mechanism funneled
through the control endpoint.  So the sum of reads and writes in
aggregate seems to be constrained by how fast data can be pushed
through this system.

With our NetBSD implementation, unless the transactions go in
lock-step and thus one of read and write has to wait while the other's
completion interrupt is being handled, read and write are handled
independently all the way down until you get to the host controller
driver.  Therefore the bidirectional numbers are more related to the
sum of the two unidirectional numbers, instead of bidirectional being
essentially equal to unidirectional as we're seeing with Linux.

The NetBSD numbers demonstrate that 128k transfers perform worse than
64k.  As would be expected, 128k transfers aren't worse with the extra
copy removed but they also aren't notably better.  So while there is
clearly too much cost copying 128k at a time vs. copying 64k, there is
still a lot of cost that's not in the copy at all, because the numbers
don't get vastly better when the copy is removed.  The latter cost is
what's preventing us from getting unidirectional rates comparable to
Linux.

Copying to/from user space is not showing to be the bottleneck; the
kernel debug logs clearly show that user space consumes and writes
faster than the bus in these tests.


Choosing a Good Buffer Size
===

The previous results are all using a buffer size of 2 MB (which is 2
MB for each of read and write with fusb_netbsd).  Also, all reads and
writes from user space were 16k.  The following tests indicated the
read and write length does not matter very much.  However, reducing
the buffer size from 2 MB demonstrably helps with the bidirectional
throughput.

Because the highest rate reached is not always the same, these results
include several runs of benchmark_usb.py.  The maximum rate is based
on what benchmark_usb.py claimed for five runs, trying to take into
account that all the higher transfer rates report underruns or
overruns occasionally.

driver   | xfer | buffer | maxim

Re: [Discuss-gnuradio] NetBSD USB progress

2006-07-16 Thread Joanne M Mikkelson
I forgot to mention that I've put files for a change to the USRP fusb
driver to take advantage of the new ugen driver up at:
  
http://acert.ir.bbn.com/viewvc/adroitgrdevel/adroitgrdevel/radio_test/usb/fusb_netbsd/
for now. The files go in the corresponding directories in the usrp
source tree.

This fusb_netbsd code was developed for testing the driver work, and
won't work terribly well with the unmodified ugen. But hopefully
it'll help out anyone trying out the ugen changes at this point.

Joanne


___
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio


Re: [Discuss-gnuradio] NetBSD USB progress

2006-07-16 Thread Joanne M Mikkelson
Many apologies for the delay in responding; somehow I missed this
message.

> Is there a consolidated patch file that would make it easier to apply against 
> the current NetBSD source tree?

Attached is a patch file for the top level of the NetBSD source,
including all the files changed. config(8) will automatically create
the enigmatic opt_ugen_bulk_ra_wb.h file. To actually include the
changes in your kernel compile (they are not enabled by default), add
"options UGEN_BULK_RA_WB" to your kernel config.

Joanne
Index: share/man/man4/ugen.4
===
RCS file: /cvs/netbsd/netbsd/src/share/man/man4/ugen.4,v
retrieving revision 1.1.1.1
retrieving revision 1.2
diff -u -r1.1.1.1 -r1.2
--- share/man/man4/ugen.4   23 Nov 2005 08:56:08 -  1.1.1.1
+++ share/man/man4/ugen.4   26 Jun 2006 21:02:02 -  1.2
@@ -77,14 +77,14 @@
 If an endpoint address is used both for input and output the device
 can be opened for both read or write.
 .Pp
-To find out what endpoints that exist there are a series of
+To find out what endpoints exist there are a series of
 .Xr ioctl 2
-operation on the control endpoint that returns the USB descriptors
+operations on the control endpoint that return the USB descriptors
 of the device, configurations, interfaces, and endpoints.
 .Pp
 The control transfer mode can only happen on the control endpoint
-which is always endpoint 0.  The control endpoint accepts request
-and may respond with an answer to such request.  Control request
+which is always endpoint 0.  The control endpoint accepts requests
+and may respond with an answer to such requests.  Control requests
 are issued by
 .Xr ioctl 2
 calls.
@@ -108,7 +108,15 @@
 and
 .Xr write 2
 should be used.
-All IO operations on a bulk endpoint are unbuffered.
+All IO operations on a bulk endpoint are normally unbuffered.
+Buffering can be enabled using a
+.Dv USB_SET_BULK_RA
+or
+.Dv USB_SET_BULK_WB
+.Xr ioctl 2
+call, to enable read-ahead and write-behind respectively.  When
+read-ahead or write-behind are enabled, the file descriptor may
+be set to use non-blocking IO.
 .Pp
 The interrupt transfer mode can be in or out depending on the
 endpoint.
@@ -272,6 +280,57 @@
 issue any USB transactions.
 .El
 .Pp
+Bulk endpoints handle the following
+.Xr ioctl 2
+calls:
+.Pp
+.Bl -tag -width indent -compact
+.It Dv USB_SET_BULK_RA (int)
+Enable or disable bulk read-ahead.  When enabled, the driver will
+begin to read data from the device into a buffer.  The 
+.Xr read 2
+call will read data from this buffer, blocking if necessary until
+there is enough data to read the length of data requested.  The
+buffer size and the read request length can be set by the
+.Dv USB_SET_BULK_RA_OPT
+.Xr ioctl 2
+call.
+.It Dv USB_SET_BULK_WB (int)
+Enable or disable bulk write-behind.  When enabled, the driver will
+buffer data from the
+.Xr write 2
+call before writing it to the device.  
+.Xr write 2
+will block if there is not enough room in the buffer for all
+the data.  The buffer size and the write request length can be set
+by the
+.Dv USB_SET_BULK_WB_OPT
+.Xr ioctl 2
+call.
+.It Dv USB_SET_BULK_RA_OPT (struct usb_bulk_ra_wb_opt)
+Set the size of the buffer and the length of the read requests used by
+the driver when bulk read-ahead is enabled.  The changes do not take
+effect until the next time bulk read-ahead is enabled.  Read requests
+are made for the length specified, and the host controller driver
+(i.e.,
+.Xr ehci 4 ,
+.Xr ohci 4 , and
+.Xr uhci 4 ) will perform as many bus transfers as required.  If
+transfers from the device should be smaller than the maximum length,
+.Dv ra_wb_request_size
+must be set to the required length.
+.Bd -literal
+struct usb_bulk_ra_wb_opt {
+   int ra_wb_buffer_size;
+   int ra_wb_request_size;
+};
+.Ed
+.It Dv USB_SET_BULK_WB_OPT (struct usb_bulk_ra_wb_opt)
+Set the size of the buffer and the length of the write requests used
+by the driver when bulk write-behind is enabled.  The changes do not
+take effect until the next time bulk write-behind is enabled.
+.El
+.Pp
 Note that there are two different ways of addressing configurations, 
interfaces,
 alternatives, and endpoints: by index or by number.
 The index is the ordinal number (starting from 0) of the descriptor
Index: sys/dev/usb/files.usb
===
RCS file: /cvs/netbsd/netbsd/src/sys/dev/usb/files.usb,v
retrieving revision 1.1.1.2
retrieving revision 1.3
diff -u -r1.1.1.2 -r1.3
--- sys/dev/usb/files.usb   19 Jun 2006 15:44:45 -  1.1.1.2
+++ sys/dev/usb/files.usb   26 Jun 2006 20:17:32 -  1.3
@@ -48,6 +48,7 @@
 
 
 # Generic devices
+defflag UGEN_BULK_RA_WB
 device ugen
 attach ugen at uhub
 file   dev/usb/ugen.c  ugenneeds-flag
Index: sys/dev/usb/ugen.c
===
RCS file: /cvs/netbsd/netbsd/s

Re: [Discuss-gnuradio] NetBSD USB progress

2006-06-28 Thread Joanne M Mikkelson
> There is a long outstanding bug in benchmark_usb that has it be
> unreliable.  It's been a long time since I looked at it.  The problem
> could be in the lfsr synchronization.

Yeah, I saw the comment in the file. What I find interesting about it
is that it's only failing for the slowest transfer rate, and the
others are fine. So they are getting past that same point in the
sequence with no trouble, and they're also not all failing ~60k
samples before the end...

It's probably indeed not very critical to figure it out, but it does
strike me as odd that it's unreliable in this particular way.

Joanne


___
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio


[Discuss-gnuradio] NetBSD USB progress

2006-06-26 Thread Joanne M Mikkelson
Hi all,

As was discussed here earlier (starting from
http://lists.gnu.org/archive/html/discuss-gnuradio/2006-05/msg00045.html
in the list archive), BBN has been working on improving the ugen(4)
driver for NetBSD.

We've now implemented the changes to the driver and it's handling
transfer rates of at least 16 MB/s well.  According to benchmark_usb.py
(and test_digital_loopback*) in gnuradio-examples, we are getting 32
MB/s throughput (16 in each direction). Also, the test_usrp_standard_rx
and test_usrp_standard_tx programs indicate we're almost getting 25.6
MB/s one-way (decimation 10 and interpolation 20), typically with 0-2
overruns or underruns.

This is much improved over ~4 MB/s but the next step is to work on
optimizing what's needed to reach 32 MB/s reading or writing.  If
you're interested, the current driver work can be found at:
http://acert.ir.bbn.com/cvs/?group=netbsd
primarily in:
http://acert.ir.bbn.com/viewvc/netbsd/netbsd/src/sys/dev/usb/


Also, interestingly, benchmark_usb always fails for 2 MB/s even
though the other rates are fine.  I don't know yet why that might be,
but it always looks about like this, always around 940k samples:
gr_check_lfsr_32k: enter_SEARCHING at offset0 (0x)
gr_check_lfsr_32k: enter_LOCKED at offset 1452 (0x05ac)
gr_check_lfsr_32k: enter_SEARCHING at offset   947028 (0x000e7354)
usb_throughput = 2M
ntotal= 100
nright= 945576
runlength = 0
delta = 100
FAILED

Joanne


___
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio