Re: [Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)
Hi, sorry for my long delay, I was on vacation and then playing catch-up. > I'm not sure I followed the explanation for why on NetBSD the > unidirectional case isn't equal to the sum of the bidirectional case. > Could you try explaining again? On second thought, is the problem > that there's only one request in the h/w endpoint queue for a given > endpoint and direction? If so, I think you could get the completion > interrupt service time out of the critical path by ensuring that there > are always two requests queued in each direction, not just one. Yes, as the driver is currently implemented, there is only one request queued for a given endpoint at a time. You're correct that having more than one would reduce the interrupt service time's effect on performance, but doing this will require changes to more than just ugen. The ehci driver will need some work in order to work properly with more than one bulk request queued at a time. We haven't changed the ehci driver, so until that happens, the ugen driver will have to use just a single request at a time. > I'd also be interested in seeing how the throughput holds up with > smaller transfer sizes and smaller total amount of buffer space. Because we only have one request at a time, the throughput will suffer as request sizes get smaller. In my experience the total buffer space need not be more than a few requests' worth (and the numbers showed that having the buffers too large hurts performance), but this testing wasn't with much computation load. At least the latency should still be improved over what we had with ugen before. Using test_usrp_standard_rx and _tx, a block size of 1024 only works with decimation 64/interpolation 128 (4 MB/s) and a block size of 4096 works with decimation 16/interpolation 32 (16 MB/s). This is without real-time scheduling, which isn't working. Joanne ___ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio
Re: [Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)
On Saturday 22 July 2006 23:56, Greg Troxel wrote: > I'm sorry to say that I can't share the enthusiam as it didn't make much > difference on a system using Centrino Duo system running with > NetBSD-3.99.21. > > Does it require GNU Radio current? I'm currently running GNU Radio > release 2.8 as found in the pkgsrc release! > > You need not only the patched NetBSD with the kernel option defined, > but also the modified fusb code that Joanne pointed out recently > (which enables read-ahead and write-behind). > > It lives here at the moment: > https://acert.ir.bbn.com/projects/adroitgrdevel > > I can certainly believe that you'd get different results, but I would > expect a big improvement. Thanks, I followed your instructions and now have stotter free sampling at max speed even with GUI's running concurrently AWESOME. cheerio Berndt pgpQ64GPEyIGt.pgp Description: PGP signature ___ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio
Re: [Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)
I'm sorry to say that I can't share the enthusiam as it didn't make much difference on a system using Centrino Duo system running with NetBSD-3.99.21. Does it require GNU Radio current? I'm currently running GNU Radio release 2.8 as found in the pkgsrc release! You need not only the patched NetBSD with the kernel option defined, but also the modified fusb code that Joanne pointed out recently (which enables read-ahead and write-behind). It lives here at the moment: https://acert.ir.bbn.com/projects/adroitgrdevel I can certainly believe that you'd get different results, but I would expect a big improvement. -- Greg Troxel <[EMAIL PROTECTED]> pgp5Kb7n2xypv.pgp Description: PGP signature ___ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio
Re: [Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)
Hi, I'm sorry to say that I can't share the enthusiam as it didn't make much difference on a system using Centrino Duo system running with NetBSD-3.99.21. Does it require GNU Radio current? I'm currently running GNU Radio release 2.8 as found in the pkgsrc release! cheerio Berndt On Saturday 22 July 2006 15:26, Joanne M Mikkelson wrote: > Hi, > > We collected some data comparing the USB throughput we're getting now > on NetBSD against the throughput on Linux. For those who are > interested in the current performance on NetBSD, I've included a > summary. The full set of measurements taken (along with the summary) > is available at: > > http://acert.ir.bbn.com/viewvc/adroitgrdevel/adroitgrdevel/radio_test/usb/t >est-results?view=co > > Summary > === > > The following USB throughput results were collected on two systems > with the same hardware, running NetBSD-current with our ugen changes > and SuSE Linux. > > The ugen changes allow specifying the length of the transfer to > request from the host controller, and here the fusb_netbsd testing > code was recompiled with the different sizes. The fusb_linux code > uses 16k requests (and says that this is the largest request > possible). In both cases the USRP library's default buffer size of 2 > MB was used. The ugen driver could also be changed to avoid a copy to > the buffer in the driver, and these tests investigate how much > performance is improved in that case. > > > For reference, here is how interpolation/decimation relates to the > intended data rate: > > data rate | decimation | interpolation > -- > 16 MB/s 16 32 > 18.3 MB/s 14 28 > 21.3 MB/s 12 24 > 25.6 MB/s 10 20 > 32 MB/s 816 > 42.6 MB/s 612 > > > benchmark_usb.py (bidirectional test) > > driver | xfer size | maximum (read+write) > -- > NetBSD 16k 32 MB/s > Linux 16k 36.57 MB/s > NetBSD 64k 32 MB/s (usually gets 36.57) > NetBSD 128k32 MB/s > NetBSD -copy 16k 32 MB/s > NetBSD -copy 64k 42.6 MB/s > NetBSD -copy 128k42.6 MB/s > > > test_standard_usrp_rx > > driver | xfer size | maximum > -- > NetBSD 16k 21.3 > Linux 16k 32 > NetBSD 64k 25.6 > NetBSD 128k21.3 > NetBSD -copy 16k 25.6 > NetBSD -copy 64k 25.6 > NetBSD -copy 128k25.6 > > test_standard_usrp_tx > > driver | xfer size | maximum > -- > NetBSD 16k 21.3 > Linux 16k 32 > NetBSD 64k 25.6 > NetBSD 128k21.3 > NetBSD -copy 16k 21.3 > NetBSD -copy 64k 25.6 > NetBSD -copy 128k25.6 > > > The Linux numbers suggest that there is about 36 MB/s bandwidth > available total (maybe more but less than 42), and it must be divided > between transmit and receive. So 32 can be done one-way, but as soon > as one needs bidirectional traffic, neither direction can do 32. > Probably the USRP could be set up to use, say, 25.6 and 8 between read > and write instead of 16 and 16, but not 25.6 and 16. > > This follows fairly well from the implementation. On Linux, USRP > reads and writes are all done via a generic request mechanism funneled > through the control endpoint. So the sum of reads and writes in > aggregate seems to be constrained by how fast data can be pushed > through this system. > > With our NetBSD implementation, unless the transactions go in > lock-step and thus one of read and write has to wait while the other's > completion interrupt is being handled, read and write are handled > independently all the way down until you get to the host controller > driver. Therefore the bidirectional numbers are more related to the > sum of the two unidirectional numbers, instead of bidirectional being > essentially equal to unidirectional as we're seeing with Linux. > > The NetBSD numbers demonstrate that 128k transfers perform worse than > 64k. As would be expected, 128k transfers aren't worse with the extra > copy removed but they also aren't notably better. So while there is > clearly too much cost copying 128k at a time vs. copying 64k, there is > still a lot of cost that's not in the copy at all, because the numbers > don't get vastly better when the copy is removed. The latter cost is > what's preventing us from getting unidirectional rates comparable to > Linux. > > Copying to/from user space is not showing to be the bottleneck; the > kernel debug logs clearly show that user space consumes and writes > faster than the bus in these tests. > > > Choosing a Good Buffer Size > === > > The previous results are all using a buffer size of 2 MB (which is 2 > MB for each of read and write with fusb_netbsd). Also, al
Re: [Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)
On Sat, Jul 22, 2006 at 01:56:15AM -0400, Joanne M Mikkelson wrote: > Hi, > > We collected some data comparing the USB throughput we're getting now > on NetBSD against the throughput on Linux. For those who are > interested in the current performance on NetBSD, I've included a > summary. The full set of measurements taken (along with the summary) > is available at: > > http://acert.ir.bbn.com/viewvc/adroitgrdevel/adroitgrdevel/radio_test/usb/test-results?view=co Great report! I'm not sure I followed the explanation for why on NetBSD the unidirectional case isn't equal to the sum of the bidirectional case. Could you try explaining again? On second thought, is the problem that there's only one request in the h/w endpoint queue for a given endpoint and direction? If so, I think you could get the completion interrupt service time out of the critical path by ensuring that there are always two requests queued in each direction, not just one. I'd also be interested in seeing how the throughput holds up with smaller transfer sizes and smaller total amount of buffer space. For example, in gnuradio-examples/python/gmsk2/tunnel.py (ethernet over GNU Radio using CSMA MAC) we're currently running with: 1024 byte transfers, with a total of 16 blocks (16kB) in each direction. If we can't enable real-time scheduling, then we run with 4096 byte transfers, with a total of 16 blocks (64kB) in each direction. As you can see, we've cut the total buffer allocated *way* down in order to minimize the maximum round-trip latency as seen by a software MAC. These numbers were empirically chosen as "close to the smallest values that works on Eric's laptops." They can be overridden on the command line, and the defaults can be set in the user prefs file, ~/.gnuradio/config.conf: [fusb] rt_block_size = 1024 rt_nblocks = 16 block_size = 4096 nblocks = 16 Currently only tunnel.py observes these settings. FYI, test_usrp_standard_{tx,rx} support similar command line options: fprintf (stderr, " [-B ] set fast usb block_size\n"); fprintf (stderr, " [-N ] set fast usb nblocks\n"); fprintf (stderr, " [-R] set real time scheduling: SCHED_FIFO; pri = midpoint\n"); Thanks again for your efforts and the report! Eric ___ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio
[Discuss-gnuradio] USB throughput numbers for NetBSD (and Linux)
Hi, We collected some data comparing the USB throughput we're getting now on NetBSD against the throughput on Linux. For those who are interested in the current performance on NetBSD, I've included a summary. The full set of measurements taken (along with the summary) is available at: http://acert.ir.bbn.com/viewvc/adroitgrdevel/adroitgrdevel/radio_test/usb/test-results?view=co Summary === The following USB throughput results were collected on two systems with the same hardware, running NetBSD-current with our ugen changes and SuSE Linux. The ugen changes allow specifying the length of the transfer to request from the host controller, and here the fusb_netbsd testing code was recompiled with the different sizes. The fusb_linux code uses 16k requests (and says that this is the largest request possible). In both cases the USRP library's default buffer size of 2 MB was used. The ugen driver could also be changed to avoid a copy to the buffer in the driver, and these tests investigate how much performance is improved in that case. For reference, here is how interpolation/decimation relates to the intended data rate: data rate | decimation | interpolation -- 16 MB/s 16 32 18.3 MB/s 14 28 21.3 MB/s 12 24 25.6 MB/s 10 20 32 MB/s 816 42.6 MB/s 612 benchmark_usb.py (bidirectional test) driver | xfer size | maximum (read+write) -- NetBSD 16k 32 MB/s Linux 16k 36.57 MB/s NetBSD 64k 32 MB/s (usually gets 36.57) NetBSD 128k32 MB/s NetBSD -copy 16k 32 MB/s NetBSD -copy 64k 42.6 MB/s NetBSD -copy 128k42.6 MB/s test_standard_usrp_rx driver | xfer size | maximum -- NetBSD 16k 21.3 Linux 16k 32 NetBSD 64k 25.6 NetBSD 128k21.3 NetBSD -copy 16k 25.6 NetBSD -copy 64k 25.6 NetBSD -copy 128k25.6 test_standard_usrp_tx driver | xfer size | maximum -- NetBSD 16k 21.3 Linux 16k 32 NetBSD 64k 25.6 NetBSD 128k21.3 NetBSD -copy 16k 21.3 NetBSD -copy 64k 25.6 NetBSD -copy 128k25.6 The Linux numbers suggest that there is about 36 MB/s bandwidth available total (maybe more but less than 42), and it must be divided between transmit and receive. So 32 can be done one-way, but as soon as one needs bidirectional traffic, neither direction can do 32. Probably the USRP could be set up to use, say, 25.6 and 8 between read and write instead of 16 and 16, but not 25.6 and 16. This follows fairly well from the implementation. On Linux, USRP reads and writes are all done via a generic request mechanism funneled through the control endpoint. So the sum of reads and writes in aggregate seems to be constrained by how fast data can be pushed through this system. With our NetBSD implementation, unless the transactions go in lock-step and thus one of read and write has to wait while the other's completion interrupt is being handled, read and write are handled independently all the way down until you get to the host controller driver. Therefore the bidirectional numbers are more related to the sum of the two unidirectional numbers, instead of bidirectional being essentially equal to unidirectional as we're seeing with Linux. The NetBSD numbers demonstrate that 128k transfers perform worse than 64k. As would be expected, 128k transfers aren't worse with the extra copy removed but they also aren't notably better. So while there is clearly too much cost copying 128k at a time vs. copying 64k, there is still a lot of cost that's not in the copy at all, because the numbers don't get vastly better when the copy is removed. The latter cost is what's preventing us from getting unidirectional rates comparable to Linux. Copying to/from user space is not showing to be the bottleneck; the kernel debug logs clearly show that user space consumes and writes faster than the bus in these tests. Choosing a Good Buffer Size === The previous results are all using a buffer size of 2 MB (which is 2 MB for each of read and write with fusb_netbsd). Also, all reads and writes from user space were 16k. The following tests indicated the read and write length does not matter very much. However, reducing the buffer size from 2 MB demonstrably helps with the bidirectional throughput. Because the highest rate reached is not always the same, these results include several runs of benchmark_usb.py. The maximum rate is based on what benchmark_usb.py claimed for five runs, trying to take into account that all the higher transfer rates report underruns or overruns occasionally. driver | xfer | buffer | maxim