Thanks Marcus Actually, the only filtering I did in the C++ version is for the M&M clock recovery, i.e. in interpolating to get the symbols based on imperfectly timed samples. In the GRC example, I also had an RRC filter, with 11*samples_per_symbol taps, but this didn't appear to be the bottleneck. In both applications, the Costas loop and the M&M timing recovery tend to be the problem. I think multithreading the C++ application will benefit, but I am not sure it is splittable into multiple threads other than possibly 3, since the Costas loop and also the M&M loop are recursive in nature.
By the way, FFTs don't seem to be such a problem, I can even get lower decimation rates for that, but to do the Costas/M&M seems to be the big killer. Cheers Ian. -----Original Message----- From: discuss-gnuradio-bounces+ian.holland=rlmgroup.com...@gnu.org [mailto:discuss-gnuradio-bounces+ian.holland=rlmgroup.com...@gnu.org] On Behalf Of Marcus D. Leech Sent: Friday, 23 April 2010 1:48 PM To: discuss-gnuradio@gnu.org Subject: Re: [Discuss-gnuradio] Large number of overflows... On 04/22/2010 07:56 PM, Matt Ettus wrote: > > I am pretty sure that what you are seeing is that your application is > not keeping up. The USRP2 keeps sending data to the computer as fast > as it generates it. The ethernet card DMAs it into some buffer in > memory. Your app uses it and the driver then frees the buffer. If at > some point the driver receives a frame and there is no buffer free for > it then the packet will be dropped, and you'll see an "S". S stands > for sequence number error, which is how the system can tell that there > is a dropped packet. It is an overrun occurring in the computer, not > in the hardware. The hardware will not overrun. > > The best way to test what is happening is to run usrp2_fft.py. If you > can run that at the same or higher sample rates than you are using in > your application, then the driver is not the issue. My guess is that > your computer will run without problem at decimation of 6 at worst, > and more likely all the way down to 4. Your app is running at a > decimation of around 12 or 16, so it is your app that can't keep up. > > > Think of it this way -- the fastest Core i7 machines are 3.2 GHz. For > a 2 Mbps signal, you only have 1600 cycles per data bit. Assuming you > are using at least 2 samples per bit, you only have 800 cycles per > sample to process them. This is certainly possible, but you will need > to optimize your code. > > How long are your filters? Are you using FFT-based filters instead of > convolution based? Is too much memory getting copied around? > > For some perspective, based on USRP1 data. My radio astronomy application runs fairly well at 10.6Msps, on a Core 2 Quad 9XXX (9770?) machine, with 8G of memory, and clocked at about 3.2GHz. My application does a 1Hz-resolution FFT over the data (that's a 10.6M point FFT!), computes the total power, and also does interference notch filtering, using a FFT filter, plus SETI analysis, pulsar folding, and transient detection. It can keep up, but all 4 cores are pretty busy! I think Matt's analysis is pretty close to the mark. One of the mistakes people make (that I've also made) is to specify FIR filters with very-narrow transition widths--that will cause a very long filter to be created. Relaxing the "skirts" on the filter can dramatically reduce CPU consumption. I typically use filter "skirts" that are roughly 20-25% of the total bandwidth of the filter. In many applications, very tight filtering isn't a requirement for decent performance of the downstream demodulation, particularly when link margins are reasonably good anyway. -- Marcus Leech Principal Investigator Shirleys Bay Radio Astronomy Consortium http://www.sbrac.org _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio _______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org http://lists.gnu.org/mailman/listinfo/discuss-gnuradio