On 2022-02-04 15:26, Richard Bell wrote:
Hello,
I know that when I see O's (overruns) in the terminal it means my host
processing is not keeping up with the sample stream coming in from the
USRP. Samples are being dropped because the host is too slow to keep up.
I'm wondering if there is a test I can run that would reveal the cause
of the O's on my server. What is it on my server that is the
bottleneck? Do O's mean the problem is buffer overruns within the NIC
itself? Does it mean buffer overrun after the CPU? Does it mean buffer
overrun while filling up ram?
I am using a 2 port QFSP+ 100G NIC with both ports attached via QSFP+
cables to a 100G switch. From the switch I connect 5 USRP n310's using
their SFP+ ports and SFP+ cables. Each of the n310's dual SFP+ ports
are connected to the 100G switch and in this configuration I am able
to use 2 of the 5 n310's simultaneously with 2 receive antennas per
radio sampling at 125 MHz without any O's. When I increase the number
of radios above this, I start seeing O's. The server is a 64 core
machine with 200G RAM.
I calculate the total throughput required to keep up with 5 n310's
sampling at 125 MHz from 2 antennas with 16 bit I and 16 bit Q coming
off the wire at the server as:
(5 radios)*(2 antennas)*(125 mega samples per second)*(32 bits per
complex sample)=40 Gbit/s or just 5 GByte/s. This is well below the
capability of the network and I assume a high end 64 core server,
unless I'm overlooking something?
Any help or feedback is appreciated.
Richard
Don't forget that in general, a single CPU is handling packets coming
from your NIC. Distributing that load over multiple CPUs is exceedingly
difficult to make work
in such a way that overall performance is improved.
Once your samples are "inside the system", there's a LOT of "bits and
pieces" at play, and it's usually hard to point to a single thing and
say "there it is, there's
the performance bottleneck".
Quite apart from CPU considerations (and you can start running into
overrun situations long before your CPU is close to saturated), there's
memory-bandwidth issues,
IO bandwidth issues, etc, etc.
For 100Msps-scale sample rates, you should probably increase the
rmem_max sysctl parameter beyond even what UHD recommends by default,
just to make sure that
transient insufficiencies in moving samples up to user-space don't
cause you issues.
The overall take-away, though, is that adding buffering to a
signal-processing pathway that cannot, on average, "keep up" does not
help you other than delay
the point at which samples start getting dropped. That's just a
basic producer-consumer thing in computer science, and not unique to DSP
flows...
Rob Kossler has already mentioned this guide, but here's a pointer:
https://kb.ettus.com/USRP_Host_Performance_Tuning_Tips_and_Tricks
_______________________________________________
USRP-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]