[USRP-users] X300 recovery after LATE_COMMAND or OVERFLOW_ERROR

Martin Guski via USRP-users Tue, 25 Jul 2017 12:36:00 -0700

Hi!

We are using 8 X300s which are each connected to a computer via dedicated
10 GigE ports. Our application is transmitting and receiving for bursts of
1 second (@10 Msps) on both slots of the USRP. After that we process the
data, transmit/receive again, and so on.


After running without errors for some time (a few hour), the rx_streamer
returns errors for the next four bursts in the following pattern:

1) ERROR_CODE_LATE_COMMAND
2) ERROR_CODE_LATE_COMMAND
3) ERROR_CODE_OVERFLOW
4) ERROR_CODE_TIMEOUT

And after that it usually continues running without problems for about 20 -
60 min before it happens again. All 8 USRPs report the errors at the same
time. (Each USRP is controlled by a separate multi_usrp running in a
dedicated process..)
I took a closer look when I start the streaming: The first LATE_COMMAND is
really to late (time of preparation of the stream varies  ). The following
command if definitely not to late, but nevertheless the LATE_ERROR is
raised.

Sometimes one USRP doesn't recover after returning the ERROR_CODE_TIMEOUT
error and reports LATE_COMMAND and OVERFLOW_ERRORS (with and without
out_of_sequence flag) for all following transmissions. After restarting the
usrp program everything works again.

And from time to time I also get this error:

> UHD Error:
>     The receive packet handler failed to time-align packets.
>     1002 received packets were processed by the handler.
>     However, a timestamp match could not be determined.


So my question is: Is there a way to recover the USRP or the steamer after
I detect an extended sequence of LATE_COMMAND and OVERFLOW_ERRORS?


Thanks
Martin

*Maybe some interesting further information:*
- The error only occurs when using both slots of the USRP, for one side
everything works.
- Reducing the sampling rate to 5 Msps doesn't help
- The ERROR_CODE_TIMEOUT is new for the new UHD3.9.7 release and it looks
like this resets the USRP sometimes.
- For the older versions of UHD 3.9 it never recovered form the first
occurring error.
- For UHD3.10 (maint branch) after ( I guess) the first error occurred the
driver process had a CPU usage of 200 % until the process was killed. Also
there were underflows for nearly each transmission (when using both
sides/channels and 10 Msps).


*More information about our setup:*

Ubuntu 16.04.1 LTS
UHD_003.009.007 release

X300
- Hardware Versions 5, 6
Frontends: 2x LFRX and 2x LFTX2x

Intel i7 (i7-5930K @3.50GHz, 6 Cores / 12 threads), 12 GB Ram
- disabled CPU power management

Network: Intel X710 for 10GbE SFP+ (quad-port)
Each X300 (port1) connected directly to NIC port, all have separate netmasks
- MTU set to 9000
- increase the maximum size of the socket buffers
- Flow Control disabled for rx and tx
- ifconfig shows 0 errors, dropped packets or overruns

_______________________________________________
USRP-users mailing list
USRP-users@lists.ettus.com
http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

[USRP-users] X300 recovery after LATE_COMMAND or OVERFLOW_ERROR

Reply via email to