Hi Jeon,

speed depends on your hardware and the implementation of the decoder.
As a rule of thumb: the more "generally applicable" a decoder is, the
slower it gets.
Jan wrote a set of highly SIMD-optimized decoders, but these are (pretty
common) special cases, so they don't cover all the cases that gr-trellis
works in, or even more general, the ITPP decoders application range. I'd
assume you could get a significant speed boost if you replaced the IT++
implementation with your own, highly specialized decoder if you know
what you're doing, but honestly, implementing, leave alone optimizing,
decoders is a non-trivial task and one should definitely not start a
project like gr-ieee802-11 trying to write one's own decoder if there's
an existing decoder out there that's usable (IT++ can be a pain to use,
still).

Generally, I'd frown upon using a VM to benchmark decoders: Good
decoders might make substantial use advanced SIMD instructions, but
these might not be enabled in your virtualizer. Furthermore, if you want
to do real-world gr-ieee802-11 usage, *don't* work in a VM, unless
you're super knowledgable about how to configure VMs; latency and CPU
overhead is critical, so the default "NAT" network configuration will
not work well for network-attached USRPs, and USB3 support in VMs is
ranging between bad and horrible, so B2x0 aren't really the best thing
to be used in VMs, either.

Run "volk-config-info --avail-machines" and check whether the output
contains:
generic_orc;sse2_64_mmx_orc;sse3_64_orc;ssse3_64_orc;sse4_1_64_orc;sse4_2_64_orc;avx_64_mmx_orc;

If that's the case, your VMWare does allow AVX/SSE4 inside your VM.

Best regards,
Marcus

On 15.09.2015 09:47, Jeon wrote:
> I've measured time taken by convolutional decoding in gr-ieee802-11.
> The module is using Punctured Convolutional Code class from IT++
> library
> (http://itpp.sourceforge.net/4.3.0/classitpp_1_1Punctured__Convolutional__Code.html)
>
> I've used chrono (chrono.h, chrono) to measure time taken. You can see
> how I made it from the following page
> (https://gist.github.com/gsongsong/7c4081f44e88a7f4407a#file-ofdm_decode_mac-cc-L252-L257)
>
> I've measure time with a loopback flow graph (w/o USRP;
> examples/wifi_loopback.grc)
>
> The result says that it takes from 5,000 to 30,000 us, which is 5 to
> 30 ms to decode a signal with a length of 9,000 samples (samples are
> either 1 or -1.)
>
> * Test environment: Ubuntu 14.04 on VMWare, 2 CPUs and 4 GB RAM allocated
> * Host environmetn: Windows 7 with i7-3770 3.7 GHz
>
> Since I am not familiar with error correcting codes, I have no idea
> how large the order of time taken is. But I think that one of the most
> efficient decoding algorithm is Viterbi and that IT++ must use it.'
>
> Then I  can deduce that CC decoding takes a quite long time even
> though the algorithm (Viterbi) is very efficient. And is it a natural
> limitation of software decoding and SDR?
>
> Another question comes that, the commercial off the shelf (COTS) Wi-Fi
> device achieves really high throughput and that must be based on super
> faster CC decoding. Is that because COTS is using heaviliy optimized
> FPGA and dedicated decoding chips?
>
> Regards,
> Jeon.
>
>
> _______________________________________________
> Discuss-gnuradio mailing list
> Discuss-gnuradio@gnu.org
> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to