Hi Jeon, speed depends on your hardware and the implementation of the decoder. As a rule of thumb: the more "generally applicable" a decoder is, the slower it gets. Jan wrote a set of highly SIMD-optimized decoders, but these are (pretty common) special cases, so they don't cover all the cases that gr-trellis works in, or even more general, the ITPP decoders application range. I'd assume you could get a significant speed boost if you replaced the IT++ implementation with your own, highly specialized decoder if you know what you're doing, but honestly, implementing, leave alone optimizing, decoders is a non-trivial task and one should definitely not start a project like gr-ieee802-11 trying to write one's own decoder if there's an existing decoder out there that's usable (IT++ can be a pain to use, still).
Generally, I'd frown upon using a VM to benchmark decoders: Good decoders might make substantial use advanced SIMD instructions, but these might not be enabled in your virtualizer. Furthermore, if you want to do real-world gr-ieee802-11 usage, *don't* work in a VM, unless you're super knowledgable about how to configure VMs; latency and CPU overhead is critical, so the default "NAT" network configuration will not work well for network-attached USRPs, and USB3 support in VMs is ranging between bad and horrible, so B2x0 aren't really the best thing to be used in VMs, either. Run "volk-config-info --avail-machines" and check whether the output contains: generic_orc;sse2_64_mmx_orc;sse3_64_orc;ssse3_64_orc;sse4_1_64_orc;sse4_2_64_orc;avx_64_mmx_orc; If that's the case, your VMWare does allow AVX/SSE4 inside your VM. Best regards, Marcus On 15.09.2015 09:47, Jeon wrote: > I've measured time taken by convolutional decoding in gr-ieee802-11. > The module is using Punctured Convolutional Code class from IT++ > library > (http://itpp.sourceforge.net/4.3.0/classitpp_1_1Punctured__Convolutional__Code.html) > > I've used chrono (chrono.h, chrono) to measure time taken. You can see > how I made it from the following page > (https://gist.github.com/gsongsong/7c4081f44e88a7f4407a#file-ofdm_decode_mac-cc-L252-L257) > > I've measure time with a loopback flow graph (w/o USRP; > examples/wifi_loopback.grc) > > The result says that it takes from 5,000 to 30,000 us, which is 5 to > 30 ms to decode a signal with a length of 9,000 samples (samples are > either 1 or -1.) > > * Test environment: Ubuntu 14.04 on VMWare, 2 CPUs and 4 GB RAM allocated > * Host environmetn: Windows 7 with i7-3770 3.7 GHz > > Since I am not familiar with error correcting codes, I have no idea > how large the order of time taken is. But I think that one of the most > efficient decoding algorithm is Viterbi and that IT++ must use it.' > > Then I can deduce that CC decoding takes a quite long time even > though the algorithm (Viterbi) is very efficient. And is it a natural > limitation of software decoding and SDR? > > Another question comes that, the commercial off the shelf (COTS) Wi-Fi > device achieves really high throughput and that must be based on super > faster CC decoding. Is that because COTS is using heaviliy optimized > FPGA and dedicated decoding chips? > > Regards, > Jeon. > > > _______________________________________________ > Discuss-gnuradio mailing list > Discuss-gnuradio@gnu.org > https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
_______________________________________________ Discuss-gnuradio mailing list Discuss-gnuradio@gnu.org https://lists.gnu.org/mailman/listinfo/discuss-gnuradio