Hi all,

I would like to use the volk library in a C++ program that uses
gnuradio-core and currently builds under Linux and MacOS X. In MacOS
1.6.8 (Snow Leopard, updated), I used macports for installing
gnuradio-core (which is in version 3.3, enough for my app). Since, in
my understanding (please correct me if I'm wrong), volk is a library
that can live independently from the gnuradio version, I did the
following:

$  git clone git://gnuradio.org/gnuradio
$  cd gnuradio/volk
$  cmake .
$  make
...
[100%] Built target volk_profile
$  sudo make install

Then I ran the tests:

$ lib/test_all

All test but one passed, and I see that in some functions the generic
architecture is the best one, which is beyond my understanding. The
test that failed is:

...
volk_32fc_32f_multiply_32fc_a: fail on arch sse
Best arch: sse
/Users/carlesfernandez/Documents/workspace/gnuradio/volk/lib/testqa.cc:25:
error in "volk_32fc_32f_multiply_32fc_a_test": check
run_volk_tests(volk_32fc_32f_multiply_32fc_a_get_func_desc(), (void
(*)())volk_32fc_32f_multiply_32fc_a_manual,
std::string("volk_32fc_32f_multiply_32fc_a"), 1e-4, 0, 20460, 1, 0) ==
0 failed [true != 0]
...


I'm quite happy because I see dramatic improvements in some functions
of my interest (basically I want to implement correlators and mixers,
so I'm sensible precisely to this function, bad luck), but this
"generic" superiority in some cases intrigues me. I would appreciate
if anyone can shed some light on the internals of volk, or if I have
to configure or install something else. Anyway, thanks to the
developers for releasing such interesting stuff :-)




This is the complete output, for the records:


volk carlesfernandez$ cmake .
-- The C compiler identification is GNU
-- The CXX compiler identification is GNU
-- Checking whether C compiler has -isysroot
-- Checking whether C compiler has -isysroot - yes
-- Checking whether C compiler supports OSX deployment target flag
-- Checking whether C compiler supports OSX deployment target flag - yes
-- Check for working C compiler: /usr/local/bin/gcc
-- Check for working C compiler: /usr/local/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Checking whether CXX compiler has -isysroot
-- Checking whether CXX compiler has -isysroot - yes
-- Checking whether CXX compiler supports OSX deployment target flag
-- Checking whether CXX compiler supports OSX deployment target flag - yes
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Found PythonInterp: /opt/local/bin/python (found version "2.6.7")
-- Boost version: 1.48.0
-- Found the following Boost libraries:
--   unit_test_framework
-- checking for module 'orc-0.4'
--   package 'orc-0.4' not found
-- orc files (missing:  ORC_LIBRARY ORC_INCLUDE_DIR ORCC_EXECUTABLE)
-- Check size of void*
-- Check size of void* - done
-- Performing Test have_maltivec
-- Performing Test have_maltivec - Failed
-- Performing Test have_mfpu=neon
-- Performing Test have_mfpu=neon - Failed
-- Performing Test have_mfloat-abi=softfp
-- Performing Test have_mfloat-abi=softfp - Failed
-- Performing Test have_funsafe-math-optimizations
-- Performing Test have_funsafe-math-optimizations - Success
-- 32 overruled
-- Performing Test have_m64
-- Performing Test have_m64 - Success
-- Performing Test have_m3dnow
-- Performing Test have_m3dnow - Success
-- Performing Test have_msse4.2
-- Performing Test have_msse4.2 - Success
-- Performing Test have_mpopcnt
-- Performing Test have_mpopcnt - Failed
-- Performing Test have_mmmx
-- Performing Test have_mmmx - Success
-- Performing Test have_msse
-- Performing Test have_msse - Success
-- Performing Test have_msse2
-- Performing Test have_msse2 - Success
-- orc overruled
-- Performing Test have_msse3
-- Performing Test have_msse3 - Success
-- Performing Test have_mssse3
-- Performing Test have_mssse3 - Success
-- Performing Test have_msse4a
-- Performing Test have_msse4a - Success
-- Performing Test have_msse4.1
-- Performing Test have_msse4.1 - Success
-- Performing Test have_mavx
-- Performing Test have_mavx - Failed
-- Available arches:
generic;64;3dnow;abm;mmx;sse;sse2;sse3;ssse3;sse4_a;sse4_1;sse4_2
-- Available machines: generic;sse2_only;sse2_64;sse3_64;ssse3_64;sse4_1_64
-- Did not find liborc and orcc, disabling orc support...
-- Using install prefix: /usr/local
-- Configuring done
-- Generating done


Tests output:



Running 77 test cases...
Using Volk machine: sse4_1_64
RUN_VOLK_TESTS: volk_16ic_s32f_deinterleave_real_32f_a
sse4_1 completed in 1.5e-05s
sse completed in 5.5e-05s
generic completed in 1.4e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_16ic_deinterleave_real_8i_a
ssse3 completed in 7e-06s
generic completed in 8e-06s
Best arch: ssse3
RUN_VOLK_TESTS: volk_16ic_deinterleave_16i_x2_a
ssse3 completed in 1.7e-05s
sse2 completed in 1.1e-05s
generic completed in 2.1e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_16ic_s32f_deinterleave_32f_x2_a
sse completed in 7.4e-05s
generic completed in 2.1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_16ic_deinterleave_real_16i_a
ssse3 completed in 6e-06s
sse2 completed in 8e-06s
generic completed in 9e-06s
Best arch: ssse3
RUN_VOLK_TESTS: volk_16ic_magnitude_16i_a
sse3 completed in 0.000132s
sse completed in 0.00015s
generic completed in 0.000218s
Best arch: sse3
RUN_VOLK_TESTS: volk_16ic_s32f_magnitude_32f_a
sse3 completed in 0.000113s
sse completed in 0.000107s
generic completed in 2.7e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_16i_s32f_convert_32f_a
sse4_1 completed in 1.2e-05s
sse completed in 2e-05s
generic completed in 1.1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_16i_s32f_convert_32f_u
sse4_1 completed in 1.2e-05s
sse completed in 2.1e-05s
generic completed in 1.1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_16i_convert_8i_a
sse2 completed in 4e-06s
generic completed in 6e-06s
Best arch: sse2
RUN_VOLK_TESTS: volk_16i_convert_8i_u
sse2 completed in 6e-06s
generic completed in 6e-06s
Best arch: sse2
RUN_VOLK_TESTS: volk_16u_byteswap_a
sse2 completed in 6e-06s
generic completed in 1.5e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_accumulator_s32f_a
sse completed in 2.5e-05s
generic completed in 2.1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_x2_add_32f_a
sse completed in 1.9e-05s
generic completed in 2.4e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32fc_32f_multiply_32fc_a
sse completed in 5.5e-05s
generic completed in 7.2e-05s
offset 4 in1: 0.387495 in2: 0.103868
offset 6 in1: 0.201248 in2: -0.203787
offset 8 in1: 0.549574 in2: 0.499452
offset 12 in1: 0.00829957 in2: 0.00535752
offset 14 in1: 0.139478 in2: 0.0225341
offset 23 in1: 0.440276 in2: 0.620457
offset 24 in1: 0.103921 in2: 0.238003
offset 25 in1: 0.126775 in2: 0.290342
offset 29 in1: 0.135211 in2: -0.115313
offset 30 in1: 0.375913 in2: 0.478058
volk_32fc_32f_multiply_32fc_a: fail on arch sse
Best arch: sse
/Users/carlesfernandez/Documents/workspace/gnuradio/volk/lib/testqa.cc:25:
error in "volk_32fc_32f_multiply_32fc_a_test": check
run_volk_tests(volk_32fc_32f_multiply_32fc_a_get_func_desc(), (void
(*)())volk_32fc_32f_multiply_32fc_a_manual,
std::string("volk_32fc_32f_multiply_32fc_a"), 1e-4, 0, 20460, 1, 0) ==
0 failed [true != 0]
RUN_VOLK_TESTS: volk_32fc_s32f_power_32fc_a
sse completed in 0.000989s
generic completed in 0.000985s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_calc_spectral_noise_floor_32f_a
sse completed in 1.8e-05s
generic completed in 4.2e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32fc_s32f_atan2_32f_a
sse4_1 completed in 0.000503s
sse completed in 0.000503s
generic completed in 0.000503s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_32fc_x2_conjugate_dot_prod_32fc_u
generic completed in 1.6e-05s
sse3 completed in 1.5e-05s
Best arch: sse3
RUN_VOLK_TESTS: volk_32fc_deinterleave_32f_x2_a
sse completed in 1.8e-05s
generic completed in 2.3e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32fc_deinterleave_64f_x2_a
sse2 completed in 4.4e-05s
generic completed in 3.8e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32fc_s32f_deinterleave_real_16i_a
sse completed in 2.7e-05s
generic completed in 2e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32fc_deinterleave_real_32f_a
sse completed in 1.1e-05s
generic completed in 1.5e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32fc_deinterleave_real_64f_a
sse2 completed in 1.5e-05s
generic completed in 1.9e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32fc_x2_dot_prod_32fc_a
generic completed in 8.8e-05s
sse_64 completed in 2e-05s
sse3 completed in 2.5e-05s
sse4_1 completed in 2.6e-05s
Best arch: sse_64
RUN_VOLK_TESTS: volk_32fc_index_max_16u_a
sse3 completed in 5e-06s
generic completed in 1e-05s
Best arch: sse3
RUN_VOLK_TESTS: volk_32fc_s32f_magnitude_16i_a
sse3 completed in 3.3e-05s
sse completed in 3.1e-05s
generic completed in 8.1e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32fc_magnitude_32f_a
sse3 completed in 2.2e-05s
sse completed in 2.1e-05s
generic completed in 2.2e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32fc_x2_multiply_32fc_a
sse3 completed in 2.4e-05s
generic completed in 0.000201s
Best arch: sse3
RUN_VOLK_TESTS: volk_32f_s32f_convert_16i_a
sse2 completed in 7e-06s
sse completed in 2.3e-05s
generic completed in 1.9e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_s32f_convert_16i_u
sse2 completed in 1e-05s
sse completed in 2.3e-05s
generic completed in 1.8e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_s32f_convert_32i_a
sse2 completed in 8e-06s
sse completed in 2e-05s
generic completed in 1.4e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_s32f_convert_32i_u
sse2 completed in 1.5e-05s
sse completed in 2.3e-05s
generic completed in 1.5e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_convert_64f_a
sse2 completed in 1.4e-05s
generic completed in 1.6e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_convert_64f_u
sse2 completed in 2.1e-05s
generic completed in 1.6e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_convert_8i_a
sse2 completed in 7e-06s
sse completed in 2.1e-05s
generic completed in 2e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_s32f_convert_8i_u
sse2 completed in 9e-06s
sse completed in 2.5e-05s
generic completed in 2e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32fc_s32f_power_spectrum_32f_a
sse3 completed in 1.8e-05s
generic completed in 1.5e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32fc_x2_square_dist_32f_a
sse3 completed in 3e-06s
generic completed in 4e-06s
Best arch: sse3
RUN_VOLK_TESTS: volk_32fc_x2_s32f_square_dist_scalar_mult_32f_a
sse3 completed in 6e-06s
generic completed in 6e-06s
Best arch: sse3
RUN_VOLK_TESTS: volk_32f_x2_divide_32f_a
sse completed in 2.3e-05s
generic completed in 2.1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_x2_dot_prod_32f_a
generic completed in 0.000351s
sse completed in 0.000112s
sse3 completed in 0.000121s
sse4_1 completed in 7.5e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_32f_x2_dot_prod_32f_u
generic completed in 0.000942s
sse completed in 0.000477s
sse3 completed in 0.000267s
sse4_1 completed in 0.000395s
Best arch: sse3
RUN_VOLK_TESTS: volk_32f_index_max_16u_a
sse4_1 completed in 1.6e-05s
sse completed in 2e-05s
generic completed in 7e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_32f_x2_s32f_interleave_16ic_a
sse2 completed in 1.2e-05s
sse completed in 3.6e-05s
generic completed in 2.7e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32f_x2_interleave_32fc_a
sse completed in 1.4e-05s
generic completed in 1.9e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_x2_max_32f_a
sse completed in 1.1e-05s
generic completed in 1.8e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_x2_min_32f_a
sse completed in 1.8e-05s
generic completed in 2e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_x2_multiply_32f_a
sse completed in 1.4e-05s
generic completed in 1.3e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_normalize_a
sse completed in 6e-06s
generic completed in 5e-06s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_power_32f_a
sse4_1 completed in 0.000523s
sse completed in 0.000521s
generic completed in 0.000521s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_sqrt_32f_a
sse completed in 2.5e-05s
generic completed in 2.1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32f_s32f_stddev_32f_a
sse4_1 completed in 8e-06s
sse completed in 6e-06s
generic completed in 2.2e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_stddev_and_mean_32f_x2_a
sse4_1 completed in 9e-06s
sse completed in 6e-06s
generic completed in 2.1e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_x2_subtract_32f_a
sse completed in 1.2e-05s
generic completed in 1.3e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32f_x3_sum_of_poly_32f_a
sse3 completed in 6e-06s
generic completed in 1.7e-05s
Best arch: sse3
RUN_VOLK_TESTS: volk_32i_x2_and_32i_a
sse completed in 1.2e-05s
generic completed in 1.4e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32i_s32f_convert_32f_a
sse2 completed in 7e-06s
generic completed in 1e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_32i_s32f_convert_32f_u
sse2 completed in 1.1e-05s
generic completed in 1e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_32i_x2_or_32i_a
sse completed in 1.2e-05s
generic completed in 1.4e-05s
Best arch: sse
RUN_VOLK_TESTS: volk_32u_byteswap_a
sse2 completed in 1.3e-05s
generic completed in 2.2e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_64f_convert_32f_a
sse2 completed in 1.1e-05s
generic completed in 1.5e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_64f_convert_32f_u
sse2 completed in 1.9e-05s
generic completed in 1.6e-05s
Best arch: generic
RUN_VOLK_TESTS: volk_64f_x2_max_64f_a
sse2 completed in 2.4e-05s
generic completed in 2.7e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_64f_x2_min_64f_a
sse2 completed in 2.2e-05s
generic completed in 2.5e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_64u_byteswap_a
sse2 completed in 2.7e-05s
generic completed in 2.9e-05s
Best arch: sse2
RUN_VOLK_TESTS: volk_8ic_deinterleave_16i_x2_a
sse4_1 completed in 9e-06s
generic completed in 0.000114s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8ic_s32f_deinterleave_32f_x2_a
sse4_1 completed in 1.4e-05s
sse completed in 7.2e-05s
generic completed in 9.5e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8ic_deinterleave_real_16i_a
sse4_1 completed in 5e-06s
generic completed in 3e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8ic_s32f_deinterleave_real_32f_a
sse4_1 completed in 8e-06s
sse completed in 5.3e-05s
generic completed in 4.8e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8ic_deinterleave_real_8i_a
ssse3 completed in 5e-06s
generic completed in 5e-06s
Best arch: ssse3
RUN_VOLK_TESTS: volk_8ic_x2_multiply_conjugate_16ic_a
sse4_1 completed in 1.9e-05s
generic completed in 0.000318s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8ic_x2_s32f_multiply_conjugate_32fc_a
sse4_1 completed in 2.2e-05s
generic completed in 0.000356s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8i_convert_16i_a
sse4_1 completed in 5e-06s
generic completed in 3.3e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8i_convert_16i_u
sse4_1 completed in 6e-06s
generic completed in 3.3e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8i_s32f_convert_32f_a
sse4_1 completed in 7e-06s
generic completed in 4.8e-05s
Best arch: sse4_1
RUN_VOLK_TESTS: volk_8i_s32f_convert_32f_u
sse4_1 completed in 1.3e-05s
generic completed in 4.9e-05s
Best arch: sse4_1

*** 1 failure detected in test suite "Master Test Suite"


Best regards,
Carles

_______________________________________________
Discuss-gnuradio mailing list
Discuss-gnuradio@gnu.org
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio

Reply via email to