"Du, Frank" <frank...@intel.com> writes:

> The PR I committed provide a basic support for runtime dispatching. I
> agree that complier should generate good vectorize for the non-null
> data part but in fact it didn't, jedbrown point to it can force
> complier to SIMD using some additional pragmas, something like
> "#pragma omp simd reduction(+:sum)", I will try this pragma later but
> need figure out if it need a linking against OpenMP.

It does not require linking OpenMP.  You just compile with -fopenmp-simd
(gcc/clang) or -qopenmp-simd (icc) so that it interprets the "omp simd"
pragmas.  (These can be captured in macros using _Pragma.)

Note that you get automatic vectorization for this sort of thing without
any OpenMP if you add -funsafe-math-optimizations (included in
-ffast-math).

  https://gcc.godbolt.org/z/8thgru

Many projects don't want -funsafe-math-optimizations because there are
places where it can hurt numerical stability.  ICC includes unsafe math
in normal optimization levels while GCC and Clang are more conservative.

Reply via email to