On Thu, Feb 06, 2014 at 05:21:00PM -0500, Tim Prince wrote: > I'm seeing vectorization but no output from > -ftree-vectorizer-verbose, and no dot product vectorization inside > omp parallel regions, with gcc g++ or gfortran 4.9. Primary targets > are cygwin64 and linux x86_64. > I've been unable to use -O3 vectorization with gcc, although it > works with gfortran and g++, so use gcc -O2 -ftree-vectorize > together with additional optimization flags which don't break.
Can you file a GCC bugzilla PR with minimal testcases for this (or point us at already filed bugreports)? > I've made source code changes to take advantage of the new > vectorization with merge() and ? operators; while it's useful for > -march=core-avx2, it's sometimes a loss for -msse4.1. > gcc vectorization with #pragma omp parallel for simd is reasonably > effective in my tests only on 12 or more cores. Likewise. > #pragma omp simd reduction(max: ) is giving correct results but poor > performance in my tests. Likewise. Jakub