https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68128

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Can't reproduce, at least not on i7-5960X (thus OMP_NUM_THREADS=16).
gcc -Ofast -fopenmp built cutcp is roughly the same performance in all of 4.6,
4.8, 5.1 and 6, the only thing that reliably helps (but only something like
3-4%) is defining __INTEL_COMPILER, as the benchmark uses different code for
ICC and for other compilers, where other compilers use atomics that aren't used
for ICC.

Reply via email to