https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111612
--- Comment #2 from Tobias Burnus <burnus at gcc dot gnu.org> --- > To clarify, the numbers here are using mainline, > and not devel/omp/gcc-13 with -fopenmp-target=acc, right? The presentation, i.e. everything quoted before "* * *", is with OG13. But I only quoted the result for MPI w/o any OpenMP/OpenACC/-fopenmp-target=acc If interested in those, go to the presentation. * * * However, as far as this PR is concerned, it is about plain single-thread host execution. And everything below the "* * *" is run with GCC mainline (GCC 14 mainline as of yesterday on an Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz system - and clang 14.0.0-1ubuntu1.1). [Note: Contrary to SPEC HPC 2021, the GitHub version does support neither OpenACC nor OpenMP offloading. It does, however, support MPI and OpenMP (using on the host: parallel for / parallel / task / taskgroup) - but as used (see "cmake" line), it uses neither.] Hence: This PR is really for single-thread & vectorization (missed) optimization, only.