https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589
--- Comment #25 from Toon Moene <toon at moene dot org> --- BTW, the speed difference between the native and the OpenMPI based program is staggering. For a 936x770x60 grid, the native run takes around 14 seconds elapsed time, while the OpenMPI based one takes 2 minutes on my 10 core x 2 hyperthreads Skylake machine.