https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52473
--- Comment #18 from Jürgen Reuter <juergen.reuter at desy dot de> --- The example by posted on May 20, 2017 on c.l.f. improved by Stefano Zaghi below shows a factor of 10-20 improvement now in gfortran 9.0.0 including the work by Thomas Koenig. $ ./a.out Elapsed CPU time = 0.33764499999999997 Elapsed CPU time = 0.29224100000000003 Elapsed CPU time = 0.26565400000000006 Here is the code: program testme use, intrinsic :: iso_fortran_env implicit none integer(int32), parameter :: n = 200 real(real32) :: a(n,n,n), b(n,n,n) integer(int32) :: j, k real(real64) :: t1, t2 call random_number(a) do k = 1, 3 call cpu_time ( t1 ) do j = 1, 100 b = cshift(a, shift=1, DIM=k) end do call cpu_time ( t2 ) write ( *, * ) 'Elapsed CPU time = ', t2-t1 end do end program testme