https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69710
--- Comment #1 from Doug Gilmore <doug.gilmore at imgtec dot com> --- Created attachment 37615 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37615&action=edit daxpy for DP (previous was for SP) Compilation example: arm-linux-gnueabihf-gcc -O3 -save-temps daxpy.c saxpy.c -c -mfpu=neon -c -fdump-tree-{vect,ivopts}-{verbose,details} -fdump-tree-{slp1,optimized} -fsched-verbose=9 \ -fdump-rtl-sched{1,2} -marm -funsafe-math-optimizations -funroll-all-loops Note that Neon does not support DP, thus daxpy.s won't contain autovectorized code. I haven't built a ToT compiler for aarch64-linux-gnu, but I suspect that you will see autovectorized code in daxpy.s in which reasonable schedules are being produced (loads are being moved above stores).