https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #76 from Richard Biener <rguenth at gcc dot gnu.org> --- x86_64 has <bb 3> [local count: 536870800]: # ivtmp.13_3 = PHI <ivtmp.13_9(3), 0(2)> vect__1.6_12 = MEM <vector(2) double> [(double *)&a + ivtmp.13_3 * 1]; MEM <vector(2) double> [(double *)&c + ivtmp.13_3 * 1] = vect__1.6_12; ivtmp.13_9 = ivtmp.13_3 + 16; if (ivtmp.13_9 != 16000000) and .L2: movapd a(%rax), %xmm0 addq $16, %rax movaps %xmm0, c-16(%rax) cmpq $16000000, %rax jne .L2 which I think is optimal. With -fPIC we get .L2: movapd (%rax,%rdx), %xmm0 addq $16, %rax movaps %xmm0, -16(%rax,%rcx) cmpq $16000000, %rax jne .L2 let's close this.