https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98673
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Known to fail| |10.2.1 See Also| |https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=70359 Known to work| |11.0 --- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- So with your testcase on trunk I see for RISCV ble a1,zero,.L2 li a6,4 li a5,0 sub a6,a6,a2 .L5: lw a4,4(a2) slli a7,a5,2 add t1,a6,a2 addi a5,a5,1 ble a4,a0,.L3 lw t3,0(a2) ble t3,a0,.L9 .L3: addi a2,a2,4 bne a1,a5,.L5 which is fine, same for x86. This is usually a SSA coalescing issue where a failed coalesce ends up splitting the backedge and emitting a move there. I can see the issue on the branch where the problematic one is ;; basic block 4, loop depth 1 ;; pred: 3 ;; 7 # i_57 = PHI <0(3), i_41(7)> ... ;; basic block 7, loop depth 1 ;; pred: 4 ;; 5 i_41 = i_57 + 1; ivtmp.14_90 = ivtmp.14_91 + 4; if (_6 != i_41) goto <bb 4>; [94.50%] else goto <bb 8>; [5.50%] ;; succ: 4 ;; 8 ;; basic block 8, loop depth 0 ;; pred: 7 _87 = (sizetype) i_57; _146 = _87 + 2; which is a use of the pre-increment i_57 on the loop exit edge. This inhibits coalescing of i_57 and i_41 causing the copy. That's exactly the issue noted in the cited PRs. There have been patches floating around re-materializing i_41 + 1 at the point of i_57 to make the coalescing possible but I think nobody developed them in full. See the thread starting at https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495843.html