https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86270
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed|2025-02-04 00:00:00 |2025-2-12
--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
The inner loop is still
.L3:
movq %rax, %rdx
movl %eax, (%rsi,%rax,4)
addq $1, %rax
cmpq %rcx, %rdx
jne .L3
we RTL expand from
# ivtmp.7_7 = PHI <ivtmp.7_8(4), 0(3)>
i_16 = (int) ivtmp.7_7;
MEM[(int *)a.0_1 + ivtmp.7_7 * 4] = i_16;
ivtmp.7_14 = ivtmp.7_7;
ivtmp.7_8 = ivtmp.7_7 + 1;
if (ivtmp.7_14 != _12)
goto <bb 4>; [89.00%]
I'll note that IVOPTs did the right thing and transform the loop to
_12 = (unsigned long) len.1_15;
_14 = _12 + 1;
<bb 4> [local count: 955630224]:
# ivtmp.7_7 = PHI <ivtmp.7_8(6), 0(3)>
_6 = (unsigned int) ivtmp.7_7;
i_16 = (int) _6;
MEM[(int *)a.0_1 + ivtmp.7_7 * 4] = i_16;
ivtmp.7_8 = ivtmp.7_7 + 1;
if (ivtmp.7_8 != _14)
but we wreck that again later, during forwprop.
I think we can pattern match this at RTL expansion time.