https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116445
ktkachov at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ktkachov at gcc dot gnu.org
--- Comment #3 from ktkachov at gcc dot gnu.org ---
Perhaps the better comparison here is against -mcpu=cortex-m55 -Os (rather than
-O):
foo:
movs r3, #8
push {lr}
dls lr, r3
.L2:
and r0, r1, r0, lsr #1
le lr, .L2
ldr pc, [sp], #4
It manages to avoid decrementing r3 in the loop altogether and it should be
better for codesize and speed