https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102018
Torbjorn SVENSSON <azoff at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |azoff at gcc dot gnu.org
--- Comment #3 from Torbjorn SVENSSON <azoff at gcc dot gnu.org> ---
The reason for the failure (AFAICT), is due to that vcmpe.f64 is used for -O1,
-O2 and -O3 (-Ofast also does the same, but there is no test with it). For -O0
and -Os, vcmp.f64 is instead used.
Assembly for -O1:
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
vcmpe.f64 d0, #0 @ 43 [c=4 l=4] *cmpdf_trap_vfp/1
vmrs APSR_nzcv, FPSCR @ 44 [c=4 l=4] *movcc_vfp
bls .L5 @ 11 [c=16 l=2] arm_cond_branch
vmov.f64 d7, #1.0e+0 @ 45 [c=4 l=4] *thumb2_movdf_vfp/2
vcmp.f64 d0, d7 @ 41 [c=4 l=4] *cmpdf_vfp/0
vmrs APSR_nzcv, FPSCR @ 42 [c=4 l=4] *movcc_vfp
bgt .L5 @ 18 [c=16 l=2] arm_cond_branch
vmul.f64 d0, d0, d0 @ 26 [c=24 l=4] *muldf3_vfp
bx lr @ 49 [c=8 l=4] *thumb2_return
.L5:
vadd.f64 d0, d0, d0 @ 21 [c=16 l=4] *adddf3_vfp
bx lr @ 39 [c=8 l=4] *thumb2_return
Assembly for -Os:
foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
vcmp.f64 d0, #0 @ 64 [c=4 l=4] *cmpdf_vfp/1
vmrs APSR_nzcv, FPSCR @ 65 [c=4 l=4] *movcc_vfp
bls .L2 @ 8 [c=16 l=2] arm_cond_branch
vmov.f64 d7, #1.0e+0 @ 13 [c=4 l=4] *thumb2_movdf_vfp/2
vcmp.f64 d0, d7 @ 62 [c=4 l=4] *cmpdf_vfp/0
vmrs APSR_nzcv, FPSCR @ 63 [c=4 l=4] *movcc_vfp
ble .L4 @ 15 [c=16 l=2] arm_cond_branch
.L2:
vadd.f64 d0, d0, d0 @ 18 [c=4 l=4] *adddf3_vfp
bx lr @ 60 [c=8 l=4] *thumb2_return
.L4:
vmul.f64 d0, d0, d0 @ 23 [c=4 l=4] *muldf3_vfp
bx lr @ 69 [c=8 l=4] *thumb2_return
The above was extracted from compiling using:
arm-none-eabi-gcc pr82692.c -mthumb -march=armv7e-m+fp.dp -mtune=cortex-m7
-mfloat-abi=hard -mfpu=auto -S -o - -Os
This bug is only present if -mtune=cortex-m7 or -mcpu=cortex-m7 is used. I
suppose it has something to do with the cost model for Cortex-M7 as otherwise,
Cortex-M4 would likely be affected too.