Hi Richard,
> The patch below is what I meant. It passes bootstrap & regression-test
> on aarch64-linux-gnu (and so produces the same results for the tests
> that you changed). Do you see any problems with this version?
> If not, I think we should go with it.
Thanks for the detailed example - unfortunately there are issues with it.
Early expansion means more instructions to deal with in RTL and fewer
optimizations - it even affects inlining (I see more calls/returns in the
instruction frequencies).
Worse, this change completely disables rematerialization of FP immediates
which implies extra spilling. A basic example goes like this:
void g(void);
double bad_remat (double x)
{
x += 5.347897294;
g();
x *= 5.347897294;
return x;
}
which with -O2 -fomit-frame-pointer -ffixed-d8 -ffixed-d9 -ffixed-d10
-ffixed-d11 -ffixed-d12 -ffixed-d13 -ffixed-d14 now compiles to:
adrp x0, .LC0
str x30, [sp, -32]!
ldr d31, [x0, #:lo12:.LC0]
str d15, [sp, 8]
fadd d15, d0, d31
str d31, [sp, 24]
bl g
ldr d31, [sp, 24]
fmul d0, d15, d31
ldr d15, [sp, 8]
ldr x30, [sp], 32
ret
Recent changes have been moving in the opposite direction - keeping
high-level constructs (like GOT accesses) as a single operation works out
better for register allocation and allows more optimization.
So keeping FP immediates as standard move instructions until regalloc
is best. Supporting MOV/FMOV in regalloc would require another secondary
reload (and would then allow rematerialization of these constants).
Cheers,
Wilco