On Tue, May 15, 2018 at 08:00:49AM -0500, Wilco Dijkstra wrote: > > ping
This seems like a fairly horrible hack around the register allocator behaviour. BUt, OK. James > This patch improves register allocation of fma by preferring to update the > accumulator register. This is done by adding fma insns with operand 1 as the > accumulator. The register allocator considers copy preferences only in > operand > order, so if the first operand is dead, it has the highest chance of being > reused as the destination. As a result code using fma often has a better > register allocation. Performance of SPECFP2017 improves by over 0.5% on some > implementations, while it had no effect on other implementations. Fma is more > readable too, in a simple example we now generate: > > fmadd s16, s2, s1, s16 > fmadd s7, s17, s16, s7 > fmadd s6, s16, s7, s6 > fmadd s5, s7, s6, s5 > > instead of: > > fmadd s16, s16, s2, s1 > fmadd s7, s7, s16, s6 > fmadd s6, s6, s7, s5 > fmadd s5, s5, s6, s4 > > Bootstrap OK. OK for commit? > > ChangeLog: > 2018-01-04 Wilco Dijkstra <wdijk...@arm.com> > > gcc/ > * config/aarch64/aarch64.md (fma<mode>4): Change into expand pattern. > (fnma<mode>4): Likewise. > (fms<mode>4): Likewise. > (fnms<mode>4): Likewise. > (aarch64_fma<mode>4): Rename insn, reorder accumulator operand. > (aarch64_fnma<mode>4): Likewise. > (aarch64_fms<mode>4): Likewise. > (aarch64_fnms<mode>4): Likewise. > (aarch64_fnmadd<mode>4): Likewise.