Hi Andrew,

> I am trying to compare this to what Eikansh implemented last year:
> https://patchwork.sourceware.org/project/gcc/patch/[email protected]/
> He is in the process of updating his patch (other things came up internally).
> One thing he had noticed while working on it was that putting it too
> early meant that the 2 instruction mov/movk form was not being
> generated  when they still should be.
> mov/movk form is fused on some micro-arch (AARCH64_FUSE_MOV_MOVK) so
> we had wanted to keep that output around.
> For an example his f4 (0xffffffff0001fedc) would be better if done
> using mov/movk rather than mov/sub.

The existing implementation always checks for that before trying more
elaborate sequences, like "if (zero_match < 2 && one_match < 2)".
We could clean up the code and handle 2-instruction MOV+MOVK
first and only then deal with more complex cases.

However what would be interesting is investigating whether any of the
sequences mentioned in the LLVM document are worth considering.
We are well into diminishing returns - this patch only finds 13 new
2-instruction sequences in all of SPEC2017...

Also on modern cores with lots of L/S units, loading complex immediates
may be better overall, especially after PR121240 where it would typically
be a single LDR.

Cheers,
Wilco

Reply via email to