On Tue, Jan 13, 2026 at 3:55 AM Wilco Dijkstra <[email protected]> wrote: > > Hi Andrew, > > > I am trying to compare this to what Eikansh implemented last year: > > https://patchwork.sourceware.org/project/gcc/patch/[email protected]/ > > He is in the process of updating his patch (other things came up > > internally). > > One thing he had noticed while working on it was that putting it too > > early meant that the 2 instruction mov/movk form was not being > > generated when they still should be. > > mov/movk form is fused on some micro-arch (AARCH64_FUSE_MOV_MOVK) so > > we had wanted to keep that output around. > > For an example his f4 (0xffffffff0001fedc) would be better if done > > using mov/movk rather than mov/sub. > > The existing implementation always checks for that before trying more > elaborate sequences, like "if (zero_match < 2 && one_match < 2)". > We could clean up the code and handle 2-instruction MOV+MOVK > first and only then deal with more complex cases. > > However what would be interesting is investigating whether any of the > sequences mentioned in the LLVM document are worth considering. > We are well into diminishing returns - this patch only finds 13 new > 2-instruction sequences in all of SPEC2017... > > Also on modern cores with lots of L/S units, loading complex immediates > may be better overall, especially after PR121240 where it would typically > be a single LDR.
I wanted to make sure I understood the trade off why this was implemented this way (and have it documented). It was not obvious from the original patch on why though but now it makes sense. So your patch is ok. Thanks, Andrew > > Cheers, > Wilco >
