https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63724
Bug ID: 63724 Summary: [AArch64] Inefficient immediate expansion and hoisting. Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ramana at gcc dot gnu.org For some cases like hmmer in SPEC2k6 we currently generate pretty rubbish code with AArch64. float P7Viterbi(int **mmx, int L, int M, int **imx, int **dmx) { int k; for (k = 0; k <= M; k++) mmx[0][k] = imx[0][k] = dmx[0][k] = -987654321; } This ends up generating pretty rubbish code at O2. tbnz w2, #31, .L4 ldr x5, [x3] ldr x4, [x4] ldr x6, [x0] mov x0, 0 .L3: mov w1, 38735 mov w3, w1 movk w1, 0xc521, lsl 16 str w1, [x4, x0, lsl 2] movk w3, 0xc521, lsl 16 mov w1, 38735 str w3, [x5, x0, lsl 2] movk w1, 0xc521, lsl 16 str w1, [x6, x0, lsl 2] add x0, x0, 1 cmp w2, w0 bge .L3 .L4: fmov s0, wzr ret .size P7Viterbi, .-P7Viterbi and could well be P7Viterbi: tbnz w2, #31, .L4 ldr x5, [x3] mov w1, 38735 ldr x3, [x4] movk w1, 0xc521, lsl 16 ldr x6, [x0] mov x0, 0 .L3: str w1, [x3, x0, lsl 2] str w1, [x5, x0, lsl 2] str w1, [x6, x0, lsl 2] add x0, x0, 1 cmp w2, w0 bge .L3 .L4: fmov s0, wzr ret .size P7Viterbi, .-P7Viterbi The hoisting is missed because we expand const_int's too early in the AArch64 backend. Given we don't have an "uncse" in the mid-end it's quite hard to recover when we've expanded to this form rather early in the compiler. The simple solution is just to move the logic out into a separate splitter function, additionally we should also investigate what happens if we start doing the same for our address computations, but that's the subject of a separate patch. Mine.