Here's my attempt to fix the movk regression on bz 87763. I still wonder if addressing some of these issues in combine is a better long term solution, but in the immediate term I think backend patterns are going to have to be the way to go.
This introduces a new insn_and_split that matches a movk via the ior..and form. We rewrite it back into the zero-extract form once operands0 and operands1 match. This allows insn fusion in the scheduler to work as it expects the zero-extract form. While I have bootstrapped this on aarch64 and aarch64_be, I haven't done anything with ILP32. On aarch64 I have also run this through a regression test cycle where it fixes the movk regression identified in bz87763. Thoughts? If we're generally happy with this direction I can look to tackle the insv_1 and insv_2 regressions in a similar manner. Jeff
* config/aarch64/aarch64.md: Add new pattern matching movk field insertion via (and (ior ...)). diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index ab8786a933e..109694f9ef0 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1161,6 +1161,54 @@ [(set_attr "type" "mov_imm")] ) +;; This is for the combiner to use to encourage creation of +;; bitfield insertions using movk. +;; +;; We rewrite back into a movk bitfield insertion to make sched +;; fusion happy the first chance we get where the appropriate +;; operands match. After LRA they should always match. +(define_insn_and_split "" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") + (match_operand:GPI 2 "const_int_operand" "n")) + (match_operand:GPI 3 "const_int_operand" "n")))] + "((UINTVAL (operands[2]) == 0xffffffffffff0000 + || UINTVAL (operands[2]) == 0xffffffff0000ffff + || UINTVAL (operands[2]) == 0xffff0000ffffffff + || UINTVAL (operands[2]) == 0x0000ffffffffffff) + && (UINTVAL (operands[2]) & UINTVAL (operands[3])) == 0)" + "#" + "&& rtx_equal_p (operands[0], operands[1])" + [(set (zero_extract:<MODE> (match_dup 0) + (const_int 16) + (match_dup 2)) + (match_dup 3))] + "{ + if (UINTVAL (operands[2]) == 0xffffffffffff0000) + { + operands[2] = GEN_INT (0); + operands[3] = GEN_INT (UINTVAL (operands[3]) & 0xffff); + } + else if (UINTVAL (operands[2]) == 0xffffffff0000ffff) + { + operands[2] = GEN_INT (16); + operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 16) & 0xffff); + } + else if (UINTVAL (operands[2]) == 0xffff0000ffffffff) + { + operands[2] = GEN_INT (32); + operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 32) & 0xffff); + } + else if (UINTVAL (operands[2]) == 0x0000ffffffffffff) + { + operands[2] = GEN_INT (48); + operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 48) & 0xffff); + } + else + gcc_unreachable (); + }" +) + (define_expand "movti" [(set (match_operand:TI 0 "nonimmediate_operand" "") (match_operand:TI 1 "general_operand" ""))]