Here's my attempt to fix the movk regression on bz 87763.

I still wonder if addressing some of these issues in combine is a better
long term solution, but in the immediate term I think backend patterns
are going to have to be the way to go.

This introduces a new insn_and_split that matches a movk via the
ior..and form.

We rewrite it back into the zero-extract form once operands0 and
operands1 match.  This allows insn fusion in the scheduler to work as it
expects the zero-extract form.

While I have bootstrapped this on aarch64 and aarch64_be, I haven't done
anything with ILP32.

On aarch64 I have also run this through a regression test cycle where it
fixes the movk regression identified in bz87763.


Thoughts?  If we're generally happy with this direction I can look to
tackle the insv_1 and insv_2 regressions in a similar manner.

Jeff


        * config/aarch64/aarch64.md: Add new pattern matching movk field
        insertion via (and (ior ...)).
        
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index ab8786a933e..109694f9ef0 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1161,6 +1161,54 @@
   [(set_attr "type" "mov_imm")]
 )
 
+;; This is for the combiner to use to encourage creation of
+;; bitfield insertions using movk.
+;;
+;; We rewrite back into a movk bitfield insertion to make sched
+;; fusion happy the first chance we get where the appropriate
+;; operands match.  After LRA they should always match.
+(define_insn_and_split ""
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+       (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0")
+                         (match_operand:GPI 2 "const_int_operand" "n"))
+                (match_operand:GPI 3 "const_int_operand" "n")))]
+  "((UINTVAL (operands[2]) == 0xffffffffffff0000
+     || UINTVAL (operands[2]) == 0xffffffff0000ffff
+     || UINTVAL (operands[2]) == 0xffff0000ffffffff
+     || UINTVAL (operands[2]) == 0x0000ffffffffffff)
+    && (UINTVAL (operands[2]) & UINTVAL (operands[3])) == 0)"
+  "#"
+  "&& rtx_equal_p (operands[0], operands[1])"
+  [(set (zero_extract:<MODE> (match_dup 0)
+                            (const_int 16)
+                            (match_dup 2))
+       (match_dup 3))]
+  "{
+     if (UINTVAL (operands[2]) == 0xffffffffffff0000)
+       {
+         operands[2] = GEN_INT (0);
+         operands[3] = GEN_INT (UINTVAL (operands[3]) & 0xffff);
+       }
+     else if (UINTVAL (operands[2]) == 0xffffffff0000ffff)
+       {
+         operands[2] = GEN_INT (16);
+         operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 16) & 0xffff);
+       }
+     else if (UINTVAL (operands[2]) == 0xffff0000ffffffff)
+       {
+         operands[2] = GEN_INT (32);
+         operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 32) & 0xffff);
+       }
+     else if (UINTVAL (operands[2]) == 0x0000ffffffffffff)
+       {
+         operands[2] = GEN_INT (48);
+         operands[3] = GEN_INT ((UINTVAL (operands[3]) >> 48) & 0xffff);
+       }
+     else
+       gcc_unreachable ();
+   }"
+)
+
 (define_expand "movti"
   [(set (match_operand:TI 0 "nonimmediate_operand" "")
        (match_operand:TI 1 "general_operand" ""))]

Reply via email to