On 22/01/2024 12:18, Matthieu Longo wrote:
> rev16 pattern was not recognised anymore as a change in the bswap tree
> pass was introducing a new GIMPLE form, not recognized by the assembly
> final transformation pass.
> 
> More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933
> 
> gcc/ChangeLog:
> 
>         PR target/108933
>         * config/arm/arm.md (*arm_rev16si2_alt3): new pattern to convert
>           a bswap + rotate by 16 bits into rev16

ChangeLog entries need to be written as sentences, so start with a capital 
letter and end with a full stop; continuation lines should start in column 8 
(one hard tab, don't use spaces).  But in this case, "New pattern." is 
sufficient.

> 
> gcc/testsuite/ChangeLog:
> 
>         PR target/108933
>         * gcc.target/arm/rev16.c: Moved to...
>         * gcc.target/arm/rev16_1.c: ...here.
>         * gcc.target/arm/rev16_2.c: New test to check that rev16 is
>           emitted.


+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "*arm_rev16si2_alt3"
+  [(set (match_operand:SI 0 "register_operand" "=l,r")
+        (rotate:SI (bswap:SI (match_operand:SI 1 "register_operand" "l,r"))
+                     (const_int 16)))]
+  "arm_arch6"
+  "rev16\\t%0, %1"
+  [(set_attr "arch" "t,32")
+   (set_attr "length" "2,4")
+   (set_attr "type" "rev")]
+)
+

Unfortunately, this is insufficient.  When generating Arm or Thumb2 code (but 
not thumb1) we also have to handle conditional execution: we need to have '%?' 
in the output template at the point where a condition code might be needed.  
That means we need separate output templates for all three alternatives (as we 
need a 16-bit variant for thumb2 that's conditional and a 16-bit for thumb1 
that isn't).  See the output of arm_rev16 for a guide of what is really needed.

I note that the arm_rev16si2_alt1, and arm_rev16si2_alt2 patterns are incorrect 
in this regard as well; that will need fixing.

I also see that arm_rev16si2 currently expands to the alt1 variant above; given 
that the preferred canonical form would now appear to use bswap + rotate, we 
should change that as well.  In fact, we can merge your new pattern with the 
expand entirely and eliminate the need to call gen_arm_rev16si2_alt1.  
Something like:

(define_insn "arm_rev16si2"
  [(set (match_operand:SI 0 "s_register_operand")
        (rotate:SI (bswap:SI (match_operand:SI 1 "s_register_operand")) 
(const_int 16))]
  "arm_arch6"
  "@
  rev16...
  ...


R.

Reply via email to