The first form has a lower latency (due to the special handling of
"move" in LA464 and LA664) despite it's longer.

gcc/ChangeLog:

        * config/loongarch/loongarch.md (define_peephole2): Require
        optimize_insn_for_size_p () for move/move/bstrins =>
        srai/bstrins transform.
---

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/loongarch.md | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 25c1d323ba0..e4434c3bd4e 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -1617,20 +1617,23 @@ (define_insn_and_split "*bstrins_<mode>_for_ior_mask"
   })
 
 ;; We always avoid the shift operation in bstrins_<mode>_for_ior_mask
-;; if possible, but the result may be sub-optimal when one of the masks
+;; if possible, but the result may be larger when one of the masks
 ;; is (1 << N) - 1 and one of the src register is the dest register.
 ;; For example:
 ;;     move            t0, a0
 ;;     move            a0, a1
 ;;     bstrins.d       a0, t0, 42, 0
 ;;     ret
-;; using a shift operation would be better:
+;; using a shift operation would be smaller:
 ;;     srai.d          t0, a1, 43
 ;;     bstrins.d       a0, t0, 63, 43
 ;;     ret
 ;; unfortunately we cannot figure it out in split1: before reload we cannot
 ;; know if the dest register is one of the src register.  Fix it up in
 ;; peephole2.
+;;
+;; Note that the first form has a lower latency so this should only be
+;; done when optimizing for size.
 (define_peephole2
   [(set (match_operand:GPR 0 "register_operand")
        (match_operand:GPR 1 "register_operand"))
@@ -1639,7 +1642,7 @@ (define_peephole2
                          (match_operand:SI 3 "const_int_operand")
                          (const_int 0))
        (match_dup 0))]
-  "peep2_reg_dead_p (3, operands[0])"
+  "peep2_reg_dead_p (3, operands[0]) && optimize_insn_for_size_p ()"
   [(const_int 0)]
   {
     int len = GET_MODE_BITSIZE (<MODE>mode) - INTVAL (operands[3]);
-- 
2.45.2

Reply via email to