Hi!

On Wed, Dec 09, 2020 at 05:49:53PM +0800, Kewen.Lin wrote:
> This patch is to treat those new pseudo-to-pseudo copies
> after hard-reg-to-pseudo-copy as zero costs.  The
> justification is that these new copies are closely after
> the corresponding hard-reg-to-pseudo-copy insns, register
> allocation should be able to coalesce them and get them
> eliminated.

Costing things that are not free as cost zero is very problematic.
Cost zero is problematic in combine anyway (it means unknown cost, not
no cost).

> Now these copies follow the normal costing scheme, the
> below case dump shows the unexpected combination:
> 
> ``` dump
> 
> Trying 3, 2 -> 13:
>     3: r119:DI=r132:DI
>       REG_DEAD r132:DI
>     2: r118:DI=r131:DI
>       REG_DEAD r131:DI
>    13: r128:DI=r118:DI&0xffffffff|r119:DI<<0x20
>       REG_DEAD r119:DI
>       REG_DEAD r118:DI

This should not combine if 2+13 and 3+13 do not combine already.  Why
did those not combine?

> Failed to match this instruction:
> (set (reg:DI 128)
>     (ior:DI (ashift:DI (reg:DI 132)
>             (const_int 32 [0x20]))
>         (reg:DI 131)))

Likely because it results in this, and this insn isn't recognised.  So
this can be fixed by adding a pattern for it (it needs to make sure all
but the bottom 32 bits of reg 131 are zero; it can use nonzero_bits for
that).

Long ago I had the following patch for this.  Not sure why I never
submitted it, maybe there is something wronmg with it?


Segher

=
>From 04c44ad71941310c84b376744cfbcc87c93a8d68 Mon Sep 17 00:00:00 2001
Message-Id: 
<04c44ad71941310c84b376744cfbcc87c93a8d68.1528751010.git.seg...@kernel.crashing.org>
From: Segher Boessenkool <seg...@kernel.crashing.org>
Date: Mon, 11 Jun 2018 20:46:31 +0000
Subject: [PATCH] rs6000: Add a splitter for a rl*imi case

An rl*imi is usually written as an IOR of an ASHIFT or similar, and an
AND of a register with a constant mask.  In some cases combine knows
that that AND doesn't do anything (because all zero bits in that mask
correspond to bits known to be already zero), and then no pattern
matches.  This patch adds a define_split for such cases.  It uses
nonzero_bits in the condition of the splitter, but does not need it
afterwards for the instruction to be recognised.  This is necessary
because later passes can see fewer nonzero_bits.

Because it is a splitter, combine will only use it when starting with
three insns (or more), even though the result is just one.  This isn't
a huge problem in practice, but some possible combinations still won't
happen.

---
 gcc/config/rs6000/rs6000.md | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 38555f5..bc9781b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3920,6 +3920,24 @@ (define_insn "*rotl<mode>3_insert_3"
 }
   [(set_attr "type" "insert")])
 
+(define_code_iterator plus_ior_xor [plus ior xor])
+
+(define_split
+  [(set (match_operand:GPR 0 "gpc_reg_operand")
+       (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand")
+                                     (match_operand:SI 2 "const_int_operand"))
+                         (match_operand:GPR 3 "gpc_reg_operand")))]
+  "nonzero_bits (operands[3], <MODE>mode)
+   < HOST_WIDE_INT_1U << INTVAL (operands[2])"
+  [(set (match_dup 0)
+       (ior:GPR (and:GPR (match_dup 3)
+                         (match_dup 4))
+                (ashift:GPR (match_dup 1)
+                            (match_dup 2))))]
+{
+  operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1);
+})
+
 (define_insn "*rotl<mode>3_insert_4"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
        (ior:GPR (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
-- 
1.8.3.1

Reply via email to