Stepping through a gdb session inspecting costs that cause
gcc.dg/tree-ssa/slsr-13.c to fail, exposed that before this
patch, cris_rtx_costs told that a shift of 1 of a register
costs 5, while adding two registers costs 4.

Making the cost of a quick-immediate constant equal to using
a register (default 0) reflects actual performance and
size-cost better.  It also happens to make
gcc.dg/tree-ssa/slsr-13.c pass with what looks like better
code being generated, and improves coremark performance by
0.4%.

But, blindly doing this for *all* valid operands that fit
the "quick-immediate" addressing mode, trips interaction
with other factors*, with the end result mixed at best.  So,
do this only for MINUS and logical operations for the time
being, and only for modes that fit in one register.

*) Examples of "other factors":

- A bad default implementation of insn_cost or actually,
pattern_cost, that looks only at the set_src_cost and
furthermore sees such a cost of 0 as invalid.  (Compare to
the more sane set_rtx_cost.)  This naturally tripped up
combine and ifcvt, causing all sorts of changes, good and
bad.

- Having the same cost, to compare a register with 0 as with
-31..31, means a compare insn of an eliminable form no
longer looks preferable.

        * config/cris/cris.cc (cris_rtx_costs) [CONST_INT]: Return 0
        for many quick operands, for register-sized modes.
---
 gcc/config/cris/cris.cc | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 641e7ea25fb1..05dead9c0778 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -1884,7 +1884,28 @@ cris_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno,
        if (val == 0)
          *total = 0;
        else if (val < 32 && val >= -32)
-         *total = 1;
+         switch (outer_code)
+           {
+             /* For modes that fit in one register we tell they cost
+                the same as with register operands.  DImode operations
+                needs careful consideration for more basic reasons:
+                shifting by a non-word-size amount needs more
+                operations than an addition by a register pair.
+                Deliberately excluding SET, PLUS and comparisons and
+                also not including the full -64..63 range for (PLUS
+                and) MINUS.  */
+           case MINUS: case ASHIFT: case LSHIFTRT:
+           case ASHIFTRT: case AND: case IOR:
+             if (GET_MODE_SIZE(mode) <= UNITS_PER_WORD)
+               {
+                 *total = 0;
+                 break;
+               }
+             /* FALL THROUGH.  */
+           default:
+             *total = 1;
+             break;
+           }
        /* Eight or 16 bits are a word and cycle more expensive.  */
        else if (val <= 32767 && val >= -32768)
          *total = 2;
-- 
2.30.2

Reply via email to