On 06/19/14 03:44, Zhenqiang Chen wrote:

ChangeLog:
2014-06-17  Zhenqiang Chen  <zhenqiang.c...@linaro.org>

         * cprop.c (try_replace_reg): Check cost for constants.

diff --git a/gcc/cprop.c b/gcc/cprop.c
index aef3ee8..c9cf02a 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -733,6 +733,14 @@ try_replace_reg (rtx from, rtx to, rtx insn)
    rtx src = 0;
    int success = 0;
    rtx set = single_set (insn);
+  int old_cost = 0;
+  bool copy_p = false;
+  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
+
+  if (set && SET_SRC (set) && REG_P (SET_SRC (set)))
+    copy_p = true;
+  else
+    old_cost = set_rtx_cost (set, speed);

Looks bogus for set == NULL?

set_rtx_cost has checked it. If it is NULL, the function will return 0;

Also what about register pressure?

Do you think it has big register pressure impact? I think it does not
increase register pressure.
I would expect a small impact on register pressure. In general anytime we avoid propagating a constant to its uses and instead hold the value in a register register pressure will increase.



Here is a summary for performance result on X86-64 and ARM.

For X86-64, I run SPEC2000 INT and FP (-O3). There is no improvement
or regression. As tests, I moved the code segment to end of function
try_replace_reg and check insns which meet "success && new_cost >
old_cost". Logs show only 52 occurrences for all SPEC2000 build and
the only one instruction pattern: *adddi_1 is impacted. For *adddi_1,
rtx_cost increases from 8 to 10 when changing a register operand to a
constant.

For ARM Cortex-M4, minimal changes for Coremark, Dhrystone and EEMBC.
For ARM Chrome book (Cortex-A15), some wave in SPEC2000 INT test. But
the final result does not show improvement or regression.

The patch is updated to remove the "bogus" code and keep more constants.

Bootstrap and no make check regression on X86-64, i686 and ARM.
So with no notable improvements, do you still want this patch to go forward?

You certainly need a testcase. It's fine if it's ARM specific, though obviously you get wider testing if you've got an x86_64 test.


diff --git a/gcc/cprop.c b/gcc/cprop.c
index aef3ee8..6ea6be0 100644
--- a/gcc/cprop.c
+++ b/gcc/cprop.c
@@ -733,6 +733,28 @@ try_replace_reg (rtx from, rtx to, rtx insn)
    rtx src = 0;
    int success = 0;
    rtx set = single_set (insn);
+  int old_cost = 0;
+  bool const_p = false;
+  bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn));
+
+  if (set && SET_SRC (set))
+    {
+      rtx src = SET_SRC (set);
+      if (REG_P (src) || GET_CODE (src) == SUBREG)
+        const_p = true;
+      else
+       {
+         if (note != 0
+             && REG_NOTE_KIND (note) == REG_EQUAL
+             && (GET_CODE (XEXP (note, 0)) == CONST
+                 || CONSTANT_P (XEXP (note, 0))))
+           {
+             const_p = true;
+           }
+         else
+           old_cost = set_rtx_cost (set, speed);
+       }
+    }
Can you come up with a better name than "const_p". I see that and my first though is "that variable indicates if some object is a constant". But that's not how its used here.



    /* Usually we substitute easy stuff, so we won't copy everything.
       We however need to take care to not duplicate non-trivial CONST
@@ -740,6 +762,20 @@ try_replace_reg (rtx from, rtx to, rtx insn)
    to = copy_rtx (to);

    validate_replace_src_group (from, to, insn);
+
+  /* For CONSTANT_P (TO), loop2_invariant pass might hoist it out the loop.
+     And it can be shared by different references.  So skip propagation if
+     it makes INSN's rtx cost higher.  */
+  if (set && SET_SRC (set) && !const_p && CONSTANT_P (to))
+    {
+      if (!CONSTANT_P (SET_SRC (set))
+         && (set_rtx_cost (set, speed) > old_cost))
+       {
+         cancel_changes (0);
+         return false;
+       }
+    }
+
    if (num_changes_pending () && apply_change_group ())
      success = 1;
I guess this needs to be after replacement so that you can easily compute the new cost. Another case where our costing API is lame :( I'm not going to have you fix that.

Why the !CONSTANT_P check? Why not just let the rtx costing alone determine if we want to avoid this propagation?

jeff

Reply via email to