RFA: patch to fix PR59466 (inefficient address generation on ppc )

Vladimir Makarov Wed, 11 Dec 2013 08:50:24 -0800

  The following patch fixes PR59466

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59466


  LRA on PPC sometimes generates inefficient code

addis 9,2,.LC77@toc@ha
addi 9,9,.LC77@toc@l
ld 9,0(9)

  instead of

addis 9,2,.LC77@toc@ha
ld 9,.LC77@toc@l(9)

  I can not create a small test for this.  The smallest file when I
found is bzip2.c from SPEC2000.

LRA generates move insn with invalid memory [unspec[`*.LC29',%2:DI]45] but it can handle it (make it valid) very efficiently after that

trying different numerous transformations.  PPC target machinary
through validize_mem just put all address in a pseudo.

  I could prevent use validize_mem in rs6000.c but I prefer to do it
in change_addres_1 as other targets might have the same problem and it
is better to have one for all solution.

  Still it does not fully solve the problem as insn
r257:DI=[unspec[`*.LC29',%2:DI] 45] cant be recognized as
*movdi... pattern has operand predicates rejecting memory because of
invalid address.  To fix this a change in general_operand is done.  As
LRA can not work properly with regular insn recognition, I added an
assert for this in lra_set_insn_recog_data to figure out this
situation earlier.

  Again, LRA has a very good code for legitimize address by itself and
it is better to use it.

  After applying patch I see code size reduction on SPEC2000.

Before the patch (this is relative reload generated code):

----------------CFP2000-----------------
-0.471%          27171          27043 168.wupwise
-0.457%          14006          13942 171.swim
-0.392%          24515          24419 172.mgrid
 0.226%          85079          85271 173.applu
 0.751%         728891         734363 177.mesa
 0.194%         214357         214773 178.galgel
-0.295%          21683          21619 179.art
 0.412%          31089          31217 183.equake
-0.520%          79976          79560 187.facerec
 0.000%         152504         152504 188.ammp
 0.000%          43758          43758 189.lucas
-0.181%        1062265        1060337 191.fma3d
 1.035%        1041684        1052468 200.sixtrack
-0.105%         151944         151784 301.apsi
Average = 0.0139775%
----------------CINT2000-----------------
 0.252%          76242          76434 164.gzip
 0.172%         186152         186472 175.vpr
-0.215%        2084612        2080132 176.gcc
 0.000%          16716          16716 181.mcf
 0.085%         225316         225508 186.crafty
-0.015%         210100         210068 197.parser
 0.622%         433635         436332 252.eon
-0.298%         762014         759742 253.perlbmk
=0.000%         904784         904784 254.gap
-0.285%         706432         704416 255.vortex
 0.220%          58297          58425 256.bzip2
 0.314%         265334         266166 300.twolf
Average = 0.070863%

After the patch:
----------------CFP2000-----------------
-0.589%          27171          27011 168.wupwise
-0.457%          14006          13942 171.swim
-0.392%          24515          24419 172.mgrid
-0.113%          85079          84983 173.applu
 0.654%         728891         733659 177.mesa
 0.060%         214357         214485 178.galgel
-0.295%          21683          21619 179.art
-0.412%          31089          30961 183.equake
-0.520%          79976          79560 187.facerec
 0.000%         152504         152504 188.ammp
 0.000%          43758          43758 189.lucas
-0.317%        1062265        1058897 191.fma3d
 0.356%        1041684        1045396 200.sixtrack
-0.105%         151944         151784 301.apsi
Average = -0.152103%
----------------CINT2000-----------------
 0.084%          76242          76306 164.gzip
-0.052%         186152         186056 175.vpr
-0.284%        2084612        2078692 176.gcc
 0.000%          16716          16716 181.mcf
-0.312%         225316         224612 186.crafty
-0.091%         210100         209908 197.parser
 0.622%         433635         436332 252.eon
-0.340%         762014         759422 253.perlbmk
 0.000%         904784         904784 254.gap
-0.335%         706432         704064 255.vortex
 0.110%          58297          58361 256.bzip2
-0.241%         265334         264694 300.twolf
Average = -0.070023%

Code size reduction for PPC in this case means also faster code
generation.  I see it but cannot provide reliable SPEC2000 rate
change.


The patch was successfully bootstrapped and tested on i686, x86_64,
and PPC64.

Ok to commit?


2013-12-11  Vladimir Makarov  <vmaka...@redhat.com>

        PR rtl-optimization/59466
        * emit-rtl.c (change_address_1): Don't validate address for LRA.
        * recog.c (general_operand): Accept any memory for LRA.
        * lra.c (lra_set_insn_recog_data): Add an assert.

Index: emit-rtl.c
===================================================================
--- emit-rtl.c  (revision 205870)
+++ emit-rtl.c  (working copy)
@@ -1951,7 +1951,9 @@ change_address_1 (rtx memref, enum machi
       && (!validate || memory_address_addr_space_p (mode, addr, as)))
     return memref;

-  if (validate)
+  /* Don't validate address for LRA.  LRA can make the address valid
+     by itself in most efficient way.  */
+  if (validate && !lra_in_progress)
     {
       if (reload_in_progress || reload_completed)
        gcc_assert (memory_address_addr_space_p (mode, addr, as));
Index: lra.c
===================================================================
--- lra.c       (revision 205870)
+++ lra.c       (working copy)
@@ -1072,9 +1072,16 @@ lra_set_insn_recog_data (rtx insn)
       nop = asm_noperands (PATTERN (insn));
       data->operand_loc = data->dup_loc = NULL;
       if (nop < 0)
-       /* Its is a special insn like USE or CLOBBER.  */
-       data->insn_static_data = insn_static_data
-         = get_static_insn_data (-1, 0, 0, 1);
+       {
+         /* Its is a special insn like USE or CLOBBER.  We should
+            recognize any regular insn otherwise LRA can do nothing
+            with this insn.  */
+         gcc_assert (GET_CODE (PATTERN (insn)) == USE
+                     || GET_CODE (PATTERN (insn)) == CLOBBER
+                     || GET_CODE (PATTERN (insn)) == ASM_INPUT);
+         data->insn_static_data = insn_static_data
+           = get_static_insn_data (-1, 0, 0, 1);
+       }
       else
        {
          /* expand_asm_operands makes sure there aren't too many
Index: recog.c
===================================================================
--- recog.c     (revision 205870)
+++ recog.c     (working copy)
@@ -1,3 +1,4 @@
+
 /* Subroutines used by or related to instruction recognition.
    Copyright (C) 1987-2013 Free Software Foundation, Inc.

@@ -1021,8 +1022,12 @@ general_operand (rtx op, enum machine_mo
       if (! volatile_ok && MEM_VOLATILE_P (op))
        return 0;

-      /* Use the mem's mode, since it will be reloaded thus.  */

- if (memory_address_addr_space_p (GET_MODE (op), y, MEM_ADDR_SPACE(op)))

+      /* Use the mem's mode, since it will be reloaded thus.  LRA can
+        generate move insn with invalid addresses which is made valid
+        and efficiently calculated by LRA through further numerous
+        transformations.  */
+      if (lra_in_progress
+         || memory_address_addr_space_p (GET_MODE (op), y, MEM_ADDR_SPACE 
(op)))
        return 1;
     }

RFA: patch to fix PR59466 (inefficient address generation on ppc )

Reply via email to