[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-09 Thread bergner at gcc dot gnu dot org


--- Comment #53 from bergner at gcc dot gnu dot org  2008-04-09 15:38 
---
Author: bergner
Date: Wed Apr  9 13:42:43 2008
New Revision: 134139

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134139
Log:
PR middle-end/PR28690
* explow.c (break_out_memory_refs): Use simplify_gen_binary rather
than gen_rtx_fmt_ee to perform more canonicalizations.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/explow.c


-- 

bergner at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-08 Thread bonzini at gnu dot org


--- Comment #52 from bonzini at gnu dot org  2008-04-08 19:07 ---
Subject: Re:  [4.2 Regression] Performace problem with
 indexed load/stores on powerpc


> Index: explow.c
> ===
> --- explow.c(revision 134095)
> +++ explow.c(working copy)
> @@ -305,7 +305,7 @@ break_out_memory_refs (rtx x)
>rtx op1 = break_out_memory_refs (XEXP (x, 1));
> 
>if (op0 != XEXP (x, 0) || op1 != XEXP (x, 1))
> -   x = gen_rtx_fmt_ee (GET_CODE (x), Pmode, op0, op1);
> +   x = simplify_gen_binary (GET_CODE (x), Pmode, op0, op1);
>  }

Definitely a good idea.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-08 Thread bergner at gcc dot gnu dot org


--- Comment #51 from bergner at gcc dot gnu dot org  2008-04-08 18:50 
---
Ok, I dug into this a little deeper.  For the following test case:

  int array[1024];
  void
  clear_table (unsigned int n)
  {
unsigned int i;
for (i = 0; i < n; i++)
  array[i] = 0;
  }

compiling this with -O1 (it's ok with -O2 or above) on powerpc{,64}-linux,
during expand, we call swap_commutative_operands_p with a SYMBOL_REF and a REG
which currently prefers the REG first.  Later, break_out_memory_refs forces the
SYMBOL_REF into a register (with the REG_POINTER attribute set), but we're
already done swapping, so we get the wrong operand ordering.  Paolo, I wonder
if this patch instead of the rtlanal.c hunk might be better.  It does fix my
problem:

Index: explow.c
===
--- explow.c(revision 134095)
+++ explow.c(working copy)
@@ -305,7 +305,7 @@ break_out_memory_refs (rtx x)
   rtx op1 = break_out_memory_refs (XEXP (x, 1));

   if (op0 != XEXP (x, 0) || op1 != XEXP (x, 1))
-   x = gen_rtx_fmt_ee (GET_CODE (x), Pmode, op0, op1);
+   x = simplify_gen_binary (GET_CODE (x), Pmode, op0, op1);
 }

   return x;


-- 

bergner at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-08 Thread bonzini at gnu dot org


--- Comment #50 from bonzini at gnu dot org  2008-04-08 15:00 ---
I guess that you had modified the precedences in order to allow additional
simplifications.  Can you report here what is missed using the current values?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-08 Thread bergner at gcc dot gnu dot org


--- Comment #49 from bergner at gcc dot gnu dot org  2008-04-08 14:49 
---
The offending hunk has been reverted in revision 134095.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-07 Thread ubizjak at gmail dot com


--- Comment #48 from ubizjak at gmail dot com  2008-04-08 06:43 ---
(In reply to comment #47)

>* rtlanal.c: Update copyright years.
>(commutative_operand_precedence): Give SYMBOL_REF's the same precedence

This change causes regression in i686-pc-linux-gnu testsuite:

FAIL: gcc.target/i386/addr-sel-1.c scan-assembler a\\+1
FAIL: gcc.target/i386/addr-sel-1.c scan-assembler b\\+1

Tracked in PR 35867.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-04-07 Thread ubizjak at gmail dot com


--- Comment #47 from ubizjak at gmail dot com  2008-04-08 06:39 ---
Author: bergner
Date: Mon Apr  7 17:36:59 2008
New Revision: 133985

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=133985
Log:
PR middle-end/PR28690
* rtlanal.c: Update copyright years.
(commutative_operand_precedence): Give SYMBOL_REF's the same precedence
as REG_POINTER and MEM_POINTER operands.
* emit-rtl.c (gen_reg_rtx_and_attrs): New function.
(set_reg_attrs_from_value): Call mark_reg_pointer as appropriate.
* rtl.h (gen_reg_rtx_and_attrs): Add prototype for new function.
* gcse.c: Update copyright years.
(pre_delete): Call gen_reg_rtx_and_attrs.
(hoist_code): Likewise.
(build_store_vectors): Likewise.
(delete_store): Likewise.
* loop-invariant.c (move_invariant_reg): Likewise.
Update copyright years.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/emit-rtl.c
trunk/gcc/gcse.c
trunk/gcc/loop-invariant.c
trunk/gcc/rtl.h
trunk/gcc/rtlanal.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-01-15 Thread bergner at gcc dot gnu dot org


--- Comment #46 from bergner at gcc dot gnu dot org  2008-01-16 01:51 
---
This is fixed on mainline and we're not going to backport it to 4.2, so I'm
changing the target milestone.


-- 

bergner at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|4.2.3   |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2008-01-08 Thread steven at gcc dot gnu dot org


-- 

steven at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|WAITING |NEW
   Last reconfirmed|2006-08-11 13:29:43 |2008-01-08 15:59:22
   date||
Summary|[4.2/4.3 Regression]|[4.2 Regression] Performace
   |Performace problem with |problem with indexed
   |indexed load/stores on  |load/stores on powerpc
   |powerpc |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-12 Thread janis at gcc dot gnu dot org


--- Comment #23 from janis at gcc dot gnu dot org  2006-10-12 17:23 ---
Subject: Bug 28690

Author: janis
Date: Thu Oct 12 17:23:10 2006
New Revision: 117668

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=117668
Log:
PR middle-end/28690
* explow.c (force_reg): Set REG_POINTER flag according to
MEM_POINTER flag.

Modified:
branches/ibm/gcc-4_1-branch/gcc/ChangeLog
branches/ibm/gcc-4_1-branch/gcc/explow.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-03 Thread bonzini at gnu dot org


--- Comment #21 from bonzini at gnu dot org  2006-10-03 18:07 ---
Note that I don't oppose at all fixing the problem in
swap_commutative_operands_p.  At the very least, you have to change at the very
least simplify-rtx.c's uses of commutative_operand_precedence to use s_c_o_p
instead, but that's a minor problem.

I'm also worried of the interaction between this change to
swap_commutative_operands_p and swap_commutative_operands_with_target which
(even though I refactored it quite recently to happen very early, in expand) is
an optimization that CSE has performed for years and has big impact on x86, for
example on crafty.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-03 Thread dje at watson dot ibm dot com


--- Comment #22 from dje at watson dot ibm dot com  2006-10-03 18:09 ---
Subject: Re:  [4.2 Regression] Performace problem with indexed load/stores on
powerpc 

I am not suggesting that the problem has to be solved in
swap_commutative_operands, etc.  I would think that GCC should be able to
create commutative addresses where they are formed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-03 Thread dje at gcc dot gnu dot org


--- Comment #20 from dje at gcc dot gnu dot org  2006-10-03 17:58 ---
Paolo, forcing all addresses through legitimize_address should not be the goal.
 The wrong ordering has performance effects, but is not an invalid address. 
While the performance effects on POWER-specific, canonicalizing addresses is a
general GCC issue.  GCC appears to want REG_POINTER first, but does not enforce
it.

I am willing to consider target-specific fixes as a last resort, but I do not
see any reason that GCC should not create and maintain a canonical address
ordering of REG_POINTER first.  Trying to correct this in the rs6000 backend is
a kludge.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-03 Thread bergner at vnet dot ibm dot com


--- Comment #19 from bergner at vnet dot ibm dot com  2006-10-03 15:51 
---
David has already said offline that he would reject any patch that would cause
us to view a non-REG_POINTER + REG_POINTER expression an not legitimate.  I
agree with that.

Sorry, but I'm slowly learning the machine description format.  How exactly
would adding a splitter help us?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-02 Thread paolo dot bonzini at lu dot unisi dot ch


--- Comment #18 from paolo dot bonzini at lu dot unisi dot ch  2006-10-03 
05:20 ---
Subject: Re:  [4.2 Regression] Performace problem with
 indexed load/stores on powerpc


> * rtlanal.c (swap_commutative_operands_p): Preference a REG_POINTER
> over a non REG_POINTER.
> * tree-ssa-address.c (gen_addr_rtx): Force a REG_POINTER to be
> the first operand of a PLUS.
This is more gentle indeed.  Be careful however as functions calling 
commutative_operand_precedence directly may have a problem with that.  
Can you try making an address illegitimate if it is non-REG_POINTER + 
REG_POINTER?  Or set up splitters to do the transformation just before 
reload?

Paolo


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-10-02 Thread bergner at vnet dot ibm dot com


--- Comment #17 from bergner at vnet dot ibm dot com  2006-10-03 03:30 
---
Created an attachment (id=12375)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12375&action=view)
Patch to swap_commutative_operands_p and gen_addr_rtx to force base pointers
into rA position of indexed load/store instructions.

We propagated the MEM_POINTER/REG_POINTER flags fine.  The problem is that the
memory reference we're handed is a REG + REG which looks legitimate to us, so
we never call LEGITIMIZE_ADDRESS, so we never have a chance to swap the
operands.

Since we cannot fixup the latest test case in LEGITIMIZE_ADDRESS, I've decided
to attempt another swap_commutative_operands_p() /
commutative_operand_precedence() change.  However, I'm a little more selective
on when we change swap_commutative_operands_p()'s return value.  With this
patch, I'm able to transform each of the test cases so that the base address if
the first operand of the indexed load.

* rtlanal.c (swap_commutative_operands_p): Preference a REG_POINTER
over a non REG_POINTER.
* tree-ssa-address.c (gen_addr_rtx): Force a REG_POINTER to be
the first operand of a PLUS.


-- 

bergner at vnet dot ibm dot com changed:

   What|Removed |Added

  Attachment #12305|0   |1
is obsolete||
  Attachment #12306|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-22 Thread sabre at nondot dot org


--- Comment #16 from sabre at nondot dot org  2006-09-22 17:27 ---
Out of curiosity, which powerpc processors are affected by this?

-Chris


-- 

sabre at nondot dot org changed:

   What|Removed |Added

 CC||sabre at nondot dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-22 Thread pinskia at physics dot uc dot edu


--- Comment #15 from pinskia at physics dot uc dot edu  2006-09-22 17:09 
---
Subject: Re:  [4.2 Regression] Performace problem
with indexed load/stores on powerpc

On Fri, 2006-09-22 at 17:05 +, pinskia at gcc dot gnu dot org wrote:
> 
> --- Comment #14 from pinskia at gcc dot gnu dot org  2006-09-22 17:05 
> ---
> (In reply to comment #13)
> > Yet another test case from Anton we don't catch.  Will they never end?!?! ;)
> I bet a beer or a shot of vodka, that this is caused by MEM_REF not expanding
> with a REG_POINTER.

And I lost because we have:
;; sum = sum + MEM[base: base, index: (int *) i * 4B]
(insn 29 27 30 (set (reg:SI 128)
(ashift:SI (reg/v:SI 123 [ i ])
(const_int 2 [0x2]))) -1 (nil)
(nil))

(insn 30 29 31 (set (reg:SI 129)
(mem:SI (plus:SI (reg:SI 128)
(reg/v/f:SI 125 [ base ])) [0 S4 A32])) -1 (nil)
(nil))

(insn 31 30 0 (set (reg/v:SI 122 [ sum ])
(plus:SI (reg/v:SI 122 [ sum ])
(reg:SI 129))) -1 (nil)
(nil))



-- Pinski


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



Re: [Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-22 Thread Andrew Pinski
On Fri, 2006-09-22 at 17:05 +, pinskia at gcc dot gnu dot org wrote:
> 
> --- Comment #14 from pinskia at gcc dot gnu dot org  2006-09-22 17:05 
> ---
> (In reply to comment #13)
> > Yet another test case from Anton we don't catch.  Will they never end?!?! ;)
> I bet a beer or a shot of vodka, that this is caused by MEM_REF not expanding
> with a REG_POINTER.

And I lost because we have:
;; sum = sum + MEM[base: base, index: (int *) i * 4B]
(insn 29 27 30 (set (reg:SI 128)
(ashift:SI (reg/v:SI 123 [ i ])
(const_int 2 [0x2]))) -1 (nil)
(nil))

(insn 30 29 31 (set (reg:SI 129)
(mem:SI (plus:SI (reg:SI 128)
(reg/v/f:SI 125 [ base ])) [0 S4 A32])) -1 (nil)
(nil))

(insn 31 30 0 (set (reg/v:SI 122 [ sum ])
(plus:SI (reg/v:SI 122 [ sum ])
(reg:SI 129))) -1 (nil)
(nil))



-- Pinski



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-22 Thread pinskia at gcc dot gnu dot org


--- Comment #14 from pinskia at gcc dot gnu dot org  2006-09-22 17:05 
---
(In reply to comment #13)
> Yet another test case from Anton we don't catch.  Will they never end?!?! ;)
I bet a beer or a shot of vodka, that this is caused by MEM_REF not expanding
with a REG_POINTER.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-22 Thread bergner at vnet dot ibm dot com


--- Comment #13 from bergner at vnet dot ibm dot com  2006-09-22 16:56 
---
Yet another test case from Anton we don't catch.  Will they never end?!?! ;)

int indexedload(int *base, int len)
{
  int i, sum = 0;
  for (i=0; i < len; i++)
sum += base[i];
  return sum;
}

In this case, LEGITIMIZE_ADDRESS cannot help, because it is never passed an
operand that includes the base pointer.  Instead, we're passed a pseudo
register that was set previously to calculation using the base pointer, so in
this case, we can't propagate the REG_POINTER flag.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-22 Thread bergner at vnet dot ibm dot com


--- Comment #12 from bergner at vnet dot ibm dot com  2006-09-22 16:30 
---
Anton dicovered that we don't get multiple dimensioned arrays like the
following test case:

int indexedload(int ***base, int idx0, int idx1, int idx2)
{
  return base[idx0][idx1][idx2];
}

This one leads to 3 indexed loads.  We transform the first indexed load ok, but
the other two we don't.  I tracked that down to force_reg (called from
break_out_memory_refs) doesn't propagate the MEM_POINTER flag to a REG_POINTER
flag on the reg it creates.  I posted/commited a fix which was approved:

http://gcc.gnu.org/ml/gcc-patches/2006-09/msg00941.html

We now successfully transform all of the indexed loads in this test case now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-21 Thread bergner at vnet dot ibm dot com


--- Comment #11 from bergner at vnet dot ibm dot com  2006-09-21 18:19 
---
Created an attachment (id=12306)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12306&action=view)
Alternate patch to rs6000_legitimize_address to force base pointers into rA
position of indexed load/store instructions.

Same as the other patch, except we don't call force_reg() on constants.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-21 Thread bergner at vnet dot ibm dot com


--- Comment #10 from bergner at vnet dot ibm dot com  2006-09-21 18:16 
---
(From update of attachment 12190)
Forgot to obsolete this patch by the updated patch.


-- 

bergner at vnet dot ibm dot com changed:

   What|Removed |Added

  Attachment #12190|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-21 Thread bergner at vnet dot ibm dot com


--- Comment #9 from bergner at vnet dot ibm dot com  2006-09-21 18:14 
---
Created an attachment (id=12305)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12305&action=view)
Patch to rs6000_legitimize_address to force base pointers into rA position of
indexed load/store instructions.

It seems -msoft-float doesn't like the operand swapping, so this patch disables
it when we specify -msoft-float on the compile command.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-06 Thread bergner at vnet dot ibm dot com


--- Comment #8 from bergner at vnet dot ibm dot com  2006-09-07 05:14 
---
Ok, this also passed regression tests on powerpc64-linux (32-bit and 64-bit
testsuite runs) for c, c++, fortran, objc, obj-c++ and java.

Does the attached patch look reasonable to everyone?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-05 Thread bergner at vnet dot ibm dot com


--- Comment #7 from bergner at vnet dot ibm dot com  2006-09-05 20:01 
---
Well, to get REG_POINTER regs to be the first operand, we'd need to increase
their commutative_operand_precedence.  I tried that change already, but that
led to an infinite recursion loop while attempting to simplify the rtl. As you
said, the current ordering seems to be relied upon by the code for
simplifications.

My rs6000_legitimize_address change is still running through the testsuite.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-05 Thread bonzini at gnu dot org


--- Comment #6 from bonzini at gnu dot org  2006-09-05 19:25 ---
To clarify, I make this suggestion because I think that we were getting it
right pre-4.2 just out of luck.

I also thought about having a lower commutative_operand_precedence for
REG_POINTER regs than normal regs, but in fact regs need to have the same
precedence as other rtx's of class RTX_OBJ, or you get pessimization on x86
(e.g. on crafty).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-05 Thread bergner at vnet dot ibm dot com


--- Comment #5 from bergner at vnet dot ibm dot com  2006-09-05 18:43 
---
Created an attachment (id=12190)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12190&action=view)
Patch to rs6000_legitimize_address to force base pointers into rA position of
indexed load/store instructions.

Ok, taking Paolo's suggestion of moving the change into
rs6000_legititmize_address, I'm trying the attached patch which bootstraps fine
and fixes the base pointer order for us.  I'm running the testsuite now and
will report back when it's done.

* config/rs6000/rs6000.c (rs6000_legitimize_address): For performance
reasons, force PLUS register operands that are base pointers to be the
first operand.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-03 Thread bonzini at gnu dot org


--- Comment #4 from bonzini at gnu dot org  2006-09-03 13:51 ---
> which ends up being "-1 < 4", so we swap the operands.  For powerpc, we'd
> prefer the base pointer remain the first operand for performance reasons. I'd
> like other people familar with this code to comment on how we can fix this. 
> One could simply bump up the priority of base pointers (ie, "case RTX_OBJ:"),
> but I personally don't know how that would affect other platforms.

Very much.  The canonical form enforced by swap_commutative_operands_p is
relied upon by all the code for simplifications, that expects for example a
(plus (mult A B) C) and not a (plus C (mult A B)).

If one took care to fix all of them, it could work, but it's no easy feat. :-(

I think the best solution (if it works) is to put this transformation in
rs6000's legitimize_address. Given a (plus (mult A B) C), force the mult into a
pseudo (let's call it D) and then return (plus C D) with the operands swapped.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-09-01 Thread mmitchel at gcc dot gnu dot org


-- 

mmitchel at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-08-25 Thread bergner at vnet dot ibm dot com


--- Comment #3 from bergner at vnet dot ibm dot com  2006-08-26 04:24 
---
Ok, I tracked down where the expander is swapping the operands. It's occuring
at simplify-rtx.c:simplify_binary_operation() at line 1459:

  /* Make sure the constant is second.  */
  if (GET_RTX_CLASS (code) == RTX_COMM_ARITH
  && swap_commutative_operands_p (op0, op1))
{
  tem = op0, op0 = op1, op1 = tem;
}

In this particular case, op0 = (reg/v/f:SI 120 [ base ]) and
op1 = (mult:SI (reg/v:SI 121 [ offset ])
(const_int 4 [0x4]))

[src being: int indexedload (int *base, int offset) { return base[offset]; }]

swap_commutative_operands_p(op0,op1) simply returns:
commutative_operand_precedence (op0) < commutative_operand_precedence (op1),
which ends up being "-1 < 4", so we swap the operands.  For powerpc, we'd
prefer the base pointer remain the first operand for performance reasons. I'd
like other people familar with this code to comment on how we can fix this. 
One could simply bump up the priority of base pointers (ie, "case RTX_OBJ:"),
but I personally don't know how that would affect other platforms.


-- 

bergner at vnet dot ibm dot com changed:

   What|Removed |Added

 CC||bonzini at gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-08-25 Thread pinskia at gcc dot gnu dot org


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu dot
   ||org
   Target Milestone|--- |4.2.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-08-11 Thread dberlin at dberlin dot org


--- Comment #2 from dberlin at gcc dot gnu dot org  2006-08-11 14:33 ---
Subject: Re:  [4.2 Regression] Performace problem with
 indexed load/stores on powerpc


Here is the reassoc patch that puts them in the right order at the tree
level.

Index: tree-ssa-reassoc.c
===
--- tree-ssa-reassoc.c  (revision 115962)
+++ tree-ssa-reassoc.c  (working copy)
@@ -356,6 +356,13 @@ sort_by_operand_rank (const void *pa, co
   && TREE_CODE (oeb->op) == SSA_NAME)
 return SSA_NAME_VERSION (oeb->op) - SSA_NAME_VERSION (oea->op);

+  /* For pointers, most things want the *base* pointer to go first to
+ try indexed loads. The base pointer is the one with the *lesser*
+ rank.  For everything else, put them in order from greatest rank
+ to least.  */
+  if (POINTER_TYPE_P (TREE_TYPE (oea->op)))
+return oea->rank - oeb->rank;
+
   return oeb->rank - oea->rank;
 }

@@ -1309,7 +1316,9 @@ reassociate_bb (basic_block bb)
   if (TREE_CODE (stmt) == MODIFY_EXPR)
{
  tree lhs = TREE_OPERAND (stmt, 0);
+ tree lhst = TREE_TYPE (lhs);
  tree rhs = TREE_OPERAND (stmt, 1);
+ tree rhst = TREE_TYPE (rhs);

  /* If this was part of an already processed tree, we don't
 need to touch it again. */
@@ -1318,10 +1327,10 @@ reassociate_bb (basic_block bb)

  /* If unsafe math optimizations we can do reassociation for
 non-integral types.  */
- if ((!INTEGRAL_TYPE_P (TREE_TYPE (lhs))
-  || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)))
- && (!SCALAR_FLOAT_TYPE_P (TREE_TYPE (rhs))
- || !SCALAR_FLOAT_TYPE_P (TREE_TYPE(lhs))
+ if (((!INTEGRAL_TYPE_P (lhst) & !POINTER_TYPE_P (lhst))
+  || (!INTEGRAL_TYPE_P (rhst) && !POINTER_TYPE_P (rhst)))
+ && (!SCALAR_FLOAT_TYPE_P (rhst)
+ || !SCALAR_FLOAT_TYPE_P (lhst)
  || !flag_unsafe_math_optimizations))
continue;



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690



[Bug middle-end/28690] [4.2 Regression] Performace problem with indexed load/stores on powerpc

2006-08-11 Thread dje at gcc dot gnu dot org


--- Comment #1 from dje at gcc dot gnu dot org  2006-08-11 13:29 ---
Confirmed


-- 

dje at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-08-11 13:29:43
   date||
Summary|Performace problem with |[4.2 Regression] Performace
   |indexed load/stores on  |problem with indexed
   |powerpc |load/stores on powerpc


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28690