Re: PR53914, rs6000 constraints and reload queries

2012-10-29 Thread David Edelsohn
On Wed, Aug 1, 2012 at 9:52 AM, Alan Modra amo...@gmail.com wrote:
 Hi David,
 This is the last of my vendetta against the o constraint in the
 rs6000 backend.  Remaining are a few places in rs6000.md where the
 operand predicate is offsettable_mem_operand, and one instance in
 spe.md.  I believe none of these can run into mode_dependent_address
 causing insn does not statisfy its constraints.  Also, with Y
 replacing o everywhere a 64-bit gpr is loaded or stored, we can
 remove the +0 wrapper you added for pr13674.

 I also add some testcases with this patch.  dfmode_off and tfmode_off
 will fail on e500, but the generated code at -O2 is so awful that I'm
 inclined to think it should remain that way until someone fixes it!

 Bootstrapped and regression tested powerpc64-linux.  OK to apply?

 gcc/
 * config/rs6000/rs6000.c (legitimize_reload_address): Remove code
 handling non-aligned ld/std.
 * config/rs6000/paired.md (movv2sf_paired): Use 'Y' instead of 'o'.
 * config/rs6000/vsx.md (vsx_mov, vsx_movti): Likewise.
 * config/rs6000/altivec.md (altivec_mov, altivec_movti): Likewise.
 * config/rs6000/dfp.md (movtd_internal): Use 'm' instead of 'o'.
 gcc/testsuite/
 * gcc.target/powerpc/dimode_off.c: New.
 * gcc.target/powerpc/timode_off.c: New.
 * gcc.target/powerpc/dfmode_off.c: New.
 * gcc.target/powerpc/tfmode_off.c: New.

This is okay.

Thanks, David


Re: PR53914, rs6000 constraints and reload queries

2012-08-01 Thread Alan Modra
Hi David,
This is the last of my vendetta against the o constraint in the
rs6000 backend.  Remaining are a few places in rs6000.md where the
operand predicate is offsettable_mem_operand, and one instance in
spe.md.  I believe none of these can run into mode_dependent_address
causing insn does not statisfy its constraints.  Also, with Y
replacing o everywhere a 64-bit gpr is loaded or stored, we can
remove the +0 wrapper you added for pr13674.

I also add some testcases with this patch.  dfmode_off and tfmode_off
will fail on e500, but the generated code at -O2 is so awful that I'm
inclined to think it should remain that way until someone fixes it!

Bootstrapped and regression tested powerpc64-linux.  OK to apply?

gcc/
* config/rs6000/rs6000.c (legitimize_reload_address): Remove code
handling non-aligned ld/std.
* config/rs6000/paired.md (movv2sf_paired): Use 'Y' instead of 'o'.
* config/rs6000/vsx.md (vsx_mov, vsx_movti): Likewise.
* config/rs6000/altivec.md (altivec_mov, altivec_movti): Likewise.
* config/rs6000/dfp.md (movtd_internal): Use 'm' instead of 'o'.
gcc/testsuite/
* gcc.target/powerpc/dimode_off.c: New.
* gcc.target/powerpc/timode_off.c: New.
* gcc.target/powerpc/dfmode_off.c: New.
* gcc.target/powerpc/tfmode_off.c: New.

Index: gcc/testsuite/gcc.target/powerpc/dimode_off.c
===
--- gcc/testsuite/gcc.target/powerpc/dimode_off.c   (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/dimode_off.c   (revision 0)
@@ -0,0 +1,50 @@
+/* { dg-do assemble } */
+/* { dg-options -O2 -fno-align-functions -mtraceback=no -save-temps } */
+
+void w1 (void *x, long long y) { *(long long *) (x + 32767) = y; }
+void w2 (void *x, long long y) { *(long long *) (x + 32766) = y; }
+void w3 (void *x, long long y) { *(long long *) (x + 32765) = y; }
+void w4 (void *x, long long y) { *(long long *) (x + 32764) = y; }
+void w5 (void *x, long long y) { *(long long *) (x + 32763) = y; }
+void w6 (void *x, long long y) { *(long long *) (x + 32762) = y; }
+void w7 (void *x, long long y) { *(long long *) (x + 32761) = y; }
+void w8 (void *x, long long y) { *(long long *) (x + 32760) = y; }
+void w9 (void *x, long long y) { *(long long *) (x + 32759) = y; }
+void w10 (void *x, long long y) { *(long long *) (x + 32758) = y; }
+void w11 (void *x, long long y) { *(long long *) (x + 32757) = y; }
+void w12 (void *x, long long y) { *(long long *) (x + 32756) = y; }
+void w13 (void *x, long long y) { *(long long *) (x + 32755) = y; }
+void w14 (void *x, long long y) { *(long long *) (x + 32754) = y; }
+void w15 (void *x, long long y) { *(long long *) (x + 32753) = y; }
+void w16 (void *x, long long y) { *(long long *) (x + 32752) = y; }
+void w17 (void *x, long long y) { *(long long *) (x + 32751) = y; }
+void w18 (void *x, long long y) { *(long long *) (x + 32750) = y; }
+void w19 (void *x, long long y) { *(long long *) (x + 32749) = y; }
+void w20 (void *x, long long y) { *(long long *) (x + 32748) = y; }
+
+long long r1 (void *x) { return *(long long *) (x + 32767); }
+long long r2 (void *x) { return *(long long *) (x + 32766); }
+long long r3 (void *x) { return *(long long *) (x + 32765); }
+long long r4 (void *x) { return *(long long *) (x + 32764); }
+long long r5 (void *x) { return *(long long *) (x + 32763); }
+long long r6 (void *x) { return *(long long *) (x + 32762); }
+long long r7 (void *x) { return *(long long *) (x + 32761); }
+long long r8 (void *x) { return *(long long *) (x + 32760); }
+long long r9 (void *x) { return *(long long *) (x + 32759); }
+long long r10 (void *x) { return *(long long *) (x + 32758); }
+long long r11 (void *x) { return *(long long *) (x + 32757); }
+long long r12 (void *x) { return *(long long *) (x + 32756); }
+long long r13 (void *x) { return *(long long *) (x + 32755); }
+long long r14 (void *x) { return *(long long *) (x + 32754); }
+long long r15 (void *x) { return *(long long *) (x + 32753); }
+long long r16 (void *x) { return *(long long *) (x + 32752); }
+long long r17 (void *x) { return *(long long *) (x + 32751); }
+long long r18 (void *x) { return *(long long *) (x + 32750); }
+long long r19 (void *x) { return *(long long *) (x + 32749); }
+long long r20 (void *x) { return *(long long *) (x + 32748); }
+
+/* { dg-final { object-size text == 440 { target { lp64 } } } } */
+/* 32-bit test should really be == 512 bytes, see pr54110 */
+/* { dg-final { object-size text = 640 { target { ilp32 } } } } */
+/* { dg-final { scan-assembler-not (st|l)fd } } */
+/* { dg-final { cleanup-saved-temps dimode_off } } */
Index: gcc/testsuite/gcc.target/powerpc/timode_off.c
===
--- gcc/testsuite/gcc.target/powerpc/timode_off.c   (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/timode_off.c   (revision 0)
@@ -0,0 +1,51 @@
+/* { dg-do assemble { target { lp64 } } } */
+/* { 

Re: PR53914, rs6000 constraints and reload queries

2012-07-24 Thread Alan Modra
On Mon, Jul 23, 2012 at 07:30:23PM -0400, David Edelsohn wrote:
 This is okay

Committed revision 189801.

PR target/53914
PR target/54009
* config/rs6000/constraints.md (Y): Use mem_operand_gpr.
* config/rs6000/predicates.md (word_offset_memref_operand): Delete.
Adjust all rs6000_legitimate_offset_address_p calls.
* config/rs6000/rs6000-protos.h (mem_operand_gpr): Declare.
(rs6000_secondary_reload_gpr): Declare.
(rs6000_legitimate_offset_address_p): Update prototype.
(rs6000_offsettable_memref_p): Delete.
(rs6000_secondary_reload_ppc64): Delete.
* config/rs6000/rs6000.c (address_offset): New function.
(mem_operand_gpr): Likewise.
(rs6000_legitimate_offset_address_p): Add worst_case param.  When
not worst_case assume class of regs with least restrictive offsets.
Adjust all calls.
(legitimate_lo_sum_address_p): Simplify register mode tests.
(rs6000_legitimize_address): Likewise.  Assume best case offset
addressing.  Combine ELF and MACHO lo_sum code.
(rs6000_mode_dependent_address): Correct offset addressing limits.
(rs6000_offsettable_memref_p): Make static, add reg_mode param.
Use reg_mode to help rs6000_legitimate_offset_address_p.
(rs6000_secondary_reload): Use address_offset.  Handle 32-bit multi
gpr load/store when offset too large.
(rs6000_secondary_reload_gpr): Renamed rs6000_secondary_reload_ppc64.
(rs6000_split_multireg_move): Adjust rs6000_offsettable_memref_p calls.
* config/rs6000/rs6000.md (movdf_hardfloat32): Use 'Y' constraint
for gpr load/store.  Order alternatives as r-Y,Y-r,r-r and
d-m,m-d,d-d.  Correct size of gpr load/store.
(movdf_softfloat32): Use 'Y' constraint for gpr load/store.  Order
alternatives.
(movti_ppc64): Likewise.
(movdi_internal32): Likewise.  Also disparage fprs.
(movdi_mfpgpr, movdi_internal64): Likewise.
(movtf_internal): Use 'm' for fpr load/store.  Order alternatives.
(movtf_softfloat): Order alternatives.
(extenddftf2_internal): Use 'm' and 'Y' for store.
(movti_power, movti_string): Use 'Y' for gpr load/store.  Order.
(stack_protect_setdi, stack_protect_testdi): Likewise.
(movdf_hardfloat64_mfpgpr, movdf_hardfloat64): Order alternatives.
(movdf_softfloat64): Likewise.
(reload_mode_store): Adjust reload_di_store to provide
reload_si_store as well.
(reload_mode_load): Likewise.

Index: gcc/config/rs6000/constraints.md
===
--- gcc/config/rs6000/constraints.md(revision 189800)
+++ gcc/config/rs6000/constraints.md(working copy)
@@ -150,8 +150,9 @@ to use @samp{m} or @samp{es} in @code{asm} stateme
(match_test GET_CODE (XEXP (op, 0)) == REG)))
 
 (define_memory_constraint Y
-  Indexed or word-aligned displacement memory operand
-  (match_operand 0 word_offset_memref_operand))
+  memory operand for 8 byte and 16 byte gpr load/store
+  (and (match_code mem)
+   (match_operand 0 mem_operand_gpr)))
 
 (define_memory_constraint Z
   Memory operand that is an indexed or indirect from a register (it is
Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 189800)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -432,29 +432,6 @@
   (and (match_operand 0 memory_operand)
(match_test offsettable_nonstrict_memref_p (op
 
-;; Return 1 if the operand is a memory operand with an address divisible by 4
-(define_predicate word_offset_memref_operand
-  (match_operand 0 memory_operand)
-{
-  /* Address inside MEM.  */
-  op = XEXP (op, 0);
-
-  /* Extract address from auto-inc/dec.  */
-  if (GET_CODE (op) == PRE_INC
-  || GET_CODE (op) == PRE_DEC)
-op = XEXP (op, 0);
-  else if (GET_CODE (op) == PRE_MODIFY)
-op = XEXP (op, 1);
-  else if (GET_CODE (op) == LO_SUM
-   GET_CODE (XEXP (op, 0)) == REG
-   GET_CODE (XEXP (op, 1)) == CONST)
-op = XEXP (XEXP (op, 1), 0);
-
-  return (GET_CODE (op) != PLUS
- || GET_CODE (XEXP (op, 1)) != CONST_INT
- || INTVAL (XEXP (op, 1)) % 4 == 0);
-})
-
 ;; Return 1 if the operand is an indexed or indirect memory operand.
 (define_predicate indexed_or_indirect_operand
   (match_code mem)
@@ -892,7 +869,8 @@
   return input_operand (op, mode);
 })
 
-;; Return true if OP is an invalid SUBREG operation on the e500.
+;; Return true if OP is a non-immediate operand and not an invalid
+;; SUBREG operation on the e500.
 (define_predicate rs6000_nonimmediate_operand
   (match_code reg,subreg,mem)
 {
@@ -1325,7 +1303,7 @@
   if (base_regno == 0)
return 0;
 }
-  else if (rs6000_legitimate_offset_address_p (SImode, src_addr, 0))
+  else if 

Re: PR53914, rs6000 constraints and reload queries

2012-07-21 Thread David Edelsohn
On Thu, Jul 19, 2012 at 8:45 PM, Alan Modra amo...@gmail.com wrote:

 This on the other hand works.  Please consider the patch amended to
 remove mem_operand_fpr, the 'wY' constraint, and uses thereof replaced
 with m.

What, exactly, is the current proposed patch? Remove offsettable
address constraint from movXX patterns but not from LO_SUM, and not
adding wY constraint?

Thanks, David


Re: PR53914, rs6000 constraints and reload queries

2012-07-19 Thread Alan Modra
On Thu, Jul 19, 2012 at 12:34:25PM +0930, Alan Modra wrote:
 and fixes pr54009.

David, in looking over this today, I realised that this bug isn't
completely fixed.  I stopped gcc emitting an offset of 32768, but that
isn't enough.  lo_sum addresses can't be offset, except when you know
the alignment of the object being addressed guarantees that no part
crosses a 32k boundary.  For example, given lis 9,x@ha; lwz 3,x@l(9);
lwz 4,x+4@l(9); we run into trouble if x happens to reside at
n*64k + 32764.  The final address matters, not just the offset in the
insn.  So I have some changes to mem_operand_gpr and
rs6000_secondary_reload.  I'll post a patch on top of the previous
one.

Another thing.  I reckon we can do without the 'wY' constraint.  I
implemented it because movtf_internal currently uses an o
constraint, but it seems to me that rs6000_legitimate_address_p
already prevents non-offsettable TFmode addresses for mems accessed by
fprs.  Geoff introduced the o on movtf here:
http://gcc.gnu.org/ml/gcc-patches/2003-12/msg00803.html
Thoughts?

-- 
Alan Modra
Australia Development Lab, IBM


Re: PR53914, rs6000 constraints and reload queries

2012-07-19 Thread Alan Modra
On Fri, Jul 20, 2012 at 12:05:28AM +0930, Alan Modra wrote:
 On Thu, Jul 19, 2012 at 12:34:25PM +0930, Alan Modra wrote:
  and fixes pr54009.
 
 David, in looking over this today, I realised that this bug isn't
 completely fixed.  I stopped gcc emitting an offset of 32768, but that
 isn't enough.  lo_sum addresses can't be offset, except when you know
 the alignment of the object being addressed guarantees that no part
 crosses a 32k boundary.  For example, given lis 9,x@ha; lwz 3,x@l(9);
 lwz 4,x+4@l(9); we run into trouble if x happens to reside at
 n*64k + 32764.  The final address matters, not just the offset in the
 insn.  So I have some changes to mem_operand_gpr and
 rs6000_secondary_reload.  I'll post a patch on top of the previous
 one.

Actually, maybe I won't.  When I disabled offsetting LO_SUMs, except
for tocrefs (which we know are aligned), I ran into an insn does not
satisfy its constraints ICE on gcc.dg/pr28795-2.  The cause: code in
rs6000_legitimze_reload_address near the comment starting /* Don't do
this for TFmode.  There we have a list of modes from which we make
assumptions about the regs used, and get it wrong for ppc32.  Reload
decided to use 2 gprs for a DFmode.

We've been doing this forever, and almost always the mem is aligned.
So fixing an esoteric testcase like the one in pr54009 would come at
the expense of normal code quality.  Unless I can reliably figure out
regs used and alignment of syms in rs6000_legitimize_reload_address,
and as far as I can tell that isn't possible.

 Another thing.  I reckon we can do without the 'wY' constraint.  I
 implemented it because movtf_internal currently uses an o
 constraint, but it seems to me that rs6000_legitimate_address_p
 already prevents non-offsettable TFmode addresses for mems accessed by
 fprs.  Geoff introduced the o on movtf here:
 http://gcc.gnu.org/ml/gcc-patches/2003-12/msg00803.html
 Thoughts?

This on the other hand works.  Please consider the patch amended to
remove mem_operand_fpr, the 'wY' constraint, and uses thereof replaced
with m.

-- 
Alan Modra
Australia Development Lab, IBM


Re: PR53914, rs6000 constraints and reload queries

2012-07-18 Thread Alan Modra
Thanks very much Uli for verifying my conclusions about reload,
operand predicates and constraints, and particularly the general
unusability of the o constraint.

Re http://gcc.gnu.org/ml/gcc/2012-07/msg00142.html, this patch adds
the missing secondary reload patterns, corrects constraints I got
wrong (?*d, not *?d), and fixes pr54009.

Uli said:
 An address involving pseudos should be
 considered legitimate if there exists an assignment of hard
 registers that makes it strictly legitimate (not if *any* such
 assignment would be strictly legitimate).  [ It might make sense
 in some cases to make the check stricter; for example if we know
 that an address would nearly always require a reload, we might
 choose to completely reject it if that actually increases performance.
 But that would be just performance tuning, not required for
 correctness ... ]
So there is quite a bit more work in rs6000.c to fully implement this.
See ??? comments that I added on code handling lo_sum, and I'll admit
to not even trying to relax rs6000_legitimate_offset_address_p
conditions for e500.  That can wait for another day.  The patch is
large enough already.

Some notes:
- word_offset_memref_operand isn't used as a predicate and as both Uli
and I noted, constraints calling predicates lead to trouble with
reload_legitimize_address output.  So move it out of predicates.md to
rs6000.c (renamed as mem_operand_gpr and without checks more suited to
predicates).
- where I changed a bunch of mode tests to GET_MODE_SIZE checks, the
original mode list missing TImode is irrelevant for 32-bit, since
TImode isn't supported on 32-bit (Why do we have 32-bit TImode insns?)
- reordering insn alternatives in some cases is cosmetic.  As the
comments say, putting r-Y and Y-r before r-r is necessary, but
reordering d-m,m-d,d-d isn't strictly necessary.  I did that for
consistency, and future proofing should the m constraint need to be
changed.  Putting r-Y before Y-r is also cosmetic but I prefer it
that way for insns that land in reload as pseudo-pseudo ie. mem-mem,
where both load and store alternatives match with reloading.  I think
it's nicer to choose input reloads rather than output reloads, so put
the store first.
- I haven't actually seen the 32-bit gpr secondary reload patterns
trigger (it's hard to make a testcase), so that code is largely
untested.  Fortunately the code is very similar to the 64-bit gpr
secondary reload code.
- movdf_hardfloat32 insn lengths looked wrong to me, so I fixed that.
gpr load and store ought to be just two insns, not four.  I also took
out the ?? kludge since the offsettable address problem is now fixed.
- I don't really like disparaging fprs in a number of DImode insns,
but without that reload prefers to reload inputs.  So you get code
like stw 10,xxx(1); stw 11,xxx+4(1); lfd 0,xxx(1); stfd 0,32764(9);
rather than addi 9,9,32764; stw 10,0(9); stw 11,4(9);  The former is
slower and requires a stack frame.

Bootstrapped and regression tested powerpc-linux.  OK to apply?

PR target/53914
PR target/54009
* config/rs6000/constraints.md (Y): Use mem_operand_gpr.
(wY): New constraint using mem_operand_fpr.
* config/rs6000/predicates.md (word_offset_memref_operand): Delete.
Adjust all rs6000_legitimate_offset_address_p calls.
* config/rs6000/rs6000-protos.h (mem_operand_gpr): Declare.
(mem_operand_fpr, rs6000_secondary_reload_gpr): Declare.
(rs6000_legitimate_offset_address_p): Update prototype.
(rs6000_offsettable_memref_p): Delete.
(rs6000_secondary_reload_ppc64): Delete.
* config/rs6000/rs6000.c (address_offset): New function.
(mem_operand_gpr, mem_operand_fpr): Likewise.
(rs6000_legitimate_offset_address_p): Add worst_case param.  When
not worst_case assume class of regs with least restrictive offsets.
Adjust all calls.
(legitimate_lo_sum_address_p): Simplify register mode tests.
(rs6000_legitimize_address): Likewise.  Assume best case offset
addressing.  Combine ELF and MACHO lo_sum code.
(rs6000_mode_dependent_address): Correct offset addressing limits.
(rs6000_offsettable_memref_p): Make static, add reg_mode param.
Use reg_mode to help rs6000_legitimate_offset_address_p.
(rs6000_secondary_reload): Use address_offset.  Handle 32-bit multi
gpr load/store when offset too large.
(rs6000_secondary_reload_gpr): Renamed rs6000_secondary_reload_ppc64.
(rs6000_split_multireg_move): Adjust rs6000_offsettable_memref_p calls.
* config/rs6000/rs6000.md (movdf_hardfloat32): Use 'Y' constraint
for gpr load/store.  Order alternatives as r-Y,Y-r,r-r and
d-m,m-d,d-d.  Correct size of gpr load/store.
(movdf_softfloat32): Use 'Y' constraint for gpr load/store.  Order
alternatives.
(movti_ppc64): Likewise.
(movdi_internal32): Likewise.  Also disparage fprs.