[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-12 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

Uros Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 Target||x32
 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.7.0

--- Comment #48 from Uros Bizjak ubizjak at gmail dot com 2011-08-12 06:31:31 
UTC ---
The remaining LEAs are due to:
- CSE'd partial address, these can be distributed into uses, based on some cost
model, etc...
- reloaded non-offsetable address to handle o constraint, necessary.

Let's close this bug and eventually open new one on a case-by-case basis.

So, fixed.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-11 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #46 from H.J. Lu hjl.tools at gmail dot com 2011-08-11 12:55:11 
UTC ---
(In reply to comment #45)
 (In reply to comment #44)
  Created attachment 24973 [details]
  Patch that recognizes addresses, zero-extended with AND, v2.
  
  (In reply to comment #42)
  
   It also caused:
   
   internal compiler error: output_operand: invalid expression as operand^M
   Please submit a full bug report,^M
  
  Oh, nice - we can just discard paradoxical subregs in this case.  Please 
  find
  attached updated revision of the patch that avoids ICE.
 
 I am testing it now.

No regressions in GCC, glibc and SPEC CPU 2K/2006.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-11 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #47 from uros at gcc dot gnu.org 2011-08-11 20:03:34 UTC ---
Author: uros
Date: Thu Aug 11 20:03:29 2011
New Revision: 177683

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=177683
Log:
PR target/49781
* config/i386/i386.md (*lea_5_zext): New.
(*lea_6_zext): Ditto.
* config/i386/predicates.md (const_32bit_mask): New predicate.
(lea_address_operand): Reject AND.
* config/i386/i386.c (ix86_decompose_address): Allow Dimode AND with
const_32bit_mask immediate.
(ix86_print_operand_address): Handle AND.
(memory_address_length): Ditto.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.md
trunk/gcc/config/i386/predicates.md


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-10 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #43 from Uros Bizjak ubizjak at gmail dot com 2011-08-10 18:19:05 
UTC ---
(In reply to comment #41)

  Patch that recognizes addresses, zero-extended with AND

 It seems to generate more leal for gcc.dg/torture/pr47744-2.c

No, the extra leas are due to reload-nonoffsetable-address patch (the previous
one). These instructions avoids ICE by reloading non-offsetable address of
double-word operand to a temporary.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-10 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

Uros Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

  Attachment #24967|0   |1
is obsolete||

--- Comment #44 from Uros Bizjak ubizjak at gmail dot com 2011-08-10 18:29:34 
UTC ---
Created attachment 24973
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24973
Patch that recognizes addresses, zero-extended with AND, v2.

(In reply to comment #42)

 It also caused:
 
 internal compiler error: output_operand: invalid expression as operand^M
 Please submit a full bug report,^M

Oh, nice - we can just discard paradoxical subregs in this case.  Please find
attached updated revision of the patch that avoids ICE.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-10 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #45 from H.J. Lu hjl.tools at gmail dot com 2011-08-10 18:44:19 
UTC ---
(In reply to comment #44)
 Created attachment 24973 [details]
 Patch that recognizes addresses, zero-extended with AND, v2.
 
 (In reply to comment #42)
 
  It also caused:
  
  internal compiler error: output_operand: invalid expression as operand^M
  Please submit a full bug report,^M
 
 Oh, nice - we can just discard paradoxical subregs in this case.  Please find
 attached updated revision of the patch that avoids ICE.

I am testing it now.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-09 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #38 from uros at gcc dot gnu.org 2011-08-09 07:38:07 UTC ---
Author: uros
Date: Tue Aug  9 07:38:02 2011
New Revision: 177583

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=177583
Log:
PR target/49781
* config/i386/i386.md (reload_noff_load): New.
(reload_noff_store): Ditto.
* config/i386/i386.c (ix86_secondary_reload): Use
CODE_FOR_reload_noff_load and CODE_FOR_reload_noff_store to handle
double-word moves from/to non-offsetable addresses instead of
generating XMM temporary.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.md


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-09 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #39 from Uros Bizjak ubizjak at gmail dot com 2011-08-09 18:37:37 
UTC ---
Created attachment 24967
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24967
Patch that recognizes addresses, zero-extended with AND

Attached patch adds recognision of addresses, zero-extended with AND.

The patch fixes gcc-target/i386/pr43766.c testcase.

H.J., can you please test it on x32 ?


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-09 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #40 from H.J. Lu hjl.tools at gmail dot com 2011-08-09 18:50:10 
UTC ---
(In reply to comment #39)
 Created attachment 24967 [details]
 Patch that recognizes addresses, zero-extended with AND
 
 Attached patch adds recognision of addresses, zero-extended with AND.
 
 The patch fixes gcc-target/i386/pr43766.c testcase.
 
 H.J., can you please test it on x32 ?

I am testing it now.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-09 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #41 from H.J. Lu hjl.tools at gmail dot com 2011-08-09 18:59:46 
UTC ---
(In reply to comment #39)
 Created attachment 24967 [details]
 Patch that recognizes addresses, zero-extended with AND
 
 Attached patch adds recognision of addresses, zero-extended with AND.
 
 The patch fixes gcc-target/i386/pr43766.c testcase.
 
 H.J., can you please test it on x32 ?

It seems to generate more leal for gcc.dg/torture/pr47744-2.c
compiled with

-mx32 -O3 -std=gnu99 -ftree-vectorize -funroll-loops


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-09 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #42 from H.J. Lu hjl.tools at gmail dot com 2011-08-09 21:21:35 
UTC ---
(In reply to comment #41)
 (In reply to comment #39)
  Created attachment 24967 [details]
  Patch that recognizes addresses, zero-extended with AND
  
  Attached patch adds recognision of addresses, zero-extended with AND.
  
  The patch fixes gcc-target/i386/pr43766.c testcase.
  
  H.J., can you please test it on x32 ?
 
 It seems to generate more leal for gcc.dg/torture/pr47744-2.c
 compiled with
 
 -mx32 -O3 -std=gnu99 -ftree-vectorize -funroll-loops

It also caused:

spawn -ignore SIGHUP /export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c
-w -O2 -lm -mx32 -o
/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/testsuite/gcc2/builtin-prefetch-4.x2^M
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c:
In function ‘preinc_glob_ptr’:^M
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.c-torture/execute/builtin-prefetch-4.c:67:1:
internal compiler error: output_operand: invalid expression as operand^M
Please submit a full bug report,^M
with preprocessed source if appropriate.^M
See http://gcc.gnu.org/bugs.html for instructions.^M
compiler exited with status 1


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-08 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #33 from H.J. Lu hjl.tools at gmail dot com 2011-08-08 13:34:47 
UTC ---
(In reply to comment #32)
 (In reply to comment #31)
  (In reply to comment #29)
   Created attachment 24938 [details]
   WIP patch that exploits addr32.
   
   New version of patch for testing. Survives bootstrap + regtest on
   x86_64-pc-linux-gnu.
  
  Oh, I forgot to add MEM_P ..., so please change the condition to:
  
if (TARGET_64BIT
   MEM_P (x)
  
 GET_MODE_SIZE (mode)  UNITS_PER_WORD
 rclass == GENERAL_REGS
 !offsettable_memref_p (x))
  
  Patch still works OK.
 
 I am testing it now.

It works.  There are no regressions in GCC, glibc and SPEC CPU
2000/2006.  Thanks.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-08 Thread uros at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #34 from uros at gcc dot gnu.org 2011-08-08 14:59:22 UTC ---
Author: uros
Date: Mon Aug  8 14:59:19 2011
New Revision: 177566

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=177566
Log:
PR target/49781
* config/i386/i386.c (ix86_decompose_address): Allow zero-extended
SImode addresses.
(ix86_print_operand_address): Handle zero-extended addresses.
(memory_address_length): Add length of addr32 prefix for
zero-extended addresses.
(ix86_secondary_reload): Handle moves to/from double-word general
registers from/to zero-extended addresses.
* config/i386/predicates.md (lea_address_operand): Reject
zero-extended operands.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/predicates.md


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-08 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #35 from H.J. Lu hjl.tools at gmail dot com 2011-08-08 16:28:58 
UTC ---
It works much better now. But gcc.dg/torture/pr47744-2.c compiled with

-mx32 -O3 -std=gnu99 -ftree-vectorize -funroll-loops

still generates those leal:

leal(%rsi,%r9), %ebp
leal(%r10,%r9), %r12d
leal(%r10,%r9), %r12d


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-08 Thread hjl at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #36 from hjl at gcc dot gnu.org hjl at gcc dot gnu.org 2011-08-08 
16:33:10 UTC ---
Author: hjl
Date: Mon Aug  8 16:33:06 2011
New Revision: 177569

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=177569
Log:
Add a testcase for PR target/49781.

2011-08-08  H.J. Lu  hongjiu...@intel.com

PR target/49781
* gcc.target/i386/pr49781-1.c: New.

Added:
trunk/gcc/testsuite/gcc.target/i386/pr49781-1.c
Modified:
trunk/gcc/testsuite/ChangeLog


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-08 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #37 from Uros Bizjak ubizjak at gmail dot com 2011-08-08 17:16:44 
UTC ---
(In reply to comment #35)
 It works much better now. But gcc.dg/torture/pr47744-2.c compiled with
 
 -mx32 -O3 -std=gnu99 -ftree-vectorize -funroll-loops
 
 still generates those leal:
 
 leal(%rsi,%r9), %ebp
 leal(%r10,%r9), %r12d
 leal(%r10,%r9), %r12d

This partial address won't combine due to multiple uses.

IIRC, there was some discussion to distribute partial addresses back to their
use sites, but OTOH it all depends on how hard you want to hammer your address
generation unit.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-07 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #28 from Uros Bizjak ubizjak at gmail dot com 2011-08-07 07:31:06 
UTC ---
Reduced testcase:

--cut here--
void test (__int128 *array, int idx, int off)
{
  __int128 *dest = array [idx];

  dest[0] += 1;
  dest[off] = 0;
}
--cut here--

$ cc1 -O2 -mx32 -quiet t.c
t.c: In function ‘test’:
t.c:7:1: error: insn does not satisfy its constraints:
(insn 20 10 11 2 (set (reg:TI 2 cx)
(mem:TI (zero_extend:DI (reg:SI 4 si [71])) [2 *dest_5+0 S16 A128]))
t.c:5 60 {*movti_internal_rex64}
 (nil))
t.c:7:1: internal compiler error: in reload_cse_simplify_operands, at
postreload.c:403
Please submit a full bug report,
...


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-07 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

Uros Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

  Attachment #24918|0   |1
is obsolete||
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011.08.07 12:53:50
 AssignedTo|unassigned at gcc dot   |ubizjak at gmail dot com
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #29 from Uros Bizjak ubizjak at gmail dot com 2011-08-07 12:53:50 
UTC ---
Created attachment 24938
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24938
WIP patch that exploits addr32.

New version of patch for testing. Survives bootstrap + regtest on
x86_64-pc-linux-gnu.

2011-08-07  Uros Bizjak  ubiz...@gmail.com

PR target/49781
* config/i386/i386.c (ix86_decompose_address): Allow zero-extended
SImode addresses.
(ix86_print_operand_address): Handle zero-extended addresses.
(memory_address_length): Add length of addr32 prefix for
zero-extended addresses.
(ix86_secondary_reload): Handle moves to/from double-word general
registers from/to zero-extended addresses.
* config/i386/predicates.md (lea_address_operand): Reject
zero-extended operands.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-07 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #30 from Uros Bizjak ubizjak at gmail dot com 2011-08-07 13:09:23 
UTC ---
(In reply to comment #29)
 Created attachment 24938 [details]
 WIP patch that exploits addr32.

BTW: This patch also fixes following FAIL in [1]:

FAIL: gcc.target/i386/pr39543-3.c (test for excess errors)

and exposes interesting problem in combine in 

FAIL: gcc.target/i386/pr43766.c scan-assembler-not lea[lq]?[ \\t]

where combine converts zero_extended address on-the-fly to an address involving
and.  We can also recognize this as a valid address.

Trying 9 - 10:
Failed to match this instruction:
(prefetch (and:DI (subreg:DI (plus:SI (ashift:SI (reg/v:SI 63 [ i ])
(const_int 2 [0x2]))
(subreg:SI (reg/v/f:DI 62 [ a ]) 0)) 0)
(const_int 4294967295 [0x]))
(const_int 0 [0])
(const_int 3 [0x3]))

[1] http://gcc.gnu.org/ml/gcc-testresults/2011-08/msg00601.html


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-07 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #31 from Uros Bizjak ubizjak at gmail dot com 2011-08-07 13:50:17 
UTC ---
(In reply to comment #29)
 Created attachment 24938 [details]
 WIP patch that exploits addr32.
 
 New version of patch for testing. Survives bootstrap + regtest on
 x86_64-pc-linux-gnu.

Oh, I forgot to add MEM_P ..., so please change the condition to:

  if (TARGET_64BIT
 MEM_P (x)

   GET_MODE_SIZE (mode)  UNITS_PER_WORD
   rclass == GENERAL_REGS
   !offsettable_memref_p (x))

Patch still works OK.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-07 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

 Status|ASSIGNED|UNCONFIRMED
 Ever Confirmed|1   |0

--- Comment #32 from H.J. Lu hjl.tools at gmail dot com 2011-08-07 20:23:13 
UTC ---
(In reply to comment #31)
 (In reply to comment #29)
  Created attachment 24938 [details]
  WIP patch that exploits addr32.
  
  New version of patch for testing. Survives bootstrap + regtest on
  x86_64-pc-linux-gnu.
 
 Oh, I forgot to add MEM_P ..., so please change the condition to:
 
   if (TARGET_64BIT
  MEM_P (x)
 
GET_MODE_SIZE (mode)  UNITS_PER_WORD
rclass == GENERAL_REGS
!offsettable_memref_p (x))
 
 Patch still works OK.

I am testing it now.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-05 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #24 from Uros Bizjak ubizjak at gmail dot com 2011-08-05 06:25:55 
UTC ---
(In reply to comment #23)
 I still got
 
 FAIL: gcc.c-torture/execute/builtins/strcat.c compilation,  -O1  (internal
 compiler error)

movdi_internal_rex64 has wrong constraint for operand 0 for moving 64bit
immediate. This move needs offsetable memory, so !o instead of !m in
alternative 4.

Nice, how many problems did -mx32 uncovered.  I will prepare a patch for
mainline that fixes all these inconsistencies first.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-05 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #25 from Uros Bizjak ubizjak at gmail dot com 2011-08-05 19:01:44 
UTC ---
New revision of patch at [1].

[1] http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00642.html


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-05 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #26 from H.J. Lu hjl.tools at gmail dot com 2011-08-05 21:58:21 
UTC ---
(In reply to comment #25)
 New revision of patch at [1].
 
 [1] http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00642.html

It failed gcc.dg/torture/pr47744-2.c the same way as comment 8.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-05 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #27 from Uros Bizjak ubizjak at gmail dot com 2011-08-05 22:01:31 
UTC ---
(In reply to comment #26)

  [1] http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00642.html
 
 It failed gcc.dg/torture/pr47744-2.c the same way as comment 8.

Expected, please see the above link for explanation.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #16 from Uros Bizjak ubizjak at gmail dot com 2011-08-04 18:49:32 
UTC ---
(In reply to comment #14)

 /export/build/gnu/gcc-x32-test/release/usr/gcc-4.7.0-x32/bin/gcc -mx32  
 -std=gnu99 -fgnu89-inline -O2-S testcase.c
 
 testcase.c:634:2: internal compiler error: Segmentation fault
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions.

Sometimes, a subreg is returned as base or index register.  This patch strips
subregs and fixes ICE (I wonder, why the ICE didn't show earlier):

Index: i386.c
===
--- i386.c(revision 177411)
+++ i386.c(working copy)
@@ -14088,6 +14101,20 @@

   gcc_assert (ok);

+  if (parts.base  GET_CODE (parts.base) == SUBREG)
+{
+  rtx tmp = SUBREG_REG (parts.base);
+  parts.base = simplify_subreg (GET_MODE (parts.base),
+tmp, GET_MODE (tmp), 0);
+}
+
+  if (parts.index  GET_CODE (parts.index) == SUBREG)
+{
+  rtx tmp = SUBREG_REG (parts.index);
+  parts.index = simplify_subreg (GET_MODE (parts.index),
+ tmp, GET_MODE (tmp), 0);
+}
+
   base = parts.base;
   index = parts.index;
   disp = parts.disp;

However, I think we should hanlde plus (zero_extend (...addr...)(const_int))
offsetable operands, too. Otherwise there will be no benefit for SSE
operands...


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

Uros Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

  Attachment #24899|0   |1
is obsolete||

--- Comment #17 from Uros Bizjak ubizjak at gmail dot com 2011-08-04 18:53:06 
UTC ---
Created attachment 24918
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24918
WIP patch that exploits addr32.

The patch ATM limits addr32 optimization to MODE_SIZE (op) = WORD_SIZE due to
inability to handle offsetable operands.

But please test it on x32 anyway.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #18 from H.J. Lu hjl.tools at gmail dot com 2011-08-04 19:40:38 
UTC ---
(In reply to comment #17)
 Created attachment 24918 [details]
 WIP patch that exploits addr32.
 
 The patch ATM limits addr32 optimization to MODE_SIZE (op) = WORD_SIZE due to
 inability to handle offsetable operands.
 
 But please test it on x32 anyway.

I am testing it now.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #19 from Uros Bizjak ubizjak at gmail dot com 2011-08-04 19:59:59 
UTC ---
(In reply to comment #18)
 (In reply to comment #17)
  Created attachment 24918 [details]
  WIP patch that exploits addr32.
  
  The patch ATM limits addr32 optimization to MODE_SIZE (op) = WORD_SIZE due 
  to
  inability to handle offsetable operands.
  
  But please test it on x32 anyway.
 
 I am testing it now.

Uh, I forgot to add this code to predicates.md, aligned_operand (and perhaps
cmpxchg8b_pic_memory_operand)...


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #20 from Uros Bizjak ubizjak at gmail dot com 2011-08-04 20:09:48 
UTC ---
This is the missing part of predicates.md:

Index: predicates.md
===
--- predicates.md(revision 177411)
+++ predicates.md(working copy)
@@ -840,6 +844,12 @@
   ok = ix86_decompose_address (op, parts);
   gcc_assert (ok);

+  if (GET_CODE (parts.base) == SUBREG)
+parts.base = SUBREG_REG (parts.base);
+
+  if (GET_CODE (parts.index) == SUBREG)
+parts.index = SUBREG_REG (parts.index);
+
   /* Look for some component that isn't known to be aligned.  */
   if (parts.index)
 {
@@ -903,6 +913,10 @@

   ok = ix86_decompose_address (XEXP (op, 0), parts);
   gcc_assert (ok);
+
+  if (GET_CODE (parts.base) == SUBREG)
+parts.base = SUBREG_REG (parts.base);
+
   if (parts.base == NULL_RTX
   || parts.base == arg_pointer_rtx
   || parts.base == frame_pointer_rtx
@@ -910,6 +924,9 @@
   || parts.base == stack_pointer_rtx)
 return true;

+  if (GET_CODE (parts.index) == SUBREG)
+parts.index = SUBREG_REG (parts.index);
+
   if (parts.index == NULL_RTX
   || parts.index == arg_pointer_rtx
   || parts.index == frame_pointer_rtx


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #21 from H.J. Lu hjl.tools at gmail dot com 2011-08-04 21:00:11 
UTC ---
(In reply to comment #20)
 This is the missing part of predicates.md:
 

It failed on 64bit bootstrap:

[hjl@gnu-33 libgcc]$ cat /tmp/doit 
/export/build/gnu/gcc-x32-test/build-x86_64-linux/./gcc/xgcc
-B/export/build/gnu/gcc-x32-test/build-x86_64-linux/./gcc/
-B/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/bin/
-B/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/include -isystem
/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/sys-include-O2 -g -O2  -O2 -g
-DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition  -isystem ./include  -fPIC -g
-DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector   -I.
-I. -I../.././gcc -I/export/gnu/import/git/gcc-x32/libgcc
-I/export/gnu/import/git/gcc-x32/libgcc/.
-I/export/gnu/import/git/gcc-x32/libgcc/../gcc
-I/export/gnu/import/git/gcc-x32/libgcc/../include
-I/export/gnu/import/git/gcc-x32/libgcc/config/libbid
-DENABLE_DECIMAL_BID_FORMAT -DHAVE_CC_TLS  -DUSE_TLS -o bid64_div.o -MT
bid64_div.o -MD -MP -MF bid64_div.dep -c
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c
[hjl@gnu-33 libgcc]$
/export/build/gnu/gcc-x32-test/build-x86_64-linux/./gcc/xgcc
-B/export/build/gnu/gcc-x32-test/build-x86_64-linux/./gcc/
-B/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/bin/
-B/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/include -isystem
/usr/gcc-4.7.0-x32/x86_64-unknown-linux-gnu/sys-include-O2 -g -O2  -O2 -g
-DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes
-Wmissing-prototypes -Wold-style-definition  -isystem ./include  -fPIC -g
-DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector   -I.
-I. -I../.././gcc -I/export/gnu/import/git/gcc-x32/libgcc
-I/export/gnu/import/git/gcc-x32/libgcc/.
-I/export/gnu/import/git/gcc-x32/libgcc/../gcc
-I/export/gnu/import/git/gcc-x32/libgcc/../include
-I/export/gnu/import/git/gcc-x32/libgcc/config/libbid
-DENABLE_DECIMAL_BID_FORMAT -DHAVE_CC_TLS  -DUSE_TLS -o bid64_div.o -MT
bid64_div.o -MD -MP -MF bid64_div.dep -c
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c: In function
‘__bid64dq_div’:
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c:523:51:
warning: variable ‘Ql’ set but not used [-Wunused-but-set-variable]
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c: In function
‘__bid64qd_div’:
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c:937:51:
warning: variable ‘Ql’ set but not used [-Wunused-but-set-variable]
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c: In function
‘__bid64qq_div’:
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c:1374:51:
warning: variable ‘Ql’ set but not used [-Wunused-but-set-variable]
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c: In function
‘__bid64_div’:
/export/gnu/import/git/gcc-x32/libgcc/config/libbid/bid64_div.c:516:1: internal
compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #22 from H.J. Lu hjl.tools at gmail dot com 2011-08-04 21:03:07 
UTC ---
I am testing:

diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index ae1fa74..373e74b 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -844,10 +844,10 @@
   ok = ix86_decompose_address (op, parts);
   gcc_assert (ok);

-  if (GET_CODE (parts.base) == SUBREG)
+  if (parts.base  GET_CODE (parts.base) == SUBREG)
 parts.base = SUBREG_REG (parts.base);

-  if (GET_CODE (parts.index) == SUBREG)
+  if (parts.index  GET_CODE (parts.index) == SUBREG)
 parts.index = SUBREG_REG (parts.index);

   /* Look for some component that isn't known to be aligned.  */
@@ -914,7 +914,7 @@
   ok = ix86_decompose_address (XEXP (op, 0), parts);
   gcc_assert (ok);

-  if (GET_CODE (parts.base) == SUBREG)
+  if (parts.base  GET_CODE (parts.base) == SUBREG)
 parts.base = SUBREG_REG (parts.base);

   if (parts.base == NULL_RTX
@@ -924,7 +924,7 @@
   || parts.base == stack_pointer_rtx)
 return true;

-  if (GET_CODE (parts.index) == SUBREG)
+  if (parts.index  GET_CODE (parts.index) == SUBREG)
 parts.index = SUBREG_REG (parts.index);

   if (parts.index == NULL_RTX


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-04 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #23 from H.J. Lu hjl.tools at gmail dot com 2011-08-04 22:24:58 
UTC ---
I still got

FAIL: gcc.c-torture/execute/builtins/strcat.c compilation,  -O1  (internal
compiler error)
FAIL: gcc.c-torture/unsorted/xdi.c,  -O1   (internal compiler error)
FAIL: gcc.dg/torture/pr45636.c  -O1  (internal compiler error)
FAIL: gcc.dg/torture/pr45636.c  -O1  (test for excess errors)
FAIL: gfortran.dg/bounds_check_array_ctor_1.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/bounds_check_array_ctor_1.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/bounds_check_array_ctor_6.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/bounds_check_array_ctor_6.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/bounds_check_array_ctor_8.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/bounds_check_array_ctor_8.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/character_array_constructor_1.f90  -O1  (internal compiler
error)
FAIL: gfortran.dg/character_array_constructor_1.f90  -O1  (test for excess
errors)
FAIL: gfortran.dg/char_expr_3.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/char_expr_3.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/char_length_3.f90  -O  (internal compiler error)
FAIL: gfortran.dg/char_length_3.f90  -O  (test for excess errors)
FAIL: gfortran.dg/extends_1.f03  -O1  (internal compiler error)
FAIL: gfortran.dg/extends_1.f03  -O1  (test for excess errors)
FAIL: gfortran.dg/g77/7388.f  -O1  (internal compiler error)
FAIL: gfortran.dg/g77/7388.f  -O1  (test for excess errors)
FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (internal compiler error)
FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (test for excess errors)
FAIL: gfortran.dg/namelist_66.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/namelist_66.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_22.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_22.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -fomit-frame-pointer  (internal compiler
error)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -fomit-frame-pointer  (test for excess
errors)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -g  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -g  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -fomit-frame-pointer  (internal
compiler error)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -fomit-frame-pointer  (test for
excess errors)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -g  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -g  (test for excess errors)


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #8 from H.J. Lu hjl.tools at gmail dot com 2011-08-03 14:10:25 
UTC ---
(In reply to comment #6)
 Created attachment 24899 [details]
 Proposed patch that exploits addr32.
 
 H.J., can you please test this patch on mx32.
 
 The patch bootstraps and regression tests OK on x86_64-pc-linux-gnu {,-m32}.

It failed the testcase for PR 47744, which I just checked in:

[hjl@gnu-33 ilp32-24]$
/export/build/gnu/gcc-x32-test/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-x32-test/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O3
-std=gnu99
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c: In
function \u2018matmul_i16\u2019:
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:40:1:
error: insn does not satisfy its constraints:
(insn 146 66 67 4 (set (reg:TI 0 ax)
(mem:TI (zero_extend:DI (plus:SI (reg:SI 4 si [orig:119 ivtmp.30 ]
[119])
(reg:SI 5 di [orig:102 dest_y ] [102]))) [6 MEM[base:
dest_y_18, index: ivtmp.30_63, offset: 0B]+0 S16 A128]))
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:34 60
{*movti_internal_rex64}
 (nil))
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:40:1:
internal compiler error: in reload_cse_simplify_operands, at postreload.c:403
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.
[hjl@gnu-33 ilp32-24]$


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #9 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 14:45:40 
UTC ---
(In reply to comment #8)

  Created attachment 24899 [details]
  Proposed patch that exploits addr32.
  
  H.J., can you please test this patch on mx32.
  
  The patch bootstraps and regression tests OK on x86_64-pc-linux-gnu {,-m32}.
 
 It failed the testcase for PR 47744, which I just checked in:
 
 [hjl@gnu-33 ilp32-24]$
 /export/build/gnu/gcc-x32-test/build-x86_64-linux/gcc/xgcc
 -B/export/build/gnu/gcc-x32-test/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O3
 -std=gnu99
 /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c
 /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c: In
 function \u2018matmul_i16\u2019:
 /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:40:1:
 error: insn does not satisfy its constraints:
 (insn 146 66 67 4 (set (reg:TI 0 ax)
 (mem:TI (zero_extend:DI (plus:SI (reg:SI 4 si [orig:119 ivtmp.30 ]
 [119])
 (reg:SI 5 di [orig:102 dest_y ] [102]))) [6 MEM[base:
 dest_y_18, index: ivtmp.30_63, offset: 0B]+0 S16 A128]))
 /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:34 60
 {*movti_internal_rex64}
  (nil))
 /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:40:1:
 internal compiler error: in reload_cse_simplify_operands, at postreload.c:403
 Please submit a full bug report,
 with preprocessed source if appropriate.
 See http://gcc.gnu.org/bugs.html for instructions.
 [hjl@gnu-33 ilp32-24]$

Hm, offsetable operand ...


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #10 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 15:01:10 
UTC ---
(In reply to comment #9)
 (In reply to comment #8)
 
   Created attachment 24899 [details]
   Proposed patch that exploits addr32.
   
   H.J., can you please test this patch on mx32.
   
   The patch bootstraps and regression tests OK on x86_64-pc-linux-gnu 
   {,-m32}.
  
  It failed the testcase for PR 47744, which I just checked in:
  
  [hjl@gnu-33 ilp32-24]$
  /export/build/gnu/gcc-x32-test/build-x86_64-linux/gcc/xgcc
  -B/export/build/gnu/gcc-x32-test/build-x86_64-linux/gcc/ -S -o x.s -mx32 -O3
  -std=gnu99
  /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c
  /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c: In
  function \u2018matmul_i16\u2019:
  /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:40:1:
  error: insn does not satisfy its constraints:
  (insn 146 66 67 4 (set (reg:TI 0 ax)
  (mem:TI (zero_extend:DI (plus:SI (reg:SI 4 si [orig:119 ivtmp.30 ]
  [119])
  (reg:SI 5 di [orig:102 dest_y ] [102]))) [6 MEM[base:
  dest_y_18, index: ivtmp.30_63, offset: 0B]+0 S16 A128]))
  /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:34 
  60
  {*movti_internal_rex64}
   (nil))
  /export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/torture/pr47744-2.c:40:1:
  internal compiler error: in reload_cse_simplify_operands, at 
  postreload.c:403
  Please submit a full bug report,
  with preprocessed source if appropriate.
  See http://gcc.gnu.org/bugs.html for instructions.
  [hjl@gnu-33 ilp32-24]$
 
 Hm, offsetable operand ...

This additional patch prevents zero_extend when we deal with
wider-than-word-size moves.  These moves need offsetable_operand, which
zero_extend (...) isn't.

Index: i386.c
===
--- i386.c(revision 177281)
+++ i386.c(working copy)
@@ -11681,6 +11689,10 @@ ix86_legitimate_address_p (enum machine_
   rtx base, index, disp;
   HOST_WIDE_INT scale;

+  if (GET_CODE (addr) == ZERO_EXTEND
+   GET_MODE_SIZE (mode)  UNITS_PER_WORD)
+  return false;
+
   if (ix86_decompose_address (addr, parts) = 0)
 /* Decomposition failed.  */
 return false;


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #11 from H.J. Lu hjl.tools at gmail dot com 2011-08-03 15:44:59 
UTC ---
(In reply to comment #10)
 This additional patch prevents zero_extend when we deal with
 wider-than-word-size moves.  These moves need offsetable_operand, which
 zero_extend (...) isn't.
 
 Index: i386.c
 ===
 --- i386.c(revision 177281)
 +++ i386.c(working copy)
 @@ -11681,6 +11689,10 @@ ix86_legitimate_address_p (enum machine_
rtx base, index, disp;
HOST_WIDE_INT scale;
 
 +  if (GET_CODE (addr) == ZERO_EXTEND
 +   GET_MODE_SIZE (mode)  UNITS_PER_WORD)
 +  return false;
 +
if (ix86_decompose_address (addr, parts) = 0)
  /* Decomposition failed.  */
  return false;

gcc.dg/torture/pr47744-2.c compiled with

-mx32 -O3 -std=gnu99 -ftree-vectorize -funroll-loops

generates codes  like

leal(%rax,%r9), %r12d
leal(%rax,%rdi), %r10d
mov%r12d, %edx
movq(%r12d), %rbp
movq8(%rdx), %rdx
movq(%r12d), %rax

Many leal aren't necessary.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #12 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 16:08:47 
UTC ---
(In reply to comment #11)

 gcc.dg/torture/pr47744-2.c compiled with
 
 -mx32 -O3 -std=gnu99 -ftree-vectorize -funroll-loops
 
 generates codes  like
 
 leal(%rax,%r9), %r12d
 leal(%rax,%rdi), %r10d
 mov%r12d, %edx
 movq(%r12d), %rbp
 movq8(%rdx), %rdx
 movq(%r12d), %rax
 
 Many leal aren't necessary.

This is the tradeof for using offsetable address for DWI operands.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #13 from H.J. Lu hjl.tools at gmail dot com 2011-08-03 16:18:45 
UTC ---
(In reply to comment #10)
 
 This additional patch prevents zero_extend when we deal with
 wider-than-word-size moves.  These moves need offsetable_operand, which
 zero_extend (...) isn't.
 
 Index: i386.c
 ===
 --- i386.c(revision 177281)
 +++ i386.c(working copy)
 @@ -11681,6 +11689,10 @@ ix86_legitimate_address_p (enum machine_
rtx base, index, disp;
HOST_WIDE_INT scale;
 
 +  if (GET_CODE (addr) == ZERO_EXTEND
 +   GET_MODE_SIZE (mode)  UNITS_PER_WORD)
 +  return false;
 +
if (ix86_decompose_address (addr, parts) = 0)
  /* Decomposition failed.  */
  return false;


Doesn't work. I got

FAIL: gcc.dg/graphite/pr35356-2.c (internal compiler error)
FAIL: gcc.dg/graphite/pr35356-2.c (test for excess errors)
FAIL: libgomp.fortran/omp_parse4.f90  -Os  (internal compiler error)
FAIL: libgomp.fortran/omp_parse4.f90  -Os  (test for excess errors)
FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (internal compiler error)
FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (test for excess errors)

so far:

spawn -ignore SIGHUP /export/build/gnu/gcc-x32/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc-x32/build-x86_64-linux/gcc/
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/graphite/pr35356-2.c -O2
-fgraphite-identity -fdump-tree-graphite-all -S -mx32 -o pr35356-2.s^M
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/graphite/pr35356-2.c: In
function 'foo':^M
/export/gnu/import/git/gcc-x32/gcc/testsuite/gcc.dg/graphite/pr35356-2.c:17:1:
internal compiler error: Segmentation fault^M
Please submit a full bug report,^M
with preprocessed source if appropriate.^M
See http://gcc.gnu.org/bugs.html for instructions.^M
compiler exited with status 1


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #14 from H.J. Lu hjl.tools at gmail dot com 2011-08-03 16:47:09 
UTC ---
Created attachment 24907
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24907
A testcase

[hjl@gnu-33 delta]$
/export/build/gnu/gcc-x32-test/release/usr/gcc-4.7.0-x32/bin/gcc -mx32  
-std=gnu99 -fgnu89-inline -O2-S testcase.c

testcase.c:634:2: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html for instructions.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #15 from H.J. Lu hjl.tools at gmail dot com 2011-08-03 16:49:12 
UTC ---
(In reply to comment #13)
 (In reply to comment #10)
  
  This additional patch prevents zero_extend when we deal with
  wider-than-word-size moves.  These moves need offsetable_operand, which
  zero_extend (...) isn't.
  
  Index: i386.c
  ===
  --- i386.c(revision 177281)
  +++ i386.c(working copy)
  @@ -11681,6 +11689,10 @@ ix86_legitimate_address_p (enum machine_
 rtx base, index, disp;
 HOST_WIDE_INT scale;
  
  +  if (GET_CODE (addr) == ZERO_EXTEND
  +   GET_MODE_SIZE (mode)  UNITS_PER_WORD)
  +  return false;
  +
 if (ix86_decompose_address (addr, parts) = 0)
   /* Decomposition failed.  */
   return false;
 
 
 Doesn't work. I got
 
 FAIL: gcc.dg/graphite/pr35356-2.c (internal compiler error)
 FAIL: gcc.dg/graphite/pr35356-2.c (test for excess errors)
 FAIL: libgomp.fortran/omp_parse4.f90  -Os  (internal compiler error)
 FAIL: libgomp.fortran/omp_parse4.f90  -Os  (test for excess errors)
 FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (internal compiler error)
 FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (test for excess errors)

Total regressions:

FAIL: gcc.c-torture/compile/2717-1.c  -Os  (internal compiler error)
FAIL: gcc.c-torture/compile/2717-1.c  -Os  (test for excess errors)
FAIL: gcc.c-torture/execute/builtins/strcat.c compilation,  -O1  (internal
compiler error)
FAIL: gcc.c-torture/unsorted/xdi.c,  -O1   (internal compiler error)
FAIL: gcc.dg/graphite/pr35356-2.c (internal compiler error)
FAIL: gcc.dg/graphite/pr35356-2.c (test for excess errors)
FAIL: gcc.dg/torture/pr45636.c  -O1  (internal compiler error)
FAIL: gcc.dg/torture/pr45636.c  -O1  (test for excess errors)
FAIL: gfortran.dg/alloc_comp_assign_4.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/alloc_comp_assign_4.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/alloc_comp_assign_4.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/alloc_comp_assign_4.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/alloc_comp_assign_4.f90  -Os  (internal compiler error)
FAIL: gfortran.dg/alloc_comp_assign_4.f90  -Os  (test for excess errors)
FAIL: gfortran.dg/bounds_check_array_ctor_1.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/bounds_check_array_ctor_1.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/bounds_check_array_ctor_6.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/bounds_check_array_ctor_6.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/bounds_check_array_ctor_8.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/bounds_check_array_ctor_8.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/character_array_constructor_1.f90  -O1  (internal compiler
error)
FAIL: gfortran.dg/character_array_constructor_1.f90  -O1  (test for excess
errors)
FAIL: gfortran.dg/char_expr_3.f90  -O1  (internal compiler error)
FAIL: gfortran.dg/char_expr_3.f90  -O1  (test for excess errors)
FAIL: gfortran.dg/char_length_3.f90  -O  (internal compiler error)
FAIL: gfortran.dg/char_length_3.f90  -O  (test for excess errors)
FAIL: gfortran.dg/extends_1.f03  -O1  (internal compiler error)
FAIL: gfortran.dg/extends_1.f03  -O1  (test for excess errors)
FAIL: gfortran.dg/g77/7388.f  -O1  (internal compiler error)
FAIL: gfortran.dg/g77/7388.f  -O1  (test for excess errors)
FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (internal compiler error)
FAIL: gfortran.dg/gomp/crayptr5.f90  -O  (test for excess errors)
FAIL: gfortran.dg/namelist_66.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/namelist_66.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_22.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_22.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -fomit-frame-pointer  (internal compiler
error)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -fomit-frame-pointer  (test for excess
errors)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -g  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_22.f90  -O3 -g  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O2  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O2  (test for excess errors)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -fomit-frame-pointer  (internal
compiler error)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -fomit-frame-pointer  (test for
excess errors)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -g  (internal compiler error)
FAIL: gfortran.dg/proc_ptr_comp_12.f90  -O3 -g  (test for excess errors)
FAIL: gfortran.fortran-torture/execute/where_11.f90,  -Os  (internal compiler
error)
FAIL: gfortran.fortran-torture/execute/where_2.f90,  -Os  (internal compiler
error)
FAIL: libgomp.fortran/omp_parse4.f90  -Os  (internal compiler error)
FAIL: libgomp.fortran/omp_parse4.f90  -Os  (test for excess errors)


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #3 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 07:00:12 
UTC ---
Basically, we should allow ZERO_EXTEND in address:

Trying 117 - 118:
Failed to match this instruction:
(set (mem:SI (zero_extend:DI (plus:SI (mult:SI (reg/v:SI 150 [ n ])
(const_int 4 [0x4]))
(reg/f:SI 189))) [2 MEM[symbol: heap, index: D.2768_17, offset:
4B]+0 S4 A32])
(reg/v:SI 150 [ n ]))

We should emit addr32 in this case.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #4 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 10:24:17 
UTC ---
(In reply to comment #0)

 Many lea insns can be combined with the load/store insn followed.

I have a patch that generates addr32 prefix. The result:

.filepr49781.c
.text
.p2align 4,,15
.globlfoo
.typefoo, @function
foo:
.LFB0:
.cfi_startproc
testl%edi, %edi
jle.L9
subl$1, %edi
leaq4+heap(%rip), %rdx
xorl%eax, %eax
mov%edi, %ecx
addq$1, %rcx
.p2align 4,,10
.p2align 3
.L4:
movl%eax, (%rdx,%rax,4)
addq$1, %rax
cmpq%rcx, %rax
jne.L4
movl%edi, %ecx
.L3:
leaq2288+heap(%rip), %rax
movl$571, %edx
jmp.L6
.p2align 4,,10
.p2align 3
.L10:
leaqheap(%rip), %rcx
movslq%edi, %rsi
subl$1, %edi
movl(%rcx,%rsi,4), %ecx
.L6:
movl4+heap(%rip), %esi
movl%ecx, 4+heap(%rip)
movl%ecx, -4(%rax)
movl%esi, (%rax)
movl%edx, %esi
subq$8, %rax
subl$2, %edx
cmpl$1, %edi
jg.L10
movl%edi, heap_len(%rip)
movl%esi, heap_max(%rip)
ret
.L9:
movlheap(%rip), %ecx
movl$-1, %edi
jmp.L3
.cfi_endproc
.LFE0:
.sizefoo, .-foo
.section.text.startup,ax,@progbits
.p2align 4,,15
.globlmain
.typemain, @function
main:
.LFB1:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
movl$286, %edi
callfoo@PLT
xorl%eax, %eax
addq$8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE1:
.sizemain, .-main
.localheap_len
.commheap_len,4,4
.localheap_max
.commheap_max,4,4
.localheap
.commheap,2292,32
.identGCC: (GNU) 4.7.0 20110801 (experimental) [trunk revision
176998]
.section.note.GNU-stack,,@progbits


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #5 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 10:27:35 
UTC ---
(In reply to comment #4)
 (In reply to comment #0)
 
  Many lea insns can be combined with the load/store insn followed.
 
 I have a patch that generates addr32 prefix. The result:

Er, wrong one (that was -m64). This one is -mx32:

.filepr49781.c
.text
.p2align 4,,15
.globlfoo
.typefoo, @function
foo:
.LFB0:
.cfi_startproc
xorl%eax, %eax
testl%edi, %edi
leal4+heap(%rip), %edx
jle.L10
.p2align 4,,10
.p2align 3
.L7:
movl%eax, (%edx,%eax,4)
addl$1, %eax
cmpl%edi, %eax
jne.L7
leal-1(%rdi), %eax
movl%eax, %ecx
.L3:
movl$573, %esi
leal2296+heap(%rip), %r10d
leal2292+heap(%rip), %r9d
lealheap(%rip), %r11d
jmp.L5
.p2align 4,,10
.p2align 3
.L11:
movl(%r11d,%eax,4), %ecx
subl$1, %eax
.L5:
movl4+heap(%rip), %r8d
movl%eax, %edx
subl$2, %esi
subl%edi, %edx
movl%ecx, 4+heap(%rip)
sall$3, %edx
cmpl$1, %eax
movl%r8d, (%r10d,%edx)
movl%ecx, (%r9d,%edx)
jg.L11
movl%eax, heap_len(%rip)
movl%esi, heap_max(%rip)
ret
.L10:
movlheap(%rip), %ecx
xorl%edi, %edi
movl$-1, %eax
jmp.L3
.cfi_endproc
.LFE0:
.sizefoo, .-foo
.section.text.startup,ax,@progbits
.p2align 4,,15
.globlmain
.typemain, @function
main:
.LFB1:
.cfi_startproc
subq$8, %rsp
.cfi_def_cfa_offset 16
movl$286, %edi
callfoo@PLT
xorl%eax, %eax
addq$8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE1:
.sizemain, .-main
.localheap_len
.commheap_len,4,4
.localheap_max
.commheap_max,4,4
.localheap
.commheap,2292,32
.identGCC: (GNU) 4.7.0 20110803 (experimental) [trunk revision
177229]
.section.note.GNU-stack,,@progbits


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #6 from Uros Bizjak ubizjak at gmail dot com 2011-08-03 10:30:39 
UTC ---
Created attachment 24899
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=24899
Proposed patch that exploits addr32.

H.J., can you please test this patch on mx32.

The patch bootstraps and regression tests OK on x86_64-pc-linux-gnu {,-m32}.


[Bug target/49781] [x32] Unnecessary lea in x32 mode

2011-08-03 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

--- Comment #7 from H.J. Lu hjl.tools at gmail dot com 2011-08-03 13:26:20 
UTC ---
(In reply to comment #6)
 Created attachment 24899 [details]
 Proposed patch that exploits addr32.
 
 H.J., can you please test this patch on mx32.
 
 The patch bootstraps and regression tests OK on x86_64-pc-linux-gnu {,-m32}.

I started it now. Thanks.