[Bug middle-end/45312] [4.4 Regression] GCC 4.4.4 miscompiles the Linux kernel

2010-09-07 Thread uweigand at gcc dot gnu dot org


--- Comment #18 from uweigand at gcc dot gnu dot org  2010-09-07 19:18 
---
(In reply to comment #17)
> I am thinking in the same direction.  merge_assign_reloads is dated by 1993. 
> Since then it was not practically changed.  I guess postreload can remove
> unecessary loads if it is generated without merge_assigned_reload.
> 
> I've tried to compile SPEC2000 by gcc-4.4 with and without
> merge_assigned_reloads.  I did not find any code difference.  I've tried a lot
> of other programs with the same result.  The single difference in code I found
> exists on this test case.

Thanks, that's certainly good to know!

> So I'd remove merge_assigned_reloads at all as it became obsolete long ago.

I agree, this seems the best way forward.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45312



[Bug middle-end/45312] [4.4 Regression] GCC 4.4.4 miscompiles the Linux kernel

2010-09-06 Thread uweigand at gcc dot gnu dot org


--- Comment #16 from uweigand at gcc dot gnu dot org  2010-09-06 16:57 
---
(In reply to comment #15)
> Ulrih, I've just wanted to post the following when I found that you already
> posted analogous conclusion.  I should have been on CC to see your comment
> right away.  The problem is really fundamental.  Code for
> merge_assigned_reloads ignores inheritance (and dependencies between reloads
> because of inheritance) at all.  Here is my post wanted to add.

I just noticed that even in the complete absence of reload inheritance, the
allocate_reload_reg routine performs free_for_value_p checks, and therefore
implicitly takes reload ordering into account.  This seems to imply that even
if we'd do merge_assigned_reloads only if no inheritance has taken place, we'd
still have a problem.

Does anybody have any idea how much merge_assigned_reloads actually contributes
to performance on i386, in particular now that we have a bit more post-reload
optimizers that potentially clear up duplicate code of the type generated by
unmerged reloads?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45312



[Bug middle-end/45312] [4.4 Regression] GCC 4.4.4 miscompiles the Linux kernel

2010-09-03 Thread uweigand at gcc dot gnu dot org


--- Comment #14 from uweigand at gcc dot gnu dot org  2010-09-03 18:30 
---
(In reply to comment #12)
> Yes, it would but I think the reload should still generate the right code in
> this particular order of insns.  IMHO, fixing the order of insn is not the
> right thing to do because there might be situation of cycle (e.g. value of 
> p600
> is inherited from 2 but should be reloaded into 3 and p356 is inherited from 3
> and should be reloaded into 2). 
> 
> The problem is definitely in reload inheritance.  Reg_last_reload_reg is not
> invalidated by insn #1407 which is generated by another reload of insn #675.

Actually, I think this really is a reload insn ordering problem. Note that
reload inheritance tracking via reg_last_reload_reg works on the whole reloaded
insn including all the generated reload insns as a whole. Conflicts between
different reloads of the same original insn should be treated specially, taking
into account reload insn ordering (this is done in free_for_value_p).

In this particular case, what happens is this.  After initial reload register
selection, we have this set of reloads:

Reload 0: reload_in (SI) = (reg/v/f:SI 132 [ kpte ])
GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0)
reload_in_reg: (reg/v/f:SI 132 [ kpte ])
reload_reg_rtx: (reg:SI 5 di)
Reload 1: reload_in (SI) = (reg/v/f:SI 132 [ kpte ])
DIREG, RELOAD_FOR_INPUT (opnum = 1)
reload_in_reg: (reg/v/f:SI 132 [ kpte ])
reload_reg_rtx: (reg:SI 5 di)
Reload 2: reload_in (SI) = (reg:SI 588 [ D.29684 ])
BREG, RELOAD_FOR_INPUT (opnum = 2)
reload_in_reg: (reg:SI 588 [ D.29684 ])
reload_reg_rtx: (reg:SI 3 bx)
Reload 3: reload_in (SI) = (reg:SI 356)
CREG, RELOAD_FOR_INPUT (opnum = 3)
reload_in_reg: (reg:SI 356)
reload_reg_rtx: (reg:SI 2 cx)

Reload inheritance notes that reload_in of reload 2 (reg:SI 588) is already
available in CX at this point.  While this cannot be used as reload register
due to the BREG class, it is chosen as "override input", i.e. reload_in is set
to (reg:SI 2 cx) instead of of (reg:SI 588).

Code near the end of choose_reload_regs then verifies whether this causes any
problems due to re-use of the CX register, and comes to the (correct) decision
that it does not, since it knows the reload insn for reload 3 (a
RELOAD_FOR_INPUT for operand 3) which clobbers CX will certainly be generated
*after* the override-input reload insn for reload 2 (a RELOAD_FOR_INPUT for
operand 2).

Unfortunately, some time *after* this decision has been made, the when_needed
field of reload 3 is changed from RELOAD_FOR_INPUT to RELOAD_OTHER.  This
causes the sequence of generated reload insns to change (RELOAD_OTHER reloads
are emitted before RELOAD_FOR_INPUT reloads) -- this breaks the assumptions the
reload inheritance code made, and causes wrong code to be generated.

Now the question is, why does this change to RELOAD_OTHER happen.  This is done
in merge_assigned_reloads, which is called because i386 is a target with
SMALL_REGISTER_CLASSES.  This routine notices that reloads 0 and 1 share a
reload register DI, and decides to merge them, making reload 0 a RELOAD_OTHER
reload (and cancelling reload 1).  It then goes through all other reloads:

  /* If this is now RELOAD_OTHER, look for any reloads that
 load parts of this operand and set them to
 RELOAD_FOR_OTHER_ADDRESS if they were for inputs,
 RELOAD_OTHER for outputs.  Note that this test is
 equivalent to looking for reloads for this operand
 number.

This loop now appears to detect reload 3 as a reload that "loads part of" the
operand 0.  This happens because 
   reg_overlap_mentioned_for_reload_p (rld[j].in,
   rld[i].in))
returns true since both reload-in values are stack slots, and
reg_overlap_mentioned_for_reload_p conservatively assumes that all memory
accesses conflict.

Therefore, when_needed for reload 3 is also set to RELOAD_OTHER.  This is not
only unnecessary, but in turn causes the breakage.

Now in this particular case, it might be possible to fix the problem by using a
better detection of which additional reloads need to be modified.  (The comment
says, "Note that this test is equivalent to looking for reloads for this
operand number." -- which is not true, but could be implemented instead ...)

However, the problem seems to be more fundamental.  If *any* change, even
necessary changes, to when_needed flags are made at this point, *any* decision
on reload inheritance or input overrides that was based on reload insn ordering
may now be incorrect.

I guess a conservative fix could be to check whether any reload inheritance was
indeed or input override was indeed performed on this insn, and if so, refuse
to perform the merge ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45312



[Bug target/31850] gcc.c-torture/compile/limits-fnargs.c is slow at compiling for spu-elf

2010-08-02 Thread uweigand at gcc dot gnu dot org


--- Comment #18 from uweigand at gcc dot gnu dot org  2010-08-02 19:25 
---
(In reply to comment #17)
> Someone might want to try this again after the fix for PR 38582.

It's a lot better, but still not real good.  I'm now seeing on a QS22 (ppu ->
spu cross compiler):

-O0:  0m9.983s
-O1:  4m7.801s
-O2: 35m10.236s
-O3: 36m7.059s

However, the culprit clearly is no longer register renaming, which is now down
to 5 seconds in the worst case.  For -O1, the by far slowest pass is dead store
elimination:
 dead store elim1  : 101.02 (41%) usr   0.62 (35%) sys 101.65 (41%) wall   
4307 kB ( 7%) ggc
 dead store elim2  : 105.03 (43%) usr   0.65 (37%) sys 105.69 (43%) wall   
3028 kB ( 5%) ggc

For -O2 and -O3, the by far slowest pass is register allocation:
 integrated RA :1485.83 (71%) usr  15.86 (68%) sys1501.87 (71%) wall   
2486 kB ( 2%) ggc
 reload: 157.93 ( 8%) usr   1.97 ( 8%) sys 159.92 ( 8%) wall  
30178 kB (19%) ggc
 reload CSE regs   : 100.05 ( 5%) usr   1.45 ( 6%) sys 101.51 ( 5%) wall  
12556 kB ( 8%) ggc

Scheduling only takes about 2 min in either case.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31850



[Bug c++/45112] [4.5 regression] Aligned attribute on static class member definition ignored

2010-07-31 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2010-07-31 17:44 
---
Subject: Bug 45112

Author: uweigand
Date: Sat Jul 31 17:43:59 2010
New Revision: 162786

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162786
Log:
Move PR c++/45112 ChangeLog entry to correct location.

Modified:
branches/gcc-4_5-branch/gcc/ChangeLog
branches/gcc-4_5-branch/gcc/cp/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] [4.5 regression] Aligned attribute on static class member definition ignored

2010-07-31 Thread uweigand at gcc dot gnu dot org


--- Comment #6 from uweigand at gcc dot gnu dot org  2010-07-31 17:43 
---
Subject: Bug 45112

Author: uweigand
Date: Sat Jul 31 17:42:48 2010
New Revision: 162785

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162785
Log:
Move PR c++/45112 ChangeLog entry to correct location.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/cp/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] [4.5 regression] Aligned attribute on static class member definition ignored

2010-07-31 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2010-07-31 15:48 
---
Fixed in 4.5 branch (for 4.5.2) as well.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] [4.5 regression] Aligned attribute on static class member definition ignored

2010-07-31 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2010-07-31 15:46 
---
Subject: Bug 45112

Author: uweigand
Date: Sat Jul 31 15:46:15 2010
New Revision: 162783

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162783
Log:
gcc/
PR c++/45112
* cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.

gcc/testsuite/
PR c++/45112
* testsuite/g++.dg/pr45112.C: New test.

Added:
branches/gcc-4_5-branch/gcc/testsuite/g++.dg/pr45112.C
Modified:
branches/gcc-4_5-branch/gcc/ChangeLog
branches/gcc-4_5-branch/gcc/cp/decl.c
branches/gcc-4_5-branch/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] [4.5 regression] Aligned attribute on static class member definition ignored

2010-07-30 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2010-07-30 16:19 
---
Fixed in mainline.  Will check in to 4.5 after 4.5.1 release.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

Summary|[4.5/4.6 regression] Aligned|[4.5 regression] Aligned
   |attribute on static class   |attribute on static class
   |member definition ignored   |member definition ignored
   Target Milestone|4.5.1   |4.5.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] [4.5/4.6 regression] Aligned attribute on static class member definition ignored

2010-07-30 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2010-07-30 15:50 
---
Subject: Bug 45112

Author: uweigand
Date: Fri Jul 30 15:49:34 2010
New Revision: 162716

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162716
Log:
gcc/
PR c++/45112
* cp/decl.c (duplicate_decls): Merge DECL_USER_ALIGN and DECL_PACKED.

gcc/testsuite/
PR c++/45112
* testsuite/g++.dg/pr45112.C: New test.

Added:
trunk/gcc/testsuite/g++.dg/pr45112.C
Modified:
trunk/gcc/ChangeLog
trunk/gcc/cp/decl.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] [4.5/4.6 regression] Aligned attribute on static class member definition ignored

2010-07-28 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2010-07-28 21:47 
---
Proposed fix posted here:
http://gcc.gnu.org/ml/gcc-patches/2010-07/msg02223.html


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |uweigand at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-07-28 21:47:11
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug c++/45112] New: [4.5/4.6 regression] Aligned attribute on static class member definition ignored

2010-07-28 Thread uweigand at gcc dot gnu dot org
Building the following testcase fails with G++ 4.5 and later.
G++ 4.4 works fine.


struct JSString
{
  unsigned char mLength;
  static JSString unitStringTable[];
};

JSString JSString::unitStringTable[] __attribute__ ((aligned (8))) = { 1 };

int bug [__alignof__ (JSString::unitStringTable) >= 8 ? 1 : -1];


The test case is reduced from Mozilla, where the bug sometimes causes the
JavaScript interpreter to crash.  See also:
https://bugzilla.mozilla.org/show_bug.cgi?id=582593

The problem appears to be that cp-decl.c:duplicate_decls fails to merge
the DECL_USER_ALIGN flag from the definition into the declaration.

This bug was introduced by the following patch:
http://gcc.gnu.org/ml/gcc-patches/2009-06/msg00763.html

Before that patch, the DECL_USER_ALIGN flag was part of a block copied in whole
via memcpy by duplicate_decls.  The patch moved that flag to another location
outside that block, so it is no longer copied ...


-- 
   Summary: [4.5/4.6 regression] Aligned attribute on static class
member definition ignored
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45112



[Bug middle-end/42509] [4.4 Regression] nonoverlapping_memrefs_p misinterprets NULL MEM_OFFSET as const0_rtx

2010-07-28 Thread uweigand at gcc dot gnu dot org


--- Comment #30 from uweigand at gcc dot gnu dot org  2010-07-28 18:01 
---
Backported fix to 4.4 branch as well.  The bug should now be fixed everywhere.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42509



[Bug middle-end/42509] [4.4 Regression] nonoverlapping_memrefs_p misinterprets NULL MEM_OFFSET as const0_rtx

2010-07-28 Thread uweigand at gcc dot gnu dot org


--- Comment #29 from uweigand at gcc dot gnu dot org  2010-07-28 18:00 
---
Subject: Bug 42509

Author: uweigand
Date: Wed Jul 28 18:00:08 2010
New Revision: 162650

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162650
Log:
Backport from mainline:
2010-04-03  Richard Guenther  

PR middle-end/42509
* alias.c (nonoverlapping_memrefs_p): For spill-slot accesses
require a non-NULL MEM_OFFSET.

Modified:
branches/gcc-4_4-branch/gcc/ChangeLog
branches/gcc-4_4-branch/gcc/alias.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42509



[Bug target/44877] C++ compiler can no longer compile dealII for VSX/Altivec vectorization

2010-07-15 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2010-07-15 12:38 
---
Subject: Bug 44877

Author: uweigand
Date: Thu Jul 15 12:37:03 2010
New Revision: 162220

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162220
Log:
PR target/44877
* config/spu/spu.c (spu_expand_builtin_1): Allow references
(as well as pointers) as argument to mask_for_load builtins.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/spu/spu.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44877



[Bug middle-end/44738] c-c++-common/uninit-17.c failed

2010-07-13 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2010-07-13 16:35 
---
Also fails on spu-elf.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44738



[Bug c++/44810] [4.6 Regression] FAIL: g++.dg/torture/pr36745.C

2010-07-13 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2010-07-13 15:15 
---
Also fails on spu-elf.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||uweigand at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44810



[Bug target/44707] operand requires impossible reload

2010-07-02 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2010-07-02 11:50 
---
Fixed.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44707



[Bug target/44707] operand requires impossible reload

2010-07-02 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2010-07-02 11:48 
---
Subject: Bug 44707

Author: uweigand
Date: Fri Jul  2 11:48:30 2010
New Revision: 161703

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161703
Log:
ChangeLog:

PR target/44707
* config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Recognize
(lo_sum (high ...) ...) patterns generated by earlier passes.

testsuite/ChangeLog:

PR target/44707
* gcc.c-torture/compile/pr44707.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr44707.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44707



[Bug target/44707] operand requires impossible reload

2010-07-01 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2010-07-01 19:14 
---
Patch posted here:
http://gcc.gnu.org/ml/gcc-patches/2010-07/msg00082.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44707



[Bug target/44707] operand requires impossible reload

2010-07-01 Thread uweigand at gcc dot gnu dot org


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |uweigand at gcc dot gnu dot
   |dot org |org
 Status|NEW |ASSIGNED
   Last reconfirmed|2010-06-29 16:56:47 |2010-07-01 19:07:33
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44707



[Bug target/44707] operand requires impossible reload

2010-06-29 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2010-06-29 17:03 
---
Created an attachment (id=21041)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=21041&action=view)
Recognize (lo_sum (high ...) ...) in rs6000_legitimize_reload_address

> It seems to me that simply extending rs6000_legitimate_reload_address to 
> handle
> this case as well should fix the bug.

And indeed, this (otherwise untested) patch fixes the bug for me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44707



[Bug target/44707] operand requires impossible reload

2010-06-29 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2010-06-29 16:56 
---
I agree, this looks like a longstanding bug in
rs6000_legitimize_reload_address.

What happens here is that find_reloads is called on this insn:

(insn 15 8 18 2 pr44707.c:13 (asm_operands/v ("/* %0 %1 %2 %3 %4 */") ("") 0 [
(mem/s/c:SI (symbol_ref:SI ("v") [flags 0xc0] ) [3 v.a+0 S4 A32])
(mem/c/i:SI (symbol_ref:SI ("w") [flags 0xc4] ) [3 w+0 S4 A32])
(mem/s/c:SI (const:SI (plus:SI (symbol_ref:SI ("v") [flags 0xc0]
)
(const_int 4 [0x4]))) [3 v.b+0 S4 A32])
(mem/s/c:SI (const:SI (plus:SI (symbol_ref:SI ("v") [flags 0xc0]
)
(const_int 8 [0x8]))) [3 v.c+0 S4 A32])
(mem/s/c:SI (const:SI (plus:SI (symbol_ref:SI ("v") [flags 0xc0]
)
(const_int 12 [0xc]))) [3 v.d+0 S4 A32])
]
 [
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
]
 [] pr44707.c:14) -1 (nil))

rs6000_find_reloads_address notices that it can rewrite
  (symbol_ref:SI "v")
to
  (lo_sum:SI (high:SI (symbol_ref:SI "v")) (symbol_ref:SI "v"))
(and place a reload on the (high:SI) subexpression) and does so.  This change
remains in the insn, and when in the next iteration find_reloads is called
again, the insn now looks like:

(insn 15 8 18 2 pr44707.c:13 (asm_operands/v ("/* %0 %1 %2 %3 %4 */") ("") 0 [
(mem/s/c:SI (lo_sum:SI (high:SI (symbol_ref:SI ("v") [flags 0xc0]
))
(symbol_ref:SI ("v") [flags 0xc0] ))
[3 v.a+0 S4 A32])
(mem/c/i:SI (lo_sum:SI (high:SI (symbol_ref:SI ("w") [flags 0xc4]
))
(symbol_ref:SI ("w") [flags 0xc4] ))
[3 w+0 S4 A32])
(mem/s/c:SI (const:SI (plus:SI (symbol_ref:SI ("v") [flags 0xc0]
)
(const_int 4 [0x4]))) [3 v.b+0 S4 A32])
(mem/s/c:SI (const:SI (plus:SI (symbol_ref:SI ("v") [flags 0xc0]
)
(const_int 8 [0x8]))) [3 v.c+0 S4 A32])
(mem/s/c:SI (const:SI (plus:SI (symbol_ref:SI ("v") [flags 0xc0]
)
(const_int 12 [0xc]))) [3 v.d+0 S4 A32])
]
 [
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
(asm_input:SI ("nro") (null):0)
]
 [] pr44707.c:14) -1 (nil))

However, this expression is now no longer recognized by
rs6000_legitimize_reload_address, and therefore no reload on (high:SI) is
pushed.

Thus, when the reload is finally processed, a reload insn like this is
generated:

(insn 26 8 27 2 pr44707.c:13 (set (reg:SI 10 10)
(lo_sum:SI (high:SI (symbol_ref:SI ("v") [flags 0xc0] ))
(symbol_ref:SI ("v") [flags 0xc0] ))) -1
(nil))

As this does not actually correspond to any valid pattern, an assertion is
triggered.

The underlying problem is that with the current reload setup, an implementation
of LEGITIMATE_RELOAD_ADDRESS must always recognize expressions it has itself
generated in an earlier call.  And indeed, that's what a comment in
rs6000_legitimize_reload_address says:

static rtx
rs6000_legitimize_reload_address (rtx x, enum machine_mode mode,
  int opnum, int type,
  int ind_levels ATTRIBUTE_UNUSED, int *win)
{
  bool reg_offset_p = reg_offset_addressing_ok_p (mode);

  /* We must recognize output that we have already generated ourselves.  */
  if (GET_CODE (x) == PLUS
  && GET_CODE (XEXP (x, 0)) == PLUS
  && GET_CODE (XEXP (XEXP (x, 0), 0)) == REG
  && GET_CODE (XEXP (XEXP (x, 0), 1)) == CONST_INT
  && GET_CODE (XEXP (x, 1)) == CONST_INT)


However, this recognizes only certain types of such output, and in particular
not the one that shows up in this test case.

It seems to me that simply extending rs6000_legitimate_reload_address to handle
this case as well should fix the bug.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2010-06-29 16:56:47
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44707



[Bug rtl-optimization/41064] [4.4 Regression]: build breakage for cris-elf building newlib, ICE in extract_insn, from r150726

2010-05-11 Thread uweigand at gcc dot gnu dot org


--- Comment #8 from uweigand at gcc dot gnu dot org  2010-05-11 13:57 
---
(In reply to comment #7)
> Not sure what's the state here.  Is 4.4 broken now?

Here's the status as far as I know.  I had checked in a patch:
http://gcc.gnu.org/ml/gcc-patches/2009-08/msg00254.html
to fix the problem PR 37053.  This patch introduced the regression
described in the current problem, fixed by Hans-Peter's patch above.

Now, *both* my PR 37053 patch and Hans-Peter's patch were checked in
only to mainline (i.e. GCC 4.5); GCC 4.4 should not be affected either
way.  However, this means the original problem in PR 37053 is still
present in GCC 4.4.  And in fact, it may well be the case that the
problem described in PR 40414 is actually a duplicate of PR 37053.

In this case, I guess the way to go would be to apply *both* my patch
and Hans-Peter's patch to the 4.4 branch ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41064



[Bug middle-end/43292] Bogus TYPE_ADDR_SPACE access

2010-03-08 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2010-03-08 16:11 
---
Why doesn't this make sense? The address space is a property of the pointed-to
type, not the pointer type itself (just like const/volatile-ness) ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43292



[Bug target/41176] [4.4/4.5 Regression] ICE in reload_cse_simplify_operands at postreload.c:396

2010-03-02 Thread uweigand at gcc dot gnu dot org


--- Comment #11 from uweigand at gcc dot gnu dot org  2010-03-02 19:56 
---
(In reply to comment #10)
> I don't see where reload is creating the whole instruction; maybe I am
> misunderstanding that statement.

Well, after reload you have insn 624, which presumably didn't exist before. 
This was inserted by reload before the (original) insn 218 -- you didn't show
the fixed-up version of insn 218 after reload, but I'm assuming it's probably a
register-to-register (or -to-memory) move from the reload register (reg:DF 21)
into whatever the register allocator has chosen to hold (reg/v:DF 203).

The new insn 624 is not in any way a "fixed up" version of insn 218.  Instead,
it is a reload insn that was generated by reload to load some value (in this
case the (mem:DF ...)) into some reload register.  (That this happens to look
similar to insn 218 before reload is just a coincidence.)  As I mentioned,
reload by default assumes that any move of any legitimate operand into any
register is always valid and can by performed by a simple set.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41176



[Bug c/31499] rejects vector int a[] = {1,1,1,1,1};

2009-12-07 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2009-12-07 22:20 
---
Subject: Bug 31499

Author: uweigand
Date: Mon Dec  7 22:20:06 2009
New Revision: 155055

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155055
Log:
2008-12-07  Ulrich Weigand  

Backport from mainline:

gcc/
2009-05-19  Andrew Pinski  

* c-typeck.c (build_binary_op): Allow % on integal vectors.
* doc/extend.texi (Vector Extension): Document that % is allowed too.

gcc/cp/
2009-05-19  Andrew Pinski  

* typeck.c (build_binary_op): Allow % on integal vectors.

gcc/testsuite/
2009-05-19  Andrew Pinski  

* gcc.dg/vector-4.c: New testcase.
* gcc.dg/simd-1b.c: % is now allowed for integer vectors.
* g++.dg/ext/vector16.C: New testcase.

2008-12-07  Ulrich Weigand  

Backport from mainline:

gcc/
2009-04-22  Andrew Pinski  

PR C/31499
* c-typeck.c (process_init_element): Treat VECTOR_TYPE like ARRAY_TYPE
and RECORD_TYPE/UNION_TYPE.  When outputing the actual element and the
value is a VECTOR_CST, the element type is the element type of the
vector.

gcc/testsuite/
2009-04-22  Andrew Pinski  

PR C/31499
* gcc.dg/vector-init-1.c: New testcase.
* gcc.dg/vector-init-2.c: New testcase.

2008-12-07  Ulrich Weigand  

Update to gcc-4_4-branch revision 155038.

Added:
branches/cell-4_4-branch/gcc/testsuite/g++.dg/ext/vector16.C
branches/cell-4_4-branch/gcc/testsuite/gcc.dg/vector-4.c
branches/cell-4_4-branch/gcc/testsuite/gcc.dg/vector-init-1.c
branches/cell-4_4-branch/gcc/testsuite/gcc.dg/vector-init-2.c
Modified:
branches/cell-4_4-branch/   (props changed)
branches/cell-4_4-branch/ChangeLog.cell
branches/cell-4_4-branch/gcc/ChangeLog
branches/cell-4_4-branch/gcc/DATESTAMP
branches/cell-4_4-branch/gcc/c-typeck.c
branches/cell-4_4-branch/gcc/config/i386/i386.md
branches/cell-4_4-branch/gcc/cp/typeck.c
branches/cell-4_4-branch/gcc/doc/extend.texi
branches/cell-4_4-branch/gcc/testsuite/gcc.dg/simd-1b.c
branches/cell-4_4-branch/gcc/tree-ssa-dom.c

Propchange: branches/cell-4_4-branch/
('svnmerge-integrated' modified)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31499



[Bug middle-end/42224] [4.5 Regression] 32bit pointers to 32bit pointers abort on 64bit VMS and S390X

2009-12-04 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2009-12-05 00:12 
---
Subject: Bug 42224

Author: uweigand
Date: Sat Dec  5 00:11:29 2009
New Revision: 155003

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155003
Log:
2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-12-03  Ken Werner  

* config/spu/spu-elf.h (STARTFILE_SPEC): Add support for gprof
startup files.
* config/spu/spu-protos.h (spu_function_profiler): Add prototype.
* config/spu/spu.c (spu_function_profiler): New function.
* config/spu/spu.h (FUNCTION_PROFILER): Invoke
spu_function_profiler.
(NO_PROFILE_COUNTERS): Define.
(PROFILE_BEFORE_PROLOGUE): Likewise.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-12-02  Ulrich Weigand  

PR middle-end/42224
* tree.h (int_or_pointer_precision): Remove.
* tree.c (int_or_pointer_precision): Remove.
(integer_pow2p): Use TYPE_PRECISION instead.
(tree_log2): Likewise.
(tree_floor_log2): Likewise.
(signed_or_unsigned_type_for): Likewise.
* fold-const.c (fit_double_type): Likewise.
* varasm.c (initializer_constant_valid_p): Likewise.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-11-17  Ulrich Weigand  

PR tree-optimization/41857
* tree-ssa-address.c (move_hint_to_base): Use void pointer to
TYPE's address space instead of pointer to TYPE.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-11-17  Ulrich Weigand  

* reload.c (find_reloads_address): Fix typo.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-11-02  Ulrich Weigand  

PR tree-optimization/41857
* tree-flow.h (rewrite_use_address): Add BASE_HINT argument.
* tree-ssa-loop-ivopts.c (rewrite_use_address): Pass base hint
to create_mem_ref.
* tree-ssa-address.c (move_hint_to_base): New function.
(most_expensive_mult_to_index): Add TYPE argument.  Use mode and
address space associated with TYPE.
(addr_to_parts): Add TYPE and BASE_HINT arguments.  Pass TYPE to
most_expensive_mult_to_index.  Call move_hint_to_base.
(create_mem_ref): Add BASE_HINT argument.  Pass BASE_HINT and
TYPE to addr_to_parts.

gcc/testsuite/
2009-11-02  Ulrich Weigand  

PR tree-optimization/41857
* gcc.target/spu/ea/pr41857.c: New file.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/testsuite/
2009-10-26  Ben Elliston  
Michael Meissner  
Ulrich Weigand  

* gcc.target/spu/ea/ea.exp: New file.
* gcc.target/spu/ea/cache1.c: Likewise.
* gcc.target/spu/ea/cast1.c: Likewise.
* gcc.target/spu/ea/cast2.c: Likewise.
* gcc.target/spu/ea/compile1.c: Likewise.
* gcc.target/spu/ea/compile2.c: Likewise.
* gcc.target/spu/ea/cppdefine.c: Likewise.
* gcc.target/spu/ea/errors1.c: Likewise.
* gcc.target/spu/ea/errors2.c: Likewise.
* gcc.target/spu/ea/execute1.c: Likewise.
* gcc.target/spu/ea/execute2.c: Likewise.
* gcc.target/spu/ea/execute3.c: Likewise.
* gcc.target/spu/ea/ops1.c: Likewise.
* gcc.target/spu/ea/ops2.c: Likewise.
* gcc.target/spu/ea/options1.c: Likewise.
* gcc.target/spu/ea/test-sizes.c: Likewise.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-10-26  Ben Elliston  
Michael Meissner  
Ulrich Weigand  

* config.gcc (spu-*-elf*): Add spu_cache.h to extra_headers.
* config/spu/spu_cache.h: New file.

* config/spu/cachemgr.c: New file.
* config/spu/cache.S: New file.

* config/spu/spu.h (ASM_OUTPUT_SYMBOL_REF): Define.
(ADDR_SPACE_EA): Define.
(TARGET_ADDR_SPACE_KEYWORDS): Define.
* config/spu/spu.c (EAmode): New macro.
(TARGET_ADDR_SPACE_POINTER_MODE): Define.
(TARGET_ADDR_SPACE_ADDRESS_MODE): Likewise.
(TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P): Likewise.
(TARGET_ADDR_SPACE_LEGITIMIZE_ADDRESS): Likewise.
(TARGET_ADDR_SPACE_SUBSET_P): Likewise.
(TARGET_ADDR_SPACE_CONVERT): Likewise.
(TARGET_ASM_SELECT_SECTION): Likewise.
(TARGET_ASM_UNIQUE_SECTION): Likewise.
(TARGET_ASM_UNALIGNED_SI_OP): Likewise.
(TARGET_ASM_ALIGNED_DI_OP): Likewise.
(ea_symbol_ref): New function.
(spu_legitimate_constant_p): Handle __ea qualified addresses.
(spu_legitimate_address): Likewise.
(spu_addr_space_legitimate_address_p): New function.
(spu_addr_space_legitimize_address): Likewise.
(cache_fetch): New global.
(cache_fetch_dirty): Likewise.
(ea_alias_s

[Bug tree-optimization/41857] Loop optimizer breaks __ea pointers with -mea64

2009-12-04 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2009-12-05 00:12 
---
Subject: Bug 41857

Author: uweigand
Date: Sat Dec  5 00:11:29 2009
New Revision: 155003

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=155003
Log:
2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-12-03  Ken Werner  

* config/spu/spu-elf.h (STARTFILE_SPEC): Add support for gprof
startup files.
* config/spu/spu-protos.h (spu_function_profiler): Add prototype.
* config/spu/spu.c (spu_function_profiler): New function.
* config/spu/spu.h (FUNCTION_PROFILER): Invoke
spu_function_profiler.
(NO_PROFILE_COUNTERS): Define.
(PROFILE_BEFORE_PROLOGUE): Likewise.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-12-02  Ulrich Weigand  

PR middle-end/42224
* tree.h (int_or_pointer_precision): Remove.
* tree.c (int_or_pointer_precision): Remove.
(integer_pow2p): Use TYPE_PRECISION instead.
(tree_log2): Likewise.
(tree_floor_log2): Likewise.
(signed_or_unsigned_type_for): Likewise.
* fold-const.c (fit_double_type): Likewise.
* varasm.c (initializer_constant_valid_p): Likewise.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-11-17  Ulrich Weigand  

PR tree-optimization/41857
* tree-ssa-address.c (move_hint_to_base): Use void pointer to
TYPE's address space instead of pointer to TYPE.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-11-17  Ulrich Weigand  

* reload.c (find_reloads_address): Fix typo.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-11-02  Ulrich Weigand  

PR tree-optimization/41857
* tree-flow.h (rewrite_use_address): Add BASE_HINT argument.
* tree-ssa-loop-ivopts.c (rewrite_use_address): Pass base hint
to create_mem_ref.
* tree-ssa-address.c (move_hint_to_base): New function.
(most_expensive_mult_to_index): Add TYPE argument.  Use mode and
address space associated with TYPE.
(addr_to_parts): Add TYPE and BASE_HINT arguments.  Pass TYPE to
most_expensive_mult_to_index.  Call move_hint_to_base.
(create_mem_ref): Add BASE_HINT argument.  Pass BASE_HINT and
TYPE to addr_to_parts.

gcc/testsuite/
2009-11-02  Ulrich Weigand  

PR tree-optimization/41857
* gcc.target/spu/ea/pr41857.c: New file.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/testsuite/
2009-10-26  Ben Elliston  
Michael Meissner  
Ulrich Weigand  

* gcc.target/spu/ea/ea.exp: New file.
* gcc.target/spu/ea/cache1.c: Likewise.
* gcc.target/spu/ea/cast1.c: Likewise.
* gcc.target/spu/ea/cast2.c: Likewise.
* gcc.target/spu/ea/compile1.c: Likewise.
* gcc.target/spu/ea/compile2.c: Likewise.
* gcc.target/spu/ea/cppdefine.c: Likewise.
* gcc.target/spu/ea/errors1.c: Likewise.
* gcc.target/spu/ea/errors2.c: Likewise.
* gcc.target/spu/ea/execute1.c: Likewise.
* gcc.target/spu/ea/execute2.c: Likewise.
* gcc.target/spu/ea/execute3.c: Likewise.
* gcc.target/spu/ea/ops1.c: Likewise.
* gcc.target/spu/ea/ops2.c: Likewise.
* gcc.target/spu/ea/options1.c: Likewise.
* gcc.target/spu/ea/test-sizes.c: Likewise.

2008-12-04  Ulrich Weigand  

Backport from mainline:

gcc/
2009-10-26  Ben Elliston  
Michael Meissner  
Ulrich Weigand  

* config.gcc (spu-*-elf*): Add spu_cache.h to extra_headers.
* config/spu/spu_cache.h: New file.

* config/spu/cachemgr.c: New file.
* config/spu/cache.S: New file.

* config/spu/spu.h (ASM_OUTPUT_SYMBOL_REF): Define.
(ADDR_SPACE_EA): Define.
(TARGET_ADDR_SPACE_KEYWORDS): Define.
* config/spu/spu.c (EAmode): New macro.
(TARGET_ADDR_SPACE_POINTER_MODE): Define.
(TARGET_ADDR_SPACE_ADDRESS_MODE): Likewise.
(TARGET_ADDR_SPACE_LEGITIMATE_ADDRESS_P): Likewise.
(TARGET_ADDR_SPACE_LEGITIMIZE_ADDRESS): Likewise.
(TARGET_ADDR_SPACE_SUBSET_P): Likewise.
(TARGET_ADDR_SPACE_CONVERT): Likewise.
(TARGET_ASM_SELECT_SECTION): Likewise.
(TARGET_ASM_UNIQUE_SECTION): Likewise.
(TARGET_ASM_UNALIGNED_SI_OP): Likewise.
(TARGET_ASM_ALIGNED_DI_OP): Likewise.
(ea_symbol_ref): New function.
(spu_legitimate_constant_p): Handle __ea qualified addresses.
(spu_legitimate_address): Likewise.
(spu_addr_space_legitimate_address_p): New function.
(spu_addr_space_legitimize_address): Likewise.
(cache_fetch): New global.
(cache_fetch_dirty): Likewise.
(ea_alias_s

[Bug middle-end/42224] [4.5 Regression] 32bit pointers to 32bit pointers abort on 64bit VMS and S390X

2009-12-02 Thread uweigand at gcc dot gnu dot org


--- Comment #6 from uweigand at gcc dot gnu dot org  2009-12-02 13:52 
---
Fixed.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42224



[Bug middle-end/42224] [4.5 Regression] 32bit pointers to 32bit pointers abort on 64bit VMS and S390X

2009-12-02 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2009-12-02 13:51 
---
Subject: Bug 42224

Author: uweigand
Date: Wed Dec  2 13:50:52 2009
New Revision: 154908

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=154908
Log:
gcc/
PR middle-end/42224
* tree.h (int_or_pointer_precision): Remove.
* tree.c (int_or_pointer_precision): Remove.
(integer_pow2p): Use TYPE_PRECISION instead.
(tree_log2): Likewise.
(tree_floor_log2): Likewise.
(signed_or_unsigned_type_for): Likewise.
* fold-const.c (fit_double_type): Likewise.
* varasm.c (initializer_constant_valid_p): Likewise.

gcc/testsuite/
PR middle-end/42224
* gcc.target/s390/pr42224.c: New test.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/fold-const.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree.c
trunk/gcc/tree.h
trunk/gcc/varasm.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42224



[Bug middle-end/42224] [4.5 Regression] 32bit pointers to 32bit pointers abort on 64bit VMS and S390X

2009-11-30 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2009-11-30 15:17 
---
OK, I've reproduced the problem.  It seems int_or_pointer_precision
is fundamentally wrong for pointers using a non-standard size
(i.e. pointer variables defined using a mode attribute).

The history of this is that there used to be code e.g. in
fold-const.c:fit_double_type that hard-coded a precision
of POINTER_SIZE for pointer types:

  if (POINTER_TYPE_P (type)
  || TREE_CODE (type) == OFFSET_TYPE)
prec = POINTER_SIZE;
  else
prec = TYPE_PRECISION (type);

This showed up as a bug in the presence of named-address-space
pointers of a different size than POINTER_SIZE.  The initial
thought to fix this as to just always use TYPE_PRECISION.

This turned out to break C++, as OFFSET_TYPEs were generated that
did not correctly set TYPE_PRECISION.   Mike fixed this, and added
the int_or_pointer_precision routine to verify that TYPE_PRECISION
of pointer and offset types is set correctly.

However, this now breaks on targets that allow pointers of different
size even for the same address space, because the verification in
int_or_pointer_precision fails.  I think we need to relax the check
to allow any "valid_pointer_mode" for the given address space as
mode of the pointer, as long as the precision matches the mode size
of that mode.

I'm testing a patch to that effect.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |uweigand at gcc dot gnu dot
   |dot org |org
 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-11-30 15:17:10
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42224



[Bug tree-optimization/41857] Loop optimizer breaks __ea pointers with -mea64

2009-11-17 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2009-11-17 16:22 
---
Subject: Bug 41857

Author: uweigand
Date: Tue Nov 17 16:21:56 2009
New Revision: 154255

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=154255
Log:
PR tree-optimization/41857
* tree-ssa-address.c (move_hint_to_base): Use void pointer to
TYPE's address space instead of pointer to TYPE.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-ssa-address.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41857



[Bug tree-optimization/41857] Loop optimizer breaks __ea pointers with -mea64

2009-11-02 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2009-11-02 14:35 
---
Fixed.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41857



[Bug tree-optimization/41857] Loop optimizer breaks __ea pointers with -mea64

2009-11-02 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2009-11-02 14:30 
---
Subject: Bug 41857

Author: uweigand
Date: Mon Nov  2 14:30:39 2009
New Revision: 153810

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=153810
Log:
gcc/
PR tree-optimization/41857
* tree-flow.h (rewrite_use_address): Add BASE_HINT argument.
* tree-ssa-loop-ivopts.c (rewrite_use_address): Pass base hint
to create_mem_ref.
* tree-ssa-address.c (move_hint_to_base): New function.
(most_expensive_mult_to_index): Add TYPE argument.  Use mode and
address space associated with TYPE.
(addr_to_parts): Add TYPE and BASE_HINT arguments.  Pass TYPE to
most_expensive_mult_to_index.  Call move_hint_to_base.
(create_mem_ref): Add BASE_HINT argument.  Pass BASE_HINT and
TYPE to addr_to_parts.

gcc/testsuite/
PR tree-optimization/41857
* gcc.target/spu/ea/pr41857.c: New file.

Added:
trunk/gcc/testsuite/gcc.target/spu/ea/pr41857.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-flow.h
trunk/gcc/tree-ssa-address.c
trunk/gcc/tree-ssa-loop-ivopts.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41857



[Bug tree-optimization/41857] Loop optimizer breaks __ea pointers with -mea64

2009-10-29 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2009-10-29 18:49 
---
Proposed fix: http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01757.html


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2009-10-29 18:49:20
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41857



[Bug tree-optimization/41857] New: Loop optimizer breaks __ea pointers with -mea64

2009-10-28 Thread uweigand at gcc dot gnu dot org
The following test case

__ea char *strchr_ea (__ea const char *s, int c);
__ea char *foo (__ea char *s)
{
  __ea char *ret = s;
  int i;

  for (i = 0; i < 3; i++)
ret = strchr_ea (ret, s[i]);

  return ret;
}

results in an ICE when compiled with -O -mea64.

The reason is that the loop optimizers use an induction variable
of type "long long int" to represent s+i, instead of using the
appropriate pointer type.

This causes rewrite_use_address to call create_mem_ref with an
affine expression none of whose subexpressions is of pointer type.
Therefore, the induction variable is assigned as the "index" of
a TARGET_MEM_REF, which means it gets converted to sizetype.

As sizetype is smaller than the __ea pointer type in the -ea64 case,
this means that value would be truncated.  This is later caught by
an assertion in convert_memory_address, which causes the ICE.

Note that use of an integral induction variable was introduced as
part of the fix to PR tree-optimization/27865:
http://gcc.gnu.org/ml/gcc-patches/2006-08/msg00198.html

It seems to me it would be preferable to keep using pointer variables
where possible, even on platforms where sizetype is the same size as
pointers, in order to properly identify address base registers where
this makes a performance difference.


-- 
   Summary: Loop optimizer breaks __ea pointers with -mea64
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
    AssignedTo: uweigand at gcc dot gnu dot org
    ReportedBy: uweigand at gcc dot gnu dot org
GCC target triplet: spu-unknown-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41857



[Bug target/41176] [4.4/4.5 Regression] ICE in reload_cse_simplify_operands at postreload.c:396

2009-10-08 Thread uweigand at gcc dot gnu dot org


--- Comment #9 from uweigand at gcc dot gnu dot org  2009-10-08 18:39 
---
(In reply to comment #8)
> This is on (set (reg:DF X) (mem:DF ((plus:DI (reg:DI Y) (const_int 3.
> When X is still a pseudo, this is considered valid, as lfd accept any offset,
> but when RA chooses to assign X to a GPR register, the address doesn't match
> the Y constraint in movdf_hardfloat64.  Is this a bug in reload that it 
> doesn't
> attempt to force the address into a register, or in target description that it
> should tell reload to do so somehow?  Or does the backend need to be able to
> handle these, perhaps by forcing splitting of it?

If reload were fixing up this insn, it would indeed have to make sure that
the Y -> r constraints are respected, e.g. by reloading the address.

However, this whole insn is *generated* by reload, in order to load a
value into a reload register.   Unfortunately, for such reload insns
(which are simple moves), reload will simply assume they must be supported
by the target, unless there is a secondary reload for this case.

To fix this, I guess the rs6000 backend either has to accept the insn
and implement it via splitting, or else register a secondary reload for
this case (which is also able to request scratch registers).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41176



[Bug middle-end/37053] [4.3/4.4/4.5 regression] ICE in reload_cse_simplify_operands, at postreload.c:395

2009-08-10 Thread uweigand at gcc dot gnu dot org


--- Comment #19 from uweigand at gcc dot gnu dot org  2009-08-10 15:34 
---
Subject: Bug 37053

Author: uweigand
Date: Mon Aug 10 15:34:09 2009
New Revision: 150626

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=150626
Log:
PR target/37053
* reload1.c (reload_as_needed): Use cancel_changes to completely
undo a failed replacement attempt.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/reload1.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37053



[Bug middle-end/37053] [4.3/4.4/4.5 regression] ICE in reload_cse_simplify_operands, at postreload.c:395

2009-08-05 Thread uweigand at gcc dot gnu dot org


--- Comment #18 from uweigand at gcc dot gnu dot org  2009-08-05 14:59 
---
(In reply to comment #16)
> Uli, can you please have a look at Richard's and Paolo's patches and does one
> or the other seem like a "better" fix?

I've yet another suggestion :-)   See my message at:
http://gcc.gnu.org/ml/gcc-patches/2009-08/msg00254.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37053



[Bug fortran/39795] New: Support round-to-zero in Fortran front-end

2009-04-17 Thread uweigand at gcc dot gnu dot org
On the SPU, all single-precision floating-point arithmetic always
takes place in round-to-zero rounding mode.  The Fortran front-end
always assumes round-to-nearest mode.  This causes a number of issues:

- Both real->string and string->real transformations (e.g. printf, scanf)
  operate in round-to-zero mode.  This means that a round-trip transform
  will often not yield an identical result; this causes e.g. the
  default_format_1.f90 test case to fail.

  It seems this cannot be fixed as the behaviour of printf and scanf
  is specified to follow round-to-zero on the SPU ...

- As a special case of the real->string->real round-trip transform problem,
  the value of GFC_REAL_4_HUGE from the (generated) kinds.h does not convert
  to the largest real when read back in, but its immediate predecessor.
  This causes the scalar_mask_2.f90 test case to fail.

  This can be fixed by using rounding away from zero when generating the
  string constant that is written to the kinds.h file.

- Compile-time operations performed by the Fortran front-end are always
  done in round-to-nearest mode.  This results in different results as
  compared to executing the corresponding operations at run-time.  This
  causes e.g. the integer_exponentiation_3.F90 test case to fail.

  This can be fixed by having the Fortran front-end check the target
  floating format rounding mode, and using this mode to perform
  compile-time operations in.


-- 
   Summary: Support round-to-zero in Fortran front-end
   Product: gcc
   Version: 4.5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org
GCC target triplet: spu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39795



[Bug middle-end/38028] [4.4 Regression] eh failures on spu-elf

2009-03-12 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2009-03-12 14:02 
---
Fixed.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38028



[Bug target/39181] [4.4 Regression] complex int arguments cause ICE

2009-03-12 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2009-03-12 14:01 
---
Fixed.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39181



[Bug target/39181] [4.4 Regression] complex int arguments cause ICE

2009-03-12 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2009-03-12 14:00 
---
Subject: Bug 39181

Author: uweigand
Date: Thu Mar 12 14:00:21 2009
New Revision: 144811

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144811
Log:
PR target/39181
* config/spu/spu.c (spu_expand_mov): Handle invalid subregs
of non-integer mode as well.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/spu/spu.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39181



[Bug testsuite/39422] New: [4.4 regression] Failing SPU vectorizer testcases

2009-03-10 Thread uweigand at gcc dot gnu dot org
The following two SPU test cases now fail on mainline (they pass on 4.3):

FAIL: gcc.dg/vect/costmodel/spu/costmodel-vect-76b.c scan-tree-dump-times vect
"vectorized 1 loops" 1
FAIL: gcc.dg/vect/costmodel/spu/costmodel-vect-76c.c scan-tree-dump-times vect
"vectorized 1 loops" 1


-- 
   Summary: [4.4 regression] Failing SPU vectorizer testcases
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org
GCC target triplet: spu-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39422



[Bug middle-end/38028] [4.4 Regression] eh failures on spu-elf

2009-03-07 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2009-03-07 16:02 
---
Subject: Bug 38028

Author: uweigand
Date: Sat Mar  7 16:02:30 2009
New Revision: 144696

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144696
Log:
PR middle-end/38028
* function.c (assign_parm_setup_stack): Use STACK_SLOT_ALIGNMENT to
determine alignment passed to assign_stack_local.
(assign_parms_unsplit_complex): Likewise.
* except.c (sjlj_build_landing_pads): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/except.c
trunk/gcc/function.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38028



[Bug target/38025] gcc.target/spu/intrinsics-1.c test fails

2008-11-05 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2008-11-05 18:04 
---
The test case tests for expected failures.  It seems there is now an additional
message being output:

/home/meissner/fsf-src/trunk/gcc/testsuite/gcc.target/spu/intrinsics-1.c:13:
warning: passing argument 2 of ‘__builtin_spu_cmpgt_11’ makes integer from
pointer without a cast
/home/meissner/fsf-src/trunk/gcc/testsuite/gcc.target/spu/intrinsics-1.c:13:
note: expected ‘int’ but argument is of type ‘int *’

The testcase checks for the "makes integer from pointer" error, but does not
expect the additional "note".


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38025



[Bug bootstrap/37097] [4.4 Regression]: Revision 139014 failed to bootstrap

2008-08-12 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2008-08-12 14:45 
---
Should be fixed now ...


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37097



[Bug bootstrap/37097] [4.4 Regression]: Revision 139014 failed to bootstrap

2008-08-12 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2008-08-12 14:37 
---
Subject: Bug 37097

Author: uweigand
Date: Tue Aug 12 14:35:54 2008
New Revision: 139019

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=139019
Log:
PR bootstrap/37097
* builtins.c (do_mpfr_bessel_n): Fix copy-and-paste bug introduced
by last change.
-This line, and those below, will be ignored--

Mgcc/builtins.c
Mgcc/ChangeLog

Modified:
trunk/gcc/ChangeLog
trunk/gcc/builtins.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37097



[Bug target/36613] [4.2/4.3 Regression] likely codegen bug

2008-08-11 Thread uweigand at gcc dot gnu dot org


--- Comment #15 from uweigand at gcc dot gnu dot org  2008-08-11 15:12 
---
(In reply to comment #14)
> Ulrich asked for some time on the trunk (we have built all of our
> packages against a patched 4.3 tree now with no appearant problems as 
> well).

OK, in that case I have no further concern.  I'll leave it up to you as
release manager to decide when you want it to go into 4.3 ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36613



[Bug target/36613] [4.2/4.3/4.4 Regression] likely codegen bug

2008-07-31 Thread uweigand at gcc dot gnu dot org


--- Comment #11 from uweigand at gcc dot gnu dot org  2008-07-31 19:31 
---
I'll have a look tomorrow ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36613



[Bug target/36698] gcc.c-torture/compile/20001226-1.c exceeds SPU local store size with -O0

2008-07-02 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2008-07-02 15:59 
---
Subject: Bug 36698

Author: uweigand
Date: Wed Jul  2 15:58:09 2008
New Revision: 137368

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137368
Log:
PR target/36698
* gcc.c-torture/compile/20001226-1.c: XFAIL -O0 case on SPU.

* gcc.dg/pr27095.c: Provide target-specific regexp for SPU.

Modified:
branches/gcc-4_3-branch/gcc/testsuite/ChangeLog
branches/gcc-4_3-branch/gcc/testsuite/gcc.c-torture/compile/20001226-1.c
branches/gcc-4_3-branch/gcc/testsuite/gcc.dg/pr27095.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36698



[Bug target/36698] gcc.c-torture/compile/20001226-1.c exceeds SPU local store size with -O0

2008-07-02 Thread uweigand at gcc dot gnu dot org


--- Comment #1 from uweigand at gcc dot gnu dot org  2008-07-02 15:57 
---
Subject: Bug 36698

Author: uweigand
Date: Wed Jul  2 15:56:31 2008
New Revision: 137367

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137367
Log:
PR target/36698
* gcc.c-torture/compile/20001226-1.c: XFAIL -O0 case on SPU.

* gcc.dg/pr27095.c: Provide target-specific regexp for SPU.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.c-torture/compile/20001226-1.c
trunk/gcc/testsuite/gcc.dg/pr27095.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36698



[Bug target/36698] New: gcc.c-torture/compile/20001226-1.c exceeds SPU local store size with -O0

2008-07-02 Thread uweigand at gcc dot gnu dot org
The gcc.c-torture/compile/20001226-1.c test case, when compiled with -O0
on the spu-elf target, results in a single function that exceeds local store
size (256 KB) on the SPU.  This cannot run (cannot even be linked!); and 
because it is a single function, overlay support does not help either.

This PR is opened to document the XFAIL reason for that test case.
In the future, it is hoped that enhancements to overlay support (by
splitting up large functions into multiple sections) will enable this
test case to pass as well.


-- 
   Summary: gcc.c-torture/compile/20001226-1.c exceeds SPU local
store size with -O0
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org
GCC target triplet: spu-unknown-elf


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36698



[Bug target/34856] [4.2/4.3/4.4 Regression] ICE with some constant vectors

2008-06-28 Thread uweigand at gcc dot gnu dot org


--- Comment #29 from uweigand at gcc dot gnu dot org  2008-06-28 10:49 
---
Subject: Bug 34856

Author: uweigand
Date: Sat Jun 28 10:48:33 2008
New Revision: 137219

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137219
Log:
PR target/34856
* config/spu/spu.c (spu_builtin_splats): Do not generate
invalid CONST_VECTOR expressions.
(spu_expand_vector_init): Likewise.

Modified:
branches/gcc-4_3-branch/gcc/ChangeLog
branches/gcc-4_3-branch/gcc/config/spu/spu.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34856



[Bug target/34856] [4.2/4.3/4.4 Regression] ICE with some constant vectors

2008-06-28 Thread uweigand at gcc dot gnu dot org


--- Comment #28 from uweigand at gcc dot gnu dot org  2008-06-28 10:48 
---
Subject: Bug 34856

Author: uweigand
Date: Sat Jun 28 10:47:36 2008
New Revision: 137218

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=137218
Log:
PR target/34856
* config/spu/spu.c (spu_builtin_splats): Do not generate
invalid CONST_VECTOR expressions.
(spu_expand_vector_init): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/spu/spu.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34856



[Bug target/36222] x86 fails to optimize out __v4si -> __m128i move

2008-05-18 Thread uweigand at gcc dot gnu dot org


--- Comment #8 from uweigand at gcc dot gnu dot org  2008-05-18 15:58 
---
That special case in find_reloads is really about a different situation.
We do not have a simple move here.

The problem also is not really related to vector instruction in particular;
reload doesn't at all care what the instructions actually do ...

There are two problems involved in this particular case, each of which
suffices to prevent optimal register allocation.

Before reload, we have

(insn:HI 10 27 11 2 d.c:7 (set (reg:V2SI 66)
(vec_concat:V2SI (mem/c/i:SI (reg/f:SI 16 argp) [2 x1+0 S4 A32])
(reg/v:SI 60 [ x2 ]))) 1338 {*vec_concatv2si_sse2}
(expr_list:REG_DEAD (reg/v:SI 60 [ x2 ])
(nil)))

where local register allocation has already selected hard registers
for *both* operands 0 and 2:

;; Register 60 in 21.
;; Register 66 in 21.

Now, the insn pattern offers those alternatives:

(define_insn "*vec_concatv2si_sse2"
  [(set (match_operand:V2SI 0 "register_operand" "=x,x ,*y,*y")
(vec_concat:V2SI
  (match_operand:SI 1 "nonimmediate_operand" " 0,rm, 0,rm")
  (match_operand:SI 2 "reg_or_0_operand" " x,C ,*y, C")))]

As operand 2 is not zero ("C" constraint), reload must choose
the first alternative, which has a matching constraint between
operands 0 and 1.

This means that the choices selected by local-alloc (both operand
0 and 2 in the same register) *force* a reload of operand 0 here.

Why does local-alloc choose the same register for the two operands?
This happens because in general, there is no conflict between a
register that is used and set in the same insn, because usually the
same hard register *can* be used for both.

In this case this is not true, but local-alloc does not recognize
this.  There is indeed code in block_alloc that tries to handle
matching constraints, but this only recognizes the more typical
scenario where *every* alternative requires a match.

Here, we seemingly have alternatives that do not require a match
-- of course, that doesn't help because in these alternatives
operand 2 is extremely constrained ("C" will only accept a constant
zero) and so they aren't actually usable ...


Even assuming local-alloc had made better choices, reload would still
generate an output reload.  This second problem really comes down to
use of matching constraints between operands of different modes / sizes.

Once operand 0 was assigned to a hard register by local-alloc, reload
would generally attempt to also assign operand 1 to the same register,
in order to fulfill the matching constraint without requiring an
output reload.

This is done by the routine find_dummy_reload.  However, in this
particular case, that routine immediately fails due to:

  /* If operands exceed a word, we can't use either of them
 unless they have the same size.  */
  if (GET_MODE_SIZE (outmode) != GET_MODE_SIZE (inmode)
  && (GET_MODE_SIZE (outmode) > UNITS_PER_WORD
  || GET_MODE_SIZE (inmode) > UNITS_PER_WORD))
return 0;

because operand 0 is two words in size, while operand 1 is just
a single word in size.  I'm not completely sure this check (which
has been in SVN forever) is still required today ...


In any case, the simplest work-around might be to write that pattern
in a way that is easier to handle by local-alloc / reload:  the two
cases x <- 0, x  and x <- rm, C are nearly completely unrelated; why
not split them into two different insn patterns?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36222



[Bug rtl-optimization/34999] Fallthru crossing edges in partition_hot_cold_basic_blocks are not been fixed when the section ends with call insn

2008-03-04 Thread uweigand at gcc dot gnu dot org


--- Comment #16 from uweigand at gcc dot gnu dot org  2008-03-04 14:51 
---
Hi Jakub,

we need the same changes in both .eh_frame and .dwarf_frame;
does the gas .cfi_ support both sections?

I'm wondering how "save & restore" should work across two
different FDEs -- in the new FDE, we'd have to emit the full
set of CFA instructions to get to the "base-line" state ...


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||uweigand at de dot ibm dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34999



[Bug target/35311] ICE at postreload.c:392 while building webkit on s390

2008-02-25 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2008-02-25 22:15 
---
(In reply to comment #3)
> This problem has already been fixed for GCC 4.3 (#34641). The testcase from
> that PR didn't fail for GCC 4.2 so I didn't apply the patch on 4.2 as well. 
> But
> now the patch should be fine for 4.2. I've verified that it fixes your
> testcase.

I agree this patch should go into 4.2 as well.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35311



[Bug target/34529] [4.1/4.2/4.3 Regression] Wrong code with altivec stores and offsets

2008-01-21 Thread uweigand at gcc dot gnu dot org


--- Comment #11 from uweigand at gcc dot gnu dot org  2008-01-21 18:54 
---
The secondary reload hook does not need to make the decision whether or
not indexed addresses are allowed; that decision has already been taken.

The purpose of the secondary reload hook is simply to do whatever it takes
to load an indexed address into a base register (after reload has already
decided that this address needs to be loaded).

Reload does not allocate scratch registers always since it assumes that
targets provide a "load address" instruction that it able to perform
this operation without scratch register.  You need the secondary reload
to let common code know that this doesn't work on your target.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34529



[Bug rtl-optimization/34529] [4.1/4.2/4.3 Regression] Wrong code with altivec stores and offsets

2008-01-09 Thread uweigand at gcc dot gnu dot org


--- Comment #8 from uweigand at gcc dot gnu dot org  2008-01-09 19:23 
---
This is a long-standing problem in gen_reload.  This routine fundamentally
assumes that every PLUS expression that describes a legitimate address can
be reloaded into a register without requiring any additional scratch registers.

It first attempts to generate a simple (set (...) (plus ...)) pattern.  If
the target does not accept this pattern, it tries as a fall-back to generate
a two-insn sequence to perform the load.

This code looks fundamentally broken to me, in that it does not check whether
the first of those two insns clobbers a register still required by the second.
However, even if such a check were added, at this point we could not actually
solve the problem, because it's far too late to allocate another register.

I ran into the same problem in the s390 back-end a long time ago, and solved
it by providing a back-end specific secondary reload rule (requiring a scratch
register) to reload PLUS patterns that require multiple instructions.

I think you'll have to do the same for rs6000.

(As a separate issue, the broken code in gen_reload should be replaced by an
abort -- at least back-ends where the secondary reload is missing would no
longer generate wrong code in that case.)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34529



[Bug target/34250] ICE in find_constant_pool_ref

2007-11-28 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2007-11-28 17:11 
---
(In reply to comment #4)
> For reference, our hacky approach to enforce liveness of arguments is by
> using them as operands of an inline asm, which we insert as first instruction
> in every function.  When those are inlined and arguments seen as constant
> (e.g. function names, __func__) it quickly happens that there are more than
> one constant pool ref in one inline asm.

I'm not sure I see what the point of this is ...

> But I see what you are saying regarding the possibility of overflowing the
> pool inside one instruction.  Will the compiler ICE in that situation or
> will it silently generate wrong code?  If the former I'm willing to accept
> that risk for now, after all split constant pools are relatively new anyway,
> IIRC.

Not really, they've always been required on s390.  (Note that on s390x,
and even on s390 with -march=z900 or higher, split constant pool are no
longer necessary.)

I expect that what will happen when the pool overflows is that you get a
linker error because the 12-bit relocation in the displacement field
overflows.  (But maybe this is only a warning, I'm not completely sure
this hasn't changed across different linker versions ...)

Note that if I understand your usage correctly, this should in fact never
occur as you do not generate any code that actually references those
literal pool entries.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34250



[Bug target/34250] ICE in find_constant_pool_ref

2007-11-28 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2007-11-28 13:36 
---
Hi Michael,

the problem is that there is an implicit assumption throughout the code
that you can have at most one pool constant per instruction.  For example,
the pool size / splitting heuristics assume that.  I think with your patch
as is, you can find examples where it will attempt to add 2 constants to
the current pool chunk even though it only has room for 1 left.

This could probably be fixed by reworking some of the heuristics (e.g. 
check *first* how many constants an insn will require, and start up a
new pool early if required).  But that can be a bit tricky ...

What fundamentally cannot be fixed is the extreme case where the single
instruction uses so many constants that they don't fit into a pool chunk
even by themselves.  We can only reload the base register to point to a
different chunk once before every insn.

Can you elaborate why this occurs in "normal" code (without inline asm)?


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 CC|        |uweigand at gcc dot gnu dot
   |        |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34250



[Bug middle-end/32970] [4.3 Regression] C++ frontend can not handle vector pointer constant parameter

2007-08-12 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2007-08-12 23:43 
---
Sa's patch isn't quite correct as it ignores the result of
the build_qualified_type call.  The following patch should
fix that:

diff -urNp toolchain/gcc.orig/gcc/tree.c toolchain/gcc/gcc/tree.c
--- toolchain/gcc.orig/gcc/tree.c   2007-08-12 15:57:05.442520932 +0200
+++ toolchain/gcc/gcc/tree.c2007-08-12 16:07:42.516093968 +0200
@@ -6554,10 +6554,7 @@ reconstruct_complex_type (tree type, tre
   else
 return bottom;

-  TYPE_READONLY (outer) = TYPE_READONLY (type);
-  TYPE_VOLATILE (outer) = TYPE_VOLATILE (type);
-
-  return outer;
+  return build_qualified_type (outer, TYPE_QUALS (type));
 }

 /* Returns a vector tree node given a mode (integer, vector, or BLKmode) and


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32970



[Bug middle-end/32970] [4.3 Regression] C++ frontend can not handle vector pointer constant parameter

2007-08-12 Thread uweigand at gcc dot gnu dot org


--- Comment #6 from uweigand at gcc dot gnu dot org  2007-08-12 23:35 
---
Changing component to middle-end as the problem is not actually in the C++
front-end.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

  Component|c++ |middle-end


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32970



[Bug middle-end/30761] [4.1/4.2 regression] Error: unsupported relocation against sfp

2007-04-27 Thread uweigand at gcc dot gnu dot org


--- Comment #11 from uweigand at gcc dot gnu dot org  2007-04-27 15:03 
---
(In reply to comment #8)
> Ulrich, in response to your question in Comment #6, yes, this bug appears in
> 4.1 and 4.2, not just in 4.3.  So, if you think it's safe to backport the
> reload patch, it would be nice to have the fix there as well.

I've back-ported the fix to 4.2 and 4.1 now.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30761



[Bug middle-end/30761] [4.1/4.2 regression] Error: unsupported relocation against sfp

2007-04-27 Thread uweigand at gcc dot gnu dot org


--- Comment #10 from uweigand at gcc dot gnu dot org  2007-04-27 14:59 
---
Subject: Bug 30761

Author: uweigand
Date: Fri Apr 27 14:59:21 2007
New Revision: 124219

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124219
Log:
PR middle-end/30761
* reload1.c (eliminate_regs_in_insn): In the single_set special
case, attempt to re-recognize the insn before falling back to
having reload fix it up.

Modified:
branches/gcc-4_1-branch/gcc/ChangeLog
branches/gcc-4_1-branch/gcc/reload1.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30761



[Bug middle-end/30761] [4.1/4.2 regression] Error: unsupported relocation against sfp

2007-04-26 Thread uweigand at gcc dot gnu dot org


--- Comment #9 from uweigand at gcc dot gnu dot org  2007-04-26 22:10 
---
Subject: Bug 30761

Author: uweigand
Date: Thu Apr 26 22:10:09 2007
New Revision: 124199

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124199
Log:
PR middle-end/30761
* reload1.c (eliminate_regs_in_insn): In the single_set special
case, attempt to re-recognize the insn before falling back to
having reload fix it up.

Modified:
branches/gcc-4_2-branch/gcc/ChangeLog
branches/gcc-4_2-branch/gcc/reload1.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30761



[Bug target/31641] [4.1/4.2/4.3 Regression] ICE in s390_expand_setmem, at config/s390/s390.c:3618

2007-04-23 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2007-04-23 14:51 
---
I don't think the patch is correct; according to the C standard,
the third argument of memset is of type size_t, which must be
an *unsigned* type, so it cannot in fact be negative.

What apparently happens is that the argument (after conversion to
size_t) is so big that it appears to be negative in its representation
as CONST_INT, so the assert in s390.c triggers.

A proper fix would probably be to remove the assert in s390_expand_setmem
and at the same time make sure those big sizes are handled correctly.

(In any case, the testcase certainly is broken anyway.)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31641



[Bug tree-optimization/30590] [4.1/4.2/4.3 Regression] tree-nrv optimization clobbers return variable

2007-03-14 Thread uweigand at gcc dot gnu dot org


--- Comment #12 from uweigand at gcc dot gnu dot org  2007-03-14 15:26 
---
This does fix my testcase on mainline.  Thanks!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30590



[Bug middle-end/30761] [4.1/4.2 regression] Error: unsupported relocation against sfp

2007-03-12 Thread uweigand at gcc dot gnu dot org


--- Comment #6 from uweigand at gcc dot gnu dot org  2007-03-12 19:34 
---
I haven't verified that this problem is fixed -- the patch was originally
intended to fix another bug uncovered by Peter Bergner, and I just added
this PR number to the check-in due to Andrew's comment #3 on this bug.

Andrew, have you verified that the problem is fixed now?

I could backport the reload1.c change to 4.1/4.2 -- I haven't done so as
there's always some risk associated with such backports, and it appeared
from this bugzilla record that the bug was exposed only with a 4.3 change
anyway.  Was this impression mistaken?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30761



[Bug middle-end/30761] [4.1/4.2/4.3 regression] Error: unsupported relocation against sfp

2007-02-21 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2007-02-21 15:05 
---
Subject: Bug 30761

Author: uweigand
Date: Wed Feb 21 15:05:01 2007
New Revision: 122199

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=122199
Log:
PR middle-end/30761
* reload1.c (eliminate_regs_in_insn): In the single_set special
case, attempt to re-recognize the insn before falling back to
having reload fix it up.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/reload1.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30761



[Bug tree-optimization/30590] New: tree-nrv optimization clobbers return variable

2007-01-25 Thread uweigand at gcc dot gnu dot org
The following test case, when compiled with g++ -O, has a return
value of 1 (instead of the correct value of 0):

struct test
{
  int type;
  char buffer[4242]; /* should trigger pass-by-reference */
};

int flag = 0;

struct test
reset (void)
{
  struct test retval;
  retval.type = 1;
  return retval;
}

struct test
test (void)
{
  struct test result;
  result.type = 0;

  for (int i = 0; i < 2; ++i)
{
  struct test candidate = reset ();
  if (flag)
result = candidate;
}

  return result;
}

int
main (void)
{
  struct test result = test ();
  return result.type;
}


The reason for this appears to be a bug in the tree-nrv (named
return value) optimization pass.   Before tree-nrv, the function
test looks like:

  struct test candidate;
  int i;
  int flag.0;

:
  .type = 0;
  flag.0 = flag;
  i = 0;

:;
  candidate = reset () [return slot optimization];
  if (flag.0 != 0) goto ; else goto ;

:;
   = candidate;

:;
  i = i + 1;
  if (i != 2) goto ; else goto ;

:;
  return ;


After tree-nrv, we have:

  struct test candidate;
  int i;
  int flag.0;

:
  .type = 0;
  flag.0 = flag;
  i = 0;

:;
   = reset () [return slot optimization];
  if (flag.0 != 0) goto ; else goto ;

:;

:;
  i = i + 1;
  if (i != 2) goto ; else goto ;

:;
  return ;


The return value of reset has been redirected directly
into the return value slot of test, instead of the local
variable candidate.   tree-nrv.c has some code that is
apparently intended to prevent this type of thing; I'm
not sure why that didn't work here.

The bug occurs (at least) in GCC 4.1.1 and current mainline.


-- 
   Summary: tree-nrv optimization clobbers return variable
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
    ReportedBy: uweigand at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30590



[Bug target/29319] ICE unrecognizable insn: offset too large for larl (breaks glibc)

2006-10-24 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2006-10-24 19:03 
---
Sorry for missing that bug.  The proposed patch is OK -- thanks for 
catching this.

As to the general problem, I think you're right that we need to further
constrain the range of accepted offsets.  However, DISP_IN_RANGE is not
the right solution, we can do a lot better.

I think the right fix would be to accept any offset in the +- 2 GB
range (*not* 4 GB) as today.  Since we restrict executable / shared
object sizes to 2 GB right now, the delta between the symbol and the
pc is in the range +- 2GB.  Adding an offset in the +- 2 GB range 
will result in a total delta in the +- 4 GB range -- which is just
what larl allows.

The +- 2 GB range is also big enough to accept any (reasonable)
offset on a 31-bit system.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29319



[Bug middle-end/28862] [4.0/4.1/4.2 Regression] attribute ((aligned)) ignored on vector variables

2006-09-05 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2006-09-05 12:47 
---
(In reply to comment #5)
> Is this also supposed to fix the problem I posted in comment #2? I applied 
> that
> patch to my gcc but it didn't fix the generated code for me. It's just weird
> because the bug only appears if the code is complex enough. If it's just a
> rather simple function, the generated code is correct.

No, your problem is certainly something completely different.  In fact I've
never seen GCC (common code) do anything even remotely like:
>GCC reserves an area big enough to hold the structure plus padding,
>so it can align the structure dynamically at runtime. It stores a
>pointer to the reserved area and a pointer to the structure within
>the area. 

Normally, attribute ((aligned)) does not cause any code to be
generated that attempts to dynamically adjust alignment at runtime,
it simply allows a variable to be aligned up to whatever default
stack frame alignment the platform ABI provides for.

It appears that the i386 back-end has some special code related to
the -mstackrealign option that may be involved here.  In any case,
this would be something for an i386 back-end person to look into.

Since this is a completely unrelated problem, I recommend you open
a separate bugzilla for it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28862



[Bug middle-end/28862] [4.0/4.1/4.2 Regression] attribute ((aligned)) ignored on vector variables

2006-09-05 Thread uweigand at gcc dot gnu dot org


--- Comment #6 from uweigand at gcc dot gnu dot org  2006-09-05 12:41 
---
(In reply to comment #4)
> Anyways I am going to test the obvious fix unless you (Ulrich) want to do it.

Please go ahead, thanks!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28862



[Bug c/28862] New: attribute ((aligned)) ignored on vector variables

2006-08-26 Thread uweigand at gcc dot gnu dot org
The following test case

__attribute__ ((vector_size (16))) unsigned int foo[128/16]
__attribute__((aligned (128)));

[ and analagously

vector unsigned int foo[128/16] __attribute__((aligned (128)));

on ppc (where "vector" is defined to __attribute__((altivec(vector__)))
or spu (where "vector" is defined to __attribute__((spu_vector))) ]

compiles to

.comm   foo,128,16

Note that the user-specified alignment is ignored, and the default
alignment of 16 for this vector type is used instead.

The reason appears to be a problem in decl_attributes (attribs.c).
For this declaration, first the "aligned" attribute is processed,
and sets DECL_ALIGN to 128 bytes, as well as the DECL_USER_ALIGN
flag.  However, subsequently the "vector_size" attribute is
processed, and this this is marked as "type_required", the following
piece of code in decl_attributes:

  /* Layout the decl in case anything changed.  */
  if (spec->type_required && DECL_P (*node)
  && (TREE_CODE (*node) == VAR_DECL
  || TREE_CODE (*node) == PARM_DECL
  || TREE_CODE (*node) == RESULT_DECL))
relayout_decl (*node);

thinks it needs to recompute the decl properties.  In particular,

void
relayout_decl (tree decl)
{
  DECL_SIZE (decl) = DECL_SIZE_UNIT (decl) = 0;
  DECL_MODE (decl) = VOIDmode;
  DECL_ALIGN (decl) = 0;
  SET_DECL_RTL (decl, 0);

  layout_decl (decl, 0);
}

relayout_decl resets DECL_ALIGN without consideration of the
DECL_USER_ALIGN flag, and layout_decl then fills back in the
default alignment for the vector type.

The problem does not occur in 3.4, since decl_attributes there
works like this:

  /* Layout the decl in case anything changed.  */
  if (spec->type_required && DECL_P (*node)
  && (TREE_CODE (*node) == VAR_DECL
  || TREE_CODE (*node) == PARM_DECL
  || TREE_CODE (*node) == RESULT_DECL))
{
  /* Force a recalculation of mode and size.  */
  DECL_MODE (*node) = VOIDmode;
  DECL_SIZE (*node) = 0;
  if (!DECL_USER_ALIGN (*node))
DECL_ALIGN (*node) = 0;

  layout_decl (*node, 0);
}

and specifically keeps user-requested alignments.


Now, I'm not quite sure what the correct fix for 4.0 and mainline is.
Should be not call relayout_decl (as in 3.4)?  Or should we add the
DECL_USER_ALIGN check to relayout_decl (what about other callers of
this function)?

Richard, it appears you added both the DECL_USER_ALIGN check in
3.4, and the relayout_decl call in 4.0, see PR 18282.  Any opinion?


-- 
   Summary: attribute ((aligned)) ignored on vector variables
   Product: gcc
   Version: 4.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
     Component: c
        AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28862



[Bug c++/18182] Incorrect processing of __attribute__ by the C++ parser

2006-07-14 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2006-07-14 19:27 
---
Yes, looks like this is long fixed.  Closing bug now.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18182



[Bug target/27842] Miscompile of Altivec vec_abs (float) inside loop

2006-06-06 Thread uweigand at gcc dot gnu dot org


--- Comment #9 from uweigand at gcc dot gnu dot org  2006-06-06 17:10 
---
Fixed on 4.1 branch and mainline.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842



[Bug target/27842] Miscompile of Altivec vec_abs (float) inside loop

2006-06-06 Thread uweigand at gcc dot gnu dot org


--- Comment #8 from uweigand at gcc dot gnu dot org  2006-06-06 17:05 
---
Subject: Bug 27842

Author: uweigand
Date: Tue Jun  6 17:04:56 2006
New Revision: 114439

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=114439
Log:
PR target/27842
* config/rs6000/altivec.md (UNSPEC_VSLW): Remove.
("altivec_vspltisw_v4sf", "altivec_vslw_v4sf"): Remove.
("mulv4sf3", "absv4sf3", "negv4sf3"): Adapt users to use
V4SImode temporaries and operations instead.

PR target/27842
* gcc.dg/vmx/pr27842.c: New test.

Added:
branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/vmx/pr27842.c
Modified:
branches/gcc-4_1-branch/gcc/ChangeLog
branches/gcc-4_1-branch/gcc/config/rs6000/altivec.md
branches/gcc-4_1-branch/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842



[Bug target/27842] Miscompile of Altivec vec_abs (float) inside loop

2006-06-06 Thread uweigand at gcc dot gnu dot org


--- Comment #7 from uweigand at gcc dot gnu dot org  2006-06-06 17:01 
---
Subject: Bug 27842

Author: uweigand
Date: Tue Jun  6 17:01:27 2006
New Revision: 114438

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=114438
Log:
PR target/27842
* config/rs6000/altivec.md (UNSPEC_VSLW): Remove.
("altivec_vspltisw_v4sf", "altivec_vslw_v4sf"): Remove.
("mulv4sf3", "absv4sf3", "negv4sf3"): Adapt users to use
V4SImode temporaries and operations instead.

PR target/27842
* gcc.dg/vmx/pr27842.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vmx/pr27842.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/altivec.md
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842



[Bug target/27842] Miscompile of Altivec vec_abs (float) inside loop

2006-06-01 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2006-06-01 21:30 
---
Yes, that makes sense -- in fact, it looks like altivec_vslw_v4sf can then be
removed as well.  I'm currenly testing a patch to that effect ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842



[Bug target/27842] Miscompile of Altivec vec_abs (float) inside loop

2006-05-31 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2006-05-31 16:59 
---
I'm not sure (subreg:SF (const_int)) is canonical RTL, I haven't seen
subregs of anything but REG or MEM.

In any case, I don't really see what this would buy us over an UNSPEC -- will
the  generic simplifier be able to evaluate this by re-interpreting the bit
pattern
as float according to the target representation?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842



[Bug target/27842] New: Miscompile of Altivec vec_abs (float) inside loop

2006-05-31 Thread uweigand at gcc dot gnu dot org
The following test case gets miscompiled and fails when built
with "-O -maltivec -mabi=altivec -include altivec.h" on GCC 4.1:

extern void abort (void);

void test (vector float *p, int n)
{
  int i;
  for (i = 0; i < n; i++)
p[i] = vec_abs (p[i]);
}

int
main (void)
{
  vector float p = (vector float){ 0.5, 0.5, 0.5, 0.5 };
  vector float q = p;

  test (&p, 1);

  if (memcmp (&p, &q, sizeof (p)))
abort ();

  return 0;
}


The reason for this appears to be an abuse of RTL semantics by
the "altivec_vspltisw_v4sf" pattern:

(define_insn "altivec_vspltisw_v4sf"
  [(set (match_operand:V4SF 0 "register_operand" "=v")
(vec_duplicate:V4SF
 (float:SF (match_operand:QI 1 "s5bit_cint_operand" "i"]
  "TARGET_ALTIVEC"
  "vspltisw %0,%1"
  [(set_attr "type" "vecperm")])

What this instruction does is to load an immediate *integer*
value into a vector register, which happens to be re-interpreted
as a floating point value (without changing the bit pattern).

What the RTL associated with the pattern *says*, however, is
to load the integer, *converted to floating point*, into the
register, which is a quite different semantics.

Now, since the pattern is only explicitly generated from within
other expanders inside altivec.md (apparently), all of which
expect the semantics of the actual Altivec instruction, not the
semantics as literally specified in the RTL, those misinterpretations
generally cancel each other and generated code behaves as expected.

However, as soon as the middle-end gets an opportunity to run the
RTL through the simplifier, everything breaks.  This happens in
particular when the load is being hoisted out of a loop due to
being loop-invariant, as in the above test case: the vec_abs
pattern expands via this expander

;; Generate
;;vspltisw SCRATCH1,-1
;;vslw SCRATCH2,SCRATCH1,SCRATCH1
;;vandc %0,%1,SCRATCH2
(define_expand "absv4sf2"
  [(set (match_dup 2)
(vec_duplicate:V4SF (float:SF (const_int -1
   (set (match_dup 3)
(unspec:V4SF [(match_dup 2) (match_dup 2)] UNSPEC_VSLW))
   (set (match_operand:V4SF 0 "register_operand" "=v")
(and:V4SF (not:V4SF (match_dup 3))
  (match_operand:V4SF 1 "register_operand" "v")))]
  "TARGET_ALTIVEC"
{
  operands[2] = gen_reg_rtx (V4SFmode);
  operands[3] = gen_reg_rtx (V4SFmode);
})

and the first two insns are in fact loop invariant.

The problem in this particular test case is a regression in GCC 4.1,
introduced by this patch:

[PATCH] Improve scheduling of Altivec absolute value patterns
http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01195.html

-(define_insn "absv4sf2"
-  [(set (match_operand:V4SF 0 "register_operand" "=v")
-(abs:V4SF (match_operand:V4SF 1 "register_operand" "v")))
-   (clobber (match_scratch:V4SF 2 "=&v"))
-   (clobber (match_scratch:V4SF 3 "=&v"))]
+;; Generate
+;;vspltisw SCRATCH1,-1
+;;vslw SCRATCH2,SCRATCH1,SCRATCH1
+;;vandc %0,%1,SCRATCH2
+(define_expand "absv4sf2"
+  [(set (match_dup 2)
+   (vec_duplicate:V4SF (float:SF (const_int -1
+   (set (match_dup 3)
+(unspec:V4SF [(match_dup 2) (match_dup 2)] UNSPEC_VSLW))
+   (set (match_operand:V4SF 0 "register_operand" "=v")
+(and:V4SF (not:V4SF (match_dup 3))
+  (match_operand:V4SF 1 "register_operand" "v")))]
   "TARGET_ALTIVEC"
-  "vspltisw %2,-1\;vslw %3,%2,%2\;vandc %0,%1,%3"
-  [(set_attr "type" "vecsimple")
-   (set_attr "length" "12")])
+{
+  operands[2] = gen_reg_rtx (V4SFmode);
+  operands[3] = gen_reg_rtx (V4SFmode);
+})

However, the underlying abuse of RTL semantics when describing the
vspltisw instruction in V4SFmode apparently pre-dates this patch.


The easiest way to fix this would appear to use an UNSPEC to
describe the insn semantics.  Any better idea?


-- 
   Summary: Miscompile of Altivec vec_abs (float) inside loop
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org
GCC target triplet: powerpc*-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27842



[Bug rtl-optimization/27661] ICE in subst_reloads

2006-05-26 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2006-05-26 20:22 
---
Subject: Bug 27661

Author: uweigand
Date: Fri May 26 20:21:53 2006
New Revision: 114141

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=114141
Log:
PR rtl-optimization/27661
* reload.c (find_reloads): When reloading a VOIDmode constant
as address due to an EXTRA_MEMORY_CONSTRAINT or 'o' constraint,
use Pmode as mode of the reload register.

PR rtl-optimization/27661
* gcc.dg/pr27661.c: New test case.

Added:
trunk/gcc/testsuite/gcc.dg/pr27661.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/reload.c
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27661



[Bug target/27772] mr instruction with odd-numbered register created

2006-05-26 Thread uweigand at gcc dot gnu dot org


--- Comment #2 from uweigand at gcc dot gnu dot org  2006-05-26 12:58 
---
This looks like a source-code problem.  The assembler instruction

 union {DItype __ll; struct {USItype __h, __l;} __i; } __x;
 __asm__ ("lr %N0,%1\n\tmr %0,%2" : "=&r" (__x.__ll)
  : "r" (__xm0), "r" (__xm1));

fundamentally assumes __ll is in fact of mode DImode, as the type name
DItype suggests -- that's (on 32-bit) what causes reload to allocate a
register *pair* for the %0 operand.

However, in your mul.i file, that type is defined as:

typedef long int DItype;

which happens to be in fact SImode ...


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27772



[Bug rtl-optimization/27661] ICE in subst_reloads

2006-05-22 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2006-05-22 13:27 
---
Looking somewhat more into this problem, there are other places where
reload decides to reload an CONST_INT as address.  Where this happens,
it usually uses Pmode as the mode to do the reload in (which makes
sense as Pmode should always be valid as the mode of an address).

Look e.g. at the various call sites of find_reloads_address_part.

However, at the place where the decision to reload a valid address
due to and EXTRA_MEMORY_CONSTRAINT or 'o' constraint is made, this
special treatment of VOIDmode constants is missing.  However, it
looks to me that this was simply an oversight here.

Thus, I'd propose something like the following patch (as of right
now completely untested) to replace VOIDmode by Pmode at that place
too.  This should fix the problem.

Index: gcc/reload.c
===
*** gcc/reload.c(revision 113828)
--- gcc/reload.c(working copy)
*** find_reloads (rtx insn, int replace, int
*** 3854,3864 
 && goal_alternative_offmemok[i]
 && MEM_P (recog_data.operand[i]))
  {
operand_reloadnum[i]
  = push_reload (XEXP (recog_data.operand[i], 0), NULL_RTX,
 &XEXP (recog_data.operand[i], 0), (rtx*) 0,
 base_reg_class (VOIDmode, MEM, SCRATCH),
!GET_MODE (XEXP (recog_data.operand[i], 0)),
 VOIDmode, 0, 0, i, RELOAD_FOR_INPUT);
rld[operand_reloadnum[i]].inc
  = GET_MODE_SIZE (GET_MODE (recog_data.operand[i]));
--- 3854,3872 
 && goal_alternative_offmemok[i]
 && MEM_P (recog_data.operand[i]))
  {
+   /* If the address to be reloaded is a VOIDmode constant,
+  use Pmode as mode of the reload register, as would have
+  been done by find_reloads_address.  */
+   enum machine_mode address_mode;
+   address_mode = GET_MODE (XEXP (recog_data.operand[i], 0));
+   if (address_mode == VOIDmode)
+ address_mode = Pmode;
+
operand_reloadnum[i]
  = push_reload (XEXP (recog_data.operand[i], 0), NULL_RTX,
 &XEXP (recog_data.operand[i], 0), (rtx*) 0,
 base_reg_class (VOIDmode, MEM, SCRATCH),
!address_mode,
 VOIDmode, 0, 0, i, RELOAD_FOR_INPUT);
rld[operand_reloadnum[i]].inc
  = GET_MODE_SIZE (GET_MODE (recog_data.operand[i]));


I'll test the patch and propose it on gcc-patches if it works out.
Could you verify this patch helps with your original testcase?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27661



[Bug target/27006] [4.1/4.2 Regression] Invalid altivec constant loading code

2006-04-13 Thread uweigand at gcc dot gnu dot org


--- Comment #10 from uweigand at gcc dot gnu dot org  2006-04-13 20:35 
---
Fixed for 4.1 and mainline.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27006



[Bug target/27006] [4.1/4.2 Regression] Invalid altivec constant loading code

2006-04-13 Thread uweigand at gcc dot gnu dot org


--- Comment #9 from uweigand at gcc dot gnu dot org  2006-04-13 20:33 
---
Subject: Bug 27006

Author: uweigand
Date: Thu Apr 13 20:33:51 2006
New Revision: 112924

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112924
Log:
2006-04-13  Paolo Bonzini  <[EMAIL PROTECTED]>
Ulrich Weigand  <[EMAIL PROTECTED]>

PR target/27006
* config/rs6000/rs6000.h (EASY_VECTOR_15_ADD_SELF): Require n
to be even.

PR target/27006
* gcc.dg/vmx/pr27006.c: New testcase.

Added:
branches/gcc-4_1-branch/gcc/testsuite/gcc.dg/vmx/pr27006.c
Modified:
branches/gcc-4_1-branch/gcc/ChangeLog
branches/gcc-4_1-branch/gcc/config/rs6000/rs6000.h
branches/gcc-4_1-branch/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27006



[Bug target/27006] [4.1/4.2 Regression] Invalid altivec constant loading code

2006-04-13 Thread uweigand at gcc dot gnu dot org


--- Comment #8 from uweigand at gcc dot gnu dot org  2006-04-13 20:27 
---
Subject: Bug 27006

Author: uweigand
Date: Thu Apr 13 20:26:59 2006
New Revision: 112923

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112923
Log:
2006-04-13  Paolo Bonzini  <[EMAIL PROTECTED]>
Ulrich Weigand  <[EMAIL PROTECTED]>

PR target/27006
* config/rs6000/rs6000.h (EASY_VECTOR_15_ADD_SELF): Require n
to be even.

PR target/27006
* gcc.dg/vmx/pr27006.c: New testcase.


Added:
trunk/gcc/testsuite/gcc.dg/vmx/pr27006.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.h
trunk/gcc/testsuite/ChangeLog


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27006



[Bug target/27006] [4.1/4.2 Regression] Invalid altivec constant loading code

2006-04-13 Thread uweigand at gcc dot gnu dot org


--- Comment #6 from uweigand at gcc dot gnu dot org  2006-04-13 11:47 
---
I've now tested and submitted the patch, thanks.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

URL||http://gcc.gnu.org/ml/gcc-
   ||patches/2006-
   ||04/msg00490.html
   Keywords||patch


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27006



[Bug target/27006] [4.1/4.2 Regression] Invalid altivec constant loading code

2006-04-06 Thread uweigand at gcc dot gnu dot org


--- Comment #4 from uweigand at gcc dot gnu dot org  2006-04-06 14:03 
---
(In reply to comment #3)
> Ulrich, can you prepare a patch or should I do so?

It would be great if you could do that, I don't yet
have a proper setup for ppc testing ...


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27006



[Bug target/27006] New: Invalid altivec constant loading code

2006-04-03 Thread uweigand at gcc dot gnu dot org
When compiling the following code with -O0 -maltivec:

typedef union
{
  int i[4];
  __attribute__((altivec(vector__))) int v;
} vec_int4;

int main (void)
{
   vec_int4 i1;

   i1.v = (__attribute__((altivec(vector__))) int){31, 31, 31, 31};
   printf ("%d\n", i1.i[0]);

   return 0;
}

the output printed is 30, not 31.

The load of the vector constant is done by the following pair
of instructions:

vspltisw 0,15
vadduwm 0,0,0

which are generated by this splitter in altivec.md:

(define_split
  [(set (match_operand:VI 0 "altivec_register_operand" "")
(match_operand:VI 1 "easy_vector_constant_add_self" ""))]
  "TARGET_ALTIVEC && reload_completed"
  [(set (match_dup 0) (match_dup 3))
   (set (match_dup 0) (plus:VI (match_dup 0)
   (match_dup 0)))]
{
  rtx dup = gen_easy_altivec_constant (operands[1]);
  rtx const_vec;

  /* Divide the operand of the resulting VEC_DUPLICATE, and use
 simplify_rtx to make a CONST_VECTOR.  */
  XEXP (dup, 0) = simplify_const_binary_operation (ASHIFTRT, QImode,
   XEXP (dup, 0), const1_rtx);
  const_vec = simplify_rtx (dup);

  if (GET_MODE (const_vec) == mode)
operands[3] = const_vec;
  else
operands[3] = gen_lowpart (mode, const_vec);
})

Now, easy_vector_constand_add_self accepts all constants between
16 and 31, where I think it should really only be accepting *even*
constants.

The test is really implemented in rs6000.h:

#define EASY_VECTOR_15_ADD_SELF(n) (!EASY_VECTOR_15((n))\
&& EASY_VECTOR_15((n) >> 1))

and adding a condition ((n) & 1) == 0 here fixes the problem.

Is this the proper solution?


-- 
   Summary: Invalid altivec constant loading code
   Product: gcc
   Version: 4.2.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: target
        AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uweigand at gcc dot gnu dot org
GCC target triplet: powerpc-*-linux*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27006



[Bug other/26208] Serious problem with unwinding through signal frames

2006-02-22 Thread uweigand at gcc dot gnu dot org


--- Comment #18 from uweigand at gcc dot gnu dot org  2006-02-22 09:57 
---
(In reply to comment #17)

> (e.g. s390/linux-unwind.h was doing that, although just for 2 selected
> signals, which  wasn't good enough, as e.g. all async signals need to be
> handled the same).

We've actually taken quite a bit of care to ensure that the position of
the PC on s390 can be understood deterministically in all cases.  The
synchronous signals come in two flavours:
- SIGSEGV and SIGBUS: here the PC points back to the instruction 
that we attempted and failed to execute -- returning from the signal
handler would by default re-execute the failed instruction
- SIGILL, SIGFPE, and SIGTRAP: here the PC points after the instruction
that caused the signal condition to be raised -- returning from the
signal handler would by default simply continue after that instruction

For all asynchronous signals, the PC points to the first instruction we
have not executed yet -- returning from the signal handler thus continues
execution with that instruction.

So you *need* to handle signals differently depending on what signal
it is -- I'm not sure I understand why you want to remove that.  What
we currently have definitely works correctly for all synchronous signals
on s390.

As for asynchronous signals, I guess it depends on what you want to see
happen here -- I'm not sure what it means to throw an exception from 
within an asynchronous signal handler.   For unwinding purposes, I guess
I can see why you would want the next instruction to show up in the
backtrace, so I wouldn't mind changing the 
  if (signal == SIGBUS || signal == SIGSEGV)
to
  if (signal != SIGILL && signal != SIGFPE && signal != SIGTRAP)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26208



[Bug other/26208] Serious problem with unwinding through signal frames

2006-02-10 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2006-02-10 20:34 
---
(In reply to comment #4)
> Not all the targets have the luxury of spare register slots.
I guess we were lucky here ;-)

> So the current proposal is to add a new CIE augmentation that will signify
> a signal frame.
OK, I see.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26208



[Bug other/26208] Serious problem with unwinding through signal frames

2006-02-10 Thread uweigand at gcc dot gnu dot org


--- Comment #3 from uweigand at gcc dot gnu dot org  2006-02-10 20:00 
---
Yup.  See how this is handled in config/s390/linux-unwind.c:

  /* If we got a SIGSEGV or a SIGBUS, the PSW address points *to*
 the faulting instruction, not after it.  This causes the logic
 in unwind-dw2.c that decrements the RA to determine the correct
 CFI region to get confused.  To fix that, we *increment* the RA
 here in that case.  Note that we cannot modify the RA in place,
 and the frame state wants a *pointer*, not a value; thus we put
 the modified RA value into the unused register 33 slot of FS and
 have the register 32 save address point to that slot.

 Unfortunately, for regular signals on old kernels, we don't know
 the signal number.  We default to not fiddling with the RA;
 that can fail in rare cases.  Upgrade your kernel.  */

  if (signo && (*signo == 11 || *signo == 7))
{
  fs->regs.reg[33].loc.exp =
(unsigned char *)regs->psw_addr + 1;
  fs->regs.reg[32].loc.offset =
(long)&fs->regs.reg[33].loc.exp - new_cfa;
    }


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 CC|                    |uweigand at gcc dot gnu dot
   |    |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26208



[Bug ada/26096] [4.2 Regression] Ada bootstrap fails in g-alleve.adb

2006-02-08 Thread uweigand at gcc dot gnu dot org


--- Comment #10 from uweigand at gcc dot gnu dot org  2006-02-08 22:36 
---
(In reply to comment #9)
> The first 3 are so well-understood as to be fixed on my machine. :-)  We are
> working on the 4th.

Excellent!

> > Will you be committing the patch, or is this not the proper fix?
> 
> It's the fix.  Sorry for the delay in applying it, we've been a bit busy
> lately.

No problem -- thanks a lot for providing the fix so quickly!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26096



[Bug ada/26096] [4.2 Regression] Ada bootstrap fails in g-alleve.adb

2006-02-08 Thread uweigand at gcc dot gnu dot org


--- Comment #8 from uweigand at gcc dot gnu dot org  2006-02-08 21:44 
---
The spurious failures are always in different test cases for me as well ...

In fact, I now did a re-test and only see the four well-understood failures:
FAIL:   c32001e
FAIL:   c64105b
FAIL:   c95086b
FAIL:   ce3810b

Will you be committing the patch, or is this not the proper fix?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26096



[Bug tree-optimization/26169] [4.2 Regression] ICE in duplicate_ssa_name

2006-02-08 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2006-02-08 16:10 
---
FYI -- this also breaks bootstrap on s390-ibm-linux and s390x-ibm-linux:

../../../gcc-head/libgfortran/io/unit.c: In function 'find_unit_1':
../../../gcc-head/libgfortran/io/unit.c:269: internal compiler error: tree
check: expected ssa_name, have struct_field_tag in duplicate_ssa_name, at
tree-ssanames.c:247
Please submit a full bug report,
with preprocessed source if appropriate.
See http://gcc.gnu.org/bugs.html> for instructions.


-- 

uweigand at gcc dot gnu dot org changed:

   What|Removed |Added

 CC|    |uweigand at gcc dot gnu dot
   |    |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26169



[Bug ada/26096] [4.2 Regression] Ada bootstrap fails in g-alleve.adb

2006-02-04 Thread uweigand at gcc dot gnu dot org


--- Comment #5 from uweigand at gcc dot gnu dot org  2006-02-04 20:16 
---
(In reply to comment #4)

> Thanks.  ce3107b is new to me but all the others are fully understood.

It looks like ce3107b is one of those spurious failures I'm getting from
time to time -- I've never quite understood what's going on here, but it
looks like a test suite issue:


 CE3107A PASSED .
PASS:   ce3107a
splitting
/home/uweigand/fsf/gcc-head-build/gcc/testsuite/ada/acats/tests/ce/ce3107b.ada
into:
   ce3107b.adb
BUILD
FAIL:   ce3107b
splitting
/home/uweigand/fsf/gcc-head-build/gcc/testsuite/ada/acats/tests/ce/ce3108a.ada
into:
   ce3108a.adb
BUILD ce3108a.adb
gnatmake --GCC="/home/uweigand/fsf/gcc-head-build/gcc/xgcc
-B/home/uweigand/fsf/gcc-head-build/gcc/" -gnatws -O2
-I/home/uweigand/fsf/gcc-head-build/gcc/testsuite/ada/acats/support ce3108a.adb
-largs --GCC="/home/uweigand/fsf/gcc-head-build/gcc/xgcc
-B/home/uweigand/fsf/gcc-head-build/gcc/"


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26096



  1   2   >