[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-30 Thread chaoyingfu at gcc dot gnu dot org


--- Comment #39 from chaoyingfu at gcc dot gnu dot org  2006-12-01 01:04 
---
Subject: Bug 29680

Author: chaoyingfu
Date: Fri Dec  1 01:01:21 2006
New Revision: 119392

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=119392
Log:
Merged revisions 118654-118785 via svnmerge from 
svn+ssh://[EMAIL PROTECTED]/svn/gcc/trunk


  r118654 | jakub | 2006-11-10 07:50:39 -0800 (Fri, 10 Nov 2006) | 3 lines

* config/locale/gnu/c_locale.cc (__convert_to_v): Prefer
strtold_l over __strtold_l if available.

  r118659 | pault | 2006-11-10 09:21:57 -0800 (Fri, 10 Nov 2006) | 12 lines

  2006-11-10 Paul Thomas [EMAIL PROTECTED]

PR fortran/29315
* trans-expr.c (is_aliased_array): Treat correctly the case where the
component is itself and array or array reference.


  2006-11-10 Paul Thomas [EMAIL PROTECTED]

PR fortran/29315
* gfortran.dg/aliasing_dummy_4.f90: New test.

  r118661 | burnus | 2006-11-10 10:15:39 -0800 (Fri, 10 Nov 2006) | 6 lines

  2006-11-10  Tobias Burnus  [EMAIL PROTECTED]

 PR fortran/29454
 * resolve.c (gfc_resolve_blocks): Fix error message.

  r118662 | fche | 2006-11-10 10:42:28 -0800 (Fri, 10 Nov 2006) | 14 lines

  2006-11-10  Frank Ch. Eigler  [EMAIL PROTECTED]

PR libmudflap/28578
* mf-hooks1.c (__mf_0fn_malloc): Make the bootstrap buffers
static but not function scope static.
(free): Skip deallocation attempts for objects placed into
bootstrap buffers.
* testsuite/libmudflap.cth/pass59-frag.c: New test.


  Mlibmudflap/mf-hooks1.c
  Mlibmudflap/ChangeLog
  Alibmudflap/testsuite/libmudflap.cth/pass59-frag.c

  r118664 | pault | 2006-11-10 13:06:42 -0800 (Fri, 10 Nov 2006) | 16 lines

  2006-11-10 Paul Thomas [EMAIL PROTECTED]

 PR fortran/29758
 * check.c (gfc_check_reshape): Check that there are enough
 elements in the source array as to be able to fill an array
 defined by shape, when pad is absent.


  2006-11-10 Paul Thomas [EMAIL PROTECTED]

 PR fortran/29758
 * gfortran.dg/reshape_source_size_1.f90: New test.

  r118665 | hubicka | 2006-11-10 13:42:04 -0800 (Fri, 10 Nov 2006) | 9 lines

* cse.c (cse_process_notes): Copy the propagated value.
* local-alloc.c (update_equiv_regs): Copy the memory RTX to be used
in REG_EQUIV notes.
* gcse.c (try_replace_reg): Copy the replacement.
* i386.c (emit_i387_cw_initialization): Copy stored_mode
(assign_386_stack_local): Always return copied memory expression
* function.c (instantiate_virtual_regs_in_insn): Copy the operand
duplicates.

  r118668 | brooks | 2006-11-10 14:34:26 -0800 (Fri, 10 Nov 2006) | 9 lines

  * lang.opt (-fmodule-private): Remove option.
  * gfortran.h (gfc_option_t): Remove module_access_private flag.
  * options.c (gfc_init_options): Remove initialization for it.
  (gfc_process_option): Remove handling for -fmodule-private.
  * module.c (gfc_check_access): Add comments, remove check for
  gfc_option.flag_module_access_private.

  (Also fixed tab-damage in preceeding changelog entry.)

  r118670 | brooks | 2006-11-10 15:43:05 -0800 (Fri, 10 Nov 2006) | 3 lines

  Corrected gfc_process_option to gfc_handle_option in my last
  ChangeLog entry.

  r118676 | gccadmin | 2006-11-10 16:17:31 -0800 (Fri, 10 Nov 2006) | 1 line

  Daily bump.

  r118678 | sayle | 2006-11-10 17:47:18 -0800 (Fri, 10 Nov 2006) | 7 lines


* tree.c (build_int_cst_wide): Add an assertion (gcc_unreachable)
when attempting to build INTEGER_CSTs of non-integral types.
* expmed.c (make_tree): Use the correct type, i.e. the inner
type, when constructing the individual elements of a CONST_VECTOR.

  r118682 | ghazi | 2006-11-10 20:01:42 -0800 (Fri, 10 Nov 2006) | 6 lines

* fold-const.c (negate_mathfn_p): Add BUILT_IN_ERF.

  testsuite:
* gcc.dg/torture/builtin-symmetric-1.c: New test.

  r118683 | ghazi | 2006-11-10 20:05:14 -0800 (Fri, 10 Nov 2006) | 8 lines

* builtins.c (fold_builtin_cos): Use fold_strip_sign_ops().
(fold_builtin_hypot): Likewise.
* fold-const.c (fold_strip_sign_ops): Handle odd builtins.

  testsuite:
* gcc.dg/builtins-20.c: Add more cases for stripping sign ops.

  r118684 | bergner | 2006-11-10 20:20:37 -0800 (Fri, 10 Nov 2006) | 3 lines

* rtl.h (MEM_COPY_ATTRIBUTES): Copy MEM_POINTER.

  r118685 | sayle | 2006-11-10 21:00:10 -0800 (Fri, 10 Nov 2006) | 5 lines


* fold-const.c (operand_equal_p) INTEGER_CST, REAL_CST, VECTOR_CST:
Don't check for TREE_CONSTANT_OVERFLOW when comparing constants.

  r118686 | jiez | 2006-11-10 23:48:33 -0800 (Fri, 10 Nov 2006) | 3 lines

* config/bfin/bfin.h (FUNCTION_PROFILER): Don't use LABELNO.
(NO_PROFILE_COUNTERS): Define as 1.

  

[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-13 Thread rakdver at gcc dot gnu dot org


--- Comment #37 from rakdver at gcc dot gnu dot org  2006-11-13 12:37 
---
Subject: Bug 29680

Author: rakdver
Date: Mon Nov 13 12:37:29 2006
New Revision: 118754

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=118754
Log:
PR tree-optimization/29680
* tree-ssa-operands.c (access_can_touch_variable): Revert fix for
PR 14784.

* gcc.dg/alias-11.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/alias-11.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-operands.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-13 Thread pinskia at gcc dot gnu dot org


--- Comment #38 from pinskia at gcc dot gnu dot org  2006-11-13 13:54 
---
Fixed.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-12 Thread rakdver at gcc dot gnu dot org


--- Comment #36 from rakdver at gcc dot gnu dot org  2006-11-12 17:33 
---
(In reply to comment #19)
 Created an attachment (id=12574)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12574action=view) [edit]
 A patch
 
 This reverts the patch which triggers the problem and adds a testcase. I
 am running SPEC CPU 2006 now.

I am going to commit this patch for now (once it passes bootstrap 
regtesting).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread rakdver at gcc dot gnu dot org


--- Comment #20 from rakdver at gcc dot gnu dot org  2006-11-09 11:16 
---
  I am playing with some ideas how to fix this, unless I come up with
something
 soon, I will revert the patch (except for the testcase that I would like to
 remain in the testsuite).

The best I was able to do is the following patch.  Virtual operand prunning
removes all the symbols that the SMTs have in common, which causes this PR. 
The patch adds artificial conflict symbols to all pairs of aliasing SMTs, to
avoid this.  Just looking at the dump of the testcase for this PR, it appears
quite expensive (there are a lot of new virtual operands); I will check what
the memory behavior is on larger testcases.

Index: tree-dump.c
===
*** tree-dump.c (revision 118619)
--- tree-dump.c (working copy)
*** dequeue_and_dump (dump_info_p di)
*** 495,500 
--- 495,501 
  case SYMBOL_MEMORY_TAG:
  case NAME_MEMORY_TAG:
  case STRUCT_FIELD_TAG:
+ case CONFLICT_TAG:
break;

  case VAR_DECL:
Index: tree-pretty-print.c
===
*** tree-pretty-print.c (revision 118619)
--- tree-pretty-print.c (working copy)
*** dump_generic_node (pretty_printer *buffe
*** 849,854 
--- 849,855 
break;

  case SYMBOL_MEMORY_TAG:
+ case CONFLICT_TAG:
  case NAME_MEMORY_TAG:
  case STRUCT_FIELD_TAG:
  case VAR_DECL:
Index: tree.c
===
*** tree.c  (revision 118619)
--- tree.c  (working copy)
*** init_ttree (void)
*** 270,279 
--- 270,281 
tree_contains_struct[STRUCT_FIELD_TAG][TS_DECL_MINIMAL] = 1;
tree_contains_struct[NAME_MEMORY_TAG][TS_DECL_MINIMAL] = 1;
tree_contains_struct[SYMBOL_MEMORY_TAG][TS_DECL_MINIMAL] = 1;
+   tree_contains_struct[CONFLICT_TAG][TS_DECL_MINIMAL] = 1;

tree_contains_struct[STRUCT_FIELD_TAG][TS_MEMORY_TAG] = 1;
tree_contains_struct[NAME_MEMORY_TAG][TS_MEMORY_TAG] = 1;
tree_contains_struct[SYMBOL_MEMORY_TAG][TS_MEMORY_TAG] = 1;
+   tree_contains_struct[CONFLICT_TAG][TS_MEMORY_TAG] = 1;

tree_contains_struct[STRUCT_FIELD_TAG][TS_STRUCT_FIELD_TAG] = 1;

*** tree_code_size (enum tree_code code)
*** 336,341 
--- 338,344 
return sizeof (struct tree_function_decl);
  case NAME_MEMORY_TAG:
  case SYMBOL_MEMORY_TAG:
+ case CONFLICT_TAG:
return sizeof (struct tree_memory_tag);
  case STRUCT_FIELD_TAG:
return sizeof (struct tree_struct_field_tag);
*** tree_node_structure (tree t)
*** 2119,2124 
--- 2122,2128 
  case SYMBOL_MEMORY_TAG:
  case NAME_MEMORY_TAG:
  case STRUCT_FIELD_TAG:
+ case CONFLICT_TAG:
return TS_MEMORY_TAG;
  default:
return TS_DECL_NON_COMMON;
Index: tree.h
===
*** tree.h  (revision 118619)
--- tree.h  (working copy)
*** extern const enum tree_code_class tree_c
*** 106,112 
  #define MTAG_P(CODE) \
(TREE_CODE (CODE) == STRUCT_FIELD_TAG   \
 || TREE_CODE (CODE) == NAME_MEMORY_TAG \
!|| TREE_CODE (CODE) == SYMBOL_MEMORY_TAG)


  /* Nonzero if DECL represents a VAR_DECL or FUNCTION_DECL.  */
--- 106,113 
  #define MTAG_P(CODE) \
(TREE_CODE (CODE) == STRUCT_FIELD_TAG   \
 || TREE_CODE (CODE) == NAME_MEMORY_TAG \
!|| TREE_CODE (CODE) == SYMBOL_MEMORY_TAG   \
!|| TREE_CODE (CODE) == CONFLICT_TAG)


  /* Nonzero if DECL represents a VAR_DECL or FUNCTION_DECL.  */
Index: tree-ssa-alias.c
===
*** tree-ssa-alias.c(revision 118619)
--- tree-ssa-alias.c(working copy)
*** init_alias_info (void)
*** 915,920 
--- 915,925 
  || TREE_CODE (var) == SYMBOL_MEMORY_TAG
  || !is_global_var (var))
clear_call_clobbered (var);
+ 
+ /* Mark all old conflict symbols for renaming, so that they go
+away.  */
+ if (TREE_CODE (var) == CONFLICT_TAG)
+   mark_sym_for_renaming (var);
}

/* Clear flow-sensitive points-to information from each SSA name.  */
*** compute_flow_sensitive_aliasing (struct 
*** 1149,1154 
--- 1154,1265 
  }
  }

+ /* Element of the conflicts hashtable.  */
+ 
+ typedef struct
+ {
+   hashval_t hash;
+   tree smt1, smt2;
+ } smt_pair;
+ 
+ typedef smt_pair *smt_pair_p;
+ 
+ /* Return true if the smt pairs are equal.  */
+ 
+ static int
+ smt_pair_eq (const void *va, const void *vb)
+ {
+   const smt_pair_p a = (const smt_pair_p) va;
+   const smt_pair_p b = (const smt_pair_p) vb;
+   return (a-smt1 == b-smt1  a-smt2 == b-smt2);
+ }
+ 
+ /* Hash for a smt pair.  */
+ 
+ static unsigned 

[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #21 from dberlin at gcc dot gnu dot org  2006-11-09 15:06 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

On 9 Nov 2006 11:16:12 -, rakdver at gcc dot gnu dot org
[EMAIL PROTECTED] wrote:


 --- Comment #20 from rakdver at gcc dot gnu dot org  2006-11-09 11:16 
 ---
   I am playing with some ideas how to fix this, unless I come up with
 something
  soon, I will revert the patch (except for the testcase that I would like to
  remain in the testsuite).

 The best I was able to do is the following patch.  Virtual operand prunning
 removes all the symbols that the SMTs have in common, which causes this PR.
 The patch adds artificial conflict symbols to all pairs of aliasing SMTs, to
 avoid this.

This is what I was trying to do originally with multiple NONLOCAL
symbols (SMT's are going to go away) in the other testcase.

It is just too expensive generally

One thing i'm going to try later is to try to partition all the
stores/load statements and figure out how many symbols it takes to
represent the results exactly (IE one symbol for each set of
statements that must interfere with each other, where each statement
can be in multiple partitions).

IE if you had

load/store statements a, b, c, d

a interferes with c and d but not b

b interferes with d

You get partitions:

part1: {a, c, d}
part2: {b, d}

We then just create two symbols, and use those as the vdef/vuse syms.

This scheme is N^2 worst case, but you can choose to unify partitions
to cut down the number of symbols.
Partitions that have no shared members can also share symbols.

This would unify all of our points-to/access_can_touch_var/etc results
into one nice framework that gets very good results, and avoid the
virtual operand explosion, i think.

The real thing is that this is probably too expensive to compute 5
times per function.

We'll see.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dnovillo at redhat dot com


--- Comment #22 from dnovillo at redhat dot com  2006-11-09 15:08 ---
Subject: Re:  [4.3 Regression] Misscompilation
 of spec2006 gcc

Daniel Berlin wrote on 11/09/06 10:05:

 One thing i'm going to try later is to try to partition all the
 stores/load statements and figure out how many symbols it takes to
 represent the results exactly (IE one symbol for each set of
 statements that must interfere with each other, where each statement
 can be in multiple partitions).
 
This is what I'm doing in memory SSA.  More details later this week 
after I'm done testing and such.  The difference is that partitioning is 
embedded in the actual SSA form and the partitioning heuristic can be 
changed independently of the renamer.  This will let us play with a 
slider-style throttling for precision/compile-time.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread hjl at lucon dot org


--- Comment #23 from hjl at lucon dot org  2006-11-09 15:47 ---
(In reply to comment #20)
   I am playing with some ideas how to fix this, unless I come up with
 something
  soon, I will revert the patch (except for the testcase that I would like to
  remain in the testsuite).
 
 The best I was able to do is the following patch.  Virtual operand prunning
 removes all the symbols that the SMTs have in common, which causes this PR. 
 The patch adds artificial conflict symbols to all pairs of aliasing SMTs, to
 avoid this.  Just looking at the dump of the testcase for this PR, it appears
 quite expensive (there are a lot of new virtual operands); I will check what
 the memory behavior is on larger testcases.
 

It failed during bootstrap:

/net/gnu-13/export/gnu/src/gcc/gcc/gcc/tree-dfa.c: In function
âfind_referenced_varsâ:
/net/gnu-13/export/gnu/src/gcc/gcc/gcc/tree-dfa.c:99: internal compiler error:
in mark_operand_necessary, at tree-ssa-dce.c:261
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://gcc.gnu.org/bugs.html for instructions.
make[5]: *** [tree-dfa.o] Error 1


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #24 from dberlin at gcc dot gnu dot org  2006-11-09 17:22 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

On 11/9/06, Diego Novillo [EMAIL PROTECTED] wrote:
 Daniel Berlin wrote on 11/09/06 10:05:

  One thing i'm going to try later is to try to partition all the
  stores/load statements and figure out how many symbols it takes to
  represent the results exactly (IE one symbol for each set of
  statements that must interfere with each other, where each statement
  can be in multiple partitions).
 
 This is what I'm doing in memory SSA.  More details later this week
 after I'm done testing and such.  The difference is that partitioning is
 embedded in the actual SSA form and the partitioning heuristic can be
 changed independently of the renamer.  This will let us play with a
 slider-style throttling for precision/compile-time.

Right, but the difference is, In the scheme i propose, you'd never
have overlapping live ranges of vuse/vdefs, and in mem-ssa, you do.
IE we wouldn't run into all the problems mem-ssa is going to bring in
this regard.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dnovillo at redhat dot com


--- Comment #25 from dnovillo at redhat dot com  2006-11-09 17:38 ---
Subject: Re:  [4.3 Regression] Misscompilation
 of spec2006 gcc

Daniel Berlin wrote on 11/09/06 12:22:

 Right, but the difference is, In the scheme i propose, you'd never
 have overlapping live ranges of vuse/vdefs, and in mem-ssa, you do.
 IE we wouldn't run into all the problems mem-ssa is going to bring in
 this regard.

No, that's not right.  Overlapping live-ranges are not a problem until 
you hit a PHI node.  That's where currently mem-ssa is having 
difficulties with.

We can use those static partitions at key points during SSA renaming. 
Since the partitions are completely unrelated to the renamer, we can 
experiment with different partitioning schemes.  It's actually even 
possible to arrive to a perfect partitioning scheme that doesn't 
introduce false positive dependencies.

More details to follow.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #26 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-11-09 18:03 ---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

  Right, but the difference is, In the scheme i propose, you'd never
  have overlapping live ranges of vuse/vdefs, and in mem-ssa, you do.
  IE we wouldn't run into all the problems mem-ssa is going to bring in
  this regard.
 
 No, that's not right.  Overlapping live-ranges are not a problem until 
 you hit a PHI node.  That's where currently mem-ssa is having 
 difficulties with.

well, in any case, Daniel's proposal has advantage that it is much less
intrusive than mem-ssa -- does not need to change ssa renaming at all,
probably needs much less changes to operand scanning, and does not need
any changes to optimizations that assume vops are in FUD form (i.e.,
that the life ranges of vops do not overlap).  If he could create (or help
someone create) a working prototype in reasonable time (few weeks?),
I would very much like to see it compared with mem-ssa before mem-ssa
branch is merged.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread Daniel Berlin

A detailed proposal:

So here is what i was thinking of.  When i say symbols below, I mean
some VAR_DECL or structure that has a name (like our memory tags
do).  A symbol is *not* a real variable that occurred in the user
program.  When I say varaible i mean a variable that occurred in the
user program.

The real problem with our alias system in terms of precision, and
often in terms of number of useless vops, is that we are trying to use
real, existing, variables, to approximate the portions of the heap a
statement accesses.

When things access portions of the heap we can't see (nonlocal
variables), we fall down badly in terms of precision because we can
eliminate every single local variable as an alias, and need to then
just say it accesses some nonlocal variable.  This causes precision
problems because it means that statements accessing nonlocal variables
that we can *prove* don't interfere, still currently share a VUSE
between them.

We also have useless vops whenever we have points-to sets that
intersect between all statements that interfere, because we end up
adding aliases for you can eliminate the members of the alias set

We also currently rely on operand-scan time pruning, which is very ugly.

There is a way to get the minimal number of vuses/vdefs necessary to
represent  completely precise (in terms of info we have) aliasing,
while still being able to degrade the precision gracefully in order to
enable the vuses/vdefs necessary to go down

The scheme i propose *never* has overlapping live ranges of the
individual symbols, even though the symbols may represent pieces of
the same heap.

In other words, you can rely on the fact that once an individual
symbol has a new version, there will never be a vuse of an old version
of that symbol.



The current vdef/vuse scheme consists of creating memory tags to
represent portions of the heap.  When a memory tag has aliases, we use
it's alias list to generate virtual operands.  When a memory tag does
not have aliases, we generate a virtual operand of the base symbol.

The basic idea in the new scheme is to never have a list of aliases
for a symbol representing portions of the heap.  The symbols
representing portions of the heap are themselves always the target of
a vuse/vdef.  The aliases they represent is immaterial (though we can
keep a list if something wants it).

This enables us to have a smaller number of vops, and have something
else generate the set of symbols in a precise manner, rather than have
things like the operand scanner try to post process it.

The symbols are also attached to the load/store statements, and not to
the variables.

The operand renamer only has to add vuses/vdefs for all the symbols
attached to a statement, and it is done.

In the simplest, dumb, non-precise version of this scheme, this means
you only have one symbol, called MEM, and generate vuse/vdefs
linking every load/store together.

In the absolute most-precise version of this scheme, you partition the
loads/store conflicts in statements into symbols that represent
statement conflictingness.

In a completely naive, O(N^3) version, the following algorithm will
work and generate completely precise results:

Collect all loads and stores into a list (lslist)
for each statement in lslist (x):
 for each statement in lslist (y):
   if x conflicts with y:
  if there is no partition for x, y, create a new one containing x and y.
  otherwise
   for every partition y belongs to:
 if all members of this partition have memory access that
conflicts with x:
  add x to this partition
otherwise
 create a new partition containing all members of the
partition except the ones x does not conflict with.
 add x to this partition


This is a very very slow way to do it, but it should be clear (there
are much much much faster ways to do this).

Basically, a single load/store statement can belong to multiple
partitions.  All members of a given partition conflict with each
other.

given the following set of memory accesses statements:

a, b, c, d

where:
a conflicts with b and c
b conflicts with c and d
c conflicts with a and b
d conflicts with a and c

you will end up with 3 partitions:
part1: {a, b, c}
part2: {b, c, d}
part3: {d, a, c}

statement c will conflict with every member of partition 1 and thus
get partition 1, rather than a new partition.

You now create symbols for each partition, and for each statement in
the partition, add the symbol to it's list.

Thus, in the above example we get

statement a - symbols: MEM.PART1, MEM.PART3
statement b - symbols: MEM.PART1, MEM.PART2
statement c - symbols: MEM.PART1, MEM.PART2, MEM.PART3
statement d - symbols MEM.PART2, MEM.PART3

As mentioned before, the operand renamer simply adds a vdef/vuse for
each symbol in the statement list.

Note that this is the minimal number of symbols necessary to precisely
represent the conflicting accesses.

If the number of partitions grows 

[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #27 from dberlin at gcc dot gnu dot org  2006-11-09 18:21 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

A detailed proposal:

So here is what i was thinking of.  When i say symbols below, I mean
some VAR_DECL or structure that has a name (like our memory tags
do).  A symbol is *not* a real variable that occurred in the user
program.  When I say varaible i mean a variable that occurred in the
user program.

The real problem with our alias system in terms of precision, and
often in terms of number of useless vops, is that we are trying to use
real, existing, variables, to approximate the portions of the heap a
statement accesses.

When things access portions of the heap we can't see (nonlocal
variables), we fall down badly in terms of precision because we can
eliminate every single local variable as an alias, and need to then
just say it accesses some nonlocal variable.  This causes precision
problems because it means that statements accessing nonlocal variables
that we can *prove* don't interfere, still currently share a VUSE
between them.

We also have useless vops whenever we have points-to sets that
intersect between all statements that interfere, because we end up
adding aliases for you can eliminate the members of the alias set

We also currently rely on operand-scan time pruning, which is very ugly.

There is a way to get the minimal number of vuses/vdefs necessary to
represent  completely precise (in terms of info we have) aliasing,
while still being able to degrade the precision gracefully in order to
enable the vuses/vdefs necessary to go down

The scheme i propose *never* has overlapping live ranges of the
individual symbols, even though the symbols may represent pieces of
the same heap.

In other words, you can rely on the fact that once an individual
symbol has a new version, there will never be a vuse of an old version
of that symbol.



The current vdef/vuse scheme consists of creating memory tags to
represent portions of the heap.  When a memory tag has aliases, we use
it's alias list to generate virtual operands.  When a memory tag does
not have aliases, we generate a virtual operand of the base symbol.

The basic idea in the new scheme is to never have a list of aliases
for a symbol representing portions of the heap.  The symbols
representing portions of the heap are themselves always the target of
a vuse/vdef.  The aliases they represent is immaterial (though we can
keep a list if something wants it).

This enables us to have a smaller number of vops, and have something
else generate the set of symbols in a precise manner, rather than have
things like the operand scanner try to post process it.

The symbols are also attached to the load/store statements, and not to
the variables.

The operand renamer only has to add vuses/vdefs for all the symbols
attached to a statement, and it is done.

In the simplest, dumb, non-precise version of this scheme, this means
you only have one symbol, called MEM, and generate vuse/vdefs
linking every load/store together.

In the absolute most-precise version of this scheme, you partition the
loads/store conflicts in statements into symbols that represent
statement conflictingness.

In a completely naive, O(N^3) version, the following algorithm will
work and generate completely precise results:

Collect all loads and stores into a list (lslist)
for each statement in lslist (x):
  for each statement in lslist (y):
if x conflicts with y:
   if there is no partition for x, y, create a new one containing x and y.
   otherwise
for every partition y belongs to:
  if all members of this partition have memory access that
conflicts with x:
   add x to this partition
 otherwise
  create a new partition containing all members of the
partition except the ones x does not conflict with.
  add x to this partition


This is a very very slow way to do it, but it should be clear (there
are much much much faster ways to do this).

Basically, a single load/store statement can belong to multiple
partitions.  All members of a given partition conflict with each
other.

given the following set of memory accesses statements:

a, b, c, d

where:
a conflicts with b and c
b conflicts with c and d
c conflicts with a and b
d conflicts with a and c

you will end up with 3 partitions:
part1: {a, b, c}
part2: {b, c, d}
part3: {d, a, c}

statement c will conflict with every member of partition 1 and thus
get partition 1, rather than a new partition.

You now create symbols for each partition, and for each statement in
the partition, add the symbol to it's list.

Thus, in the above example we get

statement a - symbols: MEM.PART1, MEM.PART3
statement b - symbols: MEM.PART1, MEM.PART2
statement c - symbols: MEM.PART1, MEM.PART2, MEM.PART3
statement d - symbols MEM.PART2, MEM.PART3

As mentioned before, the operand renamer simply adds a vdef/vuse for
each symbol in the 

[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dnovillo at gcc dot gnu dot org


--- Comment #28 from dnovillo at gcc dot gnu dot org  2006-11-09 19:15 
---
(In reply to comment #26)

 I would very much like to see it compared with mem-ssa before mem-ssa
 branch is merged.
 
Notice that the two approaches do not negate each other.  Dan's proposal is a
smarter partitioning than what the current alias grouping mechanism  tries to
do.  We can actually have memory SSA on top of Dan's partitioning scheme. 
Memory SSA will use that partitioning scheme when placing memory PHI nodes.

The two approaches are orthogonal.  Memory SSA simply adds a new degree of
factoring that gives you sparser UD chains.  It also gives you exactly one name
per store, without reducing precision.


-- 

dnovillo at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||dnovillo at gcc dot gnu dot
   ||org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #29 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-11-09 19:41 ---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

  I would very much like to see it compared with mem-ssa before mem-ssa
  branch is merged.
  
 Notice that the two approaches do not negate each other.  Dan's proposal is a
 smarter partitioning than what the current alias grouping mechanism  tries to
 do.  We can actually have memory SSA on top of Dan's partitioning scheme. 
 Memory SSA will use that partitioning scheme when placing memory PHI nodes.
 
 The two approaches are orthogonal.  Memory SSA simply adds a new degree of
 factoring that gives you sparser UD chains.  It also gives you exactly one 
 name
 per store, without reducing precision.

nevertheless, it is not obvious to me whether using mem-ssa over Daniel's
proposal would bring any significant gains, which I would like to have
verified before we introduce milion new bugs with mem-ssa (nothing
personal, it simply is too large and too intrusive change not to bring
any).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dnovillo at gcc dot gnu dot org


--- Comment #30 from dnovillo at gcc dot gnu dot org  2006-11-09 19:48 
---
(In reply to comment #29)

 nevertheless, it is not obvious to me whether using mem-ssa over Daniel's
 proposal would bring any significant gains, which I would like to have
 
Of course.  If you are interested in the compile time benefits of a
partitioning scheme, you can actually try the one we already have by forcing
alias grouping more aggressively (--param max-aliased-vops).

The current grouping is very dumb and will create tons of false positives. 
Daniel's approach will try to reduce false positives while bringing down the
number of virtual operators per memory statement.

Memory SSA brings down the number of virtual operators to exactly one per
statement.


 verified before we introduce milion new bugs with mem-ssa (nothing
 personal, it simply is too large and too intrusive change not to bring
 any).

Intrusive?  Well, the only pass that was wired to the previous virtual operator
scheme was PRE.  DSE is also wired but to a lesser extent.  No other
optimization had to be changed for mem-ssa.  It's obviously intrusive in the
renamer, but that's it.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread Daniel Berlin


Memory SSA brings down the number of virtual operators to exactly one per
statement.


However, it does so in a way that makes the traditional things that
actually want to do cool memory optimizations, harder.

I'm still on the fence over whether it's a good idea or not.



 verified before we introduce milion new bugs with mem-ssa (nothing
 personal, it simply is too large and too intrusive change not to bring
 any).

Intrusive?  Well, the only pass that was wired to the previous virtual operator
scheme was PRE.  DSE is also wired but to a lesser extent.  No other
optimization had to be changed for mem-ssa.  It's obviously intrusive in the
renamer, but that's it.


Uh, LIM and store sinking are too.  Roughly all of our memory optimizations are.

The basic problem is in mem-ssa that vdefs and vuses don't accurately
reflect what symbols are being defined and used anymore.  They
represent the factoring of a use and definition of a whole bunch of
symbols.

Things like PRE and DSE break not because they are wired to the
previous virtual operator scheme so much, but because they rely on
the virtual use/def chains accurately representing where a symbol
representing a memory access dies.  In mem-ssa, you have VDEF's of the
same symbol all over the place.

The changes i have to make to PRE (and to the other things) to account
for this is actually to rebuild the non-mem-ssa-factored (IE the
current factored) form out of the chains by seeing what symbols they
really affect.

This is going to be expensive, and IMHO, is what almost all of our SSA
memory optimizations are going to have to do.

So while mem-ssa doesn't affect *precision*, it does affect how you
can use the chains in a very significant way.

For at least all the opts i see us doing, it makes them more or less
useless without doing things (like reexpanding them) first. Because
this is true, I'm not sure it's a good idea at all, which is why i'm
still on the fence.


[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #31 from dberlin at gcc dot gnu dot org  2006-11-09 21:28 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc


 Memory SSA brings down the number of virtual operators to exactly one per
 statement.

However, it does so in a way that makes the traditional things that
actually want to do cool memory optimizations, harder.

I'm still on the fence over whether it's a good idea or not.


  verified before we introduce milion new bugs with mem-ssa (nothing
  personal, it simply is too large and too intrusive change not to bring
  any).
 
 Intrusive?  Well, the only pass that was wired to the previous virtual 
 operator
 scheme was PRE.  DSE is also wired but to a lesser extent.  No other
 optimization had to be changed for mem-ssa.  It's obviously intrusive in the
 renamer, but that's it.

Uh, LIM and store sinking are too.  Roughly all of our memory optimizations
are.

The basic problem is in mem-ssa that vdefs and vuses don't accurately
reflect what symbols are being defined and used anymore.  They
represent the factoring of a use and definition of a whole bunch of
symbols.

Things like PRE and DSE break not because they are wired to the
previous virtual operator scheme so much, but because they rely on
the virtual use/def chains accurately representing where a symbol
representing a memory access dies.  In mem-ssa, you have VDEF's of the
same symbol all over the place.

The changes i have to make to PRE (and to the other things) to account
for this is actually to rebuild the non-mem-ssa-factored (IE the
current factored) form out of the chains by seeing what symbols they
really affect.

This is going to be expensive, and IMHO, is what almost all of our SSA
memory optimizations are going to have to do.

So while mem-ssa doesn't affect *precision*, it does affect how you
can use the chains in a very significant way.

For at least all the opts i see us doing, it makes them more or less
useless without doing things (like reexpanding them) first. Because
this is true, I'm not sure it's a good idea at all, which is why i'm
still on the fence.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread Daniel Berlin

In mem-ssa, you have VDEF's of the
same symbol all over the place.


 version of a symbol


[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #32 from dberlin at gcc dot gnu dot org  2006-11-09 21:29 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

 In mem-ssa, you have VDEF's of the
 same symbol all over the place.

 version of a symbol


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dnovillo at redhat dot com


--- Comment #33 from dnovillo at redhat dot com  2006-11-09 21:48 ---
Subject: Re:  [4.3 Regression] Misscompilation
 of spec2006 gcc

dberlin at dberlin dot org wrote on 11/09/06 16:28:

 Uh, LIM and store sinking are too.  Roughly all of our memory
 optimizations are.
 
They are?  Really?  Can you show me where exactly?

 The changes i have to make to PRE (and to the other things) to
 account for this is actually to rebuild the non-mem-ssa-factored (IE
 the current factored) form out of the chains by seeing what symbols
 they really affect.
 
OK, so how come you were so positive about the new interface?  I need to
understand what was the great difficulty you ran into that made you 
change your mind.  I need to see a specific example.

See, the UD chains you get in mem-ssa are neither incomplete nor wrong.
The symbols affected are right there in plain sight, so there is no
loss of any information.

 For at least all the opts i see us doing, it makes them more or less 
 useless without doing things (like reexpanding them) first. Because 
 this is true, I'm not sure it's a good idea at all, which is why i'm 
 still on the fence.
 
But you still haven't *shown* me where the hardness or slowness comes 
in.  Granted, the work is still unfinished so we can't really do actual 
measurements.  But I need to understand where the difficulties will be 
so that I can accomodate the infrastructure.  It's obviously not my 
intent to make things harder to use.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #34 from dberlin at gcc dot gnu dot org  2006-11-10 00:03 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

On 9 Nov 2006 21:48:25 -, dnovillo at redhat dot com
[EMAIL PROTECTED] wrote:


 --- Comment #33 from dnovillo at redhat dot com  2006-11-09 21:48 ---
 Subject: Re:  [4.3 Regression] Misscompilation
  of spec2006 gcc

 dberlin at dberlin dot org wrote on 11/09/06 16:28:

  Uh, LIM and store sinking are too.  Roughly all of our memory
  optimizations are.
 
 They are?  Really?  Can you show me where exactly?

They won't get incorrect results. They will get less good results than
they do now.

Take LIM and store motion (sorry, i meant SM, not store sinking)
determine_max_movement relies on the VIRTUAL_KILL information to
determine what must be moved together.  Because things that would
previously kill different symbol versions, now will kill the same
symbol version (due to being factored), it will add false dependencies
unless someone changes it.

Take the following example:
int e,f,g,h;
int main(int argc)
{
  int *a, *b, *d;
 int i;
 a = e;
 if (argc)
   a = f;
 b = g;
 if (argc)
   b = h;
 d = e;
 if (argc)
   d = h;
 for (i = 0; i  50; i++)
   {
 *a = 1;
 *b = 2;
 *d = 3;
   }
}

previously, you would have ...
 #   e_22 = V_MAY_DEF e_14;
  #   f_23 = V_MAY_DEF f_15;
  *a_1 = 1;
  #   g_24 = V_MAY_DEF g_16;
  #   h_25 = V_MAY_DEF h_17;
  *b_2 = 2;
  #   e_26 = V_MAY_DEF e_22;
  #   h_27 = V_MAY_DEF h_25;
  *d_3 = 3;

Note that *a and *b do not vdef any symbol that is the same.

mem-ssa gives you (as of today):

  # .MEM_16 = VDEF .MEM_14, f_20
  *a_1 = 1;
  # .MEM_17 = VDEF .MEM_14, g_21
  *b_2 = 2;
  # .MEM_18 = VDEF .MEM_16, .MEM_17
  *d_3 = 3;

note that now *a and *b both vdef MEM_14.

SM is going to say these stores are dependent on each other because
they kill the same version of the same variable unless you teach it
to look at the get_loads_and_stores bitmap.
Previously, it would not.


  The changes i have to make to PRE (and to the other things) to
  account for this is actually to rebuild the non-mem-ssa-factored (IE
  the current factored) form out of the chains by seeing what symbols
  they really affect.
 
 OK, so how come you were so positive about the new interface?

When have i been overabundantly positive?
I said I'd deal with it.  I'm neither here nor there.  I've relied on
your statements that you believe it will make things significantly
better without loss of precision.  We are going to pay a cost in time
for passes to make use of this information. I believed you were aware
of this.

  I need to
 understand what was the great difficulty you ran into that made you
 change your mind.  I need to see a specific example.

 See, the UD chains you get in mem-ssa are neither incomplete nor wrong.
Nobody said they are wrong, but I would actually argue they are no
longer really the same as SSA in a way that matters, if you want to
pick nits.
SSA variables have not only the property that they are *assigned to*
once, but the property that they are *defined* by a single operation.
Our current virtual operand scheme has this property.
Yours does not, because variables may be defined multiple times,
although they are still singley assigned.
You can argue this is not a requirement of SSA.  Honestly, it makes no
difference to me.  The upshot is that by losing this property, you
make them less useful than they could be.

 The symbols affected are right there in plain sight, so there is no
 loss of any information.

Errr, last time i looked, you had to call a function to get the
*actual* list of loads or stores affected by a statement.
Why does this matter?

All of our memory optimizations are trying to figure out three things:

1. Will two memory accesses return the same results (PRE, DSE)?
2. Do these two memory accesses have a dependency (PRE, SM, DSE, LIM).
3. If I hoist this memory to block X, will it still have the same
value as it does in block Y (PRE, SM, LIM).

Your stuff has no real affect on #2, but it makes #1 and #3 harder
(not less precise, mind you), at the cost of reducing the number of
operands.

It does so by factoring stores in a way that disables the ability to
see where loads are not *currently*, but *could*, validly be live.  In
other words, it was previously possible to simply determine whether
two stores touched the same piece of memory by looking at the
versions.  This is no longer true.

PRE does #1 and #3 through value numbering memory and tracking the
distinct live ranges of virtual variables.
 SM and LIM do #3 by simply using dependency information to determine
what needs to be moved together and grouping operations that vdef the
same variable together.

SM and LIM will thus still get *correct* information in the above
example, but it's going to get false dependencies due to defs of the
same mem version, unless something disambiguates between those store
dependencies by looking at the underlying loads and 

[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-09 Thread dberlin at dberlin dot org


--- Comment #35 from dberlin at gcc dot gnu dot org  2006-11-10 00:12 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

 Take the above case.
 If we simply use virtual variable versions to value number memory, we
 will believe that *a and *b are possible stores to the same piece of
 memory, even though they are not.

In case it's not clear, this means while previously you would have
determined you can move *b above *a simply by looking at the RHS, you
can't anymore.


My guess:
Any memory optimization that wants to just eliminate redundant loads
will be just fine with mem-ssa,.
Any memory optimization that wants to eliminate redundant stores will
have to be redesigned to not use chains to get maximum precision.
Any memory optimization that wants to hoist either loads or stores
will have to be redesigned to not use chains to determine where things
can be moved to, in order to get maximum precision.

At least, that's the way it appears to me.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-08 Thread hjl at lucon dot org


--- Comment #19 from hjl at lucon dot org  2006-11-09 01:53 ---
Created an attachment (id=12574)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12574action=view)
A patch

This reverts the patch which triggers the problem and adds a testcase. I
am running SPEC CPU 2006 now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-07 Thread rakdver at gcc dot gnu dot org


--- Comment #17 from rakdver at gcc dot gnu dot org  2006-11-07 13:19 
---
 Zdenek, can you revert your patch  until we fix this?
 It might be a month or two before i get back to it.
 
 (Yeah, i know it sucks to have to do this, but)

I am not sure whether that would be helpful, since the problem does not seem to
be directly caused by the patch (it would be a bit more difficult to construct
the testcase for the problem with my patch reverted, of course).

I am playing with some ideas how to fix this, unless I come up with something
soon, I will revert the patch (except for the testcase that I would like to
remain in the testsuite).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-07 Thread dberlin at gcc dot gnu dot org


--- Comment #18 from dberlin at gcc dot gnu dot org  2006-11-07 15:22 
---
(In reply to comment #17)
  Zdenek, can you revert your patch  until we fix this?
  It might be a month or two before i get back to it.
  
  (Yeah, i know it sucks to have to do this, but)
 
 I am not sure whether that would be helpful, since the problem does not seem 
 to
 be directly caused by the patch (it would be a bit more difficult to construct
 the testcase for the problem with my patch reverted, of course).
 

 I am playing with some ideas how to fix this, unless I come up with something
 soon, I will revert the patch (except for the testcase that I would like to
 remain in the testsuite).

I agree with your reasoning, but hiding the bug is better than nothing until a
proper fix can be made.  I hope you come up with something.  Like I said, I can
fix it in a month or two, but I know some people want to use Spec2K6 before
then .


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-06 Thread hjl at lucon dot org


--- Comment #14 from hjl at lucon dot org  2006-11-06 15:12 ---
I checked gcc 4.3. The same source code, which is miscompiled in gcc from
SPEC CPU 2006, is there. It is most likely that gcc 4.3 is also miscompiled
and now generating wrong unwind/debug info, if not wrong instructions.


-- 

hjl at lucon dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-11-06 15:12:29
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



Re: [Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-06 Thread Daniel Berlin

Zdenek, can you revert your patch  until we fix this?
It might be a month or two before i get back to it.

(Yeah, i know it sucks to have to do this, but)

On 6 Nov 2006 15:12:30 -, hjl at lucon dot org
[EMAIL PROTECTED] wrote:



--- Comment #14 from hjl at lucon dot org  2006-11-06 15:12 ---
I checked gcc 4.3. The same source code, which is miscompiled in gcc from
SPEC CPU 2006, is there. It is most likely that gcc 4.3 is also miscompiled
and now generating wrong unwind/debug info, if not wrong instructions.


--

hjl at lucon dot org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2006-11-06 15:12:29
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680




[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-06 Thread dberlin at dberlin dot org


--- Comment #15 from dberlin at gcc dot gnu dot org  2006-11-06 16:28 
---
Subject: Re:  [4.3 Regression] Misscompilation of spec2006 gcc

Zdenek, can you revert your patch  until we fix this?
It might be a month or two before i get back to it.

(Yeah, i know it sucks to have to do this, but)

On 6 Nov 2006 15:12:30 -, hjl at lucon dot org
[EMAIL PROTECTED] wrote:


 --- Comment #14 from hjl at lucon dot org  2006-11-06 15:12 ---
 I checked gcc 4.3. The same source code, which is miscompiled in gcc from
 SPEC CPU 2006, is there. It is most likely that gcc 4.3 is also miscompiled
 and now generating wrong unwind/debug info, if not wrong instructions.


 --

 hjl at lucon dot org changed:

What|Removed |Added
 
  Status|UNCONFIRMED |NEW
  Ever Confirmed|0   |1
Last reconfirmed|-00-00 00:00:00 |2006-11-06 15:12:29
date||


 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-06 Thread hjl at lucon dot org


--- Comment #16 from hjl at lucon dot org  2006-11-06 17:19 ---
I think we should add the testcase when the patch is reverted to prevent it
from happening again.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-04 Thread hjl at lucon dot org


--- Comment #12 from hjl at lucon dot org  2006-11-04 16:53 ---
Created an attachment (id=12547)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12547action=view)
A testcase to show array reference is ok

Gcc doesn't have a problem with array reference. That is if I change it
from

extern dw_fde_node *fde_table;

to

extern dw_fde_node fde_table [];

I got the correct result.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-04 Thread pinskia at gcc dot gnu dot org


--- Comment #13 from pinskia at gcc dot gnu dot org  2006-11-04 17:28 
---
(In reply to comment #12)
 Created an attachment (id=12547)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12547action=view) [edit]
 A testcase to show array reference is ok
 
 Gcc doesn't have a problem with array reference. That is if I change it
 from

Yes because for arrays we keep ARRAY_REF around instead of lowering it to
pointer arthematic.  See PR 29708 for another testcase which goes funny because
of casts in the IR.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-01 Thread pinskia at gcc dot gnu dot org


--- Comment #8 from pinskia at gcc dot gnu dot org  2006-11-01 18:18 ---
This is more reason why we need a POINTER_PLUS_EXPR.


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

 CC||pinskia at gcc dot gnu dot
   ||org
Summary|Misscompilation of spec2006 |[4.3 Regression]
   |gcc |Misscompilation of spec2006
   ||gcc
   Target Milestone|--- |4.3.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-01 Thread hjl at lucon dot org


--- Comment #9 from hjl at lucon dot org  2006-11-01 20:03 ---
Created an attachment (id=12529)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12529action=view)
A run-time testcase

Here is a run-time testcase:

[EMAIL PROTECTED] yyy]$  /usr/gcc-bad/bin/gcc -O2 bad.c
[EMAIL PROTECTED] yyy]$ ./a.out 
Aborted
[EMAIL PROTECTED] yyy]$  /usr/gcc-bad/bin/gcc -O bad.c
[EMAIL PROTECTED] yyy]$ ./a.out 
[EMAIL PROTECTED] yyy]$  /usr/gcc-good/bin/gcc -O2 bad.c
[EMAIL PROTECTED] yyy]$ ./a.out 
[EMAIL PROTECTED] yyy]$ 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-01 Thread rakdver at atrey dot karlin dot mff dot cuni dot cz


--- Comment #10 from rakdver at atrey dot karlin dot mff dot cuni dot cz  
2006-11-01 20:26 ---
Subject: Re:  Misscompilation of spec2006 gcc

   I will work around this problem by teaching PTA about casts from
   nonpointers to pointers, which will cause it to end up with a nonlocal
   var in the set.
 
  ??? There is no cast from non-pointer to pointer in this testcase.
 
 Actually, there is.
 That's why you end up with SMT's in the first place.
   #   VUSE fde_table_in_useD.1609_6;
   fde_table_in_use.0D.1617_7 = fde_table_in_useD.1609;
   D.1618_8 = fde_table_in_use.0D.1617_7 * 24;
   D.1619_9 = (struct dw_fde_struct *) D.1618_8;

You mean that whenever there is a pointer arithmetics other than
adding constants, we end up with points-to anything?  :-(


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680



[Bug tree-optimization/29680] [4.3 Regression] Misscompilation of spec2006 gcc

2006-11-01 Thread hjl at lucon dot org


--- Comment #11 from hjl at lucon dot org  2006-11-01 21:26 ---
Created an attachment (id=12530)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12530action=view)
An updates run-time testcase

This is smaller.


-- 

hjl at lucon dot org changed:

   What|Removed |Added

  Attachment #12523|0   |1
is obsolete||
  Attachment #12529|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29680