[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-23 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #9 from Jakub Jelinek  2012-01-23 
09:28:26 UTC ---
Fixed.


[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-23 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

--- Comment #8 from Jakub Jelinek  2012-01-23 
09:25:55 UTC ---
Author: jakub
Date: Mon Jan 23 09:25:52 2012
New Revision: 183416

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183416
Log:
PR rtl-optimization/51933
* ree.c (transform_ifelse): Return true right away if dstreg is
already wider or equal to cand->mode.
(enum ext_modified_kind, struct ext_modified, ext_state): New types.
(make_defs_and_copies_lists): Remove defs_list and copies_list
arguments, add state argument, just truncate state->work_list
instead of always allocating and freeing the vector.  Assert that
get_defs succeeds instead of returning 2.  Changed return type to
bool.
(merge_def_and_ext): Add state argument.  If SET_DEST doesn't
have ext_src_mode, see if it has been modified already with the
right kind of extension and has been extended before from the
ext_src_mode.  If SET_DEST is already wider or equal to cand->mode,
just return true.  Remember the original mode in state->modified
array.
(combine_reaching_defs): Add state argument.  Don't allocate and
free here def_list, copied_list and vec vectors, instead just
VEC_truncate the vectors in *state.  Don't handle outcome == 2
here.
(find_and_remove_re): Set DF_DEFER_INSN_RESCAN df flag.
Add state variable, clear vectors in it, initialize state.modified
if needed.  Free all the vectors at the end and state.modified too.
Don't skip a candidate if the extension expression has been modified.

* gcc.c-torture/execute/pr51933.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr51933.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/ree.c
trunk/gcc/testsuite/ChangeLog


[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-22 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #26410|0   |1
is obsolete||
 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |jakub at gcc dot gnu.org
   |gnu.org |

--- Comment #7 from Jakub Jelinek  2012-01-22 
18:59:43 UTC ---
Created attachment 26414
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26414
gcc47-pr51933.patch

Untested fix with deferred insn rescanning.  Reverts the PR51924
find_and_remove_re fix as it is no longer needed.


[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-22 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

--- Comment #6 from Jakub Jelinek  2012-01-22 
09:42:05 UTC ---
Created attachment 26410
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26410
gcc47-pr51933.patch

Patch I wrote last night.  Passed bootstrap/regtest, the only cases where it
prevented something was in the new testcase and in jcf-parse.c
(rewrite_reflection_indexes) with -m64, where it fixed an miscompilation.  What
happened there is that some earlier bb (but in a loop) was doing a
SImode->DImode zero extension, and later in the loop there was a SImode ior,
followed by HImode->SImode zero extension of that, which fed the earlier
SImode->DImode zero extension.  The SImode->DImode insn was processed first,
removed and changed the HImode->SImode zero extension into HImode->DImode zero
extension.  But as the HImode->SImode extension was changed, its df links were
gone.  When the HImode-> zero extension is later processed as candidate,
get_defs returns NULL and we assume all the dependencies have been adjusted.

Which leads me to thinking that this patch is a wrong approach, and perhaps we
should instead DF_DEFER_INSN_RESCAN in ree.c and adjust, so that we can cope
with the adjusted insns.


[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-21 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

Eric Botcazou  changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
 AssignedTo|ebotcazou at gcc dot|unassigned at gcc dot
   |gnu.org |gnu.org

--- Comment #5 from Eric Botcazou  2012-01-21 
21:27:40 UTC ---
> But then when make_defs_and_copies_lists is called on the QImode -> DImode
> extension, we reach:
>   /* Initialize the work list.  */
>   if (!get_defs (extend_insn, src_reg, &work_list))
> {
>   VEC_free (rtx, heap, work_list);
>   /* The number of defs being equal to zero can only mean that all the
>  definitions have been previously merged.  */
>   return 2;
> }
> and because the definition has been changed already.  But nothing performs the
> ext_src_mode check that used to be performed otherwise.
> So we either need to do the src mode checking earlier when we haven't started
> modifying insns (in add_removable_extension?), or we'd need to look even at 
> the
> newly added defs and see what mode they extend from.

OK, I didn't realize that there could be an implicit truncation hidden within
an extension.  And I think that your change to add_removable_extension is fine,
as it still leaves the new flavor of the pass more powerful than the old one.

The other PR is a different problem.


[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-21 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

--- Comment #4 from Jakub Jelinek  2012-01-21 
18:40:23 UTC ---
That unfortunately slightly pessimizes it.  E.g. on i686-linux bootstrap
on:
../../gcc/config/i386/i386.c ix86_memory_move_cost
../../gcc/config/i386/i386.c ix86_register_move_cost
../../../../libgfortran/io/write.c write_float
../../../libgfortran/io/write.c write_float
s-regpat.adb system__regpat__dump_until
/usr/src/gcc/gcc/testsuite/gcc.target/i386/avx-vpackuswb-1.c do_test
/usr/src/gcc/gcc/testsuite/gcc.target/i386/sse2-packuswb-1.c do_test
/usr/src/gcc/libstdc++-v3/testsuite/ext/rope/1.cc _S_tree_concat
/usr/src/gcc/libstdc++-v3/testsuite/ext/rope/2.cc _S_tree_concat
/usr/src/gcc/libstdc++-v3/testsuite/ext/rope/3.cc _S_tree_concat
/usr/src/gcc/libstdc++-v3/testsuite/ext/rope/44963.cc _S_tree_concat
/usr/src/gcc/libstdc++-v3/testsuite/ext/rope/4.cc _S_tree_concat
/usr/src/gcc/libstdc++-v3/testsuite/ext/rope/pthread7-rope.cc _S_tree_concat
and on x86_64-linux bootstrap sans regtest:
../../gcc/ada/par.adb par__prag__process_restrictions_or_restriction_warnings
../../gcc/config/i386/i386.c ix86_memory_move_cost
../../gcc/config/i386/i386.c ix86_register_move_cost
../../gcc/java/jcf-parse.c rewrite_reflection_indexes
../../../libgfortran/io/write.c write_float
s-regpat.adb system__regpat__dump_until

The problem is if the def insn satisfies is_cond_copy_insn, then it can be in a
wider mode, yet all definitions could still be in a narrower mode.
So, perhaps we'd need to have a recursive function that for is_cond_copy_insn
defs calls get_defs on both args and recurses, checking mode of real
definitions only.  Not sure if we'd need also this is_insn_visited array for it
to avoid recursing infinitely.

BTW, for 4.8, I guess a couple of ree.c cleanups will be needed, e.g. several
functions allocate and free way too many heap allocated vectors, I guess if we
allocate them in the caller (or caller's caller) and just pass addresses of
those vectors around, we could just VEC_truncate the vectors and avoid the
permanent malloc/free.


[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free

2012-01-21 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

--- Comment #3 from Jakub Jelinek  2012-01-21 
15:57:23 UTC ---
Completely untested fix:
--- gcc/ree.c.jj2011-12-28 10:52:44.0 +0100
+++ gcc/ree.c2012-01-21 16:55:27.290731139 +0100
@@ -776,6 +776,23 @@ add_removable_extension (rtx x ATTRIBUTE
   }
 return;
   }
+else
+  {
+rtx set = single_set (DF_REF_INSN (def->ref));
+if (set == NULL_RTX
+|| !REG_P (SET_DEST (set))
+|| GET_MODE (SET_DEST (set)) != GET_MODE (XEXP (src, 0)))
+  {
+if (dump_file)
+  {
+fprintf (dump_file, "Cannot eliminate extension:\n");
+print_rtl_single (dump_file, rei->insn);
+fprintf (dump_file,
+ " because def insn has different mode\n");
+  }
+return;
+  }
+  }

   /* Then add the candidate to the list and insert the reaching
definitions
  into the definition map.  */


[Bug rtl-optimization/51933] [4.7 Regression] Wrong-code due to -free

2012-01-21 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

Eric Botcazou  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-01-21
 CC||ebotcazou at gcc dot
   ||gnu.org
 AssignedTo|unassigned at gcc dot   |ebotcazou at gcc dot
   |gnu.org |gnu.org
 Ever Confirmed|0   |1

--- Comment #2 from Eric Botcazou  2012-01-21 
15:44:46 UTC ---
> But then when make_defs_and_copies_lists is called on the QImode -> DImode
> extension, we reach:
>   /* Initialize the work list.  */
>   if (!get_defs (extend_insn, src_reg, &work_list))
> {
>   VEC_free (rtx, heap, work_list);
>   /* The number of defs being equal to zero can only mean that all the
>  definitions have been previously merged.  */
>   return 2;
> }
> and because the definition has been changed already.

Yes, this particular return embodies some implicit assumptions...  Will fix.


[Bug rtl-optimization/51933] [4.7 Regression] Wrong-code due to -free

2012-01-21 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

--- Comment #1 from Jakub Jelinek  2012-01-21 
15:24:26 UTC ---
Consider following 3 routines:

unsigned long
foo (unsigned short *s, int i)
{
  unsigned short x = s[i];
  if (x < 0x211)
return (unsigned char) x;
  asm volatile ("");
  return 6;
}
unsigned long
bar (unsigned short *s, int i)
{
  unsigned short x = s[i];
  if (x < 0x211)
return (unsigned char) x;
  asm volatile ("");
  return x;
}
unsigned long
baz (unsigned short *s, int i)
{
  unsigned short x = s[i];
  if (x < 0x211)
return x;
  asm volatile ("");
  return (unsigned char) x;
}

In foo ree doesn't optimize away the QImode -> DImode zero extension, because
merge_def_and_ext checks:
  && GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode
and in this case ext_src_mode is QImode, but SET_DEST has HImode.
In bar ree doesn't optimize it away either, first merge_def_and_ext_checks is
called with the QImode ext_src_mode (which doesn't match) and only afterwards
on the HImode -> DImode extension, which is optimized (that is correct).
In baz (which is the same as bar in #c0 testcase) unfortunately
merge_def_and_ext is first called on the HImode -> DImode extension, which is
optimized (that is fine) by turning the HImode load into a DImode zero_extend
load from HImode and removing the HImode -> DImode extension.
But then when make_defs_and_copies_lists is called on the QImode -> DImode
extension, we reach:
  /* Initialize the work list.  */
  if (!get_defs (extend_insn, src_reg, &work_list))
{
  VEC_free (rtx, heap, work_list);
  /* The number of defs being equal to zero can only mean that all the
 definitions have been previously merged.  */
  return 2;
}
and because the definition has been changed already.  But nothing performs the
ext_src_mode check that used to be performed otherwise.
So we either need to do the src mode checking earlier when we haven't started
modifying insns (in add_removable_extension?), or we'd need to look even at the
newly added defs and see what mode they extend from.


[Bug rtl-optimization/51933] [4.7 Regression] Wrong-code due to -free

2012-01-21 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933

Jakub Jelinek  changed:

   What|Removed |Added

   Priority|P3  |P1
  Known to work||4.6.3
   Target Milestone|--- |4.7.0
  Known to fail||4.7.0