[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-07-15 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

amker at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from amker at gcc dot gnu.org ---
Fixed.

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-07-15 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

--- Comment #6 from amker at gcc dot gnu.org ---
Author: amker
Date: Fri Jul 15 08:53:48 2016
New Revision: 238366

URL: https://gcc.gnu.org/viewcvs?rev=238366&root=gcc&view=rev
Log:
gcc/testsuite
PR tree-optimization/71347
* gcc.dg/tree-ssa/pr71347.c: XFAIL on ia64, arm, m68k and sparc.

Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-06-22 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

--- Comment #5 from amker at gcc dot gnu.org ---
(In reply to Rainer Orth from comment #4)
> Created attachment 38744 [details]
> pr71347.c.214t.optimized
> 
> The new testcase FAILs on sparc*-sun-solaris2.* (both 32 and 64-bit):
> 
> FAIL: gcc.dg/tree-ssa/pr71347.c scan-tree-dump-not optimized ".* = MEM.*;"
> 
> Dump attached.
> 
>   Rainer

Thanks for reporting.  I am going to disable this on various targets according
to https://gcc.gnu.org/ml/gcc-patches/2016-06/msg01377.html

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-06-22 Thread ro at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

Rainer Orth  changed:

   What|Removed |Added

 CC||ro at gcc dot gnu.org

--- Comment #4 from Rainer Orth  ---
Created attachment 38744
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38744&action=edit
pr71347.c.214t.optimized

The new testcase FAILs on sparc*-sun-solaris2.* (both 32 and 64-bit):

FAIL: gcc.dg/tree-ssa/pr71347.c scan-tree-dump-not optimized ".* = MEM.*;"

Dump attached.

  Rainer

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-06-17 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

--- Comment #3 from amker at gcc dot gnu.org ---
Author: amker
Date: Fri Jun 17 09:26:05 2016
New Revision: 237552

URL: https://gcc.gnu.org/viewcvs?rev=237552&root=gcc&view=rev
Log:
PR tree-optimization/71347
* tree-ssa-loop-ivopts.c (determine_group_iv_cost_address): Compute
cost for all uses in group.

PR tree-optimization/71347
* gcc.dg/tree-ssa/pr71347.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr71347.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-ivopts.c

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-05-31 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

--- Comment #2 from amker at gcc dot gnu.org ---
Thanks for reporting this.

The dump after IVOPT now is:


  :
  # prephitmp_21 = PHI 
  # prephitmp_23 = PHI 
  # ivtmp.17_16 = PHI 
  _6 = prephitmp_21 * prephitmp_23;
  _4 = (void *) ivtmp.17_16;
  MEM[base: _4, offset: 0B] = _6;
  ivtmp.17_9 = ivtmp.17_16 + 8;
  if (ivtmp.17_9 != _26)
goto ;
  else
goto ;

  :
  _5 = (void *) ivtmp.17_9;
  pretmp_20 = MEM[base: _5, offset: 4294967288B];
  pretmp_22 = X[1];
  goto ;

That patch skips computing cost for sub iv_uses in a group:
Group 0:
  Type: ADDRESS
  Use 0.0:
At stmt:X[i_18] = _6;
At pos: X[i_18]
IV struct:
  Type: double *
  Base: (double *) (&X + 16)
  Step: 8
  Object:   (void *) &X
  Biv:  N
  Overflowness wrto loop niter: Overflow
  Use 0.1:
At stmt:pretmp_20 = X[_15];
At pos: X[_15]
IV struct:
  Type: double *
  Base: (double *) (&X + 16)
  Step: 8
  Object:   (void *) &X
  Biv:  N
  Overflowness wrto loop niter: Overflow

Though use 0.0/0.1 have same {base, step}, but there are in different program
point, so if iv_cand is increased before use 0.0, the first use is transformed
into: MEM[var_before, 0], and use 0.1 will be transformed into: MEM[var_after,
-8].  Now the two memory reference have different expressions, though access to
the same object.  Afterwards, DOM failed to CSE them.

It is hard to decide which sub iv_use we should compute the cost, and which one
we can skip computing the cost.  Maybe I should revert this part of code which
was introduced to save a small amount of compilation time.

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-05-31 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

Richard Biener  changed:

   What|Removed |Added

 CC||amker.cheng at gmail dot com
   Target Milestone|--- |7.0

[Bug tree-optimization/71347] [7 regression] Performance drop after r235513 on x86-64 in 32-bit mode.

2016-05-30 Thread ysrumyan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71347

--- Comment #1 from Yuri Rumyantsev  ---
Created attachment 38600
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38600&action=edit
test-case to reproduce

Need to be compiled with -O2 -m32 -march=slm -ffast-math options on x64-64.