[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-03-12 Thread vmakarov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

--- Comment #7 from Vladimir Makarov vmakarov at gcc dot gnu.org ---
(In reply to amker from comment #6)
 Since it works on gcc 3.4, so I consider this as a regression and applied
 the patch.  Should be fixed now.
 
 Hi Vlad, could you please help me verify that the original benchmark is
 fixed too?  Thanks very much!

Yes, it was fixed.  Thanks for working on this.


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-03-11 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

amker at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from amker at gcc dot gnu.org ---
Fixed according to Vlad's input.


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-02-12 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

--- Comment #5 from amker at gcc dot gnu.org ---
Author: amker
Date: Fri Feb 13 05:44:46 2015
New Revision: 220676

URL: https://gcc.gnu.org/viewcvs?rev=220676root=gccview=rev
Log:

PR tree-optimization/64705
* tree-ssa-loop-niter.h (expand_simple_operations): New parameter.
* tree-ssa-loop-niter.c (expand_simple_operations): New parameter.
* tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New.
(find_bivs, find_givs_in_stmt_scev): Pass new argument to
expand_simple_operations.

testsuite
PR tree-optimization/64705
* gcc.dg/tree-ssa/pr64705.c: New test.


Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/pr64705.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-loop-ivopts.c
trunk/gcc/tree-ssa-loop-niter.c
trunk/gcc/tree-ssa-loop-niter.h


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-02-12 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

--- Comment #6 from amker at gcc dot gnu.org ---
Since it works on gcc 3.4, so I consider this as a regression and applied the
patch.  Should be fixed now.

Hi Vlad, could you please help me verify that the original benchmark is fixed
too?  Thanks very much!


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-02-04 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

--- Comment #4 from amker at gcc dot gnu.org ---
I had a patch.


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-01-22 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

amker at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2015-01-23
 CC||amker at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |amker at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from amker at gcc dot gnu.org ---
Confirmed.  I shall have a look.


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-01-22 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

--- Comment #2 from amker at gcc dot gnu.org ---
Loop dump before IVOPT is like below:

Loop 4, basic blocks 28/30;

  bb 26:
  count_54 = count_172 + 1;
  _55 = i_161 + i_161;
  prime_56 = _55 + 3;
  k_57 = prime_56 + i_161;
  if (size_26 = k_57)
goto bb 27;
  else
goto bb 31;

  bb 27:

  bb 28:
  # k_167 = PHI k_57(27), k_62(30)
  # ci_168 = PHI ci_169(27), ci_58(30)
  ci_58 = ci_168 + 1;
  k.19_59 = (sizetype) k_167;
  _60 = flags_30 + k.19_59;
  *_60 = 0;
  k_62 = prime_56 + k_167;
  if (size_26 = k_62)
goto bb 30;
  else
goto bb 29;

  bb 29:
  # ci_154 = PHI ci_58(28)
  goto bb 31;

  bb 30:
  goto bb 28;


The IV uses found by IVOPT is like below:

use 0
  address
  defined in statement 
  used in statement *_60 = 0;

  at position *_60
  type char *
  base flags_30 + (sizetype) k_57
  step (sizetype) prime_56
  base object (void *) flags_30
  related candidates 
use 1
  compare
  defined in statement 
  used in statement if (size_26 = k_62)

  at position 
  type long int
  base (_55 + 3) + k_57
  step prime_56
  is a biv
  related candidates 
use 2
  generic (computed on exit edge)
  defined in statement ci_58 = ci_168 + 1;
  used in statement ci_154 = PHI ci_58(28)

  at position 
  type long int
  base ci_169 + 1
  step 1
  is a biv
  related candidates 

Root cause is IVOPT expands use 1 from {prime_56 + k_57, prime_56}_loop to
{(_55 + _3) + k_57, prime_56}_loop.  Thus information of iv.step == prime_56
== (_55+_3) is lost during costs computation and uses rewrting, resulting in
wrong candidate selected and bloated loop after IVOPT.

The related code is in function find_givs_in_stmt_scev, specifically,

  if (!simple_iv (loop, loop_containing_stmt (stmt), lhs, iv, true))
return false;
  iv-base = expand_simple_operations (iv-base);   // --- expansion

I will see how to fix the issue by skipping expansion in case like this.

Thanks,
bin


[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations

2015-01-22 Thread amker at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705

amker at gcc dot gnu.org changed:

   What|Removed |Added

 Target|x86_64-*-*  |x86_64-*-*, aarch64

--- Comment #3 from amker at gcc dot gnu.org ---
Also it's a target independent issue.
Though IVOPT chooses base+index addressing mode, it needs one more instruction
to calculate the condition.

LLVM's assembly:
.LBB34_7:
addx26, x26, #1
strb wzr, [x22, x9]
add x9, x9, x24
cmp x9, x28
b.le.LBB34_7

GCC's assembly:
.L71:
strbwzr, [x27, x0]
addx0, x0, x2
addx19, x19, 1
addx1, x4, x0
cmpx21, x1
bge.L71