[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-22 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



Jan Hubicka hubicka at gcc dot gnu.org changed:



   What|Removed |Added



 CC||vmakarov at redhat dot com



--- Comment #37 from Jan Hubicka hubicka at gcc dot gnu.org 2012-11-22 
09:50:52 UTC ---

Yes, agreed. It is overall problem of SSA form to assume that reg-reg copies in

PHIs will be optimized away by smart regalloc.  Moreover we assume the same for

constants.



This case is hard to fix later since the values are path sensitive...

Vladimir, I guess there is not much to do on regalloc side, right?



Why the problem do not reproduce on simplified testcase:

void

f (int i, long *a, long *b)

{

  int sum = 0;

  b[i] = 0;

#define PART(I) if (t()) sum++;

  PART (1);

  PART (2);

  PART (3);

  PART (4);

  PART (5);

  PART (6);

  tt (sum);

}

here we somehow do not consider the partial redundancies on sum...


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-22 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



--- Comment #38 from Jan Hubicka hubicka at gcc dot gnu.org 2012-11-22 
13:05:38 UTC ---

yet another variant...

void

f (int i, long *a, long *b)

{

  int sum = 0;

  for (; --i =  0; a++, b++)

{

  b[i] = 0;

#define PART(I) if (t()) sum+=100+I;

  PART (1);

  PART (2);

  PART (3);

  PART (4);

  PART (5);

  PART (6);

  tt (sum);

}

}

leads to...

Starting insert iteration 1

Could not find SSA_NAME representative for expression:{plus_expr,sum_8,101}

Created SSA_NAME representative pretmp_98 for expression:{plus_expr,sum_8,101}

Could not find SSA_NAME representative for expression:{plus_expr,pretmp_98,102}

Created SSA_NAME representative pretmp_99 for

expression:{plus_expr,pretmp_98,102}

Could not find SSA_NAME representative for expression:{plus_expr,pretmp_99,103}

Created SSA_NAME representative pretmp_100 for

expression:{plus_expr,pretmp_99,103}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_100,104}

Created SSA_NAME representative pretmp_101 for

expression:{plus_expr,pretmp_100,104}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_101,105}

Created SSA_NAME representative pretmp_102 for

expression:{plus_expr,pretmp_101,105}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_100,105}

Created SSA_NAME representative pretmp_103 for

expression:{plus_expr,pretmp_100,105}

Could not find SSA_NAME representative for expression:{plus_expr,pretmp_99,104}

Created SSA_NAME representative pretmp_104 for

expression:{plus_expr,pretmp_99,104}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_104,105}

Created SSA_NAME representative pretmp_105 for

expression:{plus_expr,pretmp_104,105}

Could not find SSA_NAME representative for expression:{plus_expr,pretmp_99,105}

Created SSA_NAME representative pretmp_106 for

expression:{plus_expr,pretmp_99,105}

Could not find SSA_NAME representative for expression:{plus_expr,pretmp_98,103}

Created SSA_NAME representative pretmp_107 for

expression:{plus_expr,pretmp_98,103}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_107,104}

Created SSA_NAME representative pretmp_108 for

expression:{plus_expr,pretmp_107,104}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_108,105}

Created SSA_NAME representative pretmp_109 for

expression:{plus_expr,pretmp_108,105}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_107,105}

Created SSA_NAME representative pretmp_110 for

expression:{plus_expr,pretmp_107,105}

Could not find SSA_NAME representative for expression:{plus_expr,pretmp_98,104}

Created SSA_NAME representative pretmp_111 for

expression:{plus_expr,pretmp_98,104}

Could not find SSA_NAME representative for

expression:{plus_expr,pretmp_111,105}



that eventually leads to a lot of unused pretmp vars.


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-21 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



Jan Hubicka hubicka at gcc dot gnu.org changed:



   What|Removed |Added



 Status|WAITING |REOPENED



--- Comment #35 from Jan Hubicka hubicka at gcc dot gnu.org 2012-11-21 
17:56:04 UTC ---

Too bad, we really need to make some model on how many PHI copies we

introduce... I agree with Richard's comment that Joern's patch is rather bad in

respect to optimization oppurtunities.   This is more or less register pressure

problem. I will try think about it a bit more ;)


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-21 Thread amylaar at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



--- Comment #36 from Jorn Wolfgang Rennecke amylaar at gcc dot gnu.org 
2012-11-21 17:59:10 UTC ---

(In reply to comment #35)

 Too bad, we really need to make some model on how many PHI copies we

 introduce... I agree with Richard's comment that Joern's patch is rather bad 
 in

 respect to optimization oppurtunities.   This is more or less register 
 pressure

 problem. I will try think about it a bit more ;)



This is not just register pressure, these constant loads and register-register

copies do not come free, either.


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-20 Thread izamyatin at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



--- Comment #34 from Igor Zamyatin izamyatin at gmail dot com 2012-11-20 
21:00:39 UTC ---

(In reply to comment #32)

 Would be possible to double check if this problem is still fixed after the fix

 to the tree-ssa-pre patch?



Unfortunately the regression happened after the fix...


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-16 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



Jan Hubicka hubicka at gcc dot gnu.org changed:



   What|Removed |Added



 Status|RESOLVED|WAITING

 CC||hubicka at gcc dot gnu.org

 Resolution|FIXED   |



--- Comment #32 from Jan Hubicka hubicka at gcc dot gnu.org 2012-11-16 
17:50:28 UTC ---

Would be possible to double check if this problem is still fixed after the fix

to the tree-ssa-pre patch? I do not see any cold edges involved here, so

perhaps we will need better heuristic.



We now again find some partial redundancies.

Found partial redundancy for expression {mem_ref0B,_20}@.MEM_5 (0024)

Inserted pretmp_57 = *_20;

 in predecessor 6

Created phi prephitmp_56 = PHI _24(20), pretmp_57(6)

 in block 7

Found partial redundancy for expression {mem_ref0B,_20}@.MEM_6 (0029)

Inserted pretmp_55 = *_20;

 in predecessor 8

Created phi prephitmp_63 = PHI _29(21), pretmp_55(8)

 in block 9

Found partial redundancy for expression {mem_ref0B,_20}@.MEM_7 (0034)

Inserted pretmp_64 = *_20;

 in predecessor 10

Created phi prephitmp_65 = PHI _34(22), pretmp_64(10)

 in block 11

Found partial redundancy for expression {mem_ref0B,_20}@.MEM_8 (0039)

Inserted pretmp_66 = *_20;

 in predecessor 12

Created phi prephitmp_67 = PHI _39(23), pretmp_66(12)

 in block 13

Starting insert iteration 2

Replaced *_20 with prephitmp_56 in _29 = *_20;

Replaced *_20 with prephitmp_63 in _34 = *_20;

Replaced *_20 with prephitmp_65 in _39 = *_20;

Replaced *_20 with prephitmp_67 in _44 = *_20;


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-11-16 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785



--- Comment #33 from Jan Hubicka hubicka at gcc dot gnu.org 2012-11-16 
18:00:54 UTC ---

And at -O3 the testcase does not look really good indeed

  bb 7:

  # cstore_51 = PHI 0(5), 2147483647(6)

  # prephitmp_82 = PHI 1073741823(5), 3221225470(6)

  # prephitmp_83 = PHI 1789569705(5), 3937053352(6)

  # prephitmp_84 = PHI 2326440616(5), 4473924263(6)

  # prephitmp_85 = PHI 2755937345(5), 4903420992(6)

  # prephitmp_86 = PHI 3113851286(5), 5261334933(6)

  # prephitmp_87 = PHI 2684354557(5), 4831838204(6)

  # prephitmp_88 = PHI 2219066434(5), 4366550081(6)

  # prephitmp_89 = PHI 2576980375(5), 4724464022(6)

  # prephitmp_90 = PHI 2147483646(5), 4294967293(6)

  # prephitmp_91 = PHI 1610612734(5), 3758096381(6)

  # prephitmp_92 = PHI 2040109463(5), 4187593110(6)

  # prephitmp_93 = PHI 2398023404(5), 4545507051(6)

  # prephitmp_94 = PHI 1968526675(5), 4116010322(6)

  # prephitmp_95 = PHI 1503238552(5), 3650722199(6)

  # prephitmp_96 = PHI 1861152493(5), 4008636140(6)

  # prephitmp_97 = PHI 1431655764(5), 3579139411(6)

  # prephitmp_98 = PHI 715827882(5), 2863311529(6)

  # prephitmp_99 = PHI 1252698793(5), 3400182440(6)

  # prephitmp_100 = PHI 1682195522(5), 3829679169(6)

  # prephitmp_103 = PHI 1145324611(5), 3292808258(6)

  # prephitmp_106 = PHI 536870911(5), 2684354558(6)

  # prephitmp_107 = PHI 966367640(5), 3113851287(6)

  # prephitmp_108 = PHI 1324281581(5), 3471765228(6)

  # prephitmp_109 = PHI 894784852(5), 3042268499(6)

  # prephitmp_110 = PHI 429496729(5), 2576980376(6)

  # prephitmp_111 = PHI 787410670(5), 2934894317(6)

  # prephitmp_112 = PHI 357913941(5), 2505397588(6)

  *_18 = cstore_51;

  _24 = *_20;

  _25 = _24  2;

  if (_25 = -14)

goto bb 8;

  else

goto bb 9;



The catch is that the patch disabled the partial PRE by an accident. No cold

edges are involved here since we predict all the branches quite even :(


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-05-02 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|4.5.4   |4.8.0


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-04-27 Thread mkuvyrkov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

--- Comment #30 from Maxim Kuvyrkov mkuvyrkov at gcc dot gnu.org 2012-04-28 
01:56:59 UTC ---
Author: mkuvyrkov
Date: Sat Apr 28 01:56:54 2012
New Revision: 186928

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=186928
Log:
PR tree-optimization/38785
* common.opt (ftree-partial-pre): New option.
* doc/invoke.texi: Document it.
* opts.c (default_options_table): Initialize flag_tree_partial_pre.
* tree-ssa-pre.c (do_partial_partial_insertion): Insert only if it will
benefit speed path.
(execute_pre): Use flag_tree_partial_pre.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/common.opt
trunk/gcc/doc/invoke.texi
trunk/gcc/opts.c
trunk/gcc/tree-ssa-pre.c


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-04-27 Thread mkuvyrkov at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

Maxim Kuvyrkov mkuvyrkov at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #31 from Maxim Kuvyrkov mkuvyrkov at gcc dot gnu.org 2012-04-28 
02:09:38 UTC ---
Fixed by the above reworked version of Joern's and Steven's patches.


[Bug tree-optimization/38785] [4.5/4.6/4.7/4.8 Regression] huge performance regression on EEMBC bitmnp01

2012-03-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38785

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|4.4.7   |4.5.4

--- Comment #29 from Jakub Jelinek jakub at gcc dot gnu.org 2012-03-13 
12:46:14 UTC ---
4.4 branch is being closed, moving to 4.5.4 target.