[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-09-03 Thread steven at gcc dot gnu dot org


--- Comment #35 from steven at gcc dot gnu dot org  2006-09-03 17:28 ---
Even if we did not hash SCEV data a lot, it would not buy you >50%.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687



[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-09-03 Thread rguenth at gcc dot gnu dot org


--- Comment #34 from rguenth at gcc dot gnu dot org  2006-09-03 13:22 
---
FYI, the profile (-O2) looks like

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
  2.04  0.62 0.62  5210670 0.00 0.00  htab_find_slot_with_hash
  2.01  1.23 0.61   135824 0.00 0.00  find_reloads
  1.78  1.77 0.54  1181450 0.00 0.00  constrain_operands
  1.71  2.29 0.52  1667653 0.00 0.00  walk_tree
  1.64  2.79 0.50   113076 0.00 0.00  record_reg_classes
  1.18  3.15 0.36  1374013 0.00 0.00  for_each_rtx_1
  1.15  3.50 0.35  2188275 0.00 0.00  iterative_hash_expr
  1.02  3.81 0.31 13803214 0.00 0.00  bitmap_bit_p
  0.99  4.11 0.30   146294 0.00 0.00 
reload_cse_simplify_operands
  0.92  4.39 0.28  8238693 0.00 0.00  bitmap_set_bit
  0.89  4.66 0.27 13524760 0.00 0.00  is_gimple_min_invariant
  0.85  4.92 0.26  1944894 0.00 0.00  extract_insn
  0.76  5.15 0.23  3242786 0.00 0.00  note_stores
  0.76  5.38 0.23  1848575 0.00 0.00  mark_set_1
  0.76  5.61 0.23  1073359 0.00 0.00  fold_binary
  0.69  5.82 0.21  3530376 0.00 0.00  ix86_decompose_address
  0.66  6.02 0.20  2801807 0.00 0.00  is_gimple_reg
  0.66  6.22 0.20  101 0.00 0.02  reload
  0.62  6.41 0.19  6596295 0.00 0.00  find_reg_note
  0.62  6.60 0.19  3748843 0.00 0.00  ggc_alloc_stat
  0.62  6.79 0.19  1543768 0.00 0.00  force_fit_type
  0.59  6.97 0.18  5160059 0.00 0.00  pool_alloc
  0.59  7.15 0.18   937999 0.00 0.00  make_node_stat
  0.59  7.33 0.18 1915 0.00 0.00  cleanup_cfg
  0.56  7.50 0.17  2132681 0.00 0.00  mark_all_vars_used_1
  0.56  7.67 0.17  1015399 0.00 0.00  get_expr_operands
  0.56  7.84 0.17   263419 0.00 0.00  cse_insn
  0.53  8.00 0.16  2162651 0.00 0.00  cselib_lookup
  0.53  8.16 0.16  1176845 0.00 0.00  mark_used_regs
  0.53  8.32 0.16  101 0.00 0.01  reload_as_needed
  0.49  8.47 0.15  1748131 0.00 0.00  operand_equal_p
  0.49  8.62 0.15  1160005 0.00 0.00  propagate_one_insn
  0.49  8.77 0.15  1086424 0.00 0.00  et_splay
  0.49  8.92 0.15  1030222 0.00 0.00  rtx_cost
  0.49  9.07 0.15   523805 0.00 0.00  count_reg_usage
  0.46  9.21 0.14  3222872 0.00 0.00  memory_operand
  0.46  9.35 0.14  1651283 0.00 0.00  htab_find_with_hash
  0.46  9.49 0.14   693604 0.00 0.00  rewrite_update_stmt
  0.46  9.63 0.14   457856 0.00 0.00  mul_double

we're hashing SCEV data a lot:

0.010.07  285600/2683522 set_instantiated_value [395]
0.010.09  398929/2683522 build_int_cst_wide [214]
0.020.20  864500/2683522 find_var_scev_info [184]
[68] 2.30.060.63 2683522 htab_find_slot [68]
0.340.16 2683522/11019933 htab_find_slot_with_hash
 [53]
0.030.00 1150100/1603800 hash_scev_info [589]


0.060.03  125694/1181450 reload_cse_simplify_operands
[94]
0.090.05  201473/1181450 find_matches [200]
0.320.16  698486/1181450 extract_constrain_insn_cached
[67]
[61] 2.70.540.28 1181450+14400   constrain_operands [61]
0.010.15  806498/873917  strict_memory_address_p [236]
0.050.00  272106/402044  operands_match_p [415]


0.050.04  100911/1588625 create_ssa_var_map [329]
0.070.06  144529/1588625 walk_tree_without_duplicates
[268]
0.140.13  292781/1588625 count_uses_and_derefs [149]
0.210.20  450984/1588625 remove_unused_locals [111]
[44] 3.90.520.68 1667653+3246362 walk_tree  [44]
0.060.18  337362/337362  find_used_portions [189]
0.080.12 1659486/1659486 pointer_set_insert [209]
0.000.06   97327/97327   scan_for_refs [487]
0.050.00 1267779/1267779 count_ptr_derefs [525]


0.010.56  143418/143418  regclass [71]
[80] 1.90.010.56  143418 scan_one_insn [80]
0.500.04  113076/113076  record_reg_classes [85]
0.010.01  100486/1944894 extract_insn [121]


0.000.07  209960/1775362 x86_extended_reg_mentioned_p
[364]
0.020.26  747868/1775362 approx

[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-09-03 Thread steven at gcc dot gnu dot org


--- Comment #33 from steven at gcc dot gnu dot org  2006-09-03 11:41 ---
FWIW, the oprofile for both test cases is basically flat, nothing stands out.
We just do _so_ much more work (many more passes without removing anything) and
that hurts apparently (not surprising of course).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687



[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-09-03 Thread steven at gcc dot gnu dot org


--- Comment #32 from steven at gcc dot gnu dot org  2006-09-03 11:37 ---
Just to be sure that between 7/24 and today we didn't speed up significantly:

"real" times for hashes100.c (x86_64-linux, Intel Xeon 3.2 GHz, 1GB RAM):

3.4.6  4.2-svn20060903delta
-O0 0m1.618s   0m1.634s   +1%
-O1 0m2.743s   0m5.175s   +88%
-O2 0m4.686s   0m7.719s   +65%

"real" times for infcodes100.c:
3.44.2-svn20060903delta
-O0 0m3.040s   0m3.526s   +16%
-O1 0m4.989s   0m8.871s   +77%
-O2 0m8.375s   0m13.334s  +59%


Given these numbers, I would stick with gcc3 if I were a kernel developer.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687



[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-09-03 Thread steven at gcc dot gnu dot org


--- Comment #31 from steven at gcc dot gnu dot org  2006-09-03 11:05 ---
"real" times for hashes100.c (x86_64-linux, Intel Xeon 3.2 GHz, 1GB RAM):

3.4.6  4.0.4  4.1.2  4.2-svn20060724
-O0 0m1.618s   0m1.762s   0m1.661s   0m1.645s
-O1 0m2.743s   0m4.646s   0m4.984s   0m4.936s
-O2 0m4.686s   0m6.814s   0m7.140s   0m7.603s


"real" times for infcodes100.c:

3.4.6  4.0.4  4.1.2  4.2-svn20060724
-O0 0m3.040s   0m3.643s   0m3.555s   0m3.575s
-O1 0m4.989s   0m7.694s   0m8.809s   0m8.943s
-O2 0m8.375s   0m10.622s  0m12.136s  0m13.285s


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687



[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-07-05 Thread pinskia at gcc dot gnu dot org


--- Comment #30 from pinskia at gcc dot gnu dot org  2006-07-05 09:14 
---
Can you do timings on these again on the mainline since it looks like Richard
G.'s memory patches also improved compile time for C at least on the CSiBE
benchmark.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687



[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-05-24 Thread mmitchel at gcc dot gnu dot org


--- Comment #29 from mmitchel at gcc dot gnu dot org  2006-05-25 02:32 
---
Will not be fixed in 4.1.1; adjust target milestone to 4.1.2.


-- 

mmitchel at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.1.1   |4.1.2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687



[Bug tree-optimization/18687] [4.0/4.1/4.2 Regression] ~50% compile time regression

2006-02-23 Thread mmitchel at gcc dot gnu dot org


--- Comment #28 from mmitchel at gcc dot gnu dot org  2006-02-24 00:25 
---
This issue will not be resolved in GCC 4.1.0; retargeted at GCC 4.1.1.


-- 

mmitchel at gcc dot gnu dot org changed:

   What|Removed |Added

   Target Milestone|4.0.3   |4.1.1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18687