[Bug middle-end/80960] [5/6/7/8 Regression] Huge memory use when compiling a very large test case

2017-06-06 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

Richard Biener  changed:

   What|Removed |Added

 CC||mliska at suse dot cz,
   ||segher at gcc dot gnu.org

--- Comment #8 from Richard Biener  ---
The first "bisection" possibly points at r190594 which limited the work FRE
does (with the effect of leaving some things unoptimized)

More precise bisection would be appreciated.

I see with GCC 7.1 and -O1 (recommended for machine-generated code) a use of
3.7GB
of ram.

The code contains a very large basic-block.

I do remember compile-time/memory-hog PRs for this code style.

compile-time analysis using perf highlights:

Samples: 680K of event 'cycles:pp', Event count (approx.): 607299351135 
Overhead  Command   Shared Object Symbol  
◆
  26.58%  f951  f951  [.] refers_to_regno_p   
▒
   9.64%  f951  f951  [.] reg_overlap_mentioned_p 
▒
   7.08%  f951  f951  [.] find_hard_regno_for_1   
▒
   4.42%  f951  f951  [.] reg_used_between_p  
▒
   1.90%  f951  f951  [.] get_last_value_validate 

which probably means we're doing some quadratic amount of work on use->def
chains inside the BB.  With call traces:

+   48.54% 1.49%  f951  f951  [.] try_combine 
-   25.59%25.55%  f951  f951  [.] refers_to_regno_p
   ▒
   - refers_to_regno_p
▒
  - 16.98% reg_overlap_mentioned_p
▒
 - 16.00% reg_used_between_p  
▒
  can_combine_p   
▒
  try_combine   
...
  - 4.69% refers_to_regno_p
   ▒
 - 4.66% reg_overlap_mentioned_p  
▒
- 4.36% reg_used_between_p
▒
 can_combine_p
▒
 try_combine 


so it's combine (at least at -O1) and I can also imagine that's using up the
memory in its attempts to simplify & match up stuff as it uses GC memory
for all the copying that involves IIRC.  Segher?

int
reg_used_between_p (const_rtx reg, const rtx_insn *from_insn,
const rtx_insn *to_insn)
{
  rtx_insn *insn;

  if (from_insn == to_insn)
return 0;

  for (insn = NEXT_INSN (from_insn); insn != to_insn; insn = NEXT_INSN (insn))
if (NONDEBUG_INSN_P (insn)
&& (reg_overlap_mentioned_p (reg, PATTERN (insn))
   || (CALL_P (insn) && find_reg_fusage (insn, USE, reg
  return 1;
  return 0;
}

so that just walks the BB instead of, say, using DF uses (if available during
combine), or somehow recording "distance" between two rtx_insns to be able
to cap the amount of work done (and conservatively return true).  After all
it's going to end up combining very "distant" instructions here (remember,
gigantic basic-block).

[Bug middle-end/80960] [5/6/7/8 Regression] Huge memory use when compiling a very large test case

2017-06-03 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

--- Comment #7 from Dominique d'Humieres  ---
The second change in memory size occurred between revisions r201916 (1996MB)
and r202560 (6196Mb).

[Bug middle-end/80960] [5/6/7/8 Regression] Huge memory use when compiling a very large test case

2017-06-03 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

--- Comment #6 from Dominique d'Humieres  ---
Timings and memory use for various releases. They show a timing regression for
gcc-7 and trunk compared to gcc-6.

% time gfortran-4.8.5 pr80960_db.f90 -fdefault-integer-8 -O2 -ftime-report
...
 phase opt and generate  : 349.98 (100%) usr   5.90 (99%) sys 356.90 (100%)
wall 1986145 kB (100%) ggc
...
 combiner: 194.33 (55%) usr   3.10 (52%) sys 197.71 (55%) wall
1754287 kB (88%) ggc
...
 TOTAL : 350.37 5.93   357.33   
1995723 kB
% time gfortran-4.9.3 pr80960_db.f90 -fdefault-integer-8 -O2 -ftime-report
...
 phase opt and generate  : 264.34 (100%) usr   6.55 (99%) sys 272.05 (100%)
wall 6179333 kB (100%) ggc
...
 combiner: 204.75 (77%) usr   3.75 (57%) sys 209.44 (77%) wall
5948591 kB (96%) ggc
...
 TOTAL : 264.72 6.59   272.52   
6189174 kB
% time gfortran-5.4 pr80960_db.f90 -fdefault-integer-8 -O2 -ftime-report
...
 phase opt and generate  : 235.60 (100%) usr   3.54 (93%) sys 239.57 (100%)
wall 4455669 kB (100%) ggc
...
 combiner: 160.31 (68%) usr   1.40 (37%) sys 161.98 (67%) wall
1624073 kB (36%) ggc
...
 TOTAL : 236.00 3.79   240.26   
4465708 kB
% time gfortran-6.3 pr80960_db.f90 -fdefault-integer-8 -O2 -ftime-report
...
 phase opt and generate  : 138.71 (100%) usr   2.42 (92%) sys 141.23 (99%) wall
4493675 kB (100%) ggc
...
 combiner:  74.35 (53%) usr   1.11 (42%) sys  75.45 (53%) wall
2700700 kB (60%) ggc
...
 TOTAL : 139.16 2.64   141.97   
4503966 kB
% time gfortran-7.1 pr80960_db.f90 -fdefault-integer-8 -O2 -ftime-report
...
 phase opt and generate  : 233.01 (100%) usr   3.72 (90%) sys 237.31 (99%) wall
4498167 kB (100%) ggc
...
 combiner: 158.53 (68%) usr   1.97 (48%) sys 160.90 (67%) wall
2700700 kB (60%) ggc
...
 TOTAL : 233.47 4.13   238.51   
4508597 kB
% time gfortran-8.0 pr80960_db.f90 -fdefault-integer-8 -O2 -ftime-report
...
 phase opt and generate  : 235.01 (100%) usr   3.44 (93%) sys 238.88 (100%)
wall 4498191 kB (100%) ggc
...
 combiner: 159.18 (68%) usr   1.79 (48%) sys 161.19 (67%) wall
2700700 kB (60%) ggc
...
 TOTAL : 235.47 3.70   239.66   
4508355 kB

[Bug middle-end/80960] [5/6/7/8 Regression] Huge memory use when compiling a very large test case

2017-06-02 Thread dominiq at lps dot ens.fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

Dominique d'Humieres  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #5 from Dominique d'Humieres  ---
The most important change (a factor ~10) occurred between revisions r190586
(229Mb) and r190619 (1988Mb).

% time /opt/gcc/gcc4.8p-190586/bin/gfortran -c pr80960.f90 -fdefault-integer-8
-O2 -ftime-report

Execution times (seconds)
 phase setup :   0.03 ( 0%) usr   0.02 ( 1%) sys   0.27 ( 0%) wall 
   192 kB ( 0%) ggc
 phase parsing   :   0.53 ( 0%) usr   0.06 ( 3%) sys   0.74 ( 0%) wall 
  9890 kB ( 4%) ggc
 phase opt and generate  : 156.19 (100%) usr   1.80 (95%) sys 158.35 (99%) wall
 218850 kB (95%) ggc
...
 alias analysis  :   0.15 ( 0%) usr   0.00 ( 0%) sys   0.16 ( 0%) wall 
 11577 kB ( 5%) ggc
 alias stmt walking  :  70.08 (45%) usr   0.24 (13%) sys  70.61 (44%) wall 
 0 kB ( 0%) ggc
...
 tree PRE:   0.74 ( 0%) usr   0.04 ( 2%) sys   0.66 ( 0%) wall 
  6401 kB ( 3%) ggc
 tree FRE:  12.30 ( 8%) usr   0.26 (14%) sys  12.33 ( 8%) wall 
 20878 kB ( 9%) ggc
...
 dead store elim1:   1.61 ( 1%) usr   0.03 ( 2%) sys   1.63 ( 1%) wall 
  2235 kB ( 1%) ggc
 dead store elim2:  16.73 (11%) usr   0.01 ( 1%) sys  16.74 (11%) wall 
  3902 kB ( 2%) ggc
...
 integrated RA   :  24.05 (15%) usr   0.06 ( 3%) sys  24.11 (15%) wall 
 32565 kB (14%) ggc
 reload  :   7.82 ( 5%) usr   0.22 (12%) sys   8.04 ( 5%) wall 
 11106 kB ( 5%) ggc
 reload CSE regs :   5.41 ( 3%) usr   0.01 ( 1%) sys   5.42 ( 3%) wall 
  6071 kB ( 3%) ggc
...
 TOTAL : 156.78 1.90   159.42
229270 kB


% time /opt/gcc/gcc4.8a-190619/bin/gfortran -c pr80960.f90 -fdefault-integer-8
-O2 -ftime-report

Execution times (seconds)
 phase setup :   0.03 ( 0%) usr   0.02 ( 0%) sys   0.30 ( 0%) wall 
   192 kB ( 0%) ggc
 phase parsing   :   0.53 ( 0%) usr   0.06 ( 0%) sys   0.77 ( 0%) wall 
  9866 kB ( 0%) ggc
 phase opt and generate  : 446.51 (100%) usr  14.64 (99%) sys 697.47 (100%)
wall 1978039 kB (99%) ggc
...
 alias analysis  :   0.20 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall 
 11577 kB ( 1%) ggc
 alias stmt walking  :  23.37 ( 5%) usr   0.40 ( 3%) sys  22.98 ( 3%) wall 
 0 kB ( 0%) ggc
...
 tree PRE:   8.44 ( 2%) usr   0.06 ( 0%) sys   8.84 ( 1%) wall 
 10424 kB ( 1%) ggc
 tree FRE:  17.99 ( 4%) usr   0.25 ( 2%) sys  18.79 ( 3%) wall 
 26063 kB ( 1%) ggc
...
 dead code elimination   :   0.12 ( 0%) usr   0.00 ( 0%) sys   0.12 ( 0%) wall 
 0 kB ( 0%) ggc
 dead store elim1:   2.25 ( 1%) usr   0.01 ( 0%) sys   2.25 ( 0%) wall 
  2630 kB ( 0%) ggc
 dead store elim2:  15.55 ( 3%) usr   0.02 ( 0%) sys  15.56 ( 2%) wall 
  3329 kB ( 0%) ggc
 CPROP   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 4 kB ( 0%) ggc
 CSE 2   :   0.84 ( 0%) usr   0.01 ( 0%) sys   0.83 ( 0%) wall 
 1 kB ( 0%) ggc
 branch prediction   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
10 kB ( 0%) ggc
 combiner: 328.61 (73%) usr   9.14 (62%) sys 519.94 (74%) wall
1754285 kB (88%) ggc
 regmove :  16.52 ( 4%) usr   0.01 ( 0%) sys  16.54 ( 2%) wall 
 0 kB ( 0%) ggc
 integrated RA   :  10.49 ( 2%) usr   0.21 ( 1%) sys  15.92 ( 2%) wall 
 32785 kB ( 2%) ggc
 reload  :   5.88 ( 1%) usr   0.40 ( 3%) sys  22.38 ( 3%) wall 
  9546 kB ( 0%) ggc
 reload CSE regs :   4.90 ( 1%) usr   0.00 ( 0%) sys   4.90 ( 1%) wall 
  6073 kB ( 0%) ggc
...
 TOTAL : 447.1014.81   698.71   
1988307 kB

[Bug middle-end/80960] [5/6/7/8 Regression] Huge memory use when compiling a very large test case

2017-06-02 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

Thomas Koenig  changed:

   What|Removed |Added

   Target Milestone|--- |5.5

[Bug middle-end/80960] [5/6/7/8 Regression] Huge memory use when compiling a very large test case

2017-06-02 Thread tkoenig at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

Thomas Koenig  changed:

   What|Removed |Added

 Target|amd64 Linux |
 Status|UNCONFIRMED |NEW
   Keywords||memory-hog
   Last reconfirmed||2017-06-02
  Component|fortran |middle-end
   Host|amd64 Linux |
 Ever confirmed|0   |1
Summary|[regression since 4.9.2]|[5/6/7/8 Regression] Huge
   |gfortran crashes when   |memory use when compiling a
   |compiling f90 file with msg |very large test case
   |"Out of memory: Kill|
   |process 538 (f951)" |