[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-03-02 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #32 from Martin Jambor  ---
(In reply to Martin Jambor from comment #30)
> I think that using the same approach to cache ipa_vr
> structures (used to store results of IPA-VR) could bring further
> savings

They were not really significant, so let's leave them as they are now.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-03-01 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Markus Trippelsdorf  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #31 from Markus Trippelsdorf  ---
Fixed, thanks.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-03-01 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #30 from Martin Jambor  ---
With the above commit, we hae avoided the vast majority of memory use
increase.  I think that using the same approach to cache ipa_vr
structures (used to store results of IPA-VR) could bring further
savings (possibly a hundred of megabytes?) so I will try that.
In any event, this may no longer qualify as P1.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-03-01 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #29 from Martin Jambor  ---
Author: jamborm
Date: Wed Mar  1 09:37:27 2017
New Revision: 245805

URL: https://gcc.gnu.org/viewcvs?rev=245805=gcc=rev
Log:
[PR 78140] Reuse same IPA bits and VR info

2017-03-01  Martin Jambor  

PR lto/78140
* ipa-prop.h (ipa_bits): Removed field known.
(ipa_jump_func): Removed field vr_known.  Changed fields bits and m_vr
to pointers.  Adjusted their comments to warn about their sharing.
(ipcp_transformation_summary): Change bits to a vector of pointers.
(ipa_check_create_edge_args): Moved to ipa-prop.c, declare.
(ipa_get_ipa_bits_for_value): Declare.
* tree-vrp.h (value_range): Mark as GTY((for_user)).
* ipa-prop.c (ipa_bit_ggc_hash_traits): New.
(ipa_bits_hash_table): Likewise.
(ipa_vr_ggc_hash_traits): Likewise.
(ipa_vr_hash_table): Likewise.
(ipa_print_node_jump_functions_for_edge): Adjust for bits and m_vr
being pointers and vr_known being removed.
(ipa_set_jf_unknown): Likewise.
(ipa_get_ipa_bits_for_value): New function.
(ipa_set_jfunc_bits): Likewise.
(ipa_get_value_range): New overloaded functions.
(ipa_set_jfunc_vr): Likewise.
(ipa_compute_jump_functions_for_edge): Use the above functions to
construct bits and vr parts of jump functions.
(ipa_check_create_edge_args): Move here from ipa-prop.h, also allocate
ipa_bits_hash_table and ipa_vr_hash_table if they do not already
exist.
(ipcp_grow_transformations_if_necessary): Also allocate
ipa_bits_hash_table and ipa_vr_hash_table if they do not already
exist.
(ipa_node_params_t::duplicate): Do not copy bits, just pointers to
them.  Fix too long lines.
(ipa_write_jump_function): Adjust for bits and m_vr being pointers and
vr_known being removed.
(ipa_read_jump_function): Use new setter functions to construct bits
and vr parts of jump functions or set them to NULL.
(write_ipcp_transformation_info): Adjust for bits being pointers.
(read_ipcp_transformation_info): Likewise.
(ipcp_update_bits): Likewise.  Fix excessively long lines a trailing
space.
Include gt-ipa-prop.h.
* ipa-cp.c (propagate_bits_across_jump_function): Adjust for bits
being pointers.
(ipcp_store_bits_results): Likewise.
(propagate_vr_across_jump_function): Adjust for m_vr being a pointer.
Do not write to existing jump functions but use a temporary instead.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-cp.c
trunk/gcc/ipa-prop.c
trunk/gcc/ipa-prop.h
trunk/gcc/tree-vrp.h

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-02-24 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #28 from Martin Jambor  ---
(In reply to Martin Jambor from comment #27)
> Unfortunately, something else has added a further gigabyte to WPA of
> FF in the last week:

So this fortunately turnout to be a mistake in measurement, I was
comparing a --enable-gather-detailed-mem-stats build with a normal
one.  The correct values are:

  | compiler| wpa mem (KB) | wpa mem (GB) |
  |-+--+--|
  | gcc 6 branch|  4046451 | 3.86 |
  | trunk rev. 245382   |  5468227 | 5.21 |
  | patched rev. 245382 |  4255799 | 4.06 |
  | trunk rev. 245595   |  5452515 | 5.20 |
  | patched rev. 245595 |  4240379 | 4.04 |

Thus, the patch avoids most of the reported increase in memory use.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-02-22 Thread jamborm at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #27 from Martin Jambor  ---
I have submitted a patch to the mailing list, which re-uses
value_ranges and ipa_bits in jump functions and manages to save more
than one gigabyte of memory:

https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01369.html

Unfortunately, something else has added a further gigabyte to WPA of
FF in the last week:

  | compiler| wpa mem (GB) |
  |-+--|
  | gcc 6 branch| 3.86 |
  | trunk rev. 245382   | 5.21 |
  | patched rev. 245382 | 4.06 |
  | trunk rev. 245595   | 6.59 |
  | patched rev. 245595 | 5.25 |

I will try bisecting to find if there is one single change responsible
for this.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-02-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-22 Thread kugan at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

kugan at gcc dot gnu.org changed:

   What|Removed |Added

 CC||kugan at gcc dot gnu.org

--- Comment #26 from kugan at gcc dot gnu.org ---
(In reply to Richard Biener from comment #20)
> Look at tree-ssanames.c:range_info_def for "tricks" (make them variable
> size):
> 
> /* Value range information for SSA_NAMEs representing non-pointer variables.
> */
> 
> struct GTY ((variable_size)) range_info_def {
>   /* Minimum, maximum and nonzero bits.  */
>   TRAILING_WIDE_INT_ACCESSOR (min, ints, 0)
>   TRAILING_WIDE_INT_ACCESSOR (max, ints, 1)
>   TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2)
>   trailing_wide_ints <3> ints;
> };

I am working on a patch to change ipa vrp based on the above.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #25 from Martin Liška  ---
Created attachment 40549
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40549=edit
GCC 7 -fmem-report

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #24 from Martin Liška  ---
Created attachment 40548
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40548=edit
GCC 6 -fmem-report

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #23 from Martin Liška  ---
Depending on memory layout of the structure, but these 2 structures increase
memory of about ((32+88)*3258685)/(1024**2) ~372 MB.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Martin Liška  changed:

   What|Removed |Added

   Assignee|marxin at gcc dot gnu.org  |jamborm at gcc dot 
gnu.org

--- Comment #22 from Martin Liška  ---
Btw. sizeof(value_range) == 32 and sizeof(ipa_bits) == 88 on a x86_64 machine.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #21 from Martin Liška  ---
Looking at distinct number of value ranges and bits, we can get:

grep hash_vr /tmp/7.dump.ipa | sort | uniq -c | wc -l
65224

grep hash_bits /tmp/7.dump.ipa | sort | uniq -c | wc -l
13421

Where total # of jump functions at the end of WPA is 3258685.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #20 from Richard Biener  ---
Look at tree-ssanames.c:range_info_def for "tricks" (make them variable size):

/* Value range information for SSA_NAMEs representing non-pointer variables. 
*/

struct GTY ((variable_size)) range_info_def {
  /* Minimum, maximum and nonzero bits.  */
  TRAILING_WIDE_INT_ACCESSOR (min, ints, 0)
  TRAILING_WIDE_INT_ACCESSOR (max, ints, 1)
  TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2)
  trailing_wide_ints <3> ints;
};

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Jan Hubicka  changed:

   What|Removed |Added

 CC||kuganv at linaro dot org

--- Comment #19 from Jan Hubicka  ---
Looking into detailed mem reports we have increase in jump functions size:
ipa-prop.c:4701 (ipa_read_node_info)  0:  0.0%  47378288: 
2.4% 161144168:  6.7%  20010080: 11.8%   1238962
to
ipa-prop.c:5047 (ipa_read_node_info)  0:  0.0%  74541136:  
3.1% 567308480: 17.9%  13645376:  7.0%   1238212

So while we read about same number of jump functions, the memory usage almost
triples.  The reason is that jump function got a lot bigger now:

  /* Information about zero/non-zero bits.  */  
  struct ipa_bits bits; 

  /* Information about value range, containing valid data only when vr_known is 
 true.  */  
  value_range m_vr; 
  bool vr_known;

where

/* Information about zero/non-zero bits.  */
struct GTY(()) ipa_bits
{
  /* The propagated value.  */
  widest_int value;
  /* Mask corresponding to the value.
 Similar to ccp_lattice_t, if xth bit of mask is 0,
 implies xth bit of value is constant.  */
  widest_int mask;
  /* True if jump function is known.  */
  bool known;
};

/* Info about value ranges.  */
struct GTY(()) ipa_vr
{
  /* The data fields below are valid only if known is true.  */
  bool known;
  enum value_range_type type;
  wide_int min;
  wide_int max;
};

I think two wide_ints and two widest_ints are major offenders.  We need to find
a way to avoid allocating them for all nodes.  Perhaps implement sharing of
equal ipa_bits and ipa_vr records?

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #18 from Martin Liška  ---
Created attachment 40545
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40545=edit
GCC 7 graph

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #17 from Martin Liška  ---
Created attachment 40544
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40544=edit
GCC 6 graph

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #16 from Martin Liška  ---
It's still reproducible with current trunk, it's over 1GB on my development
machine. I did a simple script that dumps sizes of all LTO object loaded to
WPA:

GCC 7:
asm : 19.67 KB
profile : 31.76 KB
pureconst   : 1.08 MB
refs: 1.66 MB
icf : 2.44 MB
inline  : 7.05 MB
symbol_nodes: 13.63 MB
jmpfuncs: 14.98 MB
symtab  : 59.27 MB
decls   : 287.71 MB
symbols : 564.43 MB
total   : 952.31 MB

Total symbols: 505244

GCC 6:
./parse-lto.py /tmp/6.txt
asm : 19.66 KB
profile : 34.56 KB
pureconst   : 1.09 MB
refs: 1.67 MB
icf : 2.43 MB
inline  : 7.03 MB
jmpfuncs: 10.30 MB
symbol_nodes: 13.66 MB
symtab  : 59.89 MB
decls   : 284.59 MB
symbols : 559.50 MB
total   : 940.21 MB

Total symbols: 503275

Thus I guess there's no difference in amount of streamed data.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2017-01-19 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #15 from Jan Hubicka  ---
How does the memory use look with current tree?

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-12-01 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Jan Hubicka  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-12-01
 Ever confirmed|0   |1

--- Comment #14 from Jan Hubicka  ---
From a quick glance it seems to be mostly GGC memory related to
ipa-cp/ipa-inline and the global stream. Perhaps we just manage to do much more
cloning/inlining decisions than before?  How does the code size and inline
dumps compare? 

I will try to reproduce this. We need detialed mem report and take look if the
optimization decisions diverge or it is just extra stuff brought in by the
extended jump functions or extra data we stream.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-02 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #13 from Markus Trippelsdorf  ---
(In reply to Martin Liška from comment #12)
> (In reply to Markus Trippelsdorf from comment #11)
> > js/src/jit/BaselineCompiler.cpp
> 
> Hm, I see the R0 defined as:
> 
> # 1
> "/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/
> SharedICRegisters-x64.h" 1
> # 12
> "/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/
> SharedICRegisters-x64.h"
> namespace js {
> namespace jit {
> 
> static constexpr Register BaselineFrameReg = rbp;
> static constexpr Register BaselineStackReg = rsp;
> 
> static constexpr ValueOperand R0(rcx);
> 
> not as an ASM statement.

Yes, you're right. I only took a cursory look and got confused by all these
masm. statements.
Not sure how to debug this further.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-02 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #12 from Martin Liška  ---
(In reply to Markus Trippelsdorf from comment #11)
> js/src/jit/BaselineCompiler.cpp

Hm, I see the R0 defined as:

# 1
"/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/SharedICRegisters-x64.h"
1
# 12
"/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/SharedICRegisters-x64.h"
namespace js {
namespace jit {

static constexpr Register BaselineFrameReg = rbp;
static constexpr Register BaselineStackReg = rsp;

static constexpr ValueOperand R0(rcx);

not as an ASM statement.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-02 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #11 from Markus Trippelsdorf  ---
js/src/jit/BaselineCompiler.cpp

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-02 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #10 from Martin Liška  ---
(In reply to Markus Trippelsdorf from comment #9)
> (In reply to Martin Liška from comment #8)
> > (In reply to Markus Trippelsdorf from comment #7)
> > > BTW Firefox trunk fails to build for me:
> > > 
> > > ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32
> > > reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile
> > > with -fPIC
> > > ld: error: read-only segment has dynamic relocations
> > > /tmp/ccsbLieS.ltrans29.ltrans.o::function
> > > js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone
> > > .constprop.20226]: error: undefined reference to 'js::jit::R0'
> > > 
> > > Haven't looked into it yet. Could well be a Firefox bug.
> > 
> > This looks known to me, I used to see this unresolved symbol, but currently
> > it's gone on x86_64-linux-gnu.
> 
> Not for me. I hit the issue yesterday with gcc trunk and mozilla trunk.
> js::jit::R0 is an asm statement, that could end up in the wrong partition.

Ah, I see. Can you please name the source file where's it's defined? Can't grep
the symbol.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-02 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #9 from Markus Trippelsdorf  ---
(In reply to Martin Liška from comment #8)
> (In reply to Markus Trippelsdorf from comment #7)
> > BTW Firefox trunk fails to build for me:
> > 
> > ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32
> > reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile
> > with -fPIC
> > ld: error: read-only segment has dynamic relocations
> > /tmp/ccsbLieS.ltrans29.ltrans.o::function
> > js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone
> > .constprop.20226]: error: undefined reference to 'js::jit::R0'
> > 
> > Haven't looked into it yet. Could well be a Firefox bug.
> 
> This looks known to me, I used to see this unresolved symbol, but currently
> it's gone on x86_64-linux-gnu.

Not for me. I hit the issue yesterday with gcc trunk and mozilla trunk.
js::jit::R0 is an asm statement, that could end up in the wrong partition.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-02 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #8 from Martin Liška  ---
(In reply to Markus Trippelsdorf from comment #7)
> BTW Firefox trunk fails to build for me:
> 
> ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32
> reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile
> with -fPIC
> ld: error: read-only segment has dynamic relocations
> /tmp/ccsbLieS.ltrans29.ltrans.o::function
> js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone
> .constprop.20226]: error: undefined reference to 'js::jit::R0'
> 
> Haven't looked into it yet. Could well be a Firefox bug.

This looks known to me, I used to see this unresolved symbol, but currently
it's gone on x86_64-linux-gnu.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-11-01 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #7 from Markus Trippelsdorf  ---
BTW Firefox trunk fails to build for me:

ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32
reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile with
-fPIC
ld: error: read-only segment has dynamic relocations
/tmp/ccsbLieS.ltrans29.ltrans.o::function
js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone
.constprop.20226]: error: undefined reference to 'js::jit::R0'

Haven't looked into it yet. Could well be a Firefox bug.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-10-31 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Martin Liška  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org

--- Comment #6 from Martin Liška  ---
I'll take a look.

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-10-28 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #5 from Markus Trippelsdorf  ---
Similar picture on ppc64le (this uses a much older version of Firefox,
so overall memory usage is lower):

gcc7: Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
  1232 kB ( 0%) ggc
 phase parsing   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 phase opt and generate  :  37.53 (67%) usr   1.22 (48%) sys  38.75 (65%) wall
1163161 kB (35%) ggc
 phase stream in :  16.18 (29%) usr   0.45 (18%) sys  16.66 (28%) wall
2173819 kB (65%) ggc
 phase stream out:   2.42 ( 4%) usr   0.85 (34%) sys   4.10 ( 7%) wall 
 0 kB ( 0%) ggc
 garbage collection  :   1.14 ( 2%) usr   0.01 ( 0%) sys   1.17 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph optimization  :   0.60 ( 1%) usr   0.02 ( 1%) sys   0.62 ( 1%) wall 
 6 kB ( 0%) ggc
 ipa dead code removal   :   2.93 ( 5%) usr   0.04 ( 2%) sys   2.95 ( 5%) wall 
 1 kB ( 0%) ggc
 ipa virtual call target :   5.77 (10%) usr   0.13 ( 5%) sys   5.88 (10%) wall 
 0 kB ( 0%) ggc
 ipa devirtualization:   0.29 ( 1%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall 
 43938 kB ( 1%) ggc
 ipa cp  :   2.13 ( 4%) usr   0.06 ( 2%) sys   2.20 ( 4%) wall 
627117 kB (19%) ggc
 ipa inlining heuristics :  15.00 (27%) usr   0.39 (15%) sys  15.40 (26%) wall 
752347 kB (23%) ggc
 ipa comdats :   0.23 ( 0%) usr   0.01 ( 0%) sys   0.24 ( 0%) wall 
 0 kB ( 0%) ggc
 lto stream inflate  :   3.45 ( 6%) usr   0.10 ( 4%) sys   3.65 ( 6%) wall 
 0 kB ( 0%) ggc
 ipa lto gimple in   :   1.47 ( 3%) usr   0.27 (11%) sys   1.64 ( 3%) wall 
259169 kB ( 8%) ggc
 ipa lto gimple out  :   0.25 ( 0%) usr   0.07 ( 3%) sys   0.33 ( 1%) wall 
 0 kB ( 0%) ggc
 ipa lto decl in :   7.98 (14%) usr   0.14 ( 6%) sys   8.12 (14%) wall
1186633 kB (36%) ggc
 ipa lto decl out:   1.82 ( 3%) usr   0.09 ( 4%) sys   1.92 ( 3%) wall 
 0 kB ( 0%) ggc
 ipa lto constructors in :   0.21 ( 0%) usr   0.05 ( 2%) sys   0.26 ( 0%) wall 
 13649 kB ( 0%) ggc
 ipa lto constructors out:   0.18 ( 0%) usr   0.05 ( 2%) sys   0.23 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   0.43 ( 1%) usr   0.04 ( 2%) sys   0.46 ( 1%) wall 
312435 kB ( 9%) ggc
 ipa lto decl merge  :   1.13 ( 2%) usr   0.01 ( 0%) sys   1.15 ( 2%) wall 
 12473 kB ( 0%) ggc
 ipa lto cgraph merge:   0.32 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 1%) wall 
 10096 kB ( 0%) ggc
 whopr wpa   :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall 
 1 kB ( 0%) ggc
 whopr wpa I/O   :   0.11 ( 0%) usr   0.62 (25%) sys   1.54 ( 3%) wall 
 0 kB ( 0%) ggc
 whopr partitioning  :   2.18 ( 4%) usr   0.05 ( 2%) sys   2.22 ( 4%) wall 
  3758 kB ( 0%) ggc
 ipa reference   :   1.54 ( 3%) usr   0.03 ( 1%) sys   1.57 ( 3%) wall 
 0 kB ( 0%) ggc
 ipa profile :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa pure const  :   1.46 ( 3%) usr   0.01 ( 0%) sys   1.47 ( 2%) wall 
 0 kB ( 0%) ggc
 ipa icf :   4.32 ( 8%) usr   0.11 ( 4%) sys   4.46 ( 7%) wall 
 17472 kB ( 1%) ggc
 parser (global) :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 tree SSA rewrite:   0.08 ( 0%) usr   0.03 ( 1%) sys   0.11 ( 0%) wall 
 18785 kB ( 1%) ggc
 tree SSA incremental:   0.22 ( 0%) usr   0.04 ( 2%) sys   0.26 ( 0%) wall 
  4857 kB ( 0%) ggc
 tree operand scan   :   0.12 ( 0%) usr   0.02 ( 1%) sys   0.19 ( 0%) wall 
 73942 kB ( 2%) ggc
 dominance frontiers :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 dominance computation   :   0.14 ( 0%) usr   0.03 ( 1%) sys   0.16 ( 0%) wall 
 0 kB ( 0%) ggc
 varconst:   0.06 ( 0%) usr   0.05 ( 2%) sys   0.08 ( 0%) wall 
 0 kB ( 0%) ggc
 loop init   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   282 kB ( 0%) ggc
 loop fini   :   0.02 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 TOTAL :  56.13 2.5259.52   
3338215 kB

gcc6: Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
  1085 kB ( 0%) ggc
 phase opt and generate  :  37.56 (68%) usr   1.05 (50%) sys  38.64 (66%) wall 
666760 kB (27%) ggc
 phase stream in :  15.03 (27%) usr   0.37 (18%) sys  15.41 (26%) wall
1840687 kB (73%) ggc
 phase stream out:   2.94 ( 5%) usr   0.67 (32%) sys   4.16 ( 7%) wall 
 0 kB ( 0%) ggc
 garbage collection  :   1.18 ( 2%) usr   0.01 ( 0%) sys   1.21 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph optimization  :   0.36 ( 1%) usr   0.00 ( 0%) sys   0.37 ( 1%) wall 
 6 kB ( 0%) ggc
 ipa dead code removal   :   3.02 ( 5%) usr   0.04 ( 2%) sys   3.09 ( 5%) wall 
 1 kB ( 0%) ggc
 ipa virtual call target :   6.41 (12%) 

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-10-28 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #4 from Markus Trippelsdorf  ---
Basicaly just "-O3 -flto".

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-10-28 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

Richard Biener  changed:

   What|Removed |Added

   Keywords||lto
   Target Milestone|--- |7.0

--- Comment #3 from Richard Biener  ---
Hmm, that's not much information ... (or "testcase").  WPA now has extra info
(IPA VRP stuff).  But else?  I suppose this is at -O2 -flto?  Or with FDO?
Or ...?

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-10-28 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #1 from Markus Trippelsdorf  ---
Created attachment 39915
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39915=edit
gcc-6 memory graph

[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6

2016-10-28 Thread trippels at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140

--- Comment #2 from Markus Trippelsdorf  ---
Created attachment 39916
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39916=edit
gcc-7 memory graph