[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #32 from Martin Jambor --- (In reply to Martin Jambor from comment #30) > I think that using the same approach to cache ipa_vr > structures (used to store results of IPA-VR) could bring further > savings They were not really significant, so let's leave them as they are now.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Markus Trippelsdorf changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #31 from Markus Trippelsdorf --- Fixed, thanks.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #30 from Martin Jambor --- With the above commit, we hae avoided the vast majority of memory use increase. I think that using the same approach to cache ipa_vr structures (used to store results of IPA-VR) could bring further savings (possibly a hundred of megabytes?) so I will try that. In any event, this may no longer qualify as P1.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #29 from Martin Jambor --- Author: jamborm Date: Wed Mar 1 09:37:27 2017 New Revision: 245805 URL: https://gcc.gnu.org/viewcvs?rev=245805=gcc=rev Log: [PR 78140] Reuse same IPA bits and VR info 2017-03-01 Martin JamborPR lto/78140 * ipa-prop.h (ipa_bits): Removed field known. (ipa_jump_func): Removed field vr_known. Changed fields bits and m_vr to pointers. Adjusted their comments to warn about their sharing. (ipcp_transformation_summary): Change bits to a vector of pointers. (ipa_check_create_edge_args): Moved to ipa-prop.c, declare. (ipa_get_ipa_bits_for_value): Declare. * tree-vrp.h (value_range): Mark as GTY((for_user)). * ipa-prop.c (ipa_bit_ggc_hash_traits): New. (ipa_bits_hash_table): Likewise. (ipa_vr_ggc_hash_traits): Likewise. (ipa_vr_hash_table): Likewise. (ipa_print_node_jump_functions_for_edge): Adjust for bits and m_vr being pointers and vr_known being removed. (ipa_set_jf_unknown): Likewise. (ipa_get_ipa_bits_for_value): New function. (ipa_set_jfunc_bits): Likewise. (ipa_get_value_range): New overloaded functions. (ipa_set_jfunc_vr): Likewise. (ipa_compute_jump_functions_for_edge): Use the above functions to construct bits and vr parts of jump functions. (ipa_check_create_edge_args): Move here from ipa-prop.h, also allocate ipa_bits_hash_table and ipa_vr_hash_table if they do not already exist. (ipcp_grow_transformations_if_necessary): Also allocate ipa_bits_hash_table and ipa_vr_hash_table if they do not already exist. (ipa_node_params_t::duplicate): Do not copy bits, just pointers to them. Fix too long lines. (ipa_write_jump_function): Adjust for bits and m_vr being pointers and vr_known being removed. (ipa_read_jump_function): Use new setter functions to construct bits and vr parts of jump functions or set them to NULL. (write_ipcp_transformation_info): Adjust for bits being pointers. (read_ipcp_transformation_info): Likewise. (ipcp_update_bits): Likewise. Fix excessively long lines a trailing space. Include gt-ipa-prop.h. * ipa-cp.c (propagate_bits_across_jump_function): Adjust for bits being pointers. (ipcp_store_bits_results): Likewise. (propagate_vr_across_jump_function): Adjust for m_vr being a pointer. Do not write to existing jump functions but use a temporary instead. Modified: trunk/gcc/ChangeLog trunk/gcc/ipa-cp.c trunk/gcc/ipa-prop.c trunk/gcc/ipa-prop.h trunk/gcc/tree-vrp.h
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #28 from Martin Jambor --- (In reply to Martin Jambor from comment #27) > Unfortunately, something else has added a further gigabyte to WPA of > FF in the last week: So this fortunately turnout to be a mistake in measurement, I was comparing a --enable-gather-detailed-mem-stats build with a normal one. The correct values are: | compiler| wpa mem (KB) | wpa mem (GB) | |-+--+--| | gcc 6 branch| 4046451 | 3.86 | | trunk rev. 245382 | 5468227 | 5.21 | | patched rev. 245382 | 4255799 | 4.06 | | trunk rev. 245595 | 5452515 | 5.20 | | patched rev. 245595 | 4240379 | 4.04 | Thus, the patch avoids most of the reported increase in memory use.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #27 from Martin Jambor --- I have submitted a patch to the mailing list, which re-uses value_ranges and ipa_bits in jump functions and manages to save more than one gigabyte of memory: https://gcc.gnu.org/ml/gcc-patches/2017-02/msg01369.html Unfortunately, something else has added a further gigabyte to WPA of FF in the last week: | compiler| wpa mem (GB) | |-+--| | gcc 6 branch| 3.86 | | trunk rev. 245382 | 5.21 | | patched rev. 245382 | 4.06 | | trunk rev. 245595 | 6.59 | | patched rev. 245595 | 5.25 | I will try bisecting to find if there is one single change responsible for this.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 kugan at gcc dot gnu.org changed: What|Removed |Added CC||kugan at gcc dot gnu.org --- Comment #26 from kugan at gcc dot gnu.org --- (In reply to Richard Biener from comment #20) > Look at tree-ssanames.c:range_info_def for "tricks" (make them variable > size): > > /* Value range information for SSA_NAMEs representing non-pointer variables. > */ > > struct GTY ((variable_size)) range_info_def { > /* Minimum, maximum and nonzero bits. */ > TRAILING_WIDE_INT_ACCESSOR (min, ints, 0) > TRAILING_WIDE_INT_ACCESSOR (max, ints, 1) > TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2) > trailing_wide_ints <3> ints; > }; I am working on a patch to change ipa vrp based on the above.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #25 from Martin Liška --- Created attachment 40549 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40549=edit GCC 7 -fmem-report
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #24 from Martin Liška --- Created attachment 40548 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40548=edit GCC 6 -fmem-report
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #23 from Martin Liška --- Depending on memory layout of the structure, but these 2 structures increase memory of about ((32+88)*3258685)/(1024**2) ~372 MB.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Martin Liška changed: What|Removed |Added Assignee|marxin at gcc dot gnu.org |jamborm at gcc dot gnu.org --- Comment #22 from Martin Liška --- Btw. sizeof(value_range) == 32 and sizeof(ipa_bits) == 88 on a x86_64 machine.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #21 from Martin Liška --- Looking at distinct number of value ranges and bits, we can get: grep hash_vr /tmp/7.dump.ipa | sort | uniq -c | wc -l 65224 grep hash_bits /tmp/7.dump.ipa | sort | uniq -c | wc -l 13421 Where total # of jump functions at the end of WPA is 3258685.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #20 from Richard Biener --- Look at tree-ssanames.c:range_info_def for "tricks" (make them variable size): /* Value range information for SSA_NAMEs representing non-pointer variables. */ struct GTY ((variable_size)) range_info_def { /* Minimum, maximum and nonzero bits. */ TRAILING_WIDE_INT_ACCESSOR (min, ints, 0) TRAILING_WIDE_INT_ACCESSOR (max, ints, 1) TRAILING_WIDE_INT_ACCESSOR (nonzero_bits, ints, 2) trailing_wide_ints <3> ints; };
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Jan Hubicka changed: What|Removed |Added CC||kuganv at linaro dot org --- Comment #19 from Jan Hubicka --- Looking into detailed mem reports we have increase in jump functions size: ipa-prop.c:4701 (ipa_read_node_info) 0: 0.0% 47378288: 2.4% 161144168: 6.7% 20010080: 11.8% 1238962 to ipa-prop.c:5047 (ipa_read_node_info) 0: 0.0% 74541136: 3.1% 567308480: 17.9% 13645376: 7.0% 1238212 So while we read about same number of jump functions, the memory usage almost triples. The reason is that jump function got a lot bigger now: /* Information about zero/non-zero bits. */ struct ipa_bits bits; /* Information about value range, containing valid data only when vr_known is true. */ value_range m_vr; bool vr_known; where /* Information about zero/non-zero bits. */ struct GTY(()) ipa_bits { /* The propagated value. */ widest_int value; /* Mask corresponding to the value. Similar to ccp_lattice_t, if xth bit of mask is 0, implies xth bit of value is constant. */ widest_int mask; /* True if jump function is known. */ bool known; }; /* Info about value ranges. */ struct GTY(()) ipa_vr { /* The data fields below are valid only if known is true. */ bool known; enum value_range_type type; wide_int min; wide_int max; }; I think two wide_ints and two widest_ints are major offenders. We need to find a way to avoid allocating them for all nodes. Perhaps implement sharing of equal ipa_bits and ipa_vr records?
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #18 from Martin Liška --- Created attachment 40545 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40545=edit GCC 7 graph
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #17 from Martin Liška --- Created attachment 40544 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40544=edit GCC 6 graph
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #16 from Martin Liška --- It's still reproducible with current trunk, it's over 1GB on my development machine. I did a simple script that dumps sizes of all LTO object loaded to WPA: GCC 7: asm : 19.67 KB profile : 31.76 KB pureconst : 1.08 MB refs: 1.66 MB icf : 2.44 MB inline : 7.05 MB symbol_nodes: 13.63 MB jmpfuncs: 14.98 MB symtab : 59.27 MB decls : 287.71 MB symbols : 564.43 MB total : 952.31 MB Total symbols: 505244 GCC 6: ./parse-lto.py /tmp/6.txt asm : 19.66 KB profile : 34.56 KB pureconst : 1.09 MB refs: 1.67 MB icf : 2.43 MB inline : 7.03 MB jmpfuncs: 10.30 MB symbol_nodes: 13.66 MB symtab : 59.89 MB decls : 284.59 MB symbols : 559.50 MB total : 940.21 MB Total symbols: 503275 Thus I guess there's no difference in amount of streamed data.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #15 from Jan Hubicka --- How does the memory use look with current tree?
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Jan Hubicka changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2016-12-01 Ever confirmed|0 |1 --- Comment #14 from Jan Hubicka --- From a quick glance it seems to be mostly GGC memory related to ipa-cp/ipa-inline and the global stream. Perhaps we just manage to do much more cloning/inlining decisions than before? How does the code size and inline dumps compare? I will try to reproduce this. We need detialed mem report and take look if the optimization decisions diverge or it is just extra stuff brought in by the extended jump functions or extra data we stream.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #13 from Markus Trippelsdorf --- (In reply to Martin Liška from comment #12) > (In reply to Markus Trippelsdorf from comment #11) > > js/src/jit/BaselineCompiler.cpp > > Hm, I see the R0 defined as: > > # 1 > "/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/ > SharedICRegisters-x64.h" 1 > # 12 > "/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/ > SharedICRegisters-x64.h" > namespace js { > namespace jit { > > static constexpr Register BaselineFrameReg = rbp; > static constexpr Register BaselineStackReg = rsp; > > static constexpr ValueOperand R0(rcx); > > not as an ASM statement. Yes, you're right. I only took a cursory look and got confused by all these masm. statements. Not sure how to debug this further.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #12 from Martin Liška --- (In reply to Markus Trippelsdorf from comment #11) > js/src/jit/BaselineCompiler.cpp Hm, I see the R0 defined as: # 1 "/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/SharedICRegisters-x64.h" 1 # 12 "/home/marxin/BIG/buildbot/slave/source/firefox/js/src/jit/x64/SharedICRegisters-x64.h" namespace js { namespace jit { static constexpr Register BaselineFrameReg = rbp; static constexpr Register BaselineStackReg = rsp; static constexpr ValueOperand R0(rcx); not as an ASM statement.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #11 from Markus Trippelsdorf --- js/src/jit/BaselineCompiler.cpp
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #10 from Martin Liška --- (In reply to Markus Trippelsdorf from comment #9) > (In reply to Martin Liška from comment #8) > > (In reply to Markus Trippelsdorf from comment #7) > > > BTW Firefox trunk fails to build for me: > > > > > > ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32 > > > reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile > > > with -fPIC > > > ld: error: read-only segment has dynamic relocations > > > /tmp/ccsbLieS.ltrans29.ltrans.o::function > > > js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone > > > .constprop.20226]: error: undefined reference to 'js::jit::R0' > > > > > > Haven't looked into it yet. Could well be a Firefox bug. > > > > This looks known to me, I used to see this unresolved symbol, but currently > > it's gone on x86_64-linux-gnu. > > Not for me. I hit the issue yesterday with gcc trunk and mozilla trunk. > js::jit::R0 is an asm statement, that could end up in the wrong partition. Ah, I see. Can you please name the source file where's it's defined? Can't grep the symbol.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #9 from Markus Trippelsdorf --- (In reply to Martin Liška from comment #8) > (In reply to Markus Trippelsdorf from comment #7) > > BTW Firefox trunk fails to build for me: > > > > ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32 > > reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile > > with -fPIC > > ld: error: read-only segment has dynamic relocations > > /tmp/ccsbLieS.ltrans29.ltrans.o::function > > js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone > > .constprop.20226]: error: undefined reference to 'js::jit::R0' > > > > Haven't looked into it yet. Could well be a Firefox bug. > > This looks known to me, I used to see this unresolved symbol, but currently > it's gone on x86_64-linux-gnu. Not for me. I hit the issue yesterday with gcc trunk and mozilla trunk. js::jit::R0 is an asm statement, that could end up in the wrong partition.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #8 from Martin Liška --- (In reply to Markus Trippelsdorf from comment #7) > BTW Firefox trunk fails to build for me: > > ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32 > reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile > with -fPIC > ld: error: read-only segment has dynamic relocations > /tmp/ccsbLieS.ltrans29.ltrans.o::function > js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone > .constprop.20226]: error: undefined reference to 'js::jit::R0' > > Haven't looked into it yet. Could well be a Firefox bug. This looks known to me, I used to see this unresolved symbol, but currently it's gone on x86_64-linux-gnu.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #7 from Markus Trippelsdorf --- BTW Firefox trunk fails to build for me: ld: error: /tmp/ccsbLieS.ltrans29.ltrans.o: requires dynamic R_X86_64_PC32 reloc against '_ZN2js3jitL2R0E' which may overflow at runtime; recompile with -fPIC ld: error: read-only segment has dynamic relocations /tmp/ccsbLieS.ltrans29.ltrans.o::function js::jit::BaselineCompiler::emitCheckThis(js::jit::ValueOperand) [clone .constprop.20226]: error: undefined reference to 'js::jit::R0' Haven't looked into it yet. Could well be a Firefox bug.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Martin Liška changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org --- Comment #6 from Martin Liška --- I'll take a look.
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #5 from Markus Trippelsdorf --- Similar picture on ppc64le (this uses a much older version of Firefox, so overall memory usage is lower): gcc7: Execution times (seconds) phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 1232 kB ( 0%) ggc phase parsing : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc phase opt and generate : 37.53 (67%) usr 1.22 (48%) sys 38.75 (65%) wall 1163161 kB (35%) ggc phase stream in : 16.18 (29%) usr 0.45 (18%) sys 16.66 (28%) wall 2173819 kB (65%) ggc phase stream out: 2.42 ( 4%) usr 0.85 (34%) sys 4.10 ( 7%) wall 0 kB ( 0%) ggc garbage collection : 1.14 ( 2%) usr 0.01 ( 0%) sys 1.17 ( 2%) wall 0 kB ( 0%) ggc callgraph optimization : 0.60 ( 1%) usr 0.02 ( 1%) sys 0.62 ( 1%) wall 6 kB ( 0%) ggc ipa dead code removal : 2.93 ( 5%) usr 0.04 ( 2%) sys 2.95 ( 5%) wall 1 kB ( 0%) ggc ipa virtual call target : 5.77 (10%) usr 0.13 ( 5%) sys 5.88 (10%) wall 0 kB ( 0%) ggc ipa devirtualization: 0.29 ( 1%) usr 0.00 ( 0%) sys 0.27 ( 0%) wall 43938 kB ( 1%) ggc ipa cp : 2.13 ( 4%) usr 0.06 ( 2%) sys 2.20 ( 4%) wall 627117 kB (19%) ggc ipa inlining heuristics : 15.00 (27%) usr 0.39 (15%) sys 15.40 (26%) wall 752347 kB (23%) ggc ipa comdats : 0.23 ( 0%) usr 0.01 ( 0%) sys 0.24 ( 0%) wall 0 kB ( 0%) ggc lto stream inflate : 3.45 ( 6%) usr 0.10 ( 4%) sys 3.65 ( 6%) wall 0 kB ( 0%) ggc ipa lto gimple in : 1.47 ( 3%) usr 0.27 (11%) sys 1.64 ( 3%) wall 259169 kB ( 8%) ggc ipa lto gimple out : 0.25 ( 0%) usr 0.07 ( 3%) sys 0.33 ( 1%) wall 0 kB ( 0%) ggc ipa lto decl in : 7.98 (14%) usr 0.14 ( 6%) sys 8.12 (14%) wall 1186633 kB (36%) ggc ipa lto decl out: 1.82 ( 3%) usr 0.09 ( 4%) sys 1.92 ( 3%) wall 0 kB ( 0%) ggc ipa lto constructors in : 0.21 ( 0%) usr 0.05 ( 2%) sys 0.26 ( 0%) wall 13649 kB ( 0%) ggc ipa lto constructors out: 0.18 ( 0%) usr 0.05 ( 2%) sys 0.23 ( 0%) wall 0 kB ( 0%) ggc ipa lto cgraph I/O : 0.43 ( 1%) usr 0.04 ( 2%) sys 0.46 ( 1%) wall 312435 kB ( 9%) ggc ipa lto decl merge : 1.13 ( 2%) usr 0.01 ( 0%) sys 1.15 ( 2%) wall 12473 kB ( 0%) ggc ipa lto cgraph merge: 0.32 ( 1%) usr 0.00 ( 0%) sys 0.33 ( 1%) wall 10096 kB ( 0%) ggc whopr wpa : 0.21 ( 0%) usr 0.00 ( 0%) sys 0.21 ( 0%) wall 1 kB ( 0%) ggc whopr wpa I/O : 0.11 ( 0%) usr 0.62 (25%) sys 1.54 ( 3%) wall 0 kB ( 0%) ggc whopr partitioning : 2.18 ( 4%) usr 0.05 ( 2%) sys 2.22 ( 4%) wall 3758 kB ( 0%) ggc ipa reference : 1.54 ( 3%) usr 0.03 ( 1%) sys 1.57 ( 3%) wall 0 kB ( 0%) ggc ipa profile : 0.27 ( 0%) usr 0.00 ( 0%) sys 0.27 ( 0%) wall 0 kB ( 0%) ggc ipa pure const : 1.46 ( 3%) usr 0.01 ( 0%) sys 1.47 ( 2%) wall 0 kB ( 0%) ggc ipa icf : 4.32 ( 8%) usr 0.11 ( 4%) sys 4.46 ( 7%) wall 17472 kB ( 1%) ggc parser (global) : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc tree SSA rewrite: 0.08 ( 0%) usr 0.03 ( 1%) sys 0.11 ( 0%) wall 18785 kB ( 1%) ggc tree SSA incremental: 0.22 ( 0%) usr 0.04 ( 2%) sys 0.26 ( 0%) wall 4857 kB ( 0%) ggc tree operand scan : 0.12 ( 0%) usr 0.02 ( 1%) sys 0.19 ( 0%) wall 73942 kB ( 2%) ggc dominance frontiers : 0.03 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc dominance computation : 0.14 ( 0%) usr 0.03 ( 1%) sys 0.16 ( 0%) wall 0 kB ( 0%) ggc varconst: 0.06 ( 0%) usr 0.05 ( 2%) sys 0.08 ( 0%) wall 0 kB ( 0%) ggc loop init : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.01 ( 0%) wall 282 kB ( 0%) ggc loop fini : 0.02 ( 0%) usr 0.01 ( 0%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc TOTAL : 56.13 2.5259.52 3338215 kB gcc6: Execution times (seconds) phase setup : 0.00 ( 0%) usr 0.00 ( 0%) sys 0.00 ( 0%) wall 1085 kB ( 0%) ggc phase opt and generate : 37.56 (68%) usr 1.05 (50%) sys 38.64 (66%) wall 666760 kB (27%) ggc phase stream in : 15.03 (27%) usr 0.37 (18%) sys 15.41 (26%) wall 1840687 kB (73%) ggc phase stream out: 2.94 ( 5%) usr 0.67 (32%) sys 4.16 ( 7%) wall 0 kB ( 0%) ggc garbage collection : 1.18 ( 2%) usr 0.01 ( 0%) sys 1.21 ( 2%) wall 0 kB ( 0%) ggc callgraph optimization : 0.36 ( 1%) usr 0.00 ( 0%) sys 0.37 ( 1%) wall 6 kB ( 0%) ggc ipa dead code removal : 3.02 ( 5%) usr 0.04 ( 2%) sys 3.09 ( 5%) wall 1 kB ( 0%) ggc ipa virtual call target : 6.41 (12%)
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #4 from Markus Trippelsdorf --- Basicaly just "-O3 -flto".
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 Richard Biener changed: What|Removed |Added Keywords||lto Target Milestone|--- |7.0 --- Comment #3 from Richard Biener --- Hmm, that's not much information ... (or "testcase"). WPA now has extra info (IPA VRP stuff). But else? I suppose this is at -O2 -flto? Or with FDO? Or ...?
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #1 from Markus Trippelsdorf --- Created attachment 39915 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39915=edit gcc-6 memory graph
[Bug lto/78140] [7 Regression] libxul -flto uses 1GB more memory than gcc-6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78140 --- Comment #2 from Markus Trippelsdorf --- Created attachment 39916 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39916=edit gcc-7 memory graph