Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 You should get into the memory peak caused by streaming before we get to the ICE. I am about to commit the second half of the ipa-devirt speculation changes that fixes the ICE. Honza -- Markus
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Otherwise I forgot to enable compression at all ;) Richard. -- Markus
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On 2014.07.30 at 10:31 +0200, Richard Biener wrote: On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Otherwise I forgot to enable compression at all ;) He. Also parallel streaming stopped working. -- Markus
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On Wed, 30 Jul 2014, Markus Trippelsdorf wrote: On 2014.07.30 at 10:31 +0200, Richard Biener wrote: On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Otherwise I forgot to enable compression at all ;) He. Also parallel streaming stopped working. Interesting. Meanwhile after enabling compression the benefit of the patch is the reduction in LTRANS file size (stage3 cc1) from 460MB to 178MB because we now compress here. Object file size of stage3-gcc/*.o is increased to 540MB from 506MB (note LTO bootstrap is with fat LTO objects, so the increase in LTO IL size is bigger that it looks like). I suppose mainly due to no longer compressing strings. I also can observe the appearant memleak - will try to hunt it down. Richard.
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On Wed, 30 Jul 2014, Richard Biener wrote: On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Found it: Index: gcc/lto-section-in.c === --- gcc/lto-section-in.c.orig 2014-07-30 12:40:27.950225826 +0200 +++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200 @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l struct lto_input_block *ib, const char *data, size_t len) { - free (ib); + delete ib; lto_free_section_data (file_data, section_type, NULL, data, len); } Richard.
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On 07/30/2014 11:41 AM, Richard Biener wrote: On Wed, 30 Jul 2014, Richard Biener wrote: On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Found it: Index: gcc/lto-section-in.c === --- gcc/lto-section-in.c.orig 2014-07-30 12:40:27.950225826 +0200 +++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200 @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l struct lto_input_block *ib, const char *data, size_t len) { - free (ib); + delete ib; lto_free_section_data (file_data, section_type, NULL, data, len); } Richard. Hello, there's memory/CPU usage for the patch. for both, I used sync and drop_caches. Url: https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing Martin
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On Wed, Jul 30, 2014 at 1:14 PM, Martin Liška mli...@suse.cz wrote: On 07/30/2014 11:41 AM, Richard Biener wrote: On Wed, 30 Jul 2014, Richard Biener wrote: On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Found it: Index: gcc/lto-section-in.c === --- gcc/lto-section-in.c.orig 2014-07-30 12:40:27.950225826 +0200 +++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200 @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l struct lto_input_block *ib, const char *data, size_t len) { - free (ib); + delete ib; lto_free_section_data (file_data, section_type, NULL, data, len); } Richard. Hello, there's memory/CPU usage for the patch. for both, I used sync and drop_caches. Url: https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing Ok, it turns out setting -flto-compression-level to 0 doesn't really short-circuit zlib for sections. So the following does that the hard but effective way. Index: gcc/lto-section-out.c === --- gcc/lto-section-out.c.orig 2014-07-30 13:33:06.634008355 +0200 +++ gcc/lto-section-out.c 2014-07-30 13:29:19.468023995 +0200 @@ -80,7 +80,7 @@ lto_begin_section (const char *name, boo data is anything other than assembler output. The effect here is that we get compression of IL only in non-ltrans object files. */ gcc_assert (compression_stream == NULL); - if (compress) + if (compress 0) compression_stream = lto_start_compression (lto_append_data, NULL); } Index: gcc/lto-section-in.c === --- gcc/lto-section-in.c.orig 2014-07-30 13:33:06.637008355 +0200 +++ gcc/lto-section-in.c2014-07-30 13:31:57.329013126 +0200 @@ -153,7 +153,7 @@ lto_get_section_data (struct lto_file_de /* FIXME lto: WPA mode does not write compressed sections, so for now suppress uncompression if flag_ltrans. */ - if (!flag_ltrans) + if (!flag_ltrans 0) { /* Create a mapping header containing the underlying data and length, and prepend this to the uncompression buffer. The uncompressed data @@ -200,7 +200,7 @@ lto_free_section_data (struct lto_file_d /* FIXME lto: WPA mode does not write compressed sections, so for now suppress uncompression mapping if flag_ltrans. */ - if (flag_ltrans) + if (flag_ltrans || 1) { (free_section_f) (file_data, section_type, name, data, len); return; does that change anything? Thanks, Richard. Martin
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On 07/30/2014 12:37 PM, Richard Biener wrote: On Wed, Jul 30, 2014 at 1:14 PM, Martin Liška mli...@suse.cz wrote: On 07/30/2014 11:41 AM, Richard Biener wrote: On Wed, 30 Jul 2014, Richard Biener wrote: On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf mar...@trippelsdorf.de wrote: On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 Maybe there is still a memleak (just checked that LTOing int main() {} doesn't leak). Found it: Index: gcc/lto-section-in.c === --- gcc/lto-section-in.c.orig 2014-07-30 12:40:27.950225826 +0200 +++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200 @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l struct lto_input_block *ib, const char *data, size_t len) { - free (ib); + delete ib; lto_free_section_data (file_data, section_type, NULL, data, len); } Richard. Hello, there's memory/CPU usage for the patch. for both, I used sync and drop_caches. Url: https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing Ok, it turns out setting -flto-compression-level to 0 doesn't really short-circuit zlib for sections. So the following does that the hard but effective way. Index: gcc/lto-section-out.c === --- gcc/lto-section-out.c.orig 2014-07-30 13:33:06.634008355 +0200 +++ gcc/lto-section-out.c 2014-07-30 13:29:19.468023995 +0200 @@ -80,7 +80,7 @@ lto_begin_section (const char *name, boo data is anything other than assembler output. The effect here is that we get compression of IL only in non-ltrans object files. */ gcc_assert (compression_stream == NULL); - if (compress) + if (compress 0) compression_stream = lto_start_compression (lto_append_data, NULL); } Index: gcc/lto-section-in.c === --- gcc/lto-section-in.c.orig 2014-07-30 13:33:06.637008355 +0200 +++ gcc/lto-section-in.c2014-07-30 13:31:57.329013126 +0200 @@ -153,7 +153,7 @@ lto_get_section_data (struct lto_file_de /* FIXME lto: WPA mode does not write compressed sections, so for now suppress uncompression if flag_ltrans. */ - if (!flag_ltrans) + if (!flag_ltrans 0) { /* Create a mapping header containing the underlying data and length, and prepend this to the uncompression buffer. The uncompressed data @@ -200,7 +200,7 @@ lto_free_section_data (struct lto_file_d /* FIXME lto: WPA mode does not write compressed sections, so for now suppress uncompression mapping if flag_ltrans. */ - if (flag_ltrans) + if (flag_ltrans || 1) { (free_section_f) (file_data, section_type, name, data, len); return; does that change anything? Thanks, Richard. There are new numbers: https://drive.google.com/file/d/0B0pisUJ80pO1aG83N2JXLWNVUW8/edit?usp=sharing, where I reduced the scale to to 10GB to identify better any differences. Martin Martin
[PATCH] LTO streamer reorg - try to reduce WPA memory use
This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Richard. insert-changelog-here Index: gcc/data-streamer-out.c === *** gcc/data-streamer-out.c.orig2014-07-29 13:04:48.255073822 +0200 --- gcc/data-streamer-out.c 2014-07-29 13:09:40.301053715 +0200 *** along with GCC; see the file COPYING3. *** 22,27 --- 22,32 #include config.h #include system.h + /* zlib.h includes other system headers. Those headers may test feature +test macros. config.h may define feature test macros. For this reason, +zlib.h needs to be included after, rather than before, config.h and +system.h. */ + #include zlib.h #include coretypes.h #include tree.h #include basic-block.h *** along with GCC; see the file COPYING3. *** 32,37 --- 37,199 #include gimple.h #include data-streamer.h + + /* Finishes the last block, eventually compressing it, and returns the +total size of the stream. */ + + unsigned int + lto_output_stream::finish () + { + if (compress +current_pointer) + { + /* Compress the last (partial) block. */ + compress_current_block (true); + left_in_block = zlib_stream-avail_out; + int status = deflateEnd (zlib_stream); + if (status != Z_OK) + internal_error (compressed stream: %s, zError (status)); + free (zlib_stream); + } + current_pointer = NULL; + + unsigned int size = 0; + for (lto_char_ptr_base *b = first_block; b; b = (lto_char_ptr_base *)b-ptr) + size += block_size - sizeof (lto_char_ptr_base); + size -= left_in_block; + return size; + } + + /* Returns a pointer to the first block of the chain of blocks to output. */ + + lto_char_ptr_base * + lto_output_stream::get_blocks () + { + finish (); + return first_block; + } + + /* Adds a new block to output stream OBS. */ + + void + lto_output_stream::append_block () + { + struct lto_char_ptr_base *new_block; + bool first_p = false; + + gcc_assert (left_in_block == 0 block_size sizeof (lto_char_ptr_base)); + + if (first_block == NULL) + { + /* This is the first time the stream has been written into. */ + new_block = (struct lto_char_ptr_base*) xmalloc (block_size); + first_block = new_block; + first_p = true; + } + else + { + if (compress) + { + /* Compress the current block and link it into the list. */ + compress_current_block (false); + /* Re-use the uncompressed buffer. */ + new_block = current_block; + } + else + { + /* Get a new block and link it into the list. */ + new_block = (struct lto_char_ptr_base*) xmalloc (block_size); + /* The first bytes of the block are reserved as a pointer to +the next block. Set the chain of the full block to the +pointer to the new block. */ + lto_char_ptr_base *tptr = current_block; + tptr-ptr = (char *) new_block; + } + } + + /* Set the place for the next char at the first position after the + chain to the next block. */ + current_pointer + = ((char *) new_block) + sizeof (struct lto_char_ptr_base); + current_block = new_block; + /* Null out the newly allocated block's pointer to the next block. */ + new_block-ptr = NULL; + left_in_block = block_size - sizeof (struct lto_char_ptr_base); + + #if 0 + if (first_p) + streamer_write_hwi_stream (this, compress); + #endif + } + + /* Return a zlib compression level that zlib will not reject. Normalizes +the compression level from the command line flag, clamping non-default +values to the appropriate end of their valid range. */ + + static int + lto_normalized_zlib_level (void) + { + int level = flag_lto_compression_level; + + if (level != Z_DEFAULT_COMPRESSION) + { + if (level Z_NO_COMPRESSION) + level = Z_NO_COMPRESSION; + else if (level Z_BEST_COMPRESSION) + level = Z_BEST_COMPRESSION; + } + + return level; + } + + void + lto_output_stream::compress_current_block (bool last) + { + int status; + +
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. I'll probably try to split out cleanups from this patch. Richard. Index: gcc/data-streamer-out.c === *** gcc/data-streamer-out.c.orig2014-07-29 13:04:48.255073822 +0200 --- gcc/data-streamer-out.c 2014-07-29 14:35:22.908699653 +0200 *** along with GCC; see the file COPYING3. *** 22,27 --- 22,32 #include config.h #include system.h + /* zlib.h includes other system headers. Those headers may test feature +test macros. config.h may define feature test macros. For this reason, +zlib.h needs to be included after, rather than before, config.h and +system.h. */ + #include zlib.h #include coretypes.h #include tree.h #include basic-block.h *** along with GCC; see the file COPYING3. *** 32,37 --- 37,194 #include gimple.h #include data-streamer.h + + /* Finishes the last block, eventually compressing it, and returns the +total size of the stream. */ + + unsigned int + lto_output_stream::finish () + { + if (compress +current_pointer) + { + /* Compress the last (partial) block. */ + compress_current_block (true); + left_in_block = zlib_stream-avail_out; + free (current_block); + current_block = NULL; + int status = deflateEnd (zlib_stream); + if (status != Z_OK) + internal_error (compressed stream: %s, zError (status)); + free (zlib_stream); + } + current_pointer = NULL; + + unsigned int size = 0; + for (lto_char_ptr_base *b = first_block; b; b = (lto_char_ptr_base *)b-ptr) + size += block_size - sizeof (lto_char_ptr_base); + size -= left_in_block; + return size; + } + + /* Returns a pointer to the first block of the chain of blocks to output. */ + + lto_char_ptr_base * + lto_output_stream::get_blocks () + { + finish (); + return first_block; + } + + /* Adds a new block to output stream OBS. */ + + void + lto_output_stream::append_block () + { + struct lto_char_ptr_base *new_block; + + gcc_assert (left_in_block == 0 block_size sizeof (lto_char_ptr_base)); + + if (first_block == NULL) + { + /* This is the first time the stream has been written into. */ + new_block = (struct lto_char_ptr_base*) xmalloc (block_size); + first_block = new_block; + } + else + { + if (compress) + { + /* Compress the current block and link it into the list. */ + compress_current_block (false); + /* Re-use the uncompressed buffer. */ + new_block = current_block; + } + else + { + /* Get a new block and link it into the list. */ + new_block = (struct lto_char_ptr_base*) xmalloc (block_size); + /* The first bytes of the block are reserved as a pointer to +the next block. Set the chain of the full block to the +pointer to the new block. */ + lto_char_ptr_base *tptr = current_block; + tptr-ptr = (char *) new_block; + } + } + + /* Set the place for the next char at the first position after the + chain to the next block. */ + current_pointer + = ((char *) new_block) + sizeof (struct lto_char_ptr_base); + current_block = new_block; + /* Null out the newly allocated block's pointer to the next block. */ + new_block-ptr = NULL; + left_in_block = block_size - sizeof (struct lto_char_ptr_base); + } + + /* Return a zlib compression level that zlib will not reject. Normalizes +the compression level from the command line flag, clamping non-default +values to the appropriate end of their valid range. */ + + static int + lto_normalized_zlib_level (void) + { + int level = flag_lto_compression_level; + + if (level != Z_DEFAULT_COMPRESSION) + { + if (level Z_NO_COMPRESSION) + level = Z_NO_COMPRESSION; + else if (level Z_BEST_COMPRESSION) + level = Z_BEST_COMPRESSION; + } + + return
Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use
On 2014.07.29 at 15:10 +0200, Richard Biener wrote: On Tue, 29 Jul 2014, Richard Biener wrote: This re-organizes the LTO streamer to do compression transparently in the data-streamer routines (and disables section compression by defaulting to -flto-compression-level=0). This avoids keeping the whole uncompressed sections in memory, only retaining the compressed ones. The downside is that we lose compression of at least the string parts (they are abusing the streaming interface quite awkwardly and doing random-accesses with offsets into the uncompressed section). With a little bit of surgery we can get that back I think (but we'd have to keep the uncompressed piece in memory somewhere which means losing the memory use advantage). Very lightly tested sofar (running lto.exp). I'll try a LTO bootstrap now. I wonder what the change is on WPA memory use for larger projects and what the effect on object file size is. Updated patch passing LTO bootstrap (one warning fix) and with a memory leak fixed. Testing with Firefox is impossible at the moment because of PR61885. One thing I've noticed (before the ICE) is that virtual memory usage is very high: AddressKbytes RSSDirty Mode Mapping 004016344 90840 r-x-- lto1 013f6000 36 36 28 rw--- lto1 013ff000 1072 276 276 rw--- [ anon ] 034aa000 10154940 1540384 1540384 rw--- [ anon ] 2acf04af2000 136 1360 r-x-- ld-2.19.90.so 2acf04b14000 88 88 88 rw--- [ anon ] ... --- --- --- total kB 12022060 3388396 3377708 -- Markus