Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Jan Hubicka
  Updated patch passing LTO bootstrap (one warning fix) and
  with a memory leak fixed.
 
 Testing with Firefox is impossible at the moment because of PR61885.
 One thing I've noticed (before the ICE) is that virtual memory usage is
 very high:
 
 AddressKbytes  RSSDirty  Mode  Mapping
 004016344 90840  r-x-- lto1
 013f6000   36   36   28  rw--- lto1
 013ff000 1072  276  276  rw---   [ anon ]
 034aa000 10154940  1540384  1540384  rw---   [ anon ]
 2acf04af2000  136  1360  r-x-- ld-2.19.90.so
 2acf04b14000   88   88   88  rw---   [ anon ]
 ...
   ---  ---  --- 
 total kB 12022060  3388396  3377708

You should get into the memory peak caused by streaming before we get
to the ICE.  I am about to commit the second half of the ipa-devirt
speculation changes that fixes the ICE.

Honza
 
 -- 
 Markus


Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Richard Biener
On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:
 On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
 On Tue, 29 Jul 2014, Richard Biener wrote:

 
  This re-organizes the LTO streamer to do compression transparently
  in the data-streamer routines (and disables section compression
  by defaulting to -flto-compression-level=0).  This avoids
  keeping the whole uncompressed sections in memory, only retaining
  the compressed ones.
 
  The downside is that we lose compression of at least the string
  parts (they are abusing the streaming interface quite awkwardly
  and doing random-accesses with offsets into the uncompressed
  section).  With a little bit of surgery we can get that back I
  think (but we'd have to keep the uncompressed piece in memory
  somewhere which means losing the memory use advantage).
 
  Very lightly tested sofar (running lto.exp).  I'll try a LTO
  bootstrap now.
 
  I wonder what the change is on WPA memory use for larger
  projects and what the effect on object file size is.

 Updated patch passing LTO bootstrap (one warning fix) and
 with a memory leak fixed.

 Testing with Firefox is impossible at the moment because of PR61885.
 One thing I've noticed (before the ICE) is that virtual memory usage is
 very high:

 AddressKbytes  RSSDirty  Mode  Mapping
 004016344 90840  r-x-- lto1
 013f6000   36   36   28  rw--- lto1
 013ff000 1072  276  276  rw---   [ anon ]
 034aa000 10154940  1540384  1540384  rw---   [ anon ]
 2acf04af2000  136  1360  r-x-- ld-2.19.90.so
 2acf04b14000   88   88   88  rw---   [ anon ]
 ...
   ---  ---  ---
 total kB 12022060  3388396  3377708

Maybe there is still a memleak (just checked that LTOing int main() {}
doesn't leak).

Otherwise I forgot to enable compression at all ;)

Richard.

 --
 Markus


Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Markus Trippelsdorf
On 2014.07.30 at 10:31 +0200, Richard Biener wrote:
 On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
 mar...@trippelsdorf.de wrote:
  On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
  On Tue, 29 Jul 2014, Richard Biener wrote:
 
  
   This re-organizes the LTO streamer to do compression transparently
   in the data-streamer routines (and disables section compression
   by defaulting to -flto-compression-level=0).  This avoids
   keeping the whole uncompressed sections in memory, only retaining
   the compressed ones.
  
   The downside is that we lose compression of at least the string
   parts (they are abusing the streaming interface quite awkwardly
   and doing random-accesses with offsets into the uncompressed
   section).  With a little bit of surgery we can get that back I
   think (but we'd have to keep the uncompressed piece in memory
   somewhere which means losing the memory use advantage).
  
   Very lightly tested sofar (running lto.exp).  I'll try a LTO
   bootstrap now.
  
   I wonder what the change is on WPA memory use for larger
   projects and what the effect on object file size is.
 
  Updated patch passing LTO bootstrap (one warning fix) and
  with a memory leak fixed.
 
  Testing with Firefox is impossible at the moment because of PR61885.
  One thing I've noticed (before the ICE) is that virtual memory usage is
  very high:
 
  AddressKbytes  RSSDirty  Mode  Mapping
  004016344 90840  r-x-- lto1
  013f6000   36   36   28  rw--- lto1
  013ff000 1072  276  276  rw---   [ anon ]
  034aa000 10154940  1540384  1540384  rw---   [ anon ]
  2acf04af2000  136  1360  r-x-- ld-2.19.90.so
  2acf04b14000   88   88   88  rw---   [ anon ]
  ...
    ---  ---  ---
  total kB 12022060  3388396  3377708
 
 Maybe there is still a memleak (just checked that LTOing int main() {}
 doesn't leak).
 
 Otherwise I forgot to enable compression at all ;)

He. Also parallel streaming stopped working.

-- 
Markus


Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Richard Biener
On Wed, 30 Jul 2014, Markus Trippelsdorf wrote:

 On 2014.07.30 at 10:31 +0200, Richard Biener wrote:
  On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
  mar...@trippelsdorf.de wrote:
   On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
   On Tue, 29 Jul 2014, Richard Biener wrote:
  
   
This re-organizes the LTO streamer to do compression transparently
in the data-streamer routines (and disables section compression
by defaulting to -flto-compression-level=0).  This avoids
keeping the whole uncompressed sections in memory, only retaining
the compressed ones.
   
The downside is that we lose compression of at least the string
parts (they are abusing the streaming interface quite awkwardly
and doing random-accesses with offsets into the uncompressed
section).  With a little bit of surgery we can get that back I
think (but we'd have to keep the uncompressed piece in memory
somewhere which means losing the memory use advantage).
   
Very lightly tested sofar (running lto.exp).  I'll try a LTO
bootstrap now.
   
I wonder what the change is on WPA memory use for larger
projects and what the effect on object file size is.
  
   Updated patch passing LTO bootstrap (one warning fix) and
   with a memory leak fixed.
  
   Testing with Firefox is impossible at the moment because of PR61885.
   One thing I've noticed (before the ICE) is that virtual memory usage is
   very high:
  
   AddressKbytes  RSSDirty  Mode  Mapping
   004016344 90840  r-x-- lto1
   013f6000   36   36   28  rw--- lto1
   013ff000 1072  276  276  rw---   [ anon ]
   034aa000 10154940  1540384  1540384  rw---   [ anon ]
   2acf04af2000  136  1360  r-x-- ld-2.19.90.so
   2acf04b14000   88   88   88  rw---   [ anon ]
   ...
     ---  ---  ---
   total kB 12022060  3388396  3377708
  
  Maybe there is still a memleak (just checked that LTOing int main() {}
  doesn't leak).
  
  Otherwise I forgot to enable compression at all ;)
 
 He. Also parallel streaming stopped working.

Interesting.  Meanwhile after enabling compression the benefit of the
patch is the reduction in LTRANS file size (stage3 cc1) from
460MB to 178MB because we now compress here.

Object file size of stage3-gcc/*.o is increased to 540MB from 506MB
(note LTO bootstrap is with fat LTO objects, so the increase in
LTO IL size is bigger that it looks like).  I suppose mainly due
to no longer compressing strings.

I also can observe the appearant memleak - will try to hunt it down.

Richard.


Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Richard Biener
On Wed, 30 Jul 2014, Richard Biener wrote:

 On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
 mar...@trippelsdorf.de wrote:
  On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
  On Tue, 29 Jul 2014, Richard Biener wrote:
 
  
   This re-organizes the LTO streamer to do compression transparently
   in the data-streamer routines (and disables section compression
   by defaulting to -flto-compression-level=0).  This avoids
   keeping the whole uncompressed sections in memory, only retaining
   the compressed ones.
  
   The downside is that we lose compression of at least the string
   parts (they are abusing the streaming interface quite awkwardly
   and doing random-accesses with offsets into the uncompressed
   section).  With a little bit of surgery we can get that back I
   think (but we'd have to keep the uncompressed piece in memory
   somewhere which means losing the memory use advantage).
  
   Very lightly tested sofar (running lto.exp).  I'll try a LTO
   bootstrap now.
  
   I wonder what the change is on WPA memory use for larger
   projects and what the effect on object file size is.
 
  Updated patch passing LTO bootstrap (one warning fix) and
  with a memory leak fixed.
 
  Testing with Firefox is impossible at the moment because of PR61885.
  One thing I've noticed (before the ICE) is that virtual memory usage is
  very high:
 
  AddressKbytes  RSSDirty  Mode  Mapping
  004016344 90840  r-x-- lto1
  013f6000   36   36   28  rw--- lto1
  013ff000 1072  276  276  rw---   [ anon ]
  034aa000 10154940  1540384  1540384  rw---   [ anon ]
  2acf04af2000  136  1360  r-x-- ld-2.19.90.so
  2acf04b14000   88   88   88  rw---   [ anon ]
  ...
    ---  ---  ---
  total kB 12022060  3388396  3377708
 
 Maybe there is still a memleak (just checked that LTOing int main() {}
 doesn't leak).

Found it:

Index: gcc/lto-section-in.c
===
--- gcc/lto-section-in.c.orig   2014-07-30 12:40:27.950225826 +0200
+++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200
@@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l
struct lto_input_block *ib,
const char *data, size_t len)
 {
-  free (ib);
+  delete ib;
   lto_free_section_data (file_data, section_type, NULL, data, len);
 }
 
Richard.


Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Martin Liška


On 07/30/2014 11:41 AM, Richard Biener wrote:

On Wed, 30 Jul 2014, Richard Biener wrote:


On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:

On 2014.07.29 at 15:10 +0200, Richard Biener wrote:

On Tue, 29 Jul 2014, Richard Biener wrote:


This re-organizes the LTO streamer to do compression transparently
in the data-streamer routines (and disables section compression
by defaulting to -flto-compression-level=0).  This avoids
keeping the whole uncompressed sections in memory, only retaining
the compressed ones.

The downside is that we lose compression of at least the string
parts (they are abusing the streaming interface quite awkwardly
and doing random-accesses with offsets into the uncompressed
section).  With a little bit of surgery we can get that back I
think (but we'd have to keep the uncompressed piece in memory
somewhere which means losing the memory use advantage).

Very lightly tested sofar (running lto.exp).  I'll try a LTO
bootstrap now.

I wonder what the change is on WPA memory use for larger
projects and what the effect on object file size is.

Updated patch passing LTO bootstrap (one warning fix) and
with a memory leak fixed.

Testing with Firefox is impossible at the moment because of PR61885.
One thing I've noticed (before the ICE) is that virtual memory usage is
very high:

AddressKbytes  RSSDirty  Mode  Mapping
004016344 90840  r-x-- lto1
013f6000   36   36   28  rw--- lto1
013ff000 1072  276  276  rw---   [ anon ]
034aa000 10154940  1540384  1540384  rw---   [ anon ]
2acf04af2000  136  1360  r-x-- ld-2.19.90.so
2acf04b14000   88   88   88  rw---   [ anon ]
...
  ---  ---  ---
total kB 12022060  3388396  3377708

Maybe there is still a memleak (just checked that LTOing int main() {}
doesn't leak).

Found it:

Index: gcc/lto-section-in.c
===
--- gcc/lto-section-in.c.orig   2014-07-30 12:40:27.950225826 +0200
+++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200
@@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l
 struct lto_input_block *ib,
 const char *data, size_t len)
  {
-  free (ib);
+  delete ib;
lto_free_section_data (file_data, section_type, NULL, data, len);
  }
  
Richard.

Hello,
   there's memory/CPU usage for the patch. for both, I used sync and 
drop_caches.

Url: 
https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing

Martin



Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Richard Biener
On Wed, Jul 30, 2014 at 1:14 PM, Martin Liška mli...@suse.cz wrote:

 On 07/30/2014 11:41 AM, Richard Biener wrote:

 On Wed, 30 Jul 2014, Richard Biener wrote:

 On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
 mar...@trippelsdorf.de wrote:

 On 2014.07.29 at 15:10 +0200, Richard Biener wrote:

 On Tue, 29 Jul 2014, Richard Biener wrote:

 This re-organizes the LTO streamer to do compression transparently
 in the data-streamer routines (and disables section compression
 by defaulting to -flto-compression-level=0).  This avoids
 keeping the whole uncompressed sections in memory, only retaining
 the compressed ones.

 The downside is that we lose compression of at least the string
 parts (they are abusing the streaming interface quite awkwardly
 and doing random-accesses with offsets into the uncompressed
 section).  With a little bit of surgery we can get that back I
 think (but we'd have to keep the uncompressed piece in memory
 somewhere which means losing the memory use advantage).

 Very lightly tested sofar (running lto.exp).  I'll try a LTO
 bootstrap now.

 I wonder what the change is on WPA memory use for larger
 projects and what the effect on object file size is.

 Updated patch passing LTO bootstrap (one warning fix) and
 with a memory leak fixed.

 Testing with Firefox is impossible at the moment because of PR61885.
 One thing I've noticed (before the ICE) is that virtual memory usage is
 very high:

 AddressKbytes  RSSDirty  Mode  Mapping
 004016344 90840  r-x-- lto1
 013f6000   36   36   28  rw--- lto1
 013ff000 1072  276  276  rw---   [ anon ]
 034aa000 10154940  1540384  1540384  rw---   [ anon ]
 2acf04af2000  136  1360  r-x-- ld-2.19.90.so
 2acf04b14000   88   88   88  rw---   [ anon ]
 ...
   ---  ---  ---
 total kB 12022060  3388396  3377708

 Maybe there is still a memleak (just checked that LTOing int main() {}
 doesn't leak).

 Found it:

 Index: gcc/lto-section-in.c
 ===
 --- gcc/lto-section-in.c.orig   2014-07-30 12:40:27.950225826 +0200
 +++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200
 @@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l
  struct lto_input_block *ib,
  const char *data, size_t len)
   {
 -  free (ib);
 +  delete ib;
 lto_free_section_data (file_data, section_type, NULL, data, len);
   }
   Richard.

 Hello,
there's memory/CPU usage for the patch. for both, I used sync and
 drop_caches.

 Url:
 https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing

Ok, it turns out setting -flto-compression-level to 0 doesn't really
short-circuit zlib for sections.  So the following does that the hard
but effective way.

Index: gcc/lto-section-out.c
===
--- gcc/lto-section-out.c.orig  2014-07-30 13:33:06.634008355 +0200
+++ gcc/lto-section-out.c   2014-07-30 13:29:19.468023995 +0200
@@ -80,7 +80,7 @@ lto_begin_section (const char *name, boo
  data is anything other than assembler output.  The effect here is that
  we get compression of IL only in non-ltrans object files.  */
   gcc_assert (compression_stream == NULL);
-  if (compress)
+  if (compress  0)
 compression_stream = lto_start_compression (lto_append_data, NULL);
 }

Index: gcc/lto-section-in.c
===
--- gcc/lto-section-in.c.orig   2014-07-30 13:33:06.637008355 +0200
+++ gcc/lto-section-in.c2014-07-30 13:31:57.329013126 +0200
@@ -153,7 +153,7 @@ lto_get_section_data (struct lto_file_de

   /* FIXME lto: WPA mode does not write compressed sections, so for now
  suppress uncompression if flag_ltrans.  */
-  if (!flag_ltrans)
+  if (!flag_ltrans  0)
 {
   /* Create a mapping header containing the underlying data and length,
 and prepend this to the uncompression buffer.  The uncompressed data
@@ -200,7 +200,7 @@ lto_free_section_data (struct lto_file_d

   /* FIXME lto: WPA mode does not write compressed sections, so for now
  suppress uncompression mapping if flag_ltrans.  */
-  if (flag_ltrans)
+  if (flag_ltrans || 1)
 {
   (free_section_f) (file_data, section_type, name, data, len);
   return;

does that change anything?

Thanks,
Richard.

 Martin



Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-30 Thread Martin Liška


On 07/30/2014 12:37 PM, Richard Biener wrote:

On Wed, Jul 30, 2014 at 1:14 PM, Martin Liška mli...@suse.cz wrote:

On 07/30/2014 11:41 AM, Richard Biener wrote:

On Wed, 30 Jul 2014, Richard Biener wrote:


On Wed, Jul 30, 2014 at 7:51 AM, Markus Trippelsdorf
mar...@trippelsdorf.de wrote:

On 2014.07.29 at 15:10 +0200, Richard Biener wrote:

On Tue, 29 Jul 2014, Richard Biener wrote:


This re-organizes the LTO streamer to do compression transparently
in the data-streamer routines (and disables section compression
by defaulting to -flto-compression-level=0).  This avoids
keeping the whole uncompressed sections in memory, only retaining
the compressed ones.

The downside is that we lose compression of at least the string
parts (they are abusing the streaming interface quite awkwardly
and doing random-accesses with offsets into the uncompressed
section).  With a little bit of surgery we can get that back I
think (but we'd have to keep the uncompressed piece in memory
somewhere which means losing the memory use advantage).

Very lightly tested sofar (running lto.exp).  I'll try a LTO
bootstrap now.

I wonder what the change is on WPA memory use for larger
projects and what the effect on object file size is.

Updated patch passing LTO bootstrap (one warning fix) and
with a memory leak fixed.

Testing with Firefox is impossible at the moment because of PR61885.
One thing I've noticed (before the ICE) is that virtual memory usage is
very high:

AddressKbytes  RSSDirty  Mode  Mapping
004016344 90840  r-x-- lto1
013f6000   36   36   28  rw--- lto1
013ff000 1072  276  276  rw---   [ anon ]
034aa000 10154940  1540384  1540384  rw---   [ anon ]
2acf04af2000  136  1360  r-x-- ld-2.19.90.so
2acf04b14000   88   88   88  rw---   [ anon ]
...
  ---  ---  ---
total kB 12022060  3388396  3377708

Maybe there is still a memleak (just checked that LTOing int main() {}
doesn't leak).

Found it:

Index: gcc/lto-section-in.c
===
--- gcc/lto-section-in.c.orig   2014-07-30 12:40:27.950225826 +0200
+++ gcc/lto-section-in.c2014-07-30 12:37:44.179237102 +0200
@@ -249,7 +249,7 @@ lto_destroy_simple_input_block (struct l
  struct lto_input_block *ib,
  const char *data, size_t len)
   {
-  free (ib);
+  delete ib;
 lto_free_section_data (file_data, section_type, NULL, data, len);
   }
   Richard.

Hello,
there's memory/CPU usage for the patch. for both, I used sync and
drop_caches.

Url:
https://drive.google.com/file/d/0B0pisUJ80pO1andOX19JMHV3LVE/edit?usp=sharing

Ok, it turns out setting -flto-compression-level to 0 doesn't really
short-circuit zlib for sections.  So the following does that the hard
but effective way.

Index: gcc/lto-section-out.c
===
--- gcc/lto-section-out.c.orig  2014-07-30 13:33:06.634008355 +0200
+++ gcc/lto-section-out.c   2014-07-30 13:29:19.468023995 +0200
@@ -80,7 +80,7 @@ lto_begin_section (const char *name, boo
   data is anything other than assembler output.  The effect here is that
   we get compression of IL only in non-ltrans object files.  */
gcc_assert (compression_stream == NULL);
-  if (compress)
+  if (compress  0)
  compression_stream = lto_start_compression (lto_append_data, NULL);
  }

Index: gcc/lto-section-in.c
===
--- gcc/lto-section-in.c.orig   2014-07-30 13:33:06.637008355 +0200
+++ gcc/lto-section-in.c2014-07-30 13:31:57.329013126 +0200
@@ -153,7 +153,7 @@ lto_get_section_data (struct lto_file_de

/* FIXME lto: WPA mode does not write compressed sections, so for now
   suppress uncompression if flag_ltrans.  */
-  if (!flag_ltrans)
+  if (!flag_ltrans  0)
  {
/* Create a mapping header containing the underlying data and length,
  and prepend this to the uncompression buffer.  The uncompressed data
@@ -200,7 +200,7 @@ lto_free_section_data (struct lto_file_d

/* FIXME lto: WPA mode does not write compressed sections, so for now
   suppress uncompression mapping if flag_ltrans.  */
-  if (flag_ltrans)
+  if (flag_ltrans || 1)
  {
(free_section_f) (file_data, section_type, name, data, len);
return;

does that change anything?

Thanks,
Richard.

There are new numbers: 
https://drive.google.com/file/d/0B0pisUJ80pO1aG83N2JXLWNVUW8/edit?usp=sharing, 
where I reduced the scale to to 10GB to identify better any differences.

Martin


Martin





[PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-29 Thread Richard Biener

This re-organizes the LTO streamer to do compression transparently
in the data-streamer routines (and disables section compression
by defaulting to -flto-compression-level=0).  This avoids
keeping the whole uncompressed sections in memory, only retaining
the compressed ones.

The downside is that we lose compression of at least the string
parts (they are abusing the streaming interface quite awkwardly
and doing random-accesses with offsets into the uncompressed
section).  With a little bit of surgery we can get that back I
think (but we'd have to keep the uncompressed piece in memory
somewhere which means losing the memory use advantage).

Very lightly tested sofar (running lto.exp).  I'll try a LTO
bootstrap now.

I wonder what the change is on WPA memory use for larger
projects and what the effect on object file size is.

Richard.


insert-changelog-here

Index: gcc/data-streamer-out.c
===
*** gcc/data-streamer-out.c.orig2014-07-29 13:04:48.255073822 +0200
--- gcc/data-streamer-out.c 2014-07-29 13:09:40.301053715 +0200
*** along with GCC; see the file COPYING3.
*** 22,27 
--- 22,32 
  
  #include config.h
  #include system.h
+ /* zlib.h includes other system headers.  Those headers may test feature
+test macros.  config.h may define feature test macros.  For this reason,
+zlib.h needs to be included after, rather than before, config.h and
+system.h.  */
+ #include zlib.h
  #include coretypes.h
  #include tree.h
  #include basic-block.h
*** along with GCC; see the file COPYING3.
*** 32,37 
--- 37,199 
  #include gimple.h
  #include data-streamer.h
  
+ 
+ /* Finishes the last block, eventually compressing it, and returns the
+total size of the stream.  */
+ 
+ unsigned int
+ lto_output_stream::finish ()
+ {
+   if (compress
+current_pointer)
+ {
+   /* Compress the last (partial) block.  */
+   compress_current_block (true);
+   left_in_block = zlib_stream-avail_out;
+   int status = deflateEnd (zlib_stream);
+   if (status != Z_OK)
+   internal_error (compressed stream: %s, zError (status));
+   free (zlib_stream);
+ }
+   current_pointer = NULL;
+ 
+   unsigned int size = 0;
+   for (lto_char_ptr_base *b = first_block; b; b = (lto_char_ptr_base *)b-ptr)
+ size += block_size - sizeof (lto_char_ptr_base);
+   size -= left_in_block;
+   return size;
+ }
+ 
+ /* Returns a pointer to the first block of the chain of blocks to output.  */
+ 
+ lto_char_ptr_base *
+ lto_output_stream::get_blocks ()
+ {
+   finish ();
+   return first_block;
+ }
+ 
+ /* Adds a new block to output stream OBS.  */
+ 
+ void
+ lto_output_stream::append_block ()
+ {
+   struct lto_char_ptr_base *new_block;
+   bool first_p = false;
+ 
+   gcc_assert (left_in_block == 0  block_size  sizeof (lto_char_ptr_base));
+ 
+   if (first_block == NULL)
+ {
+   /* This is the first time the stream has been written into.  */
+   new_block = (struct lto_char_ptr_base*) xmalloc (block_size);
+   first_block = new_block;
+   first_p = true;
+ }
+   else
+ {
+   if (compress)
+   {
+ /* Compress the current block and link it into the list.  */
+ compress_current_block (false);
+ /* Re-use the uncompressed buffer.  */
+ new_block = current_block;
+   }
+   else
+   {
+ /* Get a new block and link it into the list.  */
+ new_block = (struct lto_char_ptr_base*) xmalloc (block_size);
+ /* The first bytes of the block are reserved as a pointer to
+the next block.  Set the chain of the full block to the
+pointer to the new block.  */
+ lto_char_ptr_base *tptr = current_block;
+ tptr-ptr = (char *) new_block;
+   }
+ }
+ 
+   /* Set the place for the next char at the first position after the
+  chain to the next block.  */
+   current_pointer
+ = ((char *) new_block) + sizeof (struct lto_char_ptr_base);
+   current_block = new_block;
+   /* Null out the newly allocated block's pointer to the next block.  */
+   new_block-ptr = NULL;
+   left_in_block = block_size - sizeof (struct lto_char_ptr_base);
+ 
+ #if 0
+   if (first_p)
+ streamer_write_hwi_stream (this, compress);
+ #endif
+ }
+ 
+ /* Return a zlib compression level that zlib will not reject.  Normalizes
+the compression level from the command line flag, clamping non-default
+values to the appropriate end of their valid range.  */
+ 
+ static int
+ lto_normalized_zlib_level (void)
+ {
+   int level = flag_lto_compression_level;
+ 
+   if (level != Z_DEFAULT_COMPRESSION)
+ {
+   if (level  Z_NO_COMPRESSION)
+   level = Z_NO_COMPRESSION;
+   else if (level  Z_BEST_COMPRESSION)
+   level = Z_BEST_COMPRESSION;
+ }
+ 
+   return level;
+ }
+ 
+ void
+ lto_output_stream::compress_current_block (bool last)
+ {
+   int status;
+ 
+   

Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-29 Thread Richard Biener
On Tue, 29 Jul 2014, Richard Biener wrote:

 
 This re-organizes the LTO streamer to do compression transparently
 in the data-streamer routines (and disables section compression
 by defaulting to -flto-compression-level=0).  This avoids
 keeping the whole uncompressed sections in memory, only retaining
 the compressed ones.
 
 The downside is that we lose compression of at least the string
 parts (they are abusing the streaming interface quite awkwardly
 and doing random-accesses with offsets into the uncompressed
 section).  With a little bit of surgery we can get that back I
 think (but we'd have to keep the uncompressed piece in memory
 somewhere which means losing the memory use advantage).
 
 Very lightly tested sofar (running lto.exp).  I'll try a LTO
 bootstrap now.
 
 I wonder what the change is on WPA memory use for larger
 projects and what the effect on object file size is.

Updated patch passing LTO bootstrap (one warning fix) and
with a memory leak fixed.

I'll probably try to split out cleanups from this patch.

Richard.


Index: gcc/data-streamer-out.c
===
*** gcc/data-streamer-out.c.orig2014-07-29 13:04:48.255073822 +0200
--- gcc/data-streamer-out.c 2014-07-29 14:35:22.908699653 +0200
*** along with GCC; see the file COPYING3.
*** 22,27 
--- 22,32 
  
  #include config.h
  #include system.h
+ /* zlib.h includes other system headers.  Those headers may test feature
+test macros.  config.h may define feature test macros.  For this reason,
+zlib.h needs to be included after, rather than before, config.h and
+system.h.  */
+ #include zlib.h
  #include coretypes.h
  #include tree.h
  #include basic-block.h
*** along with GCC; see the file COPYING3.
*** 32,37 
--- 37,194 
  #include gimple.h
  #include data-streamer.h
  
+ 
+ /* Finishes the last block, eventually compressing it, and returns the
+total size of the stream.  */
+ 
+ unsigned int
+ lto_output_stream::finish ()
+ {
+   if (compress
+current_pointer)
+ {
+   /* Compress the last (partial) block.  */
+   compress_current_block (true);
+   left_in_block = zlib_stream-avail_out;
+   free (current_block);
+   current_block = NULL;
+   int status = deflateEnd (zlib_stream);
+   if (status != Z_OK)
+   internal_error (compressed stream: %s, zError (status));
+   free (zlib_stream);
+ }
+   current_pointer = NULL;
+ 
+   unsigned int size = 0;
+   for (lto_char_ptr_base *b = first_block; b; b = (lto_char_ptr_base *)b-ptr)
+ size += block_size - sizeof (lto_char_ptr_base);
+   size -= left_in_block;
+   return size;
+ }
+ 
+ /* Returns a pointer to the first block of the chain of blocks to output.  */
+ 
+ lto_char_ptr_base *
+ lto_output_stream::get_blocks ()
+ {
+   finish ();
+   return first_block;
+ }
+ 
+ /* Adds a new block to output stream OBS.  */
+ 
+ void
+ lto_output_stream::append_block ()
+ {
+   struct lto_char_ptr_base *new_block;
+ 
+   gcc_assert (left_in_block == 0  block_size  sizeof (lto_char_ptr_base));
+ 
+   if (first_block == NULL)
+ {
+   /* This is the first time the stream has been written into.  */
+   new_block = (struct lto_char_ptr_base*) xmalloc (block_size);
+   first_block = new_block;
+ }
+   else
+ {
+   if (compress)
+   {
+ /* Compress the current block and link it into the list.  */
+ compress_current_block (false);
+ /* Re-use the uncompressed buffer.  */
+ new_block = current_block;
+   }
+   else
+   {
+ /* Get a new block and link it into the list.  */
+ new_block = (struct lto_char_ptr_base*) xmalloc (block_size);
+ /* The first bytes of the block are reserved as a pointer to
+the next block.  Set the chain of the full block to the
+pointer to the new block.  */
+ lto_char_ptr_base *tptr = current_block;
+ tptr-ptr = (char *) new_block;
+   }
+ }
+ 
+   /* Set the place for the next char at the first position after the
+  chain to the next block.  */
+   current_pointer
+ = ((char *) new_block) + sizeof (struct lto_char_ptr_base);
+   current_block = new_block;
+   /* Null out the newly allocated block's pointer to the next block.  */
+   new_block-ptr = NULL;
+   left_in_block = block_size - sizeof (struct lto_char_ptr_base);
+ }
+ 
+ /* Return a zlib compression level that zlib will not reject.  Normalizes
+the compression level from the command line flag, clamping non-default
+values to the appropriate end of their valid range.  */
+ 
+ static int
+ lto_normalized_zlib_level (void)
+ {
+   int level = flag_lto_compression_level;
+ 
+   if (level != Z_DEFAULT_COMPRESSION)
+ {
+   if (level  Z_NO_COMPRESSION)
+   level = Z_NO_COMPRESSION;
+   else if (level  Z_BEST_COMPRESSION)
+   level = Z_BEST_COMPRESSION;
+ }
+ 
+   return 

Re: [PATCH] LTO streamer reorg - try to reduce WPA memory use

2014-07-29 Thread Markus Trippelsdorf
On 2014.07.29 at 15:10 +0200, Richard Biener wrote:
 On Tue, 29 Jul 2014, Richard Biener wrote:
 
  
  This re-organizes the LTO streamer to do compression transparently
  in the data-streamer routines (and disables section compression
  by defaulting to -flto-compression-level=0).  This avoids
  keeping the whole uncompressed sections in memory, only retaining
  the compressed ones.
  
  The downside is that we lose compression of at least the string
  parts (they are abusing the streaming interface quite awkwardly
  and doing random-accesses with offsets into the uncompressed
  section).  With a little bit of surgery we can get that back I
  think (but we'd have to keep the uncompressed piece in memory
  somewhere which means losing the memory use advantage).
  
  Very lightly tested sofar (running lto.exp).  I'll try a LTO
  bootstrap now.
  
  I wonder what the change is on WPA memory use for larger
  projects and what the effect on object file size is.
 
 Updated patch passing LTO bootstrap (one warning fix) and
 with a memory leak fixed.

Testing with Firefox is impossible at the moment because of PR61885.
One thing I've noticed (before the ICE) is that virtual memory usage is
very high:

AddressKbytes  RSSDirty  Mode  Mapping
004016344 90840  r-x-- lto1
013f6000   36   36   28  rw--- lto1
013ff000 1072  276  276  rw---   [ anon ]
034aa000 10154940  1540384  1540384  rw---   [ anon ]
2acf04af2000  136  1360  r-x-- ld-2.19.90.so
2acf04b14000   88   88   88  rw---   [ anon ]
...
  ---  ---  --- 
total kB 12022060  3388396  3377708

-- 
Markus