On Wed, Sep 12, 2012 at 7:20 PM, Dehao Chen <de...@google.com> wrote: > There is another bug in the patch (not covered by unittests, > discovered through spec benchmarks). > > When we remove unused locals, we do not mark the block as used for > debug stmt, but gimple-streamer-out will still stream out blocks for > debug stmt. There can be 2 fixes:
Because doing so would create code generation differences -g vs. -g0. > 1. > --- a/gcc/gimple-streamer-out.c > +++ b/gcc/gimple-streamer-out.c > @@ -77,7 +77,8 @@ output_gimple_stmt (struct output_block *ob, gimple stmt) > lto_output_location (ob, LOCATION_LOCUS (gimple_location (stmt))); > > /* Emit the lexical block holding STMT. */ > - stream_write_tree (ob, gimple_block (stmt), true); > + if (!is_gimple_debug (stmt)) > + stream_write_tree (ob, gimple_block (stmt), true); > > /* Emit the operands. */ > switch (gimple_code (stmt)) > > 2. > --- a/gcc/tree-ssa-live.c > +++ b/gcc/tree-ssa-live.c > @@ -726,9 +726,6 @@ remove_unused_locals (void) > gimple stmt = gsi_stmt (gsi); > tree b = gimple_block (stmt); > > - if (is_gimple_debug (stmt)) > - continue; > - > if (gimple_clobber_p (stmt)) > { > have_local_clobbers = true; > > Either fix could work. Any suggests which one should we go? The 2nd one will not work and is not acceptable. The 1st one - well ... what happens on trunk right now? The debug stmt points to a BLOCK that is possibly removed from the BLOCK tree? In this case I think the fix is 3. make sure remove_unused_scope_block_p will clear BLOCKs from all stmts / expressions that have been removed. Richard. > Thanks, > Dehao > > On Wed, Sep 12, 2012 at 10:05 AM, Dehao Chen <de...@google.com> wrote: >> There are two parts that needs memory management: >> >> 1. The BLOCK structure. This is managed by GC. I originally thought >> that removing blocks from tree.gsbase would paralyze GC. This turned >> out not to be a concern because DECL_INITIAL will still mark those >> used tree nodes. This patch may decrease the memory consumption by >> removing blocks from tree/gimple. However, as it makes more blocks >> become used, they also increase the memory consumption. >> 2. The data structure in libcpp that maintains the hashtable for the >> location->block mapping. This is relatively minor because for the >> largest source I've seen, it only maintains less than 100K entries in >> the array (less than 1M total memory consumption). However, as it is a >> global data structure, it may make LTO unhappy. Honza is helping >> testing the memory consumption on LTO (but we first need to make this >> patch work for LTO). If the LTO result turns out ok, we probably don't >> want to put these under GC because: 1. it'll make things much more >> complicated. 2. using self managed memory is more efficient (as this >> is frequently used in many passes). 3. not using GC actually saves >> memory because even though the block is in the map, it can still be >> GCed as soon as it's not reachable from DECL_INITIAL. >> >> I've tested this on some very large C++ files (each one takes more >> than 10s to build), the memory consumption does not see noticeable >> increase/decrease. >> >> Thanks, >> Dehao >> >> On Wed, Sep 12, 2012 at 9:39 AM, Xinliang David Li <davi...@google.com> >> wrote: >>> On Wed, Sep 12, 2012 at 2:13 AM, Richard Guenther >>> <richard.guent...@gmail.com> wrote: >>>> On Wed, Sep 12, 2012 at 7:06 AM, Dehao Chen <de...@google.com> wrote: >>>>> Now I think we are facing a more complex problem. The data structure >>>>> we use to store the location_adhoc_data are file-static in linemap.c >>>>> in libcpp. These data structures are not guarded by GTY(()). >>>>> Meanwhile, as we have removed the block data structure from >>>>> gimple.gsbase as well as tree.exp (encoding them into an location_t). >>>>> This could cause block being GCed and the LOCATION_BLOCK becoming >>>>> dangling pointers. >>>> >>>> Uh. Note that it is quite important that we are able to garbage-collect >>>> unused >>>> BLOCKs, this is the whole point of removing unused BLOCK scopes in >>>> remove_unused_locals. So this indeed becomes much more complicated ... >>>> What would be desired is that the garbage collector can NULL an entry in >>>> the mapping table when it is not referenced in any other way (that other >>>> reference would be the BLOCK tree as stored in a FUNCTION_DECLs >>>> DECL_INITIAL). >>> >>> It would be nice to GC those unused BLOCKS. I wonder how many BLOCKS >>> are created for a large C++ program. This patch saves memory by >>> shrinking tree size, is it a net win or loss without GC those BLOCKS? >>> >>> thanks, >>> >>> David >>> >>> >>>> >>>>> I tried to manipulate GTY to make it recognize the LOCATION_BLOCK from >>>>> gimple.gsbase.location. However, neigher nested_ptr nor mark_hook can >>>>> help me. >>>>> >>>>> Another approach would be guard the location_adhoc_data and related >>>>> data structures in GTY(()). However, this is non-trivial because tree >>>>> is not visible in libcpp. At the same time, my implementation heavily >>>>> relies on hashtable to make the code efficient, thus it's quite tricky >>>>> to make "param_is" and "use_params" work. >>>>> >>>>> The final approach, which I'll try tomorrow, would be move all my >>>>> implementation from libcpp to gcc, and guard them with GTY(()). I >>>>> still haven't thought of any potential problem of this approach. Any >>>>> comments? >>>> >>>> I think moving the mapping to GC in a lazy manner as I described above >>>> would be the way to go. For hashtables GC already supports if_marked, >>>> not sure if similar support is available for arrays/vecs. >>>> >>>> Richard. >>>> >>>>> Thanks, >>>>> Dehao >>>>> >>>>> On Tue, Sep 11, 2012 at 9:00 AM, Dehao Chen <de...@google.com> wrote: >>>>>> I saw comments in tree-streamer-out.c: >>>>>> >>>>>> /* Do not stream BLOCK_SOURCE_LOCATION. We cannot handle debug >>>>>> information >>>>>> for early inlining so drop it on the floor instead of ICEing in >>>>>> dwarf2out.c. */ >>>>>> streamer_write_chain (ob, BLOCK_VARS (expr), ref_p); >>>>>> >>>>>> However, what the code is doing seemed contradictory with the comment. >>>>>> Or am I missing something? >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz <m...@suse.de> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On Tue, 11 Sep 2012, Dehao Chen wrote: >>>>>>> >>>>>>>> Looks like we have two choices: >>>>>>>> >>>>>>>> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t) >>>>>>> >>>>>>> This will actually not work correctly in some cases. The problem is, if >>>>>>> the prevailing decl is already part of another chain (say in another >>>>>>> block_var list) you would break the current chain. Hence block vars >>>>>>> need >>>>>>> special handling in the lto streamer (another reason why tree_chain is >>>>>>> not >>>>>>> the most clever think to use for this chain). This problem area needs >>>>>>> to >>>>>>> be solved somehow if block info is to be preserved correctly. >>>>>>> >>>>>>>> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL >>>>>>>> (TREE_CHAIN (t)). >>>>>>> >>>>>>> That's also a large hammer as it basically will mean no debug info after >>>>>>> LTO :-/ Sigh, at this point I have no good solution that doesn't involve >>>>>>> quite some work, perhaps your hack is good enough for the time being, >>>>>>> though I hate it :) >>>>>> >>>>>> I got it. Then I'll keep the patch as it is (remove the >>>>>> LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and >>>>>> then we should be good to check in? >>>>>> >>>>>> Thanks, >>>>>> Dehao >>>>>> >>>>>>> >>>>>>> >>>>>>> Ciao, >>>>>>> Michael.