On Thu, Sep 13, 2012 at 8:02 PM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Wed, Sep 12, 2012 at 6:39 PM, Xinliang David Li <davi...@google.com> wrote: >> On Wed, Sep 12, 2012 at 2:13 AM, Richard Guenther >> <richard.guent...@gmail.com> wrote: >>> On Wed, Sep 12, 2012 at 7:06 AM, Dehao Chen <de...@google.com> wrote: >>>> Now I think we are facing a more complex problem. The data structure >>>> we use to store the location_adhoc_data are file-static in linemap.c >>>> in libcpp. These data structures are not guarded by GTY(()). >>>> Meanwhile, as we have removed the block data structure from >>>> gimple.gsbase as well as tree.exp (encoding them into an location_t). >>>> This could cause block being GCed and the LOCATION_BLOCK becoming >>>> dangling pointers. >>> >>> Uh. Note that it is quite important that we are able to garbage-collect >>> unused >>> BLOCKs, this is the whole point of removing unused BLOCK scopes in >>> remove_unused_locals. So this indeed becomes much more complicated ... >>> What would be desired is that the garbage collector can NULL an entry in >>> the mapping table when it is not referenced in any other way (that other >>> reference would be the BLOCK tree as stored in a FUNCTION_DECLs >>> DECL_INITIAL). >> >> It would be nice to GC those unused BLOCKS. I wonder how many BLOCKS >> are created for a large C++ program. This patch saves memory by >> shrinking tree size, is it a net win or loss without GC those BLOCKS? > > Memory usage issues pop up with C++ code using expression templates > (try BOOST MPL or tramp3d or some larger spirit testcases). Inlining
I compared the memory consumption for tramp3d, the patched version has a peak of 504065kB, while non-patched version has a peak of 491853kB. > creates tons of "empty" BLOCK trees that just wrap others. It is important > to be able to GC those. Now, it might be that no expression / location > which references the BLOCK survives, and if the line-table is not scanned Those non-used blocks will still be GCed in this patch. > by GC then we will just end up with never re-usable entries (the BLOCK address > may get re-used - can we get false sharing here?) That is true (memory wasted). However, in the tramp3d case, only 409600 entries are allocated in location_adhoc_data (4.9MB, 1% of the peak mem consumption). Thus the wasted entry should not be significant. Concerning re-used BLOCK, if the block address and the location are the same, the previously allocated entry will be reused. But it'll not affect the correctness. Thanks, Dehao > > Richard. > >> thanks, >> >> David >> >> >>> >>>> I tried to manipulate GTY to make it recognize the LOCATION_BLOCK from >>>> gimple.gsbase.location. However, neigher nested_ptr nor mark_hook can >>>> help me. >>>> >>>> Another approach would be guard the location_adhoc_data and related >>>> data structures in GTY(()). However, this is non-trivial because tree >>>> is not visible in libcpp. At the same time, my implementation heavily >>>> relies on hashtable to make the code efficient, thus it's quite tricky >>>> to make "param_is" and "use_params" work. >>>> >>>> The final approach, which I'll try tomorrow, would be move all my >>>> implementation from libcpp to gcc, and guard them with GTY(()). I >>>> still haven't thought of any potential problem of this approach. Any >>>> comments? >>> >>> I think moving the mapping to GC in a lazy manner as I described above >>> would be the way to go. For hashtables GC already supports if_marked, >>> not sure if similar support is available for arrays/vecs. >>> >>> Richard. >>> >>>> Thanks, >>>> Dehao >>>> >>>> On Tue, Sep 11, 2012 at 9:00 AM, Dehao Chen <de...@google.com> wrote: >>>>> I saw comments in tree-streamer-out.c: >>>>> >>>>> /* Do not stream BLOCK_SOURCE_LOCATION. We cannot handle debug >>>>> information >>>>> for early inlining so drop it on the floor instead of ICEing in >>>>> dwarf2out.c. */ >>>>> streamer_write_chain (ob, BLOCK_VARS (expr), ref_p); >>>>> >>>>> However, what the code is doing seemed contradictory with the comment. >>>>> Or am I missing something? >>>>> >>>>> >>>>> >>>>> On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz <m...@suse.de> wrote: >>>>>> Hi, >>>>>> >>>>>> On Tue, 11 Sep 2012, Dehao Chen wrote: >>>>>> >>>>>>> Looks like we have two choices: >>>>>>> >>>>>>> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t) >>>>>> >>>>>> This will actually not work correctly in some cases. The problem is, if >>>>>> the prevailing decl is already part of another chain (say in another >>>>>> block_var list) you would break the current chain. Hence block vars need >>>>>> special handling in the lto streamer (another reason why tree_chain is >>>>>> not >>>>>> the most clever think to use for this chain). This problem area needs to >>>>>> be solved somehow if block info is to be preserved correctly. >>>>>> >>>>>>> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL >>>>>>> (TREE_CHAIN (t)). >>>>>> >>>>>> That's also a large hammer as it basically will mean no debug info after >>>>>> LTO :-/ Sigh, at this point I have no good solution that doesn't involve >>>>>> quite some work, perhaps your hack is good enough for the time being, >>>>>> though I hate it :) >>>>> >>>>> I got it. Then I'll keep the patch as it is (remove the >>>>> LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and >>>>> then we should be good to check in? >>>>> >>>>> Thanks, >>>>> Dehao >>>>> >>>>>> >>>>>> >>>>>> Ciao, >>>>>> Michael.