Hello, While trying to bootstrap with GCAC checking enabled and some instrumentation to measure how often objects are being marked, I noticed that a lot of cache misses happen because already-marked objects are being tested again (e.g. via ggc_marked_p). This struck me as rather odd, until I looked at what the GC markers for my favorite data structures (the CFG) look like.
Basic blocks live in GC space because they can point to other GC-space objects (rtx, gimple), and these objects can point back to the basic block they're contained in. In addition, the (function-specific) basic_block_info array points to all live basic blocks and of course the edges have basic_block pointers as well. Finally, loop have pointers to the loop header and latch basic_blocks. The effect is that, on average, ggc_marked_p or one of its variants gets called 17 times on each basic_block, *per ggc_collect call*! This turns out to be typical for many data structures. It's more difficult to get accurate numbers, but GCAC is especially slow when building a large PCH and I'm guessing that this is in part due to tree-to-tree pointers. This all made me wonder why we can't use the known hierarchy of the intermediate representations. Ideally, there should be only two classes of objects: global state variables (not ideal, but a reality) and objects reachable from the symbol table. AFAICT anything that isn't in one of those two categories has probably become garbage somewhere along the way. As an experiment, I did a small modification to the GC walkers of the CFG. I'd have started with the symbol table, but there's no documentation and I don't know how memory is managed or what invariants should hold (e.g. "everything that will be output must be reachable via the symbol table"). Modifying the CFG was easier. The effect for building my set of cc1-i files with a GCAC-checking enabled compiler is a ~20% speedup. It seems to be that this hierarchy could also be useful if/when GCC will move away from generated walkers (i.e. gengtype). If we can somehow enforce a hierarchy, the markers become much simpler. Also, the hierarchy could be used to verify "no leaks" from alloc pools, and probably for other things I can't think of right now. Honza, can you please add some documentation about the symtab? What do you think of the above from your symtab point of view? Another "problem" with gengtype is that it doesn't know what types can end up in a PCH. The CFG data structures can *never* be in a PCH, but there still are PCH writer functions. This is true for many other data structures as well. Has anyone ever measured/investigated what PCH writer functions are actually used? Should we (as an intermediate step towards streaming PCHs) explicitly GTY-mark types that should have PCH writer functions, and not generate such functions for unmarked types? Ciao! Steven
GC_hierarchy.diff
Description: Binary data