Hello,

While trying to bootstrap with GCAC checking enabled and some
instrumentation to measure how often objects are being marked, I
noticed that a lot of cache misses happen because already-marked
objects are being tested again (e.g. via ggc_marked_p). This struck me
as rather odd, until I looked at what the GC markers for my favorite
data structures (the CFG) look like.

Basic blocks live in GC space because they can point to other GC-space
objects (rtx, gimple), and these objects can point back to the basic
block they're contained in. In addition, the (function-specific)
basic_block_info array points to all live basic blocks and of course
the edges have basic_block pointers as well. Finally, loop have
pointers to the loop header and latch basic_blocks.

The effect is that, on average, ggc_marked_p or one of its variants
gets called 17 times on each basic_block, *per ggc_collect call*!

This turns out to be typical for many data structures. It's more
difficult to get accurate numbers, but GCAC is especially slow when
building a large PCH and I'm guessing that this is in part due to
tree-to-tree pointers.

This all made me wonder why we can't use the known hierarchy of the
intermediate representations. Ideally, there should be only two
classes of objects: global state variables (not ideal, but a reality)
and objects reachable from the symbol table. AFAICT anything that
isn't in one of those two categories has probably become garbage
somewhere along the way.

As an experiment, I did a small modification to the GC walkers of the
CFG. I'd have started with the symbol table, but there's no
documentation and I don't know how memory is managed or what
invariants should hold (e.g. "everything that will be output must be
reachable via the symbol table"). Modifying the CFG was easier.

The effect for building my set of cc1-i files with a GCAC-checking
enabled compiler is a ~20% speedup.

It seems to be that this hierarchy could also be useful if/when GCC
will move away from generated walkers (i.e. gengtype). If we can
somehow enforce a hierarchy, the markers become much simpler. Also,
the hierarchy could be used to verify "no leaks" from alloc pools, and
probably for other things I can't think of right now.

Honza, can you please add some documentation about the symtab? What do
you think of the above from your symtab point of view?

Another "problem" with gengtype is that it doesn't know what types can
end up in a PCH. The CFG data structures can *never* be in a PCH, but
there still are PCH writer functions. This is true for many other data
structures as well. Has anyone ever measured/investigated what PCH
writer functions are actually used? Should we (as an intermediate step
towards streaming PCHs) explicitly GTY-mark types that should have PCH
writer functions, and not generate such functions for unmarked types?

Ciao!
Steven

Attachment: GC_hierarchy.diff
Description: Binary data

Reply via email to