Hi,
I fear I wasn't as thorough in also splitting this one into several
patches, but the different cleanups are at least mostly in different
files. They are:
* lto-lang remembers all builtin decls in a local list, to be returned
by the getdecls langhook. But as we have our own write_globals langhook
this isn't actually called (except by dbxout.c), so there's no point in
remembering.
* lto.c:lto_materialize_function has code to read in the function body
sections, do something with them in non-wpa mode, and discard them then.
There's no point in even reading them in in non-wpa mode (except for a
dubious error message that rather is worth an assert).
* gimple.c:gimple_type_leader_entry is a fixed length cache for speeding
up our type merging machinery. It can hold references to many meanwhile
merged trees, interferring with the wish of free up much memory with a
ggc_collect with early-merging LTO. We can simply make it deletable.
* ipa-inline.c: some tidying in not calling a macro with function call
arguments, and calling a costly function only after early-outs.
* lto-streamer-out.c : it writes out and compares strings character by
character. memcmp and output_data_stream work just as well
* lto-streamer: output_unreferenced_globals writes out all global varpool
decls. The reading side simply reads over all of them, and ignores
them. This was supposed to help symbol resolution, and it probably once
did. But since some time we properly emit varpool and cgraph nodes, and
references between them, and a proper symtab. There's no need for
emitting these trees again.
* lto-streamer: the following changes the bytecode:
1: all indices into the cache are unsigned, hence we should say
so, instead of casting casts back and forth
2: trees are only appended to the cache, when writing out. When reading
in we read in all trees in the stream one after the other, also
appending to the cache. References to existing trees _always_ are to
- well - existing trees, hence to those already emitted earlier in
the stream, i.e. with a smaller offset, and more importantly with a
known index even at reader side.
So, the offset never is used, so remove that and all associated
tracking and params.
3: for the same reason we also don't need to stream the index that new
trees get in the cache. They will get exactly the ones they also had
when writing out. We could use it as consistency check, but we
stream the expected tree-node for back-references for that already.
Obviously we do need to stream the index in back references (aka
pickled references).
(the index could change if there's a different set of nodes preloaded
into the cache between writing out and reading in. But that would
have much worse problems already, silently overwriting slots with
trees from the stream; we should do away with the preloaded nodes,
and instead rely on type merging to get canonical versions of the
builtin trees)
Not streaming offset and index for most trees obviously shortens the
bytecode somewhat but I don't have statistics on how much. Not much would
be my guess.
Regstrapped on x86_64-linux with the other three cleanups. Okay for
trunk?
Ciao,
Michael.
--
* lto-streamer.h (struct lto_streamer_cache_d): Remove offsets
and next_slot members.
(lto_streamer_cache_insert, lto_streamer_cache_insert_at,
lto_streamer_cache_lookup, lto_streamer_cache_get): Adjust prototypes.
(lto_streamer_cache_append): Declare.
* lto-streamer.c (lto_streamer_cache_add_to_node_array): Use
unsigned index, remove offset parameter, ensure that we append
or update existing entries.
(lto_streamer_cache_insert_1): Use unsigned index, remove offset_p
parameter, update next_slot for append.
(lto_streamer_cache_insert): Use unsigned index, remove offset_p
parameter.
(lto_streamer_cache_insert_at): Likewise.
(lto_streamer_cache_append): New function.
(lto_streamer_cache_lookup): Use unsigned index.
(lto_streamer_cache_get): Likewise.
(lto_record_common_node): Don't test tree_node_can_be_shared.
(preload_common_node): Adjust call to lto_streamer_cache_insert.
(lto_streamer_cache_delete): Don't free offsets member.
* lto-streamer-out.c (eq_string_slot_node): Use memcmp.
(lto_output_string_with_length): Use lto_output_data_stream.
(lto_output_tree_header): Remove ix parameter, don't write it.
(lto_output_builtin_tree): Likewise.
(lto_write_tree): Adjust callers to above, don't track and write
offset, write unsigned index.
(output_unreferenced_globals): Don't emit all global vars.
(write_global_references): Use unsigned indices.
(lto_output_decl_state_refs): Likewise.