Hi, I tried to rebuild firefox without visibilities that leads to about 10fold increase in LTO time, mostly by ltrans. What happens is that by the old rule that DECL_ONE_ONLY is duplicated into each partition that needs it we end up duplicating things everywhere. Moreover we mark tem all as used_from_ohter_partition and that makes us to not optimize them out from any of the partitions even if they are dead.
I recall my analysis when I was adding this condition - I assumed that every unit that uses COMDAT must also define it. This is wrong for keyed classes (as was later corrected, but the not updated for forced_by_abi) and in fact when we know that the COMDAT is going to be output to final file - i.e. linker plugin tells us there is no one defining the symbol, we can decompose COMDAT group early. I also noticed bug in varasm where we ignore the fact that symbol is known to be output in other partition that causes us to use GOT/PLT references with this fix and hidden visibilities re-instantiated. I believe this is the problem causing random jumps in Martin Liska's systetap graphs. Martin, is there any chance to generated updated graphs for firefox or other C++ monster, idealy both with -freorder-blocks-and-partition and with -fno-reorder-blocks-and-partition? Bootstrapped/regtested x86_64-linux, tested on firefox and comitted. Honza * lto/lto-partition.c (get_symbol_class): Only unforced DECL_ONE_ONLY needs duplicating, not generic COMDAT. * ipa.c (function_and_variable_visibility): Decompose DECL_ONE_ONLY groups when we know they are controlled by LTO. * varasm.c (default_binds_local_p_1): If object is in other partition, it will be resolved locally. Index: lto/lto-partition.c =================================================================== --- lto/lto-partition.c (revision 207479) +++ lto/lto-partition.c (working copy) @@ -94,10 +94,12 @@ get_symbol_class (symtab_node *node) else if (!cgraph (node)->definition) return SYMBOL_EXTERNAL; - /* Comdats are duplicated to every use unless they are keyed. - Those do not need duplication. */ - if (DECL_COMDAT (node->decl) + /* Linker discardable symbols are duplicated to every use unless they are + keyed. + Keyed symbols or those. */ + if (DECL_ONE_ONLY (node->decl) && !node->force_output + && !node->forced_by_abi && !symtab_used_from_object_file_p (node)) return SYMBOL_DUPLICATE; Index: ipa.c =================================================================== --- ipa.c (revision 207451) +++ ipa.c (working copy) @@ -1002,6 +1002,36 @@ function_and_variable_visibility (bool w if (DECL_EXTERNAL (decl_node->decl)) DECL_EXTERNAL (node->decl) = 1; } + + /* If whole comdat group is used only within LTO code, we can dissolve it, + we handle the unification ourselves. + We keep COMDAT and weak so visibility out of DSO does not change. + Later we may bring the symbols static if they are not exported. */ + if (DECL_ONE_ONLY (node->decl) + && (node->resolution == LDPR_PREVAILING_DEF_IRONLY + || node->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP)) + { + symtab_node *next = node; + + if (node->same_comdat_group) + for (next = node->same_comdat_group; + next != node; + next = next->same_comdat_group) + if (next->externally_visible + && (next->resolution != LDPR_PREVAILING_DEF_IRONLY + && next->resolution != LDPR_PREVAILING_DEF_IRONLY_EXP)) + break; + if (node == next) + { + if (node->same_comdat_group) + for (next = node->same_comdat_group; + next != node; + next = next->same_comdat_group) + DECL_COMDAT_GROUP (next->decl) = NULL; + DECL_COMDAT_GROUP (node->decl) = NULL; + symtab_dissolve_same_comdat_group_list (node); + } + } } FOR_EACH_DEFINED_FUNCTION (node) { Index: varasm.c =================================================================== --- varasm.c (revision 207451) +++ varasm.c (working copy) @@ -6739,7 +6739,7 @@ default_binds_local_p_1 (const_tree exp, && (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) { varpool_node *vnode = varpool_get_node (exp); - if (vnode && resolution_local_p (vnode->resolution)) + if (vnode && (resolution_local_p (vnode->resolution) || vnode->in_other_partition)) resolved_locally = true; if (vnode && resolution_to_local_definition_p (vnode->resolution)) @@ -6749,7 +6749,7 @@ default_binds_local_p_1 (const_tree exp, { struct cgraph_node *node = cgraph_get_node (exp); if (node - && resolution_local_p (node->resolution)) + && (resolution_local_p (node->resolution) || node->in_other_partition)) resolved_locally = true; if (node && resolution_to_local_definition_p (node->resolution))