On Wed, Apr 17, 2024 at 09:42:47AM +0200, Jakub Jelinek wrote: > When expand_or_defer_fn is called at_eof time, it calls import_export_decl > and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a > couple of places to try to optimize cdtors which are known to have the > same body by making the complete cdtor an alias to base cdtor (and in > that case also uses *[CD]5* as comdat group name instead of the normal > comdat group names specific to each mangled name). > Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN, > maybe_clone_body and can_alias_cdtor use: > if (DECL_ONE_ONLY (fn)) > cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group > (clone)); > ... > bool can_alias = can_alias_cdtor (fn); > ... > /* Tell cgraph if both ctors or both dtors are known to have > the same body. */ > if (can_alias > && fns[0] > && idx == 1 > && cgraph_node::get_create (fns[0])->create_same_body_alias > (clone, fns[0])) > { > alias = true; > if (DECL_ONE_ONLY (fns[0])) > { > /* For comdat base and complete cdtors put them > into the same, *[CD]5* comdat group instead of > *[CD][12]*. */ > comdat_group = cdtor_comdat_group (fns[1], fns[0]); > cgraph_node::get_create (fns[0])->set_comdat_group > (comdat_group); > if (symtab_node::get (clone)->same_comdat_group) > symtab_node::get (clone)->remove_from_same_comdat_group (); > symtab_node::get (clone)->add_to_same_comdat_group > (symtab_node::get (fns[0])); > } > } > and > /* Don't use aliases for weak/linkonce definitions unless we can put both > symbols in the same COMDAT group. */ > return (DECL_INTERFACE_KNOWN (fn) > && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn)) > && (!DECL_ONE_ONLY (fn) > || (HAVE_COMDAT_GROUP && DECL_WEAK (fn)))); > The following testcase regressed with Marek's r14-5979 change, > when pr113208_0.C is compiled where the ctor is marked constexpr, > we no longer perform this optimization, where > _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the > _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and > _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it, > instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same > content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in > _ZN6vectorI12QualityValueEC1ERKS1_ comdat group. > Now, the linker seems to somehow cope with that, eventhough it > probably keeps both copies of the ctor, but seems LTO can't cope > with that and Honza doesn't know what it should do in that case > (linker decides that the prevailing symbol is > _ZN6vectorI12QualityValueEC2ERKS1_ (from the > _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and > _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU, > from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)). > > Note, the case where some constructor is marked constexpr in one > TU and not in another one happens pretty often in libstdc++ when > one mixes -std= flags used to compile different compilation units. > > The reason the optimization doesn't trigger when the constructor is > constexpr is that expand_or_defer_fn is called in that case much earlier > than when it is not constexpr; in the former case it is called when we > try to constant evaluate that constructor. But DECL_INTERFACE_KNOWN > is false in that case and comdat_linkage hasn't been called either > (I think it is desirable, because comdat group is stored in the cgraph > node and am not sure it is a good idea to create cgraph nodes for > something that might not be needed later on at all), so maybe_clone_body > clones the bodies, but doesn't make them as aliases. > > The following patch is an attempt to redo this optimization when > import_export_decl is called at_eof time on the base/complete cdtor > (or deleting dtor). It will not do anything if maybe_clone_body > hasn't been called uyet (the TREE_ASM_WRITTEN check on the > DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete > cdtors have been lowered already, or when maybe_clone_body called > maybe_thunk_body and it was successful. Otherwise retries the > can_alias_cdtor check and makes the complete cdtor alias to the > base cdtor with adjustments to the comdat group.
Here is an updated version of the patch which changes - DECL_SAVED_TREE (fns[1]) = NULL_TREE; + release_function_body (fns[1]); as suggested by Honza. Bootstrapped/regtested on x86_64-linux and i686-linux again, ok for trunk? 2024-04-22 Jakub Jelinek <ja...@redhat.com> PR lto/113208 * cp-tree.h (maybe_optimize_cdtor): Declare. * decl2.cc (import_export_decl): Call it for cloned cdtors. * optimize.cc (maybe_optimize_cdtor): New function. * g++.dg/lto/pr113208_0.C: New test. * g++.dg/lto/pr113208_1.C: New file. * g++.dg/lto/pr113208.h: New file. --- gcc/cp/cp-tree.h.jj 2024-04-16 17:18:37.286279533 +0200 +++ gcc/cp/cp-tree.h 2024-04-16 17:45:01.685635709 +0200 @@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign /* In optimize.cc */ extern tree clone_attrs (tree); extern bool maybe_clone_body (tree); +extern void maybe_optimize_cdtor (tree); /* In parser.cc */ extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool, --- gcc/cp/decl2.cc.jj 2024-04-16 17:18:37.287279519 +0200 +++ gcc/cp/decl2.cc 2024-04-16 17:45:01.686635695 +0200 @@ -3568,6 +3568,9 @@ import_export_decl (tree decl) } DECL_INTERFACE_KNOWN (decl) = 1; + + if (DECL_CLONED_FUNCTION_P (decl)) + maybe_optimize_cdtor (decl); } /* Return an expression that performs the destruction of DECL, which --- gcc/cp/optimize.cc.jj 2024-04-16 17:18:37.374278327 +0200 +++ gcc/cp/optimize.cc 2024-04-22 14:52:59.310630914 +0200 @@ -723,3 +723,58 @@ maybe_clone_body (tree fn) /* We don't need to process the original function any further. */ return 1; } + +/* If maybe_clone_body is called while the cdtor is still tentative, + DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn). + In that case we wouldn't try to optimize using an alias and instead + would emit separate base and complete cdtor. The following function + attempts to still optimize that case when we import_export_decl + is called first time on one of the clones. */ + +void +maybe_optimize_cdtor (tree orig_decl) +{ + tree fns[3]; + tree fn = DECL_CLONED_FUNCTION (orig_decl); + gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn)); + if (DECL_INTERFACE_KNOWN (fn) + || !TREE_ASM_WRITTEN (fn) + || !DECL_ONE_ONLY (orig_decl) + || symtab->global_info_ready) + return; + + populate_clone_array (fn, fns); + + if (!fns[0] || !fns[1]) + return; + for (int i = 2 - !fns[2]; i >= 0; --i) + if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i])) + return; + DECL_INTERFACE_KNOWN (fn) = 1; + comdat_linkage (fn); + if (!can_alias_cdtor (fn)) + return; + /* For comdat base and complete cdtors put them into the same, + *[CD]5* comdat group instead of *[CD][12]*. */ + auto n0 = cgraph_node::get_create (fns[0]); + auto n1 = cgraph_node::get_create (fns[1]); + auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL; + if (n0->lowered || n1->lowered || (n2 && n2->lowered)) + return; + import_export_decl (fns[0]); + n1->definition = false; + if (!n0->create_same_body_alias (fns[1], fns[0])) + return; + + tree comdat_group = cdtor_comdat_group (fns[1], fns[0]); + n1 = cgraph_node::get (fns[1]); + n0->set_comdat_group (comdat_group); + if (n1->same_comdat_group) + n1->remove_from_same_comdat_group (); + n1->add_to_same_comdat_group (n0); + if (fns[2]) + n2->add_to_same_comdat_group (n0); + import_export_decl (fns[1]); + /* Remove the body now that it is an alias. */ + release_function_body (fns[1]); +} --- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_0.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,13 @@ +// { dg-lto-do link } +// { dg-lto-options { {-O1 -std=c++20 -flto}} } +// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" } +// { dg-require-linker-plugin "" } + +#define CONSTEXPR constexpr +#include "pr113208.h" + +struct QualityValue; +struct k : vector<QualityValue> {}; + +void m(k); +void n(k i) { m(i); } --- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208_1.C 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,6 @@ +#define CONSTEXPR +#include "pr113208.h" + +struct QualityValue; +vector<QualityValue> values1; +vector<QualityValue> values{values1}; --- gcc/testsuite/g++.dg/lto/pr113208.h.jj 2024-04-16 17:45:01.687635682 +0200 +++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-16 17:45:01.687635682 +0200 @@ -0,0 +1,10 @@ +template <typename _Tp> struct _Vector_base { + int g() const; + _Vector_base(int, int); +}; +template <typename _Tp> +struct vector : _Vector_base<_Tp> { + CONSTEXPR vector(const vector &__x) + : _Vector_base<_Tp>(1, __x.g()) {} + vector() : _Vector_base<_Tp>(1, 2) {} +}; Jakub