On Wed, Apr 17, 2024 at 09:42:47AM +0200, Jakub Jelinek wrote:
> When expand_or_defer_fn is called at_eof time, it calls import_export_decl
> and then maybe_clone_body, which uses DECL_ONE_ONLY and comdat name in a
> couple of places to try to optimize cdtors which are known to have the
> same body by making the complete cdtor an alias to base cdtor (and in
> that case also uses *[CD]5* as comdat group name instead of the normal
> comdat group names specific to each mangled name).
> Now, this optimization depends on DECL_ONE_ONLY and DECL_INTERFACE_KNOWN,
> maybe_clone_body and can_alias_cdtor use:
>       if (DECL_ONE_ONLY (fn))
>         cgraph_node::get_create (clone)->set_comdat_group (cxx_comdat_group 
> (clone));
> ...
>   bool can_alias = can_alias_cdtor (fn);
> ...
>       /* Tell cgraph if both ctors or both dtors are known to have
>          the same body.  */
>       if (can_alias
>           && fns[0]
>           && idx == 1
>           && cgraph_node::get_create (fns[0])->create_same_body_alias
>                (clone, fns[0]))
>         {
>           alias = true;
>           if (DECL_ONE_ONLY (fns[0]))
>             {
>               /* For comdat base and complete cdtors put them
>                  into the same, *[CD]5* comdat group instead of
>                  *[CD][12]*.  */
>               comdat_group = cdtor_comdat_group (fns[1], fns[0]);
>               cgraph_node::get_create (fns[0])->set_comdat_group 
> (comdat_group);
>               if (symtab_node::get (clone)->same_comdat_group)
>                 symtab_node::get (clone)->remove_from_same_comdat_group ();
>               symtab_node::get (clone)->add_to_same_comdat_group
>                 (symtab_node::get (fns[0]));
>             }
>         }
> and
>   /* Don't use aliases for weak/linkonce definitions unless we can put both
>      symbols in the same COMDAT group.  */
>   return (DECL_INTERFACE_KNOWN (fn)
>           && (SUPPORTS_ONE_ONLY || !DECL_WEAK (fn))
>           && (!DECL_ONE_ONLY (fn)
>               || (HAVE_COMDAT_GROUP && DECL_WEAK (fn))));
> The following testcase regressed with Marek's r14-5979 change,
> when pr113208_0.C is compiled where the ctor is marked constexpr,
> we no longer perform this optimization, where
> _ZN6vectorI12QualityValueEC2ERKS1_ was emitted in the
> _ZN6vectorI12QualityValueEC5ERKS1_ comdat group and
> _ZN6vectorI12QualityValueEC1ERKS1_ was made an alias to it,
> instead we emit _ZN6vectorI12QualityValueEC2ERKS1_ in
> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group and the same
> content _ZN6vectorI12QualityValueEC1ERKS1_ as separate symbol in
> _ZN6vectorI12QualityValueEC1ERKS1_ comdat group.
> Now, the linker seems to somehow cope with that, eventhough it
> probably keeps both copies of the ctor, but seems LTO can't cope
> with that and Honza doesn't know what it should do in that case
> (linker decides that the prevailing symbol is
> _ZN6vectorI12QualityValueEC2ERKS1_ (from the
> _ZN6vectorI12QualityValueEC2ERKS1_ comdat group) and
> _ZN6vectorI12QualityValueEC1ERKS1_ alias (from the other TU,
> from _ZN6vectorI12QualityValueEC5ERKS1_ comdat group)).
> 
> Note, the case where some constructor is marked constexpr in one
> TU and not in another one happens pretty often in libstdc++ when
> one mixes -std= flags used to compile different compilation units.
> 
> The reason the optimization doesn't trigger when the constructor is
> constexpr is that expand_or_defer_fn is called in that case much earlier
> than when it is not constexpr; in the former case it is called when we
> try to constant evaluate that constructor.  But DECL_INTERFACE_KNOWN
> is false in that case and comdat_linkage hasn't been called either
> (I think it is desirable, because comdat group is stored in the cgraph
> node and am not sure it is a good idea to create cgraph nodes for
> something that might not be needed later on at all), so maybe_clone_body
> clones the bodies, but doesn't make them as aliases.
> 
> The following patch is an attempt to redo this optimization when
> import_export_decl is called at_eof time on the base/complete cdtor
> (or deleting dtor).  It will not do anything if maybe_clone_body
> hasn't been called uyet (the TREE_ASM_WRITTEN check on the
> DECL_MAYBE_IN_CHARGE_CDTOR_P), or when one or both of the base/complete
> cdtors have been lowered already, or when maybe_clone_body called
> maybe_thunk_body and it was successful.  Otherwise retries the
> can_alias_cdtor check and makes the complete cdtor alias to the
> base cdtor with adjustments to the comdat group.

Here is an updated version of the patch which changes
-  DECL_SAVED_TREE (fns[1]) = NULL_TREE;
+  release_function_body (fns[1]);
as suggested by Honza.

Bootstrapped/regtested on x86_64-linux and i686-linux again, ok for trunk?

2024-04-22  Jakub Jelinek  <ja...@redhat.com>

        PR lto/113208
        * cp-tree.h (maybe_optimize_cdtor): Declare.
        * decl2.cc (import_export_decl): Call it for cloned cdtors.
        * optimize.cc (maybe_optimize_cdtor): New function.

        * g++.dg/lto/pr113208_0.C: New test.
        * g++.dg/lto/pr113208_1.C: New file.
        * g++.dg/lto/pr113208.h: New file.

--- gcc/cp/cp-tree.h.jj 2024-04-16 17:18:37.286279533 +0200
+++ gcc/cp/cp-tree.h    2024-04-16 17:45:01.685635709 +0200
@@ -7447,6 +7447,7 @@ extern bool handle_module_option (unsign
 /* In optimize.cc */
 extern tree clone_attrs                                (tree);
 extern bool maybe_clone_body                   (tree);
+extern void maybe_optimize_cdtor               (tree);
 
 /* In parser.cc */
 extern tree cp_convert_range_for (tree, tree, tree, cp_decomp *, bool,
--- gcc/cp/decl2.cc.jj  2024-04-16 17:18:37.287279519 +0200
+++ gcc/cp/decl2.cc     2024-04-16 17:45:01.686635695 +0200
@@ -3568,6 +3568,9 @@ import_export_decl (tree decl)
     }
 
   DECL_INTERFACE_KNOWN (decl) = 1;
+
+  if (DECL_CLONED_FUNCTION_P (decl))
+    maybe_optimize_cdtor (decl);
 }
 
 /* Return an expression that performs the destruction of DECL, which
--- gcc/cp/optimize.cc.jj       2024-04-16 17:18:37.374278327 +0200
+++ gcc/cp/optimize.cc  2024-04-22 14:52:59.310630914 +0200
@@ -723,3 +723,58 @@ maybe_clone_body (tree fn)
   /* We don't need to process the original function any further.  */
   return 1;
 }
+
+/* If maybe_clone_body is called while the cdtor is still tentative,
+   DECL_ONE_ONLY will be false and so will be can_alias_cdtor (fn).
+   In that case we wouldn't try to optimize using an alias and instead
+   would emit separate base and complete cdtor.  The following function
+   attempts to still optimize that case when we import_export_decl
+   is called first time on one of the clones.  */
+
+void
+maybe_optimize_cdtor (tree orig_decl)
+{
+  tree fns[3];
+  tree fn = DECL_CLONED_FUNCTION (orig_decl);
+  gcc_checking_assert (DECL_MAYBE_IN_CHARGE_CDTOR_P (fn));
+  if (DECL_INTERFACE_KNOWN (fn)
+      || !TREE_ASM_WRITTEN (fn)
+      || !DECL_ONE_ONLY (orig_decl)
+      || symtab->global_info_ready)
+    return;
+
+  populate_clone_array (fn, fns);
+
+  if (!fns[0] || !fns[1])
+    return;
+  for (int i = 2 - !fns[2]; i >= 0; --i)
+    if (fns[i] != orig_decl && DECL_INTERFACE_KNOWN (fns[i]))
+      return;
+  DECL_INTERFACE_KNOWN (fn) = 1;
+  comdat_linkage (fn);
+  if (!can_alias_cdtor (fn))
+    return;
+  /* For comdat base and complete cdtors put them into the same,
+     *[CD]5* comdat group instead of *[CD][12]*.  */
+  auto n0 = cgraph_node::get_create (fns[0]);
+  auto n1 = cgraph_node::get_create (fns[1]);
+  auto n2 = fns[2] ? cgraph_node::get_create (fns[1]) : NULL;
+  if (n0->lowered || n1->lowered || (n2 && n2->lowered))
+    return;
+  import_export_decl (fns[0]);
+  n1->definition = false;
+  if (!n0->create_same_body_alias (fns[1], fns[0]))
+    return;
+
+  tree comdat_group = cdtor_comdat_group (fns[1], fns[0]);
+  n1 = cgraph_node::get (fns[1]);
+  n0->set_comdat_group (comdat_group);
+  if (n1->same_comdat_group)
+    n1->remove_from_same_comdat_group ();
+  n1->add_to_same_comdat_group (n0);
+  if (fns[2])
+    n2->add_to_same_comdat_group (n0);
+  import_export_decl (fns[1]);
+  /* Remove the body now that it is an alias.  */
+  release_function_body (fns[1]);
+}
--- gcc/testsuite/g++.dg/lto/pr113208_0.C.jj    2024-04-16 17:45:01.687635682 
+0200
+++ gcc/testsuite/g++.dg/lto/pr113208_0.C       2024-04-16 17:45:01.687635682 
+0200
@@ -0,0 +1,13 @@
+// { dg-lto-do link }
+// { dg-lto-options { {-O1 -std=c++20 -flto}} }
+// { dg-extra-ld-options "-r -nostdlib -flinker-output=nolto-rel" }
+// { dg-require-linker-plugin "" }
+
+#define CONSTEXPR constexpr
+#include "pr113208.h"
+
+struct QualityValue;
+struct k : vector<QualityValue> {};
+
+void m(k);
+void n(k i) { m(i); }
--- gcc/testsuite/g++.dg/lto/pr113208_1.C.jj    2024-04-16 17:45:01.687635682 
+0200
+++ gcc/testsuite/g++.dg/lto/pr113208_1.C       2024-04-16 17:45:01.687635682 
+0200
@@ -0,0 +1,6 @@
+#define CONSTEXPR 
+#include "pr113208.h"
+
+struct QualityValue;
+vector<QualityValue> values1;
+vector<QualityValue> values{values1};
--- gcc/testsuite/g++.dg/lto/pr113208.h.jj      2024-04-16 17:45:01.687635682 
+0200
+++ gcc/testsuite/g++.dg/lto/pr113208.h 2024-04-16 17:45:01.687635682 +0200
@@ -0,0 +1,10 @@
+template <typename _Tp> struct _Vector_base {
+  int g() const;
+  _Vector_base(int, int);
+};
+template <typename _Tp>
+struct vector : _Vector_base<_Tp> {
+  CONSTEXPR vector(const vector &__x)
+      : _Vector_base<_Tp>(1, __x.g()) {}
+ vector() : _Vector_base<_Tp>(1, 2) {}
+};


        Jakub

Reply via email to