On 4/12/24 14:39, Patrick Palka wrote:
On Fri, 12 Apr 2024, Jason Merrill wrote:

On 4/12/24 13:48, Patrick Palka wrote:
On Fri, 12 Apr 2024, Jason Merrill wrote:

On 4/12/24 10:35, Patrick Palka wrote:
On Wed, 10 Apr 2024, Jason Merrill wrote:

On 4/10/24 14:48, Patrick Palka wrote:
On Tue, 9 Apr 2024, Jason Merrill wrote:

On 3/5/24 10:31, Patrick Palka wrote:
On Tue, 27 Feb 2024, Patrick Palka wrote:

Subject: [PATCH] c++/modules: local type merging [PR99426]

One known missing piece in the modules implementation is merging
of
a
streamed-in local type (class or enum) with the corresponding
in-TU
version of the local type.  This missing piece turns out to
cause a
hard-to-reduce use-after-free GC issue due to the entity_ary not
being
marked as a GC root (deliberately), and manifests as a
serialization
error on stream-in as in PR99426 (see comment #6 for a
reduction).
It's
also reproducible on trunk when running the xtreme-header tests
without
-fno-module-lazy.

This patch makes us merge such local types according to their
position
within the containing function's definition, analogous to how we
merge
FIELD_DECLs of a class according to their index in the
TYPE_FIELDS
list.

        PR c++/99426

gcc/cp/ChangeLog:

        * module.cc (merge_kind::MK_local_type): New enumerator.
        (merge_kind_name): Update.
        (trees_out::chained_decls): Move BLOCK-specific handling
        of DECL_LOCAL_DECL_P decls to ...
        (trees_out::core_vals) <case BLOCK>: ... here.  Stream
        BLOCK_VARS manually.
        (trees_in::core_vals) <case BLOCK>: Stream BLOCK_VARS
        manually.  Handle deduplicated local types..
        (trees_out::key_local_type): Define.
        (trees_in::key_local_type): Define.
        (trees_out::get_merge_kind) <case FUNCTION_DECL>: Return
        MK_local_type for a local type.
        (trees_out::key_mergeable) <case FUNCTION_DECL>: Use
        key_local_type.
        (trees_in::key_mergeable) <case FUNCTION_DECL>: Likewise.
        (trees_in::is_matching_decl): Be flexible with type mismatches
        for local entities.

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 80b63a70a62..d9e34e9a4b9 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -6714,7 +6720,37 @@ trees_in::core_vals (tree t)
          case BLOCK:
            t->block.locus = state->read_location (*this);
            t->block.end_locus = state->read_location (*this);
-      t->block.vars = chained_decls ();
+
+      for (tree *chain = &t->block.vars;;)
+       if (tree decl = tree_node ())
+         {
+           /* For a deduplicated local type or enumerator, chain the
+              duplicate decl instead of the canonical in-TU decl.
Seeing
+              a duplicate here means the containing function whose
body
+              we're streaming in is a duplicate too, so we'll end up
+              discarding this BLOCK (and the rest of the duplicate
function
+              body) anyway.  */
+           if (is_duplicate (decl))
+             decl = maybe_duplicate (decl);
+           else if (DECL_IMPLICIT_TYPEDEF_P (decl)
+                    && TYPE_TEMPLATE_INFO (TREE_TYPE (decl)))
+             {
+               tree tmpl = TYPE_TI_TEMPLATE (TREE_TYPE (decl));
+               if (DECL_TEMPLATE_RESULT (tmpl) == decl &&
is_duplicate
(tmpl))
+                 decl = DECL_TEMPLATE_RESULT (maybe_duplicate
(tmpl));
+             }

This seems like a lot of generally-applicable code for finding the
duplicate,
which other calls to maybe_duplicate/odr_duplicate don't use.  If
the
template
is a duplicate, why isn't its result?  If there's a good reason
for
that,
should this template handling go into maybe_duplicate?

Ah yeah, that makes sense.

Some context: IIUC modules treats the TEMPLATE_DECL instead of the
DECL_TEMPLATE_RESULT as the canonical decl, which in turn means
we'll
register_duplicate only the TEMPLATE_DECL.  But BLOCK_VARS never
contains
a TEMPLATE_DECL, always the DECL_TEMPLATE_RESULT (i.e. a TYPE_DECL),
hence the extra handling.

Given that it's relatively more difficult to get at the
TEMPLATE_DECL
from the DECL_TEMPLATE_RESULT rather than vice versa, maybe we
should
just register both as duplicates from register_duplicate?  That way
callers can just simply pass the DECL_TEMPLATE_RESULT to
maybe_duplicate
and it'll do the right thing.

Sounds good.

@@ -10337,6 +10373,83 @@ trees_in::fn_parms_fini (int tag, tree
fn,
tree
existing, bool is_defn)
          }
      }
      +/* Encode into KEY the position of the local type (class
or
enum)
+   declaration DECL within FN.  The position is encoded as the
+   index of the innermost BLOCK (numbered in BFS order) along
with
+   the index within its BLOCK_VARS list.  */

Since we already set DECL_DISCRIMINATOR for mangling, could we use
it+name
for
the key as well?

We could (and IIUc that'd be more robust to ODR violations), but
wouldn't it mean we'd have to do a linear walk over all BLOCK_VARs
of
all BLOCKS in order to find the one with the matching
name+discriminator?  That'd be slower than the current approach
which
lets us skip to the correct BLOCK and walk only its BLOCK_VARS.

Ah, good point.  How about block number + name instead of the index?

It seems DECL_DISCRIMINATOR is only set at instantiation time and so for
local types from a function template pattern the field is empty, which
means we can't use it as the key in general :/

I meant just block number and name, without DECL_DISCRIMINATOR.  Just
using
the name instead of an index in BLOCK_VARS.

Ah, I think that'd be enough for named local types, but what about
anonymous local types?  IIUC without DECL_DISCRIMINATOR we wouldn't be
able to reliably distinguisth between multiple anonymous local types
defined in the same block, since their identifiers aren't stable given
that they're based off of a global counter (and so sensitive to #include
order) :(

Good point.  But I'd still think to merge based on name if we have one; as you
said above, to be more robust to ODR violations.

Sounds good.


If the imported fn has a local class that the included header didn't, would we
get the same problem?

IIUC yes, all the intra-block indexes would be off by one in that case
and deduplicatation would fail or we'd deduplicate distinct types.

Here's an incremental diff for the updated patch.  The augmented
testcase triggered a latent qsort checking failure in depset_cmp
that was straightforwardly fixed:

OK.

Jason

Reply via email to