Am Freitag, dem 09.01.2026 um 08:21 +0100 schrieb Richard Biener:
> On Thu, 8 Jan 2026, Martin Uecker wrote:
>
> >
> > Hi Richard,
> >
> > do you see an issue with the following approach? Calling
> > record_component_aliases for all versions? (should be rare
> > that there is more than one version.)
> >
> > Bootstrapped and regression tested on aarch64.
> > (and running on x86_64).
> >
> > Martin
> >
> >
> >
> > Given the following two types, the C FE assigns the same
> > TYPE_CANONICAL to both struct bar, because it treats pointer to
> > tagged types with the same type as compatible (in this context).
> >
> > struct foo { int y; };
> > struct bar { struct foo *c; }
> >
> > struct foo { long y; };
> > struct bar { struct foo *c; }
>
> I assume this is only "valid" if split into two TUs? How does
> this then materialize?
It could materialize when it goes through another TU
where struct foo is incomplete. Having incomplete structs
at API boundaries across TUs is common in C.
These differently completed types can then be in the same TU.
This is why it is relevant for alias analysis even inside one TU.
I don't think it is very common, but possible.
> LTO wouldn't merge the two distinct 'foo'
> types and also not the distinct 'bar' ones (possibly -Wodr would
> complain, but IIRC the C FE doesn't set up types in a way to do)?
I assume the other way round? It would merge bar while globbing
the pointer to void?
>
> Does the C standard require that the two 'bar' inter-operate
> wrt TBAA even though 'foo' are not compatible? Does it require
> even the 'foo' to inter-operate? IMO, if it does, stupiditly so,
> then it has to make sure such types have alias-set zero.
No, the C standard requires struct bar to interoperate with
another struct bar { struct foo *c; } where struct foo
is incomplete. The problem is that this implies - when
forming equivalence classes - that the above two incompatible types
end up in the same class because they both need to be compatile
with the one which points to the incomplete foo. This is
an inherent limitation of GCC's way of modelling aliasing
relationship via equivalence classes.
My understanding is that this is the reason why LTO globs
pointer to void in this situation.
>
> > get_alias_set records the components of aggregate types, but only
> > considers the components of the canonical version. We need to record
> > the components of all such variations which we now do incrementally
> > as we discover these types.
>
> The proposed patch looks like a very ugly hack. Incrementally
> adding to the alias set forest looks fragile at best, I'm not
> convinced this results in correct operation.
In principle, it seems a clean approach
to me that should correctly model the relationships without
loss of information in the common case.
But if there is a technical reason why adding subsets at different
times is not possible, then this is of course a problem. I could not
identify a concrete issue though.
record_alias_subset has a comment that it "should not" be
called more than once per pair, but record_component_aliases
already does this.
An alternative could possible be to activate the "void" case
in record_component_alias also without LTO.
Another alternative - but more work - is if the C FE synthesis
a fresh TYPE_CANONICAL that changes these pointers to pointers
to incomplete types.
Martin
>
> Richard.
>
> > PR c/122572
> >
> > gcc/ChangeLog:
> > * gcc/alias.cc (get_alias_set): Record components of all
> > versions with the same TYPE_CANONICAL.
> >
> > gcc/testsuite/ChangeLog:
> > * gcc.dg/struct-alias-2.c: New test.
> > ---
> > gcc/alias.cc | 32 +++++++-
> > gcc/testsuite/gcc.dg/struct-alias-2.c | 108 ++++++++++++++++++++++++++
> > 2 files changed, 138 insertions(+), 2 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.dg/struct-alias-2.c
> >
> > diff --git a/gcc/alias.cc b/gcc/alias.cc
> > index bce026200a1..95f1f017989 100644
> > --- a/gcc/alias.cc
> > +++ b/gcc/alias.cc
> > @@ -923,6 +923,9 @@ get_alias_set (tree t)
> > && TYPE_TYPELESS_STORAGE (t))
> > return 0;
> >
> > + /* Remember the original type before replace with its canonical version.
> > */
> > + tree torig = t;
> > +
> > /* Always use the canonical type as well. If this is a type that
> > requires structural comparisons to identify compatible types
> > use alias set zero. */
> > @@ -954,7 +957,21 @@ get_alias_set (tree t)
> > /* If this is a type with a known alias set, return it. */
> > gcc_checking_assert (t == TYPE_MAIN_VARIANT (t));
> > if (TYPE_ALIAS_SET_KNOWN_P (t))
> > - return TYPE_ALIAS_SET (t);
> > + {
> > + /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate
> > types
> > + that contain pointers to incompatible tagged types. In this case the
> > + types which we had seen before may not have had exactly the same
> > + components. We do need to record the components of the new type
> > + we haven't seen before. */
> > + if (!in_lto_p && AGGREGATE_TYPE_P(t) && !TYPE_ALIAS_SET_KNOWN_P
> > (torig))
> > + {
> > + gcc_assert (torig != t);
> > + gcc_assert (t == TYPE_CANONICAL (torig));
> > + TYPE_ALIAS_SET (torig) = TYPE_ALIAS_SET (t);
> > + record_component_aliases (torig);
> > + }
> > + return TYPE_ALIAS_SET (t);
> > + }
> >
> > /* We don't want to set TYPE_ALIAS_SET for incomplete types. */
> > if (!COMPLETE_TYPE_P (t))
> > @@ -1110,7 +1127,7 @@ get_alias_set (tree t)
> > /* Assign the alias set to both p and t.
> > We cannot call get_alias_set (p) here as that would trigger
> > infinite recursion when p == t. In other cases it would just
> > - trigger unnecesary legwork of rebuilding the pointer again. */
> > + trigger unnecessary legwork of rebuilding the pointer again. */
> > gcc_checking_assert (p == TYPE_MAIN_VARIANT (p));
> > if (TYPE_ALIAS_SET_KNOWN_P (p))
> > set = TYPE_ALIAS_SET (p);
> > @@ -1147,6 +1164,17 @@ get_alias_set (tree t)
> > if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
> > record_component_aliases (t);
> >
> > + /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate types
> > + that contain pointers to incompatible tagged types. In this case
> > + TYPE_CANONICAL may not have exactly the same components as the
> > original
> > + type. We do need to record the components of both versions. */
> > + if (!in_lto_p && torig != t && AGGREGATE_TYPE_P(t))
> > + {
> > + TYPE_ALIAS_SET (torig) = set;
> > + record_component_aliases (torig);
> > + }
> > +
> > +
> > /* We treat pointer types specially in alias_set_subset_of. */
> > if (POINTER_TYPE_P (t) && set)
> > {
> > diff --git a/gcc/testsuite/gcc.dg/struct-alias-2.c
> > b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > new file mode 100644
> > index 00000000000..114a1275594
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > @@ -0,0 +1,108 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-O2 -std=c23" } */
> > +
> > +
> > +/* Based on the submitted test case for PR123356 but using types with
> > + * tags and moving the second version into another scope. */
> > +
> > +struct foo { long x; };
> > +
> > +void f()
> > +{
> > + struct foo { };
> > + struct bar { struct foo *c; };
> > + union baz { struct foo *c; };
> > + struct arr { struct foo *c[1]; };
> > +}
> > +
> > +
> > +void f1()
> > +{
> > + struct foo { };
> > + struct bar { struct foo *c; };
> > + union baz { struct foo *c; };
> > + struct arr { struct foo *c[1]; };
> > +}
> > +
> > +struct bar { struct foo *c; };
> > +union baz { struct foo *c; };
> > +struct arr { struct foo *c[1]; };
> > +
> > +void f2()
> > +{
> > + struct foo { int y; };
> > + struct bar { struct foo *c; };
> > + union baz { struct foo *c; };
> > + struct arr { struct foo *c[1]; };
> > +}
> > +
> > +__attribute__((noinline))
> > +struct foo * g1(struct bar *B, struct bar *Q)
> > +{
> > + struct bar t = *B;
> > + *B = *Q;
> > + *Q = t;
> > + return B->c;
> > +}
> > +
> > +__attribute__((noinline))
> > +struct foo * g2(union baz *B, union baz *Q)
> > +{
> > + union baz t = *B;
> > + *B = *Q;
> > + *Q = t;
> > + return B->c;
> > +}
> > +
> > +__attribute__((noinline))
> > +struct foo { long x; } *
> > + g3(struct bar { struct foo { long x; } *c; } *B,
> > + struct bar { struct foo { long x; } *c; } *Q)
> > +{
> > + struct bar t = *B;
> > + *B = *Q;
> > + *Q = t;
> > + return B->c;
> > +}
> > +
> > +struct foo * g4(struct arr *B,
> > + struct arr *Q)
> > +{
> > + struct arr t = *B;
> > + *B = *Q;
> > + *Q = t;
> > + return B->c[0];
> > +}
> > +
> > +int main()
> > +{
> > + struct foo Bc = { };
> > + struct foo Qc = { };
> > +
> > + struct bar B = { &Bc };
> > + struct bar Q = { &Qc };
> > +
> > + if (g1(&B, &Q) != &Qc)
> > + __builtin_abort();
> > +
> > + union baz Bu = { &Bc };
> > + union baz Qu = { &Qc };
> > +
> > + if (g2(&Bu, &Qu) != &Qc)
> > + __builtin_abort();
> > +
> > + struct bar B2 = { &Bc };
> > + struct bar Q2 = { &Qc };
> > +
> > + if (g3(&B2, &Q2) != &Qc)
> > + __builtin_abort();
> > +
> > + struct arr Ba = { &Bc };
> > + struct arr Qa = { &Qc };
> > +
> > + if (g4(&Ba, &Qa) != &Qc)
> > + __builtin_abort();
> > +
> > + return 0;
> > +}
> > +
> >