On Fri, 9 Jan 2026, Martin Uecker wrote:
> Am Freitag, dem 09.01.2026 um 08:21 +0100 schrieb Richard Biener:
> > On Thu, 8 Jan 2026, Martin Uecker wrote:
> >
> > >
> > > Hi Richard,
> > >
> > > do you see an issue with the following approach? Calling
> > > record_component_aliases for all versions? (should be rare
> > > that there is more than one version.)
> > >
> > > Bootstrapped and regression tested on aarch64.
> > > (and running on x86_64).
> > >
> > > Martin
> > >
> > >
> > >
> > > Given the following two types, the C FE assigns the same
> > > TYPE_CANONICAL to both struct bar, because it treats pointer to
> > > tagged types with the same type as compatible (in this context).
> > >
> > > struct foo { int y; };
> > > struct bar { struct foo *c; }
> > >
> > > struct foo { long y; };
> > > struct bar { struct foo *c; }
> >
> > I assume this is only "valid" if split into two TUs? How does
> > this then materialize?
>
> It could materialize when it goes through another TU
> where struct foo is incomplete. Having incomplete structs
> at API boundaries across TUs is common in C.
>
> These differently completed types can then be in the same TU.
> This is why it is relevant for alias analysis even inside one TU.
> I don't think it is very common, but possible.
I can't seem to create a TU with both types that is not rejected by
the frontend.
>
> > LTO wouldn't merge the two distinct 'foo'
> > types and also not the distinct 'bar' ones (possibly -Wodr would
> > complain, but IIRC the C FE doesn't set up types in a way to do)?
>
> I assume the other way round? It would merge bar while globbing
> the pointer to void?
Ah yes, we "simplify" all pointers to be pointers to incomplete
types, not exactly to void *. Honza might remember better.
That makes accesses to the pointer data compatible but when the
IL contains accesses to the incompatible 'y' those should remain.
>
> >
> > Does the C standard require that the two 'bar' inter-operate
> > wrt TBAA even though 'foo' are not compatible? Does it require
> > even the 'foo' to inter-operate? IMO, if it does, stupiditly so,
> > then it has to make sure such types have alias-set zero.
>
> No, the C standard requires struct bar to interoperate with
> another struct bar { struct foo *c; } where struct foo
> is incomplete. The problem is that this implies - when
> forming equivalence classes - that the above two incompatible types
> end up in the same class because they both need to be compatile
> with the one which points to the incomplete foo. This is
> an inherent limitation of GCC's way of modelling aliasing
> relationship via equivalence classes.
>
> My understanding is that this is the reason why LTO globs
> pointer to void in this situation.
I think so. But how does this become a problem in non-LTO given
I fail to see how two such distinct completions can be visible
in the same TU? Thus, why does the C frontend need to care?
> >
> > > get_alias_set records the components of aggregate types, but only
> > > considers the components of the canonical version. We need to record
> > > the components of all such variations which we now do incrementally
> > > as we discover these types.
> >
> > The proposed patch looks like a very ugly hack. Incrementally
> > adding to the alias set forest looks fragile at best, I'm not
> > convinced this results in correct operation.
>
> In principle, it seems a clean approach
> to me that should correctly model the relationships without
> loss of information in the common case.
>
> But if there is a technical reason why adding subsets at different
> times is not possible, then this is of course a problem. I could not
> identify a concrete issue though.
For one, visibility of such a type in completely unrelated code
would impact optimization of code in another context - so it
would be dependent on order of processing. That's highly
undesirable. Similarly this different optimization could
be invalid when such IL is then inlined into a context where
the additional type was visible.
> record_alias_subset has a comment that it "should not" be
> called more than once per pair, but record_component_aliases
> already does this.
>
> An alternative could possible be to activate the "void" case
> in record_component_alias also without LTO.
>
> Another alternative - but more work - is if the C FE synthesis
> a fresh TYPE_CANONICAL that changes these pointers to pointers
> to incomplete types.
You can look into what ipa-free-lang-data.cc:fld_incomplete_type_of
does, it basically creates a struct T; (when it does not already
exist) for a struct T { ... }; and uses that when referenced as
pointed-to type. So the C frontend could make sure such an
incomplete type variant exists and make the pointer to that type
the canonical type for pointers?
Richard.
> Martin
>
> >
> > Richard.
> >
> > > PR c/122572
> > >
> > > gcc/ChangeLog:
> > > * gcc/alias.cc (get_alias_set): Record components of all
> > > versions with the same TYPE_CANONICAL.
> > >
> > > gcc/testsuite/ChangeLog:
> > > * gcc.dg/struct-alias-2.c: New test.
> > > ---
> > > gcc/alias.cc | 32 +++++++-
> > > gcc/testsuite/gcc.dg/struct-alias-2.c | 108 ++++++++++++++++++++++++++
> > > 2 files changed, 138 insertions(+), 2 deletions(-)
> > > create mode 100644 gcc/testsuite/gcc.dg/struct-alias-2.c
> > >
> > > diff --git a/gcc/alias.cc b/gcc/alias.cc
> > > index bce026200a1..95f1f017989 100644
> > > --- a/gcc/alias.cc
> > > +++ b/gcc/alias.cc
> > > @@ -923,6 +923,9 @@ get_alias_set (tree t)
> > > && TYPE_TYPELESS_STORAGE (t))
> > > return 0;
> > >
> > > + /* Remember the original type before replace with its canonical
> > > version. */
> > > + tree torig = t;
> > > +
> > > /* Always use the canonical type as well. If this is a type that
> > > requires structural comparisons to identify compatible types
> > > use alias set zero. */
> > > @@ -954,7 +957,21 @@ get_alias_set (tree t)
> > > /* If this is a type with a known alias set, return it. */
> > > gcc_checking_assert (t == TYPE_MAIN_VARIANT (t));
> > > if (TYPE_ALIAS_SET_KNOWN_P (t))
> > > - return TYPE_ALIAS_SET (t);
> > > + {
> > > + /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate
> > > types
> > > + that contain pointers to incompatible tagged types. In this case the
> > > + types which we had seen before may not have had exactly the same
> > > + components. We do need to record the components of the new type
> > > + we haven't seen before. */
> > > + if (!in_lto_p && AGGREGATE_TYPE_P(t) && !TYPE_ALIAS_SET_KNOWN_P
> > > (torig))
> > > + {
> > > + gcc_assert (torig != t);
> > > + gcc_assert (t == TYPE_CANONICAL (torig));
> > > + TYPE_ALIAS_SET (torig) = TYPE_ALIAS_SET (t);
> > > + record_component_aliases (torig);
> > > + }
> > > + return TYPE_ALIAS_SET (t);
> > > + }
> > >
> > > /* We don't want to set TYPE_ALIAS_SET for incomplete types. */
> > > if (!COMPLETE_TYPE_P (t))
> > > @@ -1110,7 +1127,7 @@ get_alias_set (tree t)
> > > /* Assign the alias set to both p and t.
> > > We cannot call get_alias_set (p) here as that would trigger
> > > infinite recursion when p == t. In other cases it would just
> > > - trigger unnecesary legwork of rebuilding the pointer again. */
> > > + trigger unnecessary legwork of rebuilding the pointer again. */
> > > gcc_checking_assert (p == TYPE_MAIN_VARIANT (p));
> > > if (TYPE_ALIAS_SET_KNOWN_P (p))
> > > set = TYPE_ALIAS_SET (p);
> > > @@ -1147,6 +1164,17 @@ get_alias_set (tree t)
> > > if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
> > > record_component_aliases (t);
> > >
> > > + /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate
> > > types
> > > + that contain pointers to incompatible tagged types. In this case
> > > + TYPE_CANONICAL may not have exactly the same components as the
> > > original
> > > + type. We do need to record the components of both versions. */
> > > + if (!in_lto_p && torig != t && AGGREGATE_TYPE_P(t))
> > > + {
> > > + TYPE_ALIAS_SET (torig) = set;
> > > + record_component_aliases (torig);
> > > + }
> > > +
> > > +
> > > /* We treat pointer types specially in alias_set_subset_of. */
> > > if (POINTER_TYPE_P (t) && set)
> > > {
> > > diff --git a/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > new file mode 100644
> > > index 00000000000..114a1275594
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > @@ -0,0 +1,108 @@
> > > +/* { dg-do run } */
> > > +/* { dg-options "-O2 -std=c23" } */
> > > +
> > > +
> > > +/* Based on the submitted test case for PR123356 but using types with
> > > + * tags and moving the second version into another scope. */
> > > +
> > > +struct foo { long x; };
> > > +
> > > +void f()
> > > +{
> > > + struct foo { };
> > > + struct bar { struct foo *c; };
> > > + union baz { struct foo *c; };
> > > + struct arr { struct foo *c[1]; };
> > > +}
> > > +
> > > +
> > > +void f1()
> > > +{
> > > + struct foo { };
> > > + struct bar { struct foo *c; };
> > > + union baz { struct foo *c; };
> > > + struct arr { struct foo *c[1]; };
> > > +}
> > > +
> > > +struct bar { struct foo *c; };
> > > +union baz { struct foo *c; };
> > > +struct arr { struct foo *c[1]; };
> > > +
> > > +void f2()
> > > +{
> > > + struct foo { int y; };
> > > + struct bar { struct foo *c; };
> > > + union baz { struct foo *c; };
> > > + struct arr { struct foo *c[1]; };
> > > +}
> > > +
> > > +__attribute__((noinline))
> > > +struct foo * g1(struct bar *B, struct bar *Q)
> > > +{
> > > + struct bar t = *B;
> > > + *B = *Q;
> > > + *Q = t;
> > > + return B->c;
> > > +}
> > > +
> > > +__attribute__((noinline))
> > > +struct foo * g2(union baz *B, union baz *Q)
> > > +{
> > > + union baz t = *B;
> > > + *B = *Q;
> > > + *Q = t;
> > > + return B->c;
> > > +}
> > > +
> > > +__attribute__((noinline))
> > > +struct foo { long x; } *
> > > + g3(struct bar { struct foo { long x; } *c; } *B,
> > > + struct bar { struct foo { long x; } *c; } *Q)
> > > +{
> > > + struct bar t = *B;
> > > + *B = *Q;
> > > + *Q = t;
> > > + return B->c;
> > > +}
> > > +
> > > +struct foo * g4(struct arr *B,
> > > + struct arr *Q)
> > > +{
> > > + struct arr t = *B;
> > > + *B = *Q;
> > > + *Q = t;
> > > + return B->c[0];
> > > +}
> > > +
> > > +int main()
> > > +{
> > > + struct foo Bc = { };
> > > + struct foo Qc = { };
> > > +
> > > + struct bar B = { &Bc };
> > > + struct bar Q = { &Qc };
> > > +
> > > + if (g1(&B, &Q) != &Qc)
> > > + __builtin_abort();
> > > +
> > > + union baz Bu = { &Bc };
> > > + union baz Qu = { &Qc };
> > > +
> > > + if (g2(&Bu, &Qu) != &Qc)
> > > + __builtin_abort();
> > > +
> > > + struct bar B2 = { &Bc };
> > > + struct bar Q2 = { &Qc };
> > > +
> > > + if (g3(&B2, &Q2) != &Qc)
> > > + __builtin_abort();
> > > +
> > > + struct arr Ba = { &Bc };
> > > + struct arr Qa = { &Qc };
> > > +
> > > + if (g4(&Ba, &Qa) != &Qc)
> > > + __builtin_abort();
> > > +
> > > + return 0;
> > > +}
> > > +
> > >
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)