Am Freitag, dem 09.01.2026 um 10:08 +0100 schrieb Richard Biener:
> On Fri, 9 Jan 2026, Martin Uecker wrote:
>
> > Am Freitag, dem 09.01.2026 um 08:21 +0100 schrieb Richard Biener:
> > > On Thu, 8 Jan 2026, Martin Uecker wrote:
> > >
> > > >
> > > > Hi Richard,
> > > >
> > > > do you see an issue with the following approach? Calling
> > > > record_component_aliases for all versions? (should be rare
> > > > that there is more than one version.)
> > > >
> > > > Bootstrapped and regression tested on aarch64.
> > > > (and running on x86_64).
> > > >
> > > > Martin
> > > >
> > > >
> > > >
> > > > Given the following two types, the C FE assigns the same
> > > > TYPE_CANONICAL to both struct bar, because it treats pointer to
> > > > tagged types with the same type as compatible (in this context).
> > > >
> > > > struct foo { int y; };
> > > > struct bar { struct foo *c; }
> > > >
> > > > struct foo { long y; };
> > > > struct bar { struct foo *c; }
> > >
> > > I assume this is only "valid" if split into two TUs? How does
> > > this then materialize?
> >
> > It could materialize when it goes through another TU
> > where struct foo is incomplete. Having incomplete structs
> > at API boundaries across TUs is common in C.
> >
> > These differently completed types can then be in the same TU.
> > This is why it is relevant for alias analysis even inside one TU.
> > I don't think it is very common, but possible.
>
> I can't seem to create a TU with both types that is not rejected by
> the frontend.
This can be done by defining the types in different scopes.
>
> >
> > > LTO wouldn't merge the two distinct 'foo'
> > > types and also not the distinct 'bar' ones (possibly -Wodr would
> > > complain, but IIRC the C FE doesn't set up types in a way to do)?
> >
> > I assume the other way round? It would merge bar while globbing
> > the pointer to void?
>
> Ah yes, we "simplify" all pointers to be pointers to incomplete
> types, not exactly to void *. Honza might remember better.
> That makes accesses to the pointer data compatible but when the
> IL contains accesses to the incompatible 'y' those should remain.
If I am not missing something, it is merged to void and such
pointers are special-cased to alias with other void pointer
types (but not non-pointer types)
https://github.com/gcc-mirror/gcc/blob/master/gcc/alias.cc#L1261
> >
> > >
> > > Does the C standard require that the two 'bar' inter-operate
> > > wrt TBAA even though 'foo' are not compatible? Does it require
> > > even the 'foo' to inter-operate? IMO, if it does, stupiditly so,
> > > then it has to make sure such types have alias-set zero.
> >
> > No, the C standard requires struct bar to interoperate with
> > another struct bar { struct foo *c; } where struct foo
> > is incomplete. The problem is that this implies - when
> > forming equivalence classes - that the above two incompatible types
> > end up in the same class because they both need to be compatile
> > with the one which points to the incomplete foo. This is
> > an inherent limitation of GCC's way of modelling aliasing
> > relationship via equivalence classes.
> >
> > My understanding is that this is the reason why LTO globs
> > pointer to void in this situation.
>
> I think so. But how does this become a problem in non-LTO given
> I fail to see how two such distinct completions can be visible
> in the same TU? Thus, why does the C frontend need to care?
>
> > >
> > > > get_alias_set records the components of aggregate types, but only
> > > > considers the components of the canonical version. We need to record
> > > > the components of all such variations which we now do incrementally
> > > > as we discover these types.
> > >
> > > The proposed patch looks like a very ugly hack. Incrementally
> > > adding to the alias set forest looks fragile at best, I'm not
> > > convinced this results in correct operation.
> >
> > In principle, it seems a clean approach
> > to me that should correctly model the relationships without
> > loss of information in the common case.
> >
> > But if there is a technical reason why adding subsets at different
> > times is not possible, then this is of course a problem. I could not
> > identify a concrete issue though.
>
> For one, visibility of such a type in completely unrelated code
> would impact optimization of code in another context - so it
> would be dependent on order of processing. That's highly
> undesirable.
Yes, I agree that this would be undesirable.
> Similarly this different optimization could
> be invalid when such IL is then inlined into a context where
> the additional type was visible.
I don't fully understand this. Even with inlining, any reordering
would have to take into account the relevant types used for the
accesses which are then already processed. This visibility of other
types should not matter and could only pessimize the decisions
if the equivalence class was enlarged.
>
> > record_alias_subset has a comment that it "should not" be
> > called more than once per pair, but record_component_aliases
> > already does this.
> >
> > An alternative could possible be to activate the "void" case
> > in record_component_alias also without LTO.
> >
> > Another alternative - but more work - is if the C FE synthesis
> > a fresh TYPE_CANONICAL that changes these pointers to pointers
> > to incomplete types.
>
> You can look into what ipa-free-lang-data.cc:fld_incomplete_type_of
> does, it basically creates a struct T; (when it does not already
> exist) for a struct T { ... }; and uses that when referenced as
> pointed-to type. So the C frontend could make sure such an
> incomplete type variant exists and make the pointer to that type
> the canonical type for pointers?
Thanks, I will can take a look at this.
Martin
>
> Richard.
>
> > Martin
> >
> > >
> > > Richard.
> > >
> > > > PR c/122572
> > > >
> > > > gcc/ChangeLog:
> > > > * gcc/alias.cc (get_alias_set): Record components of all
> > > > versions with the same TYPE_CANONICAL.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > > * gcc.dg/struct-alias-2.c: New test.
> > > > ---
> > > > gcc/alias.cc | 32 +++++++-
> > > > gcc/testsuite/gcc.dg/struct-alias-2.c | 108 ++++++++++++++++++++++++++
> > > > 2 files changed, 138 insertions(+), 2 deletions(-)
> > > > create mode 100644 gcc/testsuite/gcc.dg/struct-alias-2.c
> > > >
> > > > diff --git a/gcc/alias.cc b/gcc/alias.cc
> > > > index bce026200a1..95f1f017989 100644
> > > > --- a/gcc/alias.cc
> > > > +++ b/gcc/alias.cc
> > > > @@ -923,6 +923,9 @@ get_alias_set (tree t)
> > > > && TYPE_TYPELESS_STORAGE (t))
> > > > return 0;
> > > >
> > > > + /* Remember the original type before replace with its canonical
> > > > version. */
> > > > + tree torig = t;
> > > > +
> > > > /* Always use the canonical type as well. If this is a type that
> > > > requires structural comparisons to identify compatible types
> > > > use alias set zero. */
> > > > @@ -954,7 +957,21 @@ get_alias_set (tree t)
> > > > /* If this is a type with a known alias set, return it. */
> > > > gcc_checking_assert (t == TYPE_MAIN_VARIANT (t));
> > > > if (TYPE_ALIAS_SET_KNOWN_P (t))
> > > > - return TYPE_ALIAS_SET (t);
> > > > + {
> > > > + /* The C FE for C23 sometimes creates TYPE_CANONICAL for
> > > > aggregate types
> > > > + that contain pointers to incompatible tagged types. In this
> > > > case the
> > > > + types which we had seen before may not have had exactly the
> > > > same
> > > > + components. We do need to record the components of the new
> > > > type
> > > > + we haven't seen before. */
> > > > + if (!in_lto_p && AGGREGATE_TYPE_P(t) && !TYPE_ALIAS_SET_KNOWN_P
> > > > (torig))
> > > > + {
> > > > + gcc_assert (torig != t);
> > > > + gcc_assert (t == TYPE_CANONICAL (torig));
> > > > + TYPE_ALIAS_SET (torig) = TYPE_ALIAS_SET (t);
> > > > + record_component_aliases (torig);
> > > > + }
> > > > + return TYPE_ALIAS_SET (t);
> > > > + }
> > > >
> > > > /* We don't want to set TYPE_ALIAS_SET for incomplete types. */
> > > > if (!COMPLETE_TYPE_P (t))
> > > > @@ -1110,7 +1127,7 @@ get_alias_set (tree t)
> > > > /* Assign the alias set to both p and t.
> > > > We cannot call get_alias_set (p) here as that would trigger
> > > > infinite recursion when p == t. In other cases it would
> > > > just
> > > > - trigger unnecesary legwork of rebuilding the pointer
> > > > again. */
> > > > + trigger unnecessary legwork of rebuilding the pointer
> > > > again. */
> > > > gcc_checking_assert (p == TYPE_MAIN_VARIANT (p));
> > > > if (TYPE_ALIAS_SET_KNOWN_P (p))
> > > > set = TYPE_ALIAS_SET (p);
> > > > @@ -1147,6 +1164,17 @@ get_alias_set (tree t)
> > > > if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
> > > > record_component_aliases (t);
> > > >
> > > > + /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate
> > > > types
> > > > + that contain pointers to incompatible tagged types. In this case
> > > > + TYPE_CANONICAL may not have exactly the same components as the
> > > > original
> > > > + type. We do need to record the components of both versions. */
> > > > + if (!in_lto_p && torig != t && AGGREGATE_TYPE_P(t))
> > > > + {
> > > > + TYPE_ALIAS_SET (torig) = set;
> > > > + record_component_aliases (torig);
> > > > + }
> > > > +
> > > > +
> > > > /* We treat pointer types specially in alias_set_subset_of. */
> > > > if (POINTER_TYPE_P (t) && set)
> > > > {
> > > > diff --git a/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > > b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > > new file mode 100644
> > > > index 00000000000..114a1275594
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > > @@ -0,0 +1,108 @@
> > > > +/* { dg-do run } */
> > > > +/* { dg-options "-O2 -std=c23" } */
> > > > +
> > > > +
> > > > +/* Based on the submitted test case for PR123356 but using types with
> > > > + * tags and moving the second version into another scope. */
> > > > +
> > > > +struct foo { long x; };
> > > > +
> > > > +void f()
> > > > +{
> > > > + struct foo { };
> > > > + struct bar { struct foo *c; };
> > > > + union baz { struct foo *c; };
> > > > + struct arr { struct foo *c[1]; };
> > > > +}
> > > > +
> > > > +
> > > > +void f1()
> > > > +{
> > > > + struct foo { };
> > > > + struct bar { struct foo *c; };
> > > > + union baz { struct foo *c; };
> > > > + struct arr { struct foo *c[1]; };
> > > > +}
> > > > +
> > > > +struct bar { struct foo *c; };
> > > > +union baz { struct foo *c; };
> > > > +struct arr { struct foo *c[1]; };
> > > > +
> > > > +void f2()
> > > > +{
> > > > + struct foo { int y; };
> > > > + struct bar { struct foo *c; };
> > > > + union baz { struct foo *c; };
> > > > + struct arr { struct foo *c[1]; };
> > > > +}
> > > > +
> > > > +__attribute__((noinline))
> > > > +struct foo * g1(struct bar *B, struct bar *Q)
> > > > +{
> > > > + struct bar t = *B;
> > > > + *B = *Q;
> > > > + *Q = t;
> > > > + return B->c;
> > > > +}
> > > > +
> > > > +__attribute__((noinline))
> > > > +struct foo * g2(union baz *B, union baz *Q)
> > > > +{
> > > > + union baz t = *B;
> > > > + *B = *Q;
> > > > + *Q = t;
> > > > + return B->c;
> > > > +}
> > > > +
> > > > +__attribute__((noinline))
> > > > +struct foo { long x; } *
> > > > + g3(struct bar { struct foo { long x; } *c; } *B,
> > > > + struct bar { struct foo { long x; } *c; } *Q)
> > > > +{
> > > > + struct bar t = *B;
> > > > + *B = *Q;
> > > > + *Q = t;
> > > > + return B->c;
> > > > +}
> > > > +
> > > > +struct foo * g4(struct arr *B,
> > > > + struct arr *Q)
> > > > +{
> > > > + struct arr t = *B;
> > > > + *B = *Q;
> > > > + *Q = t;
> > > > + return B->c[0];
> > > > +}
> > > > +
> > > > +int main()
> > > > +{
> > > > + struct foo Bc = { };
> > > > + struct foo Qc = { };
> > > > +
> > > > + struct bar B = { &Bc };
> > > > + struct bar Q = { &Qc };
> > > > +
> > > > + if (g1(&B, &Q) != &Qc)
> > > > + __builtin_abort();
> > > > +
> > > > + union baz Bu = { &Bc };
> > > > + union baz Qu = { &Qc };
> > > > +
> > > > + if (g2(&Bu, &Qu) != &Qc)
> > > > + __builtin_abort();
> > > > +
> > > > + struct bar B2 = { &Bc };
> > > > + struct bar Q2 = { &Qc };
> > > > +
> > > > + if (g3(&B2, &Q2) != &Qc)
> > > > + __builtin_abort();
> > > > +
> > > > + struct arr Ba = { &Bc };
> > > > + struct arr Qa = { &Qc };
> > > > +
> > > > + if (g4(&Ba, &Qa) != &Qc)
> > > > + __builtin_abort();
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > >
> >
--
Univ.-Prof. Dr. rer. nat. Martin Uecker
Graz University of Technology
Institute of Biomedical Imaging