On Fri, 9 Jan 2026, Martin Uecker wrote:

> Am Freitag, dem 09.01.2026 um 08:21 +0100 schrieb Richard Biener:
> > On Thu, 8 Jan 2026, Martin Uecker wrote:
> > 
> > > 
> > > Hi Richard,
> > > 
> > > do you see an issue with the following approach?  Calling
> > > record_component_aliases for all versions? (should be rare
> > > that there is more than one version.)
> > > 
> > > Bootstrapped and regression tested on aarch64.
> > > (and running on x86_64).
> > > 
> > > Martin
> > > 
> > > 
> > > 
> > > Given the following two types, the C FE assigns the same
> > > TYPE_CANONICAL to both struct bar, because it treats pointer to
> > > tagged types with the same type as compatible (in this context).
> > > 
> > > struct foo { int y; };
> > > struct bar { struct foo *c; }
> > > 
> > > struct foo { long y; };
> > > struct bar { struct foo *c; }
> > 
> > I assume this is only "valid" if split into two TUs?  How does
> > this then materialize?  
> 
> It could materialize when it goes through another TU
> where struct foo is incomplete.   Having incomplete structs
> at API boundaries across TUs is common in C.
> 
> These differently completed types can then be in the same TU. 
> This is why it is relevant for alias analysis even inside one TU.
> I don't think it is very common, but possible.

I can't seem to create a TU with both types that is not rejected by
the frontend.

> 
> > LTO wouldn't merge the two distinct 'foo'
> > types and also not the distinct 'bar' ones (possibly -Wodr would
> > complain, but IIRC the C FE doesn't set up types in a way to do)?
> 
> I assume the other way round?  It would merge bar while globbing
> the pointer to void?

Ah yes, we "simplify" all pointers to be pointers to incomplete
types, not exactly to void *.  Honza might remember better.
That makes accesses to the pointer data compatible but when the
IL contains accesses to the incompatible 'y' those should remain.

> 
> > 
> > Does the C standard require that the two 'bar' inter-operate
> > wrt TBAA even though 'foo' are not compatible?  Does it require
> > even the 'foo' to inter-operate?  IMO, if it does, stupiditly so,
> > then it has to make sure such types have alias-set zero.
> 
> No, the C standard requires struct bar to interoperate with
> another struct bar { struct foo *c; }  where struct foo
> is incomplete.  The problem is that this implies - when
> forming equivalence classes - that the above two incompatible types
> end up in the same class because they both need to be compatile
> with the one which points to the incomplete foo.  This is
> an inherent limitation of GCC's way of modelling aliasing
> relationship via equivalence classes.
>
> My understanding is that this is the reason why LTO globs
> pointer to void in this situation. 

I think so.  But how does this become a problem in non-LTO given
I fail to see how two such distinct completions can be visible
in the same TU?  Thus, why does the C frontend need to care?
 
> > 
> > > get_alias_set records the components of aggregate types, but only
> > > considers the components of the canonical version.  We need to record
> > > the components of all such variations which we now do incrementally
> > > as we discover these types.
> > 
> > The proposed patch looks like a very ugly hack.  Incrementally
> > adding to the alias set forest looks fragile at best, I'm not
> > convinced this results in correct operation.
> 
> In principle, it seems a clean approach
> to me that should correctly model the relationships without
> loss of information in the common case. 
> 
> But if there is a technical reason why adding subsets at different
> times is not possible, then this is of course a problem.  I could not
> identify a concrete issue though.

For one, visibility of such a type in completely unrelated code
would impact optimization of code in another context - so it
would be dependent on order of processing.  That's highly
undesirable.  Similarly this different optimization could
be invalid when such IL is then inlined into a context where
the additional type was visible.

> record_alias_subset has a comment that it "should not" be
> called more than once per pair, but record_component_aliases
> already does this. 
> 
> An alternative could possible be to activate the "void" case
> in record_component_alias also without LTO.
> 
> Another alternative - but more work - is if the C FE synthesis
> a fresh TYPE_CANONICAL that changes these pointers to pointers
> to incomplete types. 

You can look into what ipa-free-lang-data.cc:fld_incomplete_type_of
does, it basically creates a struct T; (when it does not already
exist) for a struct T { ... }; and uses that when referenced as
pointed-to type.  So the C frontend could make sure such an
incomplete type variant exists and make the pointer to that type
the canonical type for pointers?

Richard.

> Martin
> 
> > 
> > Richard.
> > 
> > >   PR c/122572
> > > 
> > > gcc/ChangeLog:
> > >   * gcc/alias.cc (get_alias_set): Record components of all
> > >   versions with the same TYPE_CANONICAL.
> > > 
> > > gcc/testsuite/ChangeLog:
> > >   * gcc.dg/struct-alias-2.c: New test.
> > > ---
> > >  gcc/alias.cc                          |  32 +++++++-
> > >  gcc/testsuite/gcc.dg/struct-alias-2.c | 108 ++++++++++++++++++++++++++
> > >  2 files changed, 138 insertions(+), 2 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/struct-alias-2.c
> > > 
> > > diff --git a/gcc/alias.cc b/gcc/alias.cc
> > > index bce026200a1..95f1f017989 100644
> > > --- a/gcc/alias.cc
> > > +++ b/gcc/alias.cc
> > > @@ -923,6 +923,9 @@ get_alias_set (tree t)
> > >        && TYPE_TYPELESS_STORAGE (t))
> > >      return 0;
> > >  
> > > +  /* Remember the original type before replace with its canonical 
> > > version.  */
> > > +  tree torig = t;
> > > +
> > >    /* Always use the canonical type as well.  If this is a type that
> > >       requires structural comparisons to identify compatible types
> > >       use alias set zero.  */
> > > @@ -954,7 +957,21 @@ get_alias_set (tree t)
> > >    /* If this is a type with a known alias set, return it.  */
> > >    gcc_checking_assert (t == TYPE_MAIN_VARIANT (t));
> > >    if (TYPE_ALIAS_SET_KNOWN_P (t))
> > > -    return TYPE_ALIAS_SET (t);
> > > +    {
> > > +      /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate 
> > > types
> > > +  that contain pointers to incompatible tagged types.  In this case the
> > > +  types which we had seen before may not have had exactly the same
> > > +  components.  We do need to record the components of the new type
> > > +  we haven't seen before.  */
> > > +      if (!in_lto_p && AGGREGATE_TYPE_P(t) && !TYPE_ALIAS_SET_KNOWN_P 
> > > (torig))
> > > + {
> > > +   gcc_assert (torig != t);
> > > +   gcc_assert (t == TYPE_CANONICAL (torig));
> > > +   TYPE_ALIAS_SET (torig) = TYPE_ALIAS_SET (t);
> > > +   record_component_aliases (torig);
> > > + }
> > > +      return TYPE_ALIAS_SET (t);
> > > +    }
> > >  
> > >    /* We don't want to set TYPE_ALIAS_SET for incomplete types.  */
> > >    if (!COMPLETE_TYPE_P (t))
> > > @@ -1110,7 +1127,7 @@ get_alias_set (tree t)
> > >     /* Assign the alias set to both p and t.
> > >        We cannot call get_alias_set (p) here as that would trigger
> > >        infinite recursion when p == t.  In other cases it would just
> > > -      trigger unnecesary legwork of rebuilding the pointer again.  */
> > > +      trigger unnecessary legwork of rebuilding the pointer again.  */
> > >     gcc_checking_assert (p == TYPE_MAIN_VARIANT (p));
> > >     if (TYPE_ALIAS_SET_KNOWN_P (p))
> > >       set = TYPE_ALIAS_SET (p);
> > > @@ -1147,6 +1164,17 @@ get_alias_set (tree t)
> > >    if (AGGREGATE_TYPE_P (t) || TREE_CODE (t) == COMPLEX_TYPE)
> > >      record_component_aliases (t);
> > >  
> > > +  /* The C FE for C23 sometimes creates TYPE_CANONICAL for aggregate 
> > > types
> > > +     that contain pointers to incompatible tagged types.  In this case
> > > +     TYPE_CANONICAL may not have exactly the same components as the 
> > > original
> > > +     type.  We do need to record the components of both versions.  */
> > > +  if (!in_lto_p && torig != t && AGGREGATE_TYPE_P(t))
> > > +    {
> > > +      TYPE_ALIAS_SET (torig) = set;
> > > +      record_component_aliases (torig);
> > > +    }
> > > +
> > > +
> > >    /* We treat pointer types specially in alias_set_subset_of.  */
> > >    if (POINTER_TYPE_P (t) && set)
> > >      {
> > > diff --git a/gcc/testsuite/gcc.dg/struct-alias-2.c 
> > > b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > new file mode 100644
> > > index 00000000000..114a1275594
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/struct-alias-2.c
> > > @@ -0,0 +1,108 @@
> > > +/* { dg-do run } */
> > > +/* { dg-options "-O2 -std=c23" } */
> > > +
> > > +
> > > +/* Based on the submitted test case for PR123356 but using types with
> > > + * tags and moving the second version into another scope.  */
> > > +
> > > +struct foo { long x; };
> > > +
> > > +void f()
> > > +{
> > > + struct foo { };
> > > + struct bar { struct foo *c; };
> > > + union baz { struct foo *c; };
> > > + struct arr { struct foo *c[1]; };
> > > +}
> > > +
> > > +
> > > +void f1()
> > > +{
> > > + struct foo { };
> > > + struct bar { struct foo *c; };
> > > + union baz { struct foo *c; };
> > > + struct arr { struct foo *c[1]; };
> > > +}
> > > +
> > > +struct bar { struct foo *c; };
> > > +union baz { struct foo *c; };
> > > +struct arr { struct foo *c[1]; };
> > > +
> > > +void f2()
> > > +{
> > > + struct foo { int y; };
> > > + struct bar { struct foo *c; };
> > > + union baz { struct foo *c; };
> > > + struct arr { struct foo *c[1]; };
> > > +}
> > > +
> > > +__attribute__((noinline))
> > > +struct foo * g1(struct bar *B, struct bar *Q)
> > > +{
> > > +    struct bar t = *B;
> > > +    *B = *Q;
> > > +    *Q = t;
> > > +    return B->c;
> > > +}
> > > +
> > > +__attribute__((noinline))
> > > +struct foo * g2(union baz *B, union baz *Q)
> > > +{
> > > +    union baz t = *B;
> > > +    *B = *Q;
> > > +    *Q = t;
> > > +    return B->c;
> > > +}
> > > +
> > > +__attribute__((noinline))
> > > +struct foo { long x; } * 
> > > + g3(struct bar { struct foo { long x; } *c; } *B,
> > > +    struct bar { struct foo { long x; } *c; } *Q)
> > > +{
> > > +    struct bar t = *B;
> > > +    *B = *Q;
> > > +    *Q = t;
> > > +    return B->c;
> > > +}
> > > +
> > > +struct foo * g4(struct arr *B,
> > > +         struct arr *Q)
> > > +{
> > > +    struct arr t = *B;
> > > +    *B = *Q;
> > > +    *Q = t;
> > > +    return B->c[0];
> > > +}
> > > +
> > > +int main()
> > > +{
> > > +    struct foo Bc = { };
> > > +    struct foo Qc = { };
> > > +
> > > +    struct bar B = { &Bc };
> > > +    struct bar Q = { &Qc };
> > > +
> > > +    if (g1(&B, &Q) != &Qc)
> > > +     __builtin_abort();
> > > +
> > > +    union baz Bu = { &Bc };
> > > +    union baz Qu = { &Qc };
> > > +
> > > +    if (g2(&Bu, &Qu) != &Qc)
> > > +     __builtin_abort();
> > > +
> > > +    struct bar B2 = { &Bc };
> > > +    struct bar Q2 = { &Qc };
> > > +
> > > +    if (g3(&B2, &Q2) != &Qc)
> > > +     __builtin_abort();
> > > +
> > > +    struct arr Ba = { &Bc };
> > > +    struct arr Qa = { &Qc };
> > > +
> > > +    if (g4(&Ba, &Qa) != &Qc)
> > > +     __builtin_abort();
> > > +
> > > +    return 0;
> > > +}
> > > +
> > > 
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to