Hi, testcase of PR 117423 shows a flaw in the fancy way we do "total scalarization" in SRA now. We use the types encountered in the function body and not in type declaration (allowing us to totally scalarize when only one union field is ever used, since we effectively "skip" the union then) and can accommodate pre-existing accesses that happen to fall into padding.
In this case, we skipped the union (bypassing the totally_scalarizable_type_p check) and the access falling into the "padding" is an aggregate and so not a candidate for SRA but actually containing data. Arguably total scalarization should just bail out when it encounters this situation (but I decided not to depend on this mainly because we'd need to detect all cases when we eventually cannot scalarize, such as when a scalar access has children accesses) but the actual bug is that the detection if all data in an aggregate is indeed covered by replacements just assumes that is always the case if total scalarization triggers which however may not be the case in cases like this - and perhaps more. This patch fixes the bug by just assuming that all padding is taken care of when total scalarization triggered, not that every access was actually scalarized. Bootstrapped and tested on x86_64-linux, I'm also running a bootstrap and testing on aarch64-linux as I'm writing this but will know the results only tomorrow. OK for master and later for all active release branches if it passes there too? Thanks, Martin gcc/ChangeLog: 2025-07-17 Martin Jambor <mjam...@suse.cz> PR tree-optimization/117423 * tree-sra.cc (analyze_access_subtree): Fix computation of grp_covered flag. gcc/testsuite/ChangeLog: 2025-07-17 Martin Jambor <mjam...@suse.cz> PR tree-optimization/117423 * gcc.dg/tree-ssa/pr117423.c: New test. --- gcc/testsuite/gcc.dg/tree-ssa/pr117423.c | 49 ++++++++++++++++++++++++ gcc/tree-sra.cc | 9 ++++- 2 files changed, 56 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr117423.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr117423.c b/gcc/testsuite/gcc.dg/tree-ssa/pr117423.c new file mode 100644 index 00000000000..a5d3b29886f --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr117423.c @@ -0,0 +1,49 @@ +/* { dg-do run } */ +/* { dg-options "-O1" } */ + +struct s4 { + int _0; +}; +struct s1 { + unsigned char _4; + long _1; +}; +struct s2 { + union { + struct s3 { + unsigned char _1; + struct s4 _0; + } var_0; + struct s1 var_1; + } DATA; +}; +int f1(int arg0) { return arg0 > 12345; } +__attribute__((noinline)) +struct s4 f2(int arg0) { + struct s4 rv = {arg0}; + return rv; +} +struct s2 f3(int arg0) { + struct s2 rv; + struct s1 var6 = {0}; + struct s4 var7; + if (f1(arg0)) { + rv.DATA.var_1 = var6; + return rv; + } else { + rv.DATA.var_0._1 = 2; + var7 = f2(arg0); + rv.DATA.var_0._0 = var7; + return rv; + } +} +int main() { + if (f3(12345).DATA.var_0._0._0 == 12345) + ; + else + __builtin_abort(); + if (f3(12344).DATA.var_0._0._0 == 12344) + ; + else + __builtin_abort(); +} diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc index 23236fc6537..240af676ea3 100644 --- a/gcc/tree-sra.cc +++ b/gcc/tree-sra.cc @@ -2889,7 +2889,10 @@ analyze_access_subtree (struct access *root, struct access *parent, for (child = root->first_child; child; child = child->next_sibling) { - hole |= covered_to < child->offset; + if (totally) + covered_to = child->offset; + else + hole |= covered_to < child->offset; sth_created |= analyze_access_subtree (child, root, allow_replacements && !scalar && !root->grp_partial_lhs, @@ -2900,6 +2903,8 @@ analyze_access_subtree (struct access *root, struct access *parent, covered_to += child->size; else hole = true; + if (totally && !hole) + covered_to = limit; } if (allow_replacements && scalar && !root->first_child @@ -2972,7 +2977,7 @@ analyze_access_subtree (struct access *root, struct access *parent, root->grp_total_scalarization = 0; } - if (!hole || totally) + if (!hole) root->grp_covered = 1; else if (root->grp_write || comes_initialized_p (root->base)) root->grp_unscalarized_data = 1; /* not covered and written to */ -- 2.50.1