[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
dobbelaj-snps wrote: Next builds seem to succeed. Looks like an instability of the lldb buildbot ? https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
llvm-ci wrote: LLVM Buildbot has detected a new failure on builder `lldb-x86_64-debian` running on `lldb-x86_64-debian` while building `clang` at step 6 "test". Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/23916 Here is the relevant piece of the build log for the reference ``` Step 6 (test) failure: build (failure) ... UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/interactive_crashlog_invalid_target.test (2980 of 2991) UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/watchpoint_callback.test (2981 of 2991) UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/last_exception_backtrace_crashlog.test (2982 of 2991) UNSUPPORTED: lldb-shell :: Expr/TestEnumExtensibility.m (2983 of 2991) UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/breakpoint_oneline_callback.test (2984 of 2991) UNSUPPORTED: lldb-shell :: Process/Windows/msstl_smoke.cpp (2985 of 2991) UNSUPPORTED: lldb-shell :: ScriptInterpreter/Lua/command_script_import.test (2986 of 2991) UNSUPPORTED: lldb-shell :: ScriptInterpreter/Python/Crashlog/skipped_status_interactive_crashlog.test (2987 of 2991) PASS: lldb-api :: terminal/TestEditlineCompletions.py (2988 of 2991) UNRESOLVED: lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py (2989 of 2991) TEST 'lldb-api :: tools/lldb-dap/launch/TestDAP_launch.py' FAILED Script: -- /usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib --cmake-build-type Release -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/launch -p TestDAP_launch.py -- Exit Code: 1 Command Output (stdout): -- lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision c3b8a15eab06fceb6f4d0f2a0f505d5290ff208a) clang revision c3b8a15eab06fceb6f4d0f2a0f505d5290ff208a llvm revision c3b8a15eab06fceb6f4d0f2a0f505d5290ff208a Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc'] -- Command Output (stderr): -- Change dir to: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/tools/lldb-dap/launch runCmd: settings clear --all output: runCmd: settings set symbols.enable-external-lookup false output: runCmd: settings set target.inherit-tcc true output: runCmd: settings set target.disable-aslr false output: runCmd: settings set target.detach-on-error false output: runCmd: settings set target.auto-apply-fixits false ``` https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
dobbelaj-snps wrote: > Could someone with commit access please merge this? Thanks! Done https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/dobbelaj-snps closed https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
brunodf-snps wrote: Could someone with commit access please merge this? Thanks! https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/rjmccall approved this pull request. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
brunodf-snps wrote: (Not sure if this requires re-approval, but I think this is ready to be merged.) https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/brunodf-snps updated https://github.com/llvm/llvm-project/pull/137719 >From d63c8fe4fcc8e89933bf3c1cc176941b0b9094fa Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Mon, 28 Apr 2025 14:12:00 +0200 Subject: [PATCH 1/6] [clang][CodeGen] Make tbaa-array test more robust Avoid unintentional matches against extra load/stores in the unoptimized LLVM IR. --- clang/test/CodeGen/tbaa-array.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index 4a6576e2eeb7f..ce34e7d586e3e 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -1,6 +1,6 @@ -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -emit-llvm -o - | FileCheck %s -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -new-struct-path-tbaa -emit-llvm -o - | \ // RUN: FileCheck -check-prefix=CHECK-NEW %s // @@ -45,7 +45,6 @@ int bar3(C *c, int j) { // CHECK-NEW-DAG: [[TYPE_char:!.*]] = !{{{.*}}, i64 1, !"omnipotent char"} // CHECK-NEW-DAG: [[TYPE_int:!.*]] = !{[[TYPE_char]], i64 4, !"int"} // CHECK-NEW-DAG: [[TAG_int]] = !{[[TYPE_int]], [[TYPE_int]], i64 0, i64 4} -// CHECK-NEW-DAG: [[TYPE_pointer:!.*]] = !{[[TYPE_char]], i64 8, !"any pointer"} // CHECK-NEW-DAG: [[TYPE_A:!.*]] = !{[[TYPE_char]], i64 4, !"_ZTS1A", [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TAG_A_i]] = !{[[TYPE_A]], [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TYPE_C:!.*]] = !{[[TYPE_char]], i64 16, !"_ZTS1C", [[TYPE_int]], i64 0, i64 4, [[TYPE_int]], i64 4, i64 12} >From bba5ee5ed17af062f91604d3185d733df944df67 Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Tue, 29 Apr 2025 00:07:02 +0200 Subject: [PATCH 2/6] [CodeGen] Add TBAA struct path info for array members This enables the LLVM optimizer to view accesses to distinct struct members as independent, also for array members. For example, the following two stores no longer alias: struct S { int a[10]; int b; }; void test(S *p, int i) { p->a[i] = ...; p->b = ...; } Array members were already added to TBAA struct type nodes in commit 57493e2. Here, we extend a path tag for an array subscript expression. --- clang/lib/CodeGen/CGExpr.cpp | 27 ++- clang/test/CodeGen/tbaa-array.cpp | 21 +++-- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index bba7d1e805f3f..c95a54fcebba9 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -4503,7 +4503,32 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E, E->getType(), !getLangOpts().PointerOverflowDefined, SignedIndices, E->getExprLoc(), &arrayType, E->getBase()); EltBaseInfo = ArrayLV.getBaseInfo(); -EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +if (!CGM.getCodeGenOpts().NewStructPathTBAA) { + // Since CodeGenTBAA::getTypeInfoHelper only handles array types for + // new struct path TBAA, we must a use a plain access. + EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +} else if (ArrayLV.getTBAAInfo().isMayAlias()) { + EltTBAAInfo = TBAAAccessInfo::getMayAliasInfo(); +} else if (ArrayLV.getTBAAInfo().isIncomplete()) { + EltTBAAInfo = CGM.getTBAAAccessInfo(E->getType()); +} else { + // Extend struct path from base lvalue, similar to EmitLValueForField. + // If no base type has been assigned for the array access, then try to + // generate one. + EltTBAAInfo = ArrayLV.getTBAAInfo(); + if (!EltTBAAInfo.BaseType) { +EltTBAAInfo.BaseType = CGM.getTBAABaseTypeInfo(ArrayLV.getType()); +assert(!EltTBAAInfo.Offset && + "Nonzero offset for an access with no base type!"); + } + // The index into the array is a runtime value. We use the same struct + // path for all array elements (that of the element at index 0). So we + // set the access type and size, but do not have to adjust + // EltTBAAInfo.Offset. + EltTBAAInfo.AccessType = CGM.getTBAATypeInfo(E->getType()); + EltTBAAInfo.Size = + getContext().getTypeSizeInChars(E->getType()).getQuantity(); +} } else { // The base must be a pointer; emit it with an estimate of its alignment. Addr = EmitPointerWithAlignment(E->getBase(), &EltBaseInfo, &EltTBAAInfo); diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index ce34e7d586e3e..7cda1dd8d5bf7 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -10,6 +10,8 @@ struct A { int i; }; struct B { A a[1]; }; struct C { int i; int x[3]; }; +struct D { int n; int arr[]; }; // flexible array memb
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
brunodf-snps wrote: > I was still adding test cases involving may_alias, but I still found problems This has been solved in 1a40063bc73ee468cb6c8634505232b6f2d833ec. This requires to update the member types in the struct type node in case of an array member with may_alias tag. I found it is the most logical to handle array types as a general case in `TypeHasMayAlias`, it seems relevant for all uses of this function. The upshot is that we can now reuse the access type from the base array lvalue, as was originally suggested by @rjmccall. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/brunodf-snps updated https://github.com/llvm/llvm-project/pull/137719 >From d63c8fe4fcc8e89933bf3c1cc176941b0b9094fa Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Mon, 28 Apr 2025 14:12:00 +0200 Subject: [PATCH 1/5] [clang][CodeGen] Make tbaa-array test more robust Avoid unintentional matches against extra load/stores in the unoptimized LLVM IR. --- clang/test/CodeGen/tbaa-array.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index 4a6576e2eeb7f..ce34e7d586e3e 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -1,6 +1,6 @@ -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -emit-llvm -o - | FileCheck %s -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -new-struct-path-tbaa -emit-llvm -o - | \ // RUN: FileCheck -check-prefix=CHECK-NEW %s // @@ -45,7 +45,6 @@ int bar3(C *c, int j) { // CHECK-NEW-DAG: [[TYPE_char:!.*]] = !{{{.*}}, i64 1, !"omnipotent char"} // CHECK-NEW-DAG: [[TYPE_int:!.*]] = !{[[TYPE_char]], i64 4, !"int"} // CHECK-NEW-DAG: [[TAG_int]] = !{[[TYPE_int]], [[TYPE_int]], i64 0, i64 4} -// CHECK-NEW-DAG: [[TYPE_pointer:!.*]] = !{[[TYPE_char]], i64 8, !"any pointer"} // CHECK-NEW-DAG: [[TYPE_A:!.*]] = !{[[TYPE_char]], i64 4, !"_ZTS1A", [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TAG_A_i]] = !{[[TYPE_A]], [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TYPE_C:!.*]] = !{[[TYPE_char]], i64 16, !"_ZTS1C", [[TYPE_int]], i64 0, i64 4, [[TYPE_int]], i64 4, i64 12} >From bba5ee5ed17af062f91604d3185d733df944df67 Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Tue, 29 Apr 2025 00:07:02 +0200 Subject: [PATCH 2/5] [CodeGen] Add TBAA struct path info for array members This enables the LLVM optimizer to view accesses to distinct struct members as independent, also for array members. For example, the following two stores no longer alias: struct S { int a[10]; int b; }; void test(S *p, int i) { p->a[i] = ...; p->b = ...; } Array members were already added to TBAA struct type nodes in commit 57493e2. Here, we extend a path tag for an array subscript expression. --- clang/lib/CodeGen/CGExpr.cpp | 27 ++- clang/test/CodeGen/tbaa-array.cpp | 21 +++-- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index bba7d1e805f3f..c95a54fcebba9 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -4503,7 +4503,32 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E, E->getType(), !getLangOpts().PointerOverflowDefined, SignedIndices, E->getExprLoc(), &arrayType, E->getBase()); EltBaseInfo = ArrayLV.getBaseInfo(); -EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +if (!CGM.getCodeGenOpts().NewStructPathTBAA) { + // Since CodeGenTBAA::getTypeInfoHelper only handles array types for + // new struct path TBAA, we must a use a plain access. + EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +} else if (ArrayLV.getTBAAInfo().isMayAlias()) { + EltTBAAInfo = TBAAAccessInfo::getMayAliasInfo(); +} else if (ArrayLV.getTBAAInfo().isIncomplete()) { + EltTBAAInfo = CGM.getTBAAAccessInfo(E->getType()); +} else { + // Extend struct path from base lvalue, similar to EmitLValueForField. + // If no base type has been assigned for the array access, then try to + // generate one. + EltTBAAInfo = ArrayLV.getTBAAInfo(); + if (!EltTBAAInfo.BaseType) { +EltTBAAInfo.BaseType = CGM.getTBAABaseTypeInfo(ArrayLV.getType()); +assert(!EltTBAAInfo.Offset && + "Nonzero offset for an access with no base type!"); + } + // The index into the array is a runtime value. We use the same struct + // path for all array elements (that of the element at index 0). So we + // set the access type and size, but do not have to adjust + // EltTBAAInfo.Offset. + EltTBAAInfo.AccessType = CGM.getTBAATypeInfo(E->getType()); + EltTBAAInfo.Size = + getContext().getTypeSizeInChars(E->getType()).getQuantity(); +} } else { // The base must be a pointer; emit it with an estimate of its alignment. Addr = EmitPointerWithAlignment(E->getBase(), &EltBaseInfo, &EltTBAAInfo); diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index ce34e7d586e3e..7cda1dd8d5bf7 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -10,6 +10,8 @@ struct A { int i; }; struct B { A a[1]; }; struct C { int i; int x[3]; }; +struct D { int n; int arr[]; }; // flexible array memb
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
brunodf-snps wrote: Hm, I was still adding test cases involving may_alias, but I still found problems with the following: ``` typedef int __attribute__((may_alias)) aliasing_int; typedef int __attribute__((may_alias)) aliasing_array[10]; struct E { aliasing_int x[4]; aliasing_array y; }; ``` Both `e->x[i]` and `e->y[j]` fail in the IR verifier because of a mismatch between the access type and the type of the member in the type node of E (the access type is "int" and the member type is "omnipotent char" or vice versa). https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
@@ -28,25 +30,39 @@ int bar(C *c) { int bar2(C *c) { // CHECK-NEW-LABEL: _Z4bar2P1C -// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_int:!.*]] +// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_C_x:!.*]] return c->x[2]; } int bar3(C *c, int j) { // CHECK-NEW-LABEL: _Z4bar3P1Ci -// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_int:!.*]] +// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_C_x]] return c->x[j]; kosarev wrote: Makes sense, thanks. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/kosarev approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
@@ -28,25 +30,39 @@ int bar(C *c) { int bar2(C *c) { // CHECK-NEW-LABEL: _Z4bar2P1C -// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_int:!.*]] +// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_C_x:!.*]] return c->x[2]; } int bar3(C *c, int j) { // CHECK-NEW-LABEL: _Z4bar3P1Ci -// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_int:!.*]] +// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_C_x]] return c->x[j]; brunodf-snps wrote: Yes, the access to any element of the x member array will get this tag. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
@@ -28,25 +30,39 @@ int bar(C *c) { int bar2(C *c) { // CHECK-NEW-LABEL: _Z4bar2P1C -// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_int:!.*]] +// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_C_x:!.*]] return c->x[2]; } int bar3(C *c, int j) { // CHECK-NEW-LABEL: _Z4bar3P1Ci -// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_int:!.*]] +// CHECK-NEW: load i32, {{.*}}, !tbaa [[TAG_C_x]] return c->x[j]; kosarev wrote: Am I reading this right that the tag now says we access a 4-byte int in C at offset 4? https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/rjmccall approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/brunodf-snps updated https://github.com/llvm/llvm-project/pull/137719 >From d63c8fe4fcc8e89933bf3c1cc176941b0b9094fa Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Mon, 28 Apr 2025 14:12:00 +0200 Subject: [PATCH 1/4] [clang][CodeGen] Make tbaa-array test more robust Avoid unintentional matches against extra load/stores in the unoptimized LLVM IR. --- clang/test/CodeGen/tbaa-array.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index 4a6576e2eeb7f..ce34e7d586e3e 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -1,6 +1,6 @@ -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -emit-llvm -o - | FileCheck %s -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -new-struct-path-tbaa -emit-llvm -o - | \ // RUN: FileCheck -check-prefix=CHECK-NEW %s // @@ -45,7 +45,6 @@ int bar3(C *c, int j) { // CHECK-NEW-DAG: [[TYPE_char:!.*]] = !{{{.*}}, i64 1, !"omnipotent char"} // CHECK-NEW-DAG: [[TYPE_int:!.*]] = !{[[TYPE_char]], i64 4, !"int"} // CHECK-NEW-DAG: [[TAG_int]] = !{[[TYPE_int]], [[TYPE_int]], i64 0, i64 4} -// CHECK-NEW-DAG: [[TYPE_pointer:!.*]] = !{[[TYPE_char]], i64 8, !"any pointer"} // CHECK-NEW-DAG: [[TYPE_A:!.*]] = !{[[TYPE_char]], i64 4, !"_ZTS1A", [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TAG_A_i]] = !{[[TYPE_A]], [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TYPE_C:!.*]] = !{[[TYPE_char]], i64 16, !"_ZTS1C", [[TYPE_int]], i64 0, i64 4, [[TYPE_int]], i64 4, i64 12} >From bba5ee5ed17af062f91604d3185d733df944df67 Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Tue, 29 Apr 2025 00:07:02 +0200 Subject: [PATCH 2/4] [CodeGen] Add TBAA struct path info for array members This enables the LLVM optimizer to view accesses to distinct struct members as independent, also for array members. For example, the following two stores no longer alias: struct S { int a[10]; int b; }; void test(S *p, int i) { p->a[i] = ...; p->b = ...; } Array members were already added to TBAA struct type nodes in commit 57493e2. Here, we extend a path tag for an array subscript expression. --- clang/lib/CodeGen/CGExpr.cpp | 27 ++- clang/test/CodeGen/tbaa-array.cpp | 21 +++-- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index bba7d1e805f3f..c95a54fcebba9 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -4503,7 +4503,32 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E, E->getType(), !getLangOpts().PointerOverflowDefined, SignedIndices, E->getExprLoc(), &arrayType, E->getBase()); EltBaseInfo = ArrayLV.getBaseInfo(); -EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +if (!CGM.getCodeGenOpts().NewStructPathTBAA) { + // Since CodeGenTBAA::getTypeInfoHelper only handles array types for + // new struct path TBAA, we must a use a plain access. + EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +} else if (ArrayLV.getTBAAInfo().isMayAlias()) { + EltTBAAInfo = TBAAAccessInfo::getMayAliasInfo(); +} else if (ArrayLV.getTBAAInfo().isIncomplete()) { + EltTBAAInfo = CGM.getTBAAAccessInfo(E->getType()); +} else { + // Extend struct path from base lvalue, similar to EmitLValueForField. + // If no base type has been assigned for the array access, then try to + // generate one. + EltTBAAInfo = ArrayLV.getTBAAInfo(); + if (!EltTBAAInfo.BaseType) { +EltTBAAInfo.BaseType = CGM.getTBAABaseTypeInfo(ArrayLV.getType()); +assert(!EltTBAAInfo.Offset && + "Nonzero offset for an access with no base type!"); + } + // The index into the array is a runtime value. We use the same struct + // path for all array elements (that of the element at index 0). So we + // set the access type and size, but do not have to adjust + // EltTBAAInfo.Offset. + EltTBAAInfo.AccessType = CGM.getTBAATypeInfo(E->getType()); + EltTBAAInfo.Size = + getContext().getTypeSizeInChars(E->getType()).getQuantity(); +} } else { // The base must be a pointer; emit it with an estimate of its alignment. Addr = EmitPointerWithAlignment(E->getBase(), &EltBaseInfo, &EltTBAAInfo); diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index ce34e7d586e3e..7cda1dd8d5bf7 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -10,6 +10,8 @@ struct A { int i; }; struct B { A a[1]; }; struct C { int i; int x[3]; }; +struct D { int n; int arr[]; }; // flexible array memb
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
@@ -4503,7 +4503,29 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E, E->getType(), !getLangOpts().PointerOverflowDefined, SignedIndices, E->getExprLoc(), &arrayType, E->getBase()); EltBaseInfo = ArrayLV.getBaseInfo(); -EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +if (!CGM.getCodeGenOpts().NewStructPathTBAA) { + // Since CodeGenTBAA::getTypeInfoHelper only handles array types for + // new struct path TBAA, we must a use a plain access. + EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +} else if (ArrayLV.getTBAAInfo().isMayAlias()) { + EltTBAAInfo = TBAAAccessInfo::getMayAliasInfo(); +} else if (ArrayLV.getTBAAInfo().isIncomplete()) { + // The array element is complete, even if the array is not. + EltTBAAInfo = CGM.getTBAAAccessInfo(E->getType()); +} else { + // Extend struct path from base lvalue, similar to EmitLValueForField. + EltTBAAInfo = ArrayLV.getTBAAInfo(); + // If no base type has been assigned for the array access, there is no + // point trying to generate one, since an array is not a valid base type. + // + // The index into the array is a runtime value. We use the same struct + // path for all array elements (that of the element at index 0). So we + // set the access type and size, but do not have to adjust + // EltTBAAInfo.Offset. brunodf-snps wrote: OK, I've rewritten the code comments for this block. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/rjmccall commented: Thanks, those seem like good reasons. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
@@ -4503,7 +4503,29 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E, E->getType(), !getLangOpts().PointerOverflowDefined, SignedIndices, E->getExprLoc(), &arrayType, E->getBase()); EltBaseInfo = ArrayLV.getBaseInfo(); -EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +if (!CGM.getCodeGenOpts().NewStructPathTBAA) { + // Since CodeGenTBAA::getTypeInfoHelper only handles array types for + // new struct path TBAA, we must a use a plain access. + EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +} else if (ArrayLV.getTBAAInfo().isMayAlias()) { + EltTBAAInfo = TBAAAccessInfo::getMayAliasInfo(); +} else if (ArrayLV.getTBAAInfo().isIncomplete()) { + // The array element is complete, even if the array is not. + EltTBAAInfo = CGM.getTBAAAccessInfo(E->getType()); +} else { + // Extend struct path from base lvalue, similar to EmitLValueForField. + EltTBAAInfo = ArrayLV.getTBAAInfo(); + // If no base type has been assigned for the array access, there is no + // point trying to generate one, since an array is not a valid base type. + // + // The index into the array is a runtime value. We use the same struct + // path for all array elements (that of the element at index 0). So we + // set the access type and size, but do not have to adjust + // EltTBAAInfo.Offset. rjmccall wrote: This comment doesn't seem to match the code anymore. https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/rjmccall edited https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
brunodf-snps wrote: > I'm surprised we need all this. Since we don't distinguish arrays from their > elements in TBAA, is it not sufficient to just make sure that array subscript > expressions produce an l-value with the same TBAA metadata as their base > l-value? The code is written after [this logic](https://github.com/brunodf-snps/llvm-project/blob/bba5ee5ed17af062f91604d3185d733df944df67/clang/lib/CodeGen/CGExpr.cpp#L5093-L5126) from EmitLValueForField. After investigation, I agree not all of it is needed. An array type is not a valid TBAA base type, so I've removed those lines now. But the rest I would keep though: - In `ArrayLV.getTBAAInfo()`, the TBAA access info from the base l-value, we at least have to update the access size (an element is smaller than the entire array). - Except when the access info is `MayAlias`, then we should just stay at `MayAlias`. Note: this triggers for an array like: ``` typedef int __attribute__((may_alias)) aliasing_array[10]; aliasing_array arr; ``` - When the array access info is `Incomplete`, the access to the element is nonetheless ordinary. So we have to handle this specifically. - Despite that we represent arrays as their elements in TBAA, I found a case where also the access type from the array access info is not correct. ``` typedef int __attribute__((may_alias)) aliasing_int; aliasing_int arr[10]; ``` When the TBAA access info is computed for the array, its type will be canonicalized [here](https://github.com/brunodf-snps/llvm-project/blob/bba5ee5ed17af062f91604d3185d733df944df67/clang/lib/CodeGen/CodeGenTBAA.cpp#L381) into `int[10]`, so we end up with the MDNode for "int" as the access type. While if we reset the access type from the element type `aliasing_int`, it will be the correct MDNode of "omnipotent char". https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/brunodf-snps updated https://github.com/llvm/llvm-project/pull/137719 >From d63c8fe4fcc8e89933bf3c1cc176941b0b9094fa Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Mon, 28 Apr 2025 14:12:00 +0200 Subject: [PATCH 1/3] [clang][CodeGen] Make tbaa-array test more robust Avoid unintentional matches against extra load/stores in the unoptimized LLVM IR. --- clang/test/CodeGen/tbaa-array.cpp | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index 4a6576e2eeb7f..ce34e7d586e3e 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -1,6 +1,6 @@ -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -emit-llvm -o - | FileCheck %s -// RUN: %clang_cc1 -triple x86_64-linux -O1 -disable-llvm-passes %s \ +// RUN: %clang_cc1 -triple x86_64-linux -O1 %s \ // RUN: -new-struct-path-tbaa -emit-llvm -o - | \ // RUN: FileCheck -check-prefix=CHECK-NEW %s // @@ -45,7 +45,6 @@ int bar3(C *c, int j) { // CHECK-NEW-DAG: [[TYPE_char:!.*]] = !{{{.*}}, i64 1, !"omnipotent char"} // CHECK-NEW-DAG: [[TYPE_int:!.*]] = !{[[TYPE_char]], i64 4, !"int"} // CHECK-NEW-DAG: [[TAG_int]] = !{[[TYPE_int]], [[TYPE_int]], i64 0, i64 4} -// CHECK-NEW-DAG: [[TYPE_pointer:!.*]] = !{[[TYPE_char]], i64 8, !"any pointer"} // CHECK-NEW-DAG: [[TYPE_A:!.*]] = !{[[TYPE_char]], i64 4, !"_ZTS1A", [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TAG_A_i]] = !{[[TYPE_A]], [[TYPE_int]], i64 0, i64 4} // CHECK-NEW-DAG: [[TYPE_C:!.*]] = !{[[TYPE_char]], i64 16, !"_ZTS1C", [[TYPE_int]], i64 0, i64 4, [[TYPE_int]], i64 4, i64 12} >From bba5ee5ed17af062f91604d3185d733df944df67 Mon Sep 17 00:00:00 2001 From: Bruno De Fraine Date: Tue, 29 Apr 2025 00:07:02 +0200 Subject: [PATCH 2/3] [CodeGen] Add TBAA struct path info for array members This enables the LLVM optimizer to view accesses to distinct struct members as independent, also for array members. For example, the following two stores no longer alias: struct S { int a[10]; int b; }; void test(S *p, int i) { p->a[i] = ...; p->b = ...; } Array members were already added to TBAA struct type nodes in commit 57493e2. Here, we extend a path tag for an array subscript expression. --- clang/lib/CodeGen/CGExpr.cpp | 27 ++- clang/test/CodeGen/tbaa-array.cpp | 21 +++-- 2 files changed, 45 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp index bba7d1e805f3f..c95a54fcebba9 100644 --- a/clang/lib/CodeGen/CGExpr.cpp +++ b/clang/lib/CodeGen/CGExpr.cpp @@ -4503,7 +4503,32 @@ LValue CodeGenFunction::EmitArraySubscriptExpr(const ArraySubscriptExpr *E, E->getType(), !getLangOpts().PointerOverflowDefined, SignedIndices, E->getExprLoc(), &arrayType, E->getBase()); EltBaseInfo = ArrayLV.getBaseInfo(); -EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +if (!CGM.getCodeGenOpts().NewStructPathTBAA) { + // Since CodeGenTBAA::getTypeInfoHelper only handles array types for + // new struct path TBAA, we must a use a plain access. + EltTBAAInfo = CGM.getTBAAInfoForSubobject(ArrayLV, E->getType()); +} else if (ArrayLV.getTBAAInfo().isMayAlias()) { + EltTBAAInfo = TBAAAccessInfo::getMayAliasInfo(); +} else if (ArrayLV.getTBAAInfo().isIncomplete()) { + EltTBAAInfo = CGM.getTBAAAccessInfo(E->getType()); +} else { + // Extend struct path from base lvalue, similar to EmitLValueForField. + // If no base type has been assigned for the array access, then try to + // generate one. + EltTBAAInfo = ArrayLV.getTBAAInfo(); + if (!EltTBAAInfo.BaseType) { +EltTBAAInfo.BaseType = CGM.getTBAABaseTypeInfo(ArrayLV.getType()); +assert(!EltTBAAInfo.Offset && + "Nonzero offset for an access with no base type!"); + } + // The index into the array is a runtime value. We use the same struct + // path for all array elements (that of the element at index 0). So we + // set the access type and size, but do not have to adjust + // EltTBAAInfo.Offset. + EltTBAAInfo.AccessType = CGM.getTBAATypeInfo(E->getType()); + EltTBAAInfo.Size = + getContext().getTypeSizeInChars(E->getType()).getQuantity(); +} } else { // The base must be a pointer; emit it with an estimate of its alignment. Addr = EmitPointerWithAlignment(E->getBase(), &EltBaseInfo, &EltTBAAInfo); diff --git a/clang/test/CodeGen/tbaa-array.cpp b/clang/test/CodeGen/tbaa-array.cpp index ce34e7d586e3e..7cda1dd8d5bf7 100644 --- a/clang/test/CodeGen/tbaa-array.cpp +++ b/clang/test/CodeGen/tbaa-array.cpp @@ -10,6 +10,8 @@ struct A { int i; }; struct B { A a[1]; }; struct C { int i; int x[3]; }; +struct D { int n; int arr[]; }; // flexible array memb
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
https://github.com/rjmccall commented: I'm surprised we need all this. Since we don't distinguish arrays from their elements in TBAA, is it not sufficient to just make sure that array subscript expressions produce an l-value with the same TBAA metadata as their base l-value? https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [CodeGen] Add TBAA struct path info for array members (PR #137719)
brunodf-snps wrote: > @kosarev @fhahn @rjmccall @efriedma-quic @hfinkel Ping. @AaronBallman @nikic any idea who else could review TBAA changes? https://github.com/llvm/llvm-project/pull/137719 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits