[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-04 Thread LLVM Continuous Integration via cfe-commits

llvm-ci wrote:

LLVM Buildbot has detected a new failure on builder `llvm-clang-aarch64-darwin` 
running on `doug-worker-4` while building `clang` at step 6 
"test-build-unified-tree-check-all".

Full details are available at: 
https://lab.llvm.org/buildbot/#/builders/190/builds/26685


Here is the relevant piece of the build log for the reference

```
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
 TEST 'lld :: MinGW/driver.test' FAILED 
Exit Code: 127

Command Output (stdout):
--
# RUN: at line 1
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld -### foo.o -m 
i386pe 2>&1 | /Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=X86 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld '-###' foo.o -m 
i386pe
# note: command had no output on stdout or stderr
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=X86 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# note: command had no output on stdout or stderr
# RUN: at line 7
echo "-### foo.o -m i386pe" > 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/tools/lld/test/MinGW/Output/driver.test.tmp.rsp
# executed command: echo '-### foo.o -m i386pe'
# note: command had no output on stdout or stderr
# RUN: at line 8
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld 
@/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/tools/lld/test/MinGW/Output/driver.test.tmp.rsp
 2>&1 | /Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=X86 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld 
@/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/tools/lld/test/MinGW/Output/driver.test.tmp.rsp
# note: command had no output on stdout or stderr
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=X86 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# note: command had no output on stdout or stderr
# RUN: at line 10
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld -### foo.o -m 
i386pep 2>&1 | 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=X64 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld '-###' foo.o -m 
i386pep
# note: command had no output on stdout or stderr
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=X64 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# note: command had no output on stdout or stderr
# RUN: at line 16
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld -### foo.o -m 
thumb2pe 2>&1 | 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=ARM 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld '-###' foo.o -m 
thumb2pe
# note: command had no output on stdout or stderr
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=ARM 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# note: command had no output on stdout or stderr
# RUN: at line 22
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld -### foo.o -m 
arm64pe 2>&1 | 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=ARM64 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld '-###' foo.o -m 
arm64pe
# note: command had no output on stdout or stderr
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=ARM64 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# note: command had no output on stdout or stderr
# RUN: at line 28
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld -### foo.o -m 
arm64ecpe 2>&1 | 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=ARM64EC 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld '-###' foo.o -m 
arm64ecpe
# note: command had no output on stdout or stderr
# executed command: 
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/FileCheck 
-check-prefix=ARM64EC 
/Users/buildbot/buildbot-root/llvm-project/lld/test/MinGW/driver.test
# note: command had no output on stdout or stderr
# RUN: at line 34
/Volumes/RAMDisk/buildbot-root/aarch64-darwin/build/bin/ld.lld -### foo.o -m 
arm64xpe 2>&1 | 

[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Balázs Benics via cfe-commits

https://github.com/balazs-benics-sonarsource closed 
https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

Thanks for the updates!

https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Balázs Benics via cfe-commits


@@ -1704,16 +1704,23 @@ static RegionOffset calculateOffset(const MemRegion *R) 
{
   if (SymbolicOffsetBase)
 continue;
 
-  // Get the field number.
-  unsigned idx = 0;
-  for (RecordDecl::field_iterator FI = RD->field_begin(),
- FE = RD->field_end(); FI != FE; ++FI, ++idx) {
-if (FR->getDecl() == *FI)
-  break;
+  auto MaybeFieldIdx = [FR, RD]() -> std::optional {
+assert(FR->getDecl()->getCanonicalDecl() == FR->getDecl());

balazs-benics-sonarsource wrote:

Fixed in 1d289652b54d22db2566e63391c86501ecaefff5

https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Balázs Benics via cfe-commits


@@ -1271,7 +1271,7 @@ const SymbolicRegion 
*MemRegionManager::getSymbolicHeapRegion(SymbolRef Sym) {
 const FieldRegion*
 MemRegionManager::getFieldRegion(const FieldDecl *d,
  const SubRegion* superRegion){
-  return getSubRegion(d, superRegion);
+  return getSubRegion(d->getCanonicalDecl(), superRegion);

balazs-benics-sonarsource wrote:

Fixed in 2d18911419310ef16e16e5d567b112f2c8d3805e

https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Balázs Benics via cfe-commits

https://github.com/balazs-benics-sonarsource updated 
https://github.com/llvm/llvm-project/pull/156668

From 5d776f7eec057c51e8d9346492b1d4f4128bd1be Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Wed, 3 Sep 2025 15:18:41 +0200
Subject: [PATCH 1/3] [analyzer] Canonicalize the Decls of FieldRegions

When calculating the offset of a FieldRegion, we need to find out which
field index the given field refers to.
Previously, if for some reason the field was not found, then the `Idx`
passed to `Layout.getFieldOffset` was out of bounds and caused undefined
behavior when dereferenced an out of bounds element in
`ASTVector::FieldOffsets::operator[]`, which asserts this in debug
builds, but exposes the undefined behavior in release builds.

In this patch, I refactored how we enumerate the fields, and gracefully
handle the scenario where the field is not found.
That case is still bad, but at least it should not expose the undefined
behavior in release builds, and should assert earlier in debug builds
than before.

The motivational case was transformed into a regression test, that would
fail if no canonicalization would happen when creating a FieldRegion.
This was reduced from a production crash.
In the test case, due to how modules work, there would be multiple
copies of the same template specialization in the AST.
This could lead into inconsistent state when the FieldRegion's Decl was
different to the RecordDecl's field - because one referred to the first
and the other to the second. This made `calculateOffset` fail to compute
the field index, triggering the undefined behavior in production.

While this inconsistency gets fixed now, I think the assertion is still
warranted to avoid potential undefined behavior in release builds.

CPP-6691,CPP-6849

Co-authored-by: Marco Borgeaud 
---
 clang/lib/StaticAnalyzer/Core/MemRegion.cpp   | 23 +++-
 .../explicit-templ-inst-crash-in-modules.cppm | 35 +++
 2 files changed, 50 insertions(+), 8 deletions(-)
 create mode 100644 
clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm

diff --git a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp 
b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
index 5f271963e3d09..f68fb1e6df759 100644
--- a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
+++ b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
@@ -1271,7 +1271,7 @@ const SymbolicRegion 
*MemRegionManager::getSymbolicHeapRegion(SymbolRef Sym) {
 const FieldRegion*
 MemRegionManager::getFieldRegion(const FieldDecl *d,
  const SubRegion* superRegion){
-  return getSubRegion(d, superRegion);
+  return getSubRegion(d->getCanonicalDecl(), superRegion);
 }
 
 const ObjCIvarRegion*
@@ -1704,16 +1704,23 @@ static RegionOffset calculateOffset(const MemRegion *R) 
{
   if (SymbolicOffsetBase)
 continue;
 
-  // Get the field number.
-  unsigned idx = 0;
-  for (RecordDecl::field_iterator FI = RD->field_begin(),
- FE = RD->field_end(); FI != FE; ++FI, ++idx) {
-if (FR->getDecl() == *FI)
-  break;
+  auto MaybeFieldIdx = [FR, RD]() -> std::optional {
+assert(FR->getDecl()->getCanonicalDecl() == FR->getDecl());
+for (auto [Idx, Field] : llvm::enumerate(RD->fields())) {
+  if (FR->getDecl() == Field->getCanonicalDecl())
+return Idx;
+}
+return std::nullopt;
+  }();
+
+  if (!MaybeFieldIdx.has_value()) {
+assert(false && "Field not found");
+goto Finish; // Invalid offset.
   }
+
   const ASTRecordLayout &Layout = R->getContext().getASTRecordLayout(RD);
   // This is offset in bits.
-  Offset += Layout.getFieldOffset(idx);
+  Offset += Layout.getFieldOffset(MaybeFieldIdx.value());
   break;
 }
 }
diff --git 
a/clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm 
b/clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm
new file mode 100644
index 0..6eec29c7187ba
--- /dev/null
+++ b/clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm
@@ -0,0 +1,35 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// DEFINE: %{common-flags}= -std=c++20 -I %t -fprebuilt-module-path=%t
+//
+// RUN: %clang_cc1 %{common-flags} %t/other.cppm -emit-module-interface -o 
%t/other.pcm
+// RUN: %clang_analyze_cc1 -analyzer-checker=core %{common-flags} 
%t/entry.cppm -verify
+
+
+//--- MyStruct.h
+template  struct MyStruct {
+  T data = 0;
+};
+template struct MyStruct; // Explicit template instantiation.
+
+//--- other.cppm
+module;
+#include "MyStruct.h"
+export module other;
+static void implicit_instantiate_MyStruct() {
+  MyStruct var;
+  (void)var;
+}
+
+//--- entry.cppm
+// expected-no-diagnostics
+module;
+#include "MyStruct.h"
+module other;
+
+void entry_point() {
+  MyStruct var; // no-crash
+  (void)var;
+}

From 2d18911419310ef16e16e5d567b112f2c8d3805e Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Wed, 3 

[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Donát Nagy via cfe-commits


@@ -1271,7 +1271,7 @@ const SymbolicRegion 
*MemRegionManager::getSymbolicHeapRegion(SymbolRef Sym) {
 const FieldRegion*
 MemRegionManager::getFieldRegion(const FieldDecl *d,
  const SubRegion* superRegion){
-  return getSubRegion(d, superRegion);
+  return getSubRegion(d->getCanonicalDecl(), superRegion);

NagyDonat wrote:

As your commit already touches this line, consider `UpperCamelCasing` the 
variable names (as a step in gradually spreading that style).

https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Donát Nagy via cfe-commits


@@ -1704,16 +1704,23 @@ static RegionOffset calculateOffset(const MemRegion *R) 
{
   if (SymbolicOffsetBase)
 continue;
 
-  // Get the field number.
-  unsigned idx = 0;
-  for (RecordDecl::field_iterator FI = RD->field_begin(),
- FE = RD->field_end(); FI != FE; ++FI, ++idx) {
-if (FR->getDecl() == *FI)
-  break;
+  auto MaybeFieldIdx = [FR, RD]() -> std::optional {
+assert(FR->getDecl()->getCanonicalDecl() == FR->getDecl());

NagyDonat wrote:

Please move this assertion out of the immediately executed lambda expression -- 
let's keep it as simple as possible.

Also consider adding something like `&& "This function should always be called 
with a canonical declaration."` because IMO this would be helpful for a reader 
(who otherwise may have thought about a more fundamental reason for asserting 
this). 

https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat edited 
https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Donát Nagy via cfe-commits

https://github.com/NagyDonat approved this pull request.

The patch LGTM, it is always good to see a fix for a crash.

I added two very minor stylistic suggestions in inline comments, but I think 
the change is acceptable even without those changes.

https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balázs Benics (balazs-benics-sonarsource)


Changes

When calculating the offset of a FieldRegion, we need to find out which field 
index the given field refers to.
Previously, if for some reason the field was not found, then the `Idx` passed 
to `Layout.getFieldOffset` was out of bounds and caused undefined behavior when 
dereferenced an out of bounds element in `ASTVector::FieldOffsets::operator[]`, 
which asserts this in debug builds, but exposes the undefined behavior in 
release builds.

In this patch, I refactored how we enumerate the fields, and gracefully handle 
the scenario where the field is not found.
That case is still bad, but at least it should not expose the undefined 
behavior in release builds, and should assert earlier in debug builds than 
before.

The motivational case was transformed into a regression test, that would fail 
if no canonicalization would happen when creating a FieldRegion. This was 
reduced from a production crash.
In the test case, due to how modules work, there would be multiple copies of 
the same template specialization in the AST. This could lead into inconsistent 
state when the FieldRegion's Decl was different to the RecordDecl's field - 
because one referred to the first and the other to the second. This made 
`calculateOffset` fail to compute the field index, triggering the undefined 
behavior in production.

While this inconsistency gets fixed now, I think the assertion is still 
warranted to avoid potential undefined behavior in release builds.

CPP-6691,CPP-6849

---
Full diff: https://github.com/llvm/llvm-project/pull/156668.diff


2 Files Affected:

- (modified) clang/lib/StaticAnalyzer/Core/MemRegion.cpp (+15-8) 
- (added) clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm 
(+35) 


``diff
diff --git a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp 
b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
index 5f271963e3d09..f68fb1e6df759 100644
--- a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
+++ b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
@@ -1271,7 +1271,7 @@ const SymbolicRegion 
*MemRegionManager::getSymbolicHeapRegion(SymbolRef Sym) {
 const FieldRegion*
 MemRegionManager::getFieldRegion(const FieldDecl *d,
  const SubRegion* superRegion){
-  return getSubRegion(d, superRegion);
+  return getSubRegion(d->getCanonicalDecl(), superRegion);
 }
 
 const ObjCIvarRegion*
@@ -1704,16 +1704,23 @@ static RegionOffset calculateOffset(const MemRegion *R) 
{
   if (SymbolicOffsetBase)
 continue;
 
-  // Get the field number.
-  unsigned idx = 0;
-  for (RecordDecl::field_iterator FI = RD->field_begin(),
- FE = RD->field_end(); FI != FE; ++FI, ++idx) {
-if (FR->getDecl() == *FI)
-  break;
+  auto MaybeFieldIdx = [FR, RD]() -> std::optional {
+assert(FR->getDecl()->getCanonicalDecl() == FR->getDecl());
+for (auto [Idx, Field] : llvm::enumerate(RD->fields())) {
+  if (FR->getDecl() == Field->getCanonicalDecl())
+return Idx;
+}
+return std::nullopt;
+  }();
+
+  if (!MaybeFieldIdx.has_value()) {
+assert(false && "Field not found");
+goto Finish; // Invalid offset.
   }
+
   const ASTRecordLayout &Layout = R->getContext().getASTRecordLayout(RD);
   // This is offset in bits.
-  Offset += Layout.getFieldOffset(idx);
+  Offset += Layout.getFieldOffset(MaybeFieldIdx.value());
   break;
 }
 }
diff --git 
a/clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm 
b/clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm
new file mode 100644
index 0..6eec29c7187ba
--- /dev/null
+++ b/clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm
@@ -0,0 +1,35 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+// RUN: split-file %s %t
+//
+// DEFINE: %{common-flags}= -std=c++20 -I %t -fprebuilt-module-path=%t
+//
+// RUN: %clang_cc1 %{common-flags} %t/other.cppm -emit-module-interface -o 
%t/other.pcm
+// RUN: %clang_analyze_cc1 -analyzer-checker=core %{common-flags} 
%t/entry.cppm -verify
+
+
+//--- MyStruct.h
+template  struct MyStruct {
+  T data = 0;
+};
+template struct MyStruct; // Explicit template instantiation.
+
+//--- other.cppm
+module;
+#include "MyStruct.h"
+export module other;
+static void implicit_instantiate_MyStruct() {
+  MyStruct var;
+  (void)var;
+}
+
+//--- entry.cppm
+// expected-no-diagnostics
+module;
+#include "MyStruct.h"
+module other;
+
+void entry_point() {
+  MyStruct var; // no-crash
+  (void)var;
+}

``




https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Balázs Benics via cfe-commits

https://github.com/balazs-benics-sonarsource edited 
https://github.com/llvm/llvm-project/pull/156668
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [analyzer] Canonicalize the Decls of FieldRegions (PR #156668)

2025-09-03 Thread Balázs Benics via cfe-commits

https://github.com/balazs-benics-sonarsource created 
https://github.com/llvm/llvm-project/pull/156668

When calculating the offset of a FieldRegion, we need to find out which field 
index the given field refers to.
Previously, if for some reason the field was not found, then the `Idx` passed 
to `Layout.getFieldOffset` was out of bounds and caused undefined behavior when 
dereferenced an out of bounds element in `ASTVector::FieldOffsets::operator[]`, 
which asserts this in debug builds, but exposes the undefined behavior in 
release builds.

In this patch, I refactored how we enumerate the fields, and gracefully handle 
the scenario where the field is not found.
That case is still bad, but at least it should not expose the undefined 
behavior in release builds, and should assert earlier in debug builds than 
before.

The motivational case was transformed into a regression test, that would fail 
if no canonicalization would happen when creating a FieldRegion. This was 
reduced from a production crash.
In the test case, due to how modules work, there would be multiple copies of 
the same template specialization in the AST. This could lead into inconsistent 
state when the FieldRegion's Decl was different to the RecordDecl's field - 
because one referred to the first and the other to the second. This made 
`calculateOffset` fail to compute the field index, triggering the undefined 
behavior in production.

While this inconsistency gets fixed now, I think the assertion is still 
warranted to avoid potential undefined behavior in release builds.

CPP-6691,CPP-6849

Co-authored-by: Marco Borgeaud 

From 5d776f7eec057c51e8d9346492b1d4f4128bd1be Mon Sep 17 00:00:00 2001
From: Balazs Benics 
Date: Wed, 3 Sep 2025 15:18:41 +0200
Subject: [PATCH] [analyzer] Canonicalize the Decls of FieldRegions

When calculating the offset of a FieldRegion, we need to find out which
field index the given field refers to.
Previously, if for some reason the field was not found, then the `Idx`
passed to `Layout.getFieldOffset` was out of bounds and caused undefined
behavior when dereferenced an out of bounds element in
`ASTVector::FieldOffsets::operator[]`, which asserts this in debug
builds, but exposes the undefined behavior in release builds.

In this patch, I refactored how we enumerate the fields, and gracefully
handle the scenario where the field is not found.
That case is still bad, but at least it should not expose the undefined
behavior in release builds, and should assert earlier in debug builds
than before.

The motivational case was transformed into a regression test, that would
fail if no canonicalization would happen when creating a FieldRegion.
This was reduced from a production crash.
In the test case, due to how modules work, there would be multiple
copies of the same template specialization in the AST.
This could lead into inconsistent state when the FieldRegion's Decl was
different to the RecordDecl's field - because one referred to the first
and the other to the second. This made `calculateOffset` fail to compute
the field index, triggering the undefined behavior in production.

While this inconsistency gets fixed now, I think the assertion is still
warranted to avoid potential undefined behavior in release builds.

CPP-6691,CPP-6849

Co-authored-by: Marco Borgeaud 
---
 clang/lib/StaticAnalyzer/Core/MemRegion.cpp   | 23 +++-
 .../explicit-templ-inst-crash-in-modules.cppm | 35 +++
 2 files changed, 50 insertions(+), 8 deletions(-)
 create mode 100644 
clang/test/Analysis/modules/explicit-templ-inst-crash-in-modules.cppm

diff --git a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp 
b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
index 5f271963e3d09..f68fb1e6df759 100644
--- a/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
+++ b/clang/lib/StaticAnalyzer/Core/MemRegion.cpp
@@ -1271,7 +1271,7 @@ const SymbolicRegion 
*MemRegionManager::getSymbolicHeapRegion(SymbolRef Sym) {
 const FieldRegion*
 MemRegionManager::getFieldRegion(const FieldDecl *d,
  const SubRegion* superRegion){
-  return getSubRegion(d, superRegion);
+  return getSubRegion(d->getCanonicalDecl(), superRegion);
 }
 
 const ObjCIvarRegion*
@@ -1704,16 +1704,23 @@ static RegionOffset calculateOffset(const MemRegion *R) 
{
   if (SymbolicOffsetBase)
 continue;
 
-  // Get the field number.
-  unsigned idx = 0;
-  for (RecordDecl::field_iterator FI = RD->field_begin(),
- FE = RD->field_end(); FI != FE; ++FI, ++idx) {
-if (FR->getDecl() == *FI)
-  break;
+  auto MaybeFieldIdx = [FR, RD]() -> std::optional {
+assert(FR->getDecl()->getCanonicalDecl() == FR->getDecl());
+for (auto [Idx, Field] : llvm::enumerate(RD->fields())) {
+  if (FR->getDecl() == Field->getCanonicalDecl())
+return Idx;
+}
+return std::nullopt;
+  }();
+
+  if (!MaybeFieldIdx.has_value()) {
+assert(false && "Field not fo