https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/175767
## Summary
`__builtin_amdgcn_global_load_lds` and related builtins crash when the size
argument is a template parameter. The semantic checker calls
`VerifyIntegerConstantExpression` without first checking if the expression is
value-dependent, causing an assertion failure.
**Affected builtins:**
- `__builtin_amdgcn_global_load_lds`
- `__builtin_amdgcn_raw_ptr_buffer_load_lds`
- `__builtin_amdgcn_struct_ptr_buffer_load_lds`
- `__builtin_amdgcn_load_to_lds`
**Reproducer:**
```cpp
template<int N>
__device__ void load(void* src, __shared__ void* dst) {
__builtin_amdgcn_global_load_lds(src, dst, N, 0, 0); // Crashes
}
template __device__ void load<16>(void*, __shared__ void*);
```
**Error before fix:**
```
Assertion `!isValueDependent() && "Expression evaluator can't be called on a
dependent expression."' failed
```
**Fix:** Add `isValueDependent()` check before evaluating the constant
expression. Value-dependent expressions are accepted during template definition
and properly checked during template instantiation.
## Test plan
- [x] New lit test `clang/test/SemaHIP/amdgpu-global-load-lds-template.hip`
- Tests valid template parameter sizes compile without errors
- Tests invalid sizes are caught at template instantiation
- [x] All existing SemaHIP tests pass (17/17)
- [x] Related builtin tests pass
>From a3cabc1e35f627de0152fea86b60d72a2cd5e5df Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu" <[email protected]>
Date: Tue, 13 Jan 2026 09:35:20 -0500
Subject: [PATCH] [AMDGPU] Fix builtin crash with template parameter size
argument
__builtin_amdgcn_global_load_lds and related builtins (raw_ptr_buffer_load_lds,
struct_ptr_buffer_load_lds, load_to_lds) crash when the size argument is a
template parameter because the semantic checker calls
VerifyIntegerConstantExpression
without first checking if the expression is value-dependent.
This causes an assertion failure:
"Expression evaluator can't be called on a dependent expression."
The fix adds an isValueDependent() check before evaluating the constant
expression. Value-dependent expressions (like template parameters) are
accepted during template definition and properly checked during template
instantiation.
Fixes a regression reported by the FBGEMM team when building with ROCm 7.2
for gfx950.
---
clang/lib/Sema/SemaAMDGPU.cpp | 4 ++
.../amdgpu-global-load-lds-template.hip | 59 +++++++++++++++++++
2 files changed, 63 insertions(+)
create mode 100644 clang/test/SemaHIP/amdgpu-global-load-lds-template.hip
diff --git a/clang/lib/Sema/SemaAMDGPU.cpp b/clang/lib/Sema/SemaAMDGPU.cpp
index 9d154c65c932e..917da7360e7cf 100644
--- a/clang/lib/Sema/SemaAMDGPU.cpp
+++ b/clang/lib/Sema/SemaAMDGPU.cpp
@@ -44,6 +44,10 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned
BuiltinID,
constexpr const int SizeIdx = 2;
llvm::APSInt Size;
Expr *ArgExpr = TheCall->getArg(SizeIdx);
+ // Check for value-dependent expressions (e.g., template parameters).
+ // These will be checked again during template instantiation.
+ if (ArgExpr->isValueDependent())
+ return false;
[[maybe_unused]] ExprResult R =
SemaRef.VerifyIntegerConstantExpression(ArgExpr, &Size);
assert(!R.isInvalid());
diff --git a/clang/test/SemaHIP/amdgpu-global-load-lds-template.hip
b/clang/test/SemaHIP/amdgpu-global-load-lds-template.hip
new file mode 100644
index 0000000000000..48c5bdc7aa35b
--- /dev/null
+++ b/clang/test/SemaHIP/amdgpu-global-load-lds-template.hip
@@ -0,0 +1,59 @@
+// RUN: %clang_cc1 -std=c++20 -triple amdgcn -target-cpu gfx950 -fsyntax-only
-fcuda-is-device -verify %s
+
+// Test that __builtin_amdgcn_global_load_lds and related builtins accept
+// template parameters as the size argument. These are value-dependent
+// expressions that should be accepted during template definition and
+// checked during template instantiation.
+
+#define __device__ __attribute__((device))
+#define __shared__ __attribute__((shared))
+
+template<int N>
+__device__ void test_global_load_lds_template(void* src, __shared__ void* dst)
{
+ // Template parameter should be accepted as size argument
+ __builtin_amdgcn_global_load_lds(src, dst, N, 0, 0); // #load_lds
+}
+
+template<int N>
+__device__ void test_load_to_lds_template(void* src, __shared__ void* dst) {
+ __builtin_amdgcn_load_to_lds(src, dst, N, 0, 0); // #load_to_lds
+}
+
+// Test with valid sizes - these should compile without errors
+template __device__ void test_global_load_lds_template<1>(void*, __shared__
void*);
+template __device__ void test_global_load_lds_template<2>(void*, __shared__
void*);
+template __device__ void test_global_load_lds_template<4>(void*, __shared__
void*);
+template __device__ void test_global_load_lds_template<12>(void*, __shared__
void*);
+template __device__ void test_global_load_lds_template<16>(void*, __shared__
void*);
+
+template __device__ void test_load_to_lds_template<1>(void*, __shared__ void*);
+template __device__ void test_load_to_lds_template<2>(void*, __shared__ void*);
+template __device__ void test_load_to_lds_template<4>(void*, __shared__ void*);
+template __device__ void test_load_to_lds_template<12>(void*, __shared__
void*);
+template __device__ void test_load_to_lds_template<16>(void*, __shared__
void*);
+
+// Test with constexpr computed from template parameter
+template<int N>
+__device__ void test_computed_size(void* src, __shared__ void* dst) {
+ constexpr int Size = N * 2;
+ __builtin_amdgcn_global_load_lds(src, dst, Size, 0, 0); // #computed
+}
+
+template __device__ void test_computed_size<2>(void*, __shared__ void*); //
Size = 4
+template __device__ void test_computed_size<8>(void*, __shared__ void*); //
Size = 16
+
+// Test that invalid sizes are caught at instantiation time
+template __device__ void test_global_load_lds_template<5>(void*, __shared__
void*); // #inst1
+// expected-error@#load_lds {{invalid size value}}
+// expected-note@#load_lds {{size must be 1, 2, 4, 12 or 16}}
+// expected-note@#inst1 {{in instantiation of function template specialization
'test_global_load_lds_template<5>' requested here}}
+
+template __device__ void test_load_to_lds_template<7>(void*, __shared__
void*); // #inst2
+// expected-error@#load_to_lds {{invalid size value}}
+// expected-note@#load_to_lds {{size must be 1, 2, 4, 12 or 16}}
+// expected-note@#inst2 {{in instantiation of function template specialization
'test_load_to_lds_template<7>' requested here}}
+
+template __device__ void test_computed_size<3>(void*, __shared__ void*); //
#inst3
+// expected-error@#computed {{invalid size value}}
+// expected-note@#computed {{size must be 1, 2, 4, 12 or 16}}
+// expected-note@#inst3 {{in instantiation of function template specialization
'test_computed_size<3>' requested here}}
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits