llvmorg-github-actions[bot] wrote:

<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang

Author: Igor Gorban (igorban-intel)

<details>
<summary>Changes</summary>

Add CodeGen tests and extend Sema-negative tests for the Intel OpenCL
extension builtins declared in OpenCLBuiltins.td

New CodeGen tests:

  clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl
  clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl
  clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl
  clang/test/CodeGenOpenCL/intel-subgroups-builtins.cl
  clang/test/CodeGenOpenCL/intel-subgroups-char-builtins.cl
  clang/test/CodeGenOpenCL/intel-subgroups-long-builtins.cl
  clang/test/CodeGenOpenCL/intel-subgroups-short-builtins.cl

Extended Sema -verify tests covering the same extensions:

  clang/test/SemaOpenCL/intel-bfloat16-conversions-builtins.cl
  clang/test/SemaOpenCL/intel-subgroup-buffer-prefetch-builtins.cl
  clang/test/SemaOpenCL/intel-subgroup-local-block-io-builtins.cl
  clang/test/SemaOpenCL/intel-subgroups-builtins.cl
  clang/test/SemaOpenCL/intel-subgroups-char-builtins.cl
  clang/test/SemaOpenCL/intel-subgroups-long-builtins.cl
  clang/test/SemaOpenCL/intel-subgroups-short-builtins.cl

Extensions covered:

  cl_intel_bfloat16_conversions
  cl_intel_subgroup_buffer_prefetch
  cl_intel_subgroup_local_block_io
  cl_intel_subgroups
  cl_intel_subgroups_char
  cl_intel_subgroups_long
  cl_intel_subgroups_short

No functional change to the compiler; tests only.

---

Patch is 121.92 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/199968.diff


14 Files Affected:

- (added) clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl (+162) 
- (added) clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl 
(+96) 
- (added) clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl 
(+319) 
- (added) clang/test/CodeGenOpenCL/intel-subgroups-builtins.cl (+239) 
- (added) clang/test/CodeGenOpenCL/intel-subgroups-char-builtins.cl (+185) 
- (added) clang/test/CodeGenOpenCL/intel-subgroups-long-builtins.cl (+143) 
- (added) clang/test/CodeGenOpenCL/intel-subgroups-short-builtins.cl (+182) 
- (modified) clang/test/SemaOpenCL/intel-bfloat16-conversions-builtins.cl 
(+22-1) 
- (modified) clang/test/SemaOpenCL/intel-subgroup-buffer-prefetch-builtins.cl 
(+42) 
- (modified) clang/test/SemaOpenCL/intel-subgroup-local-block-io-builtins.cl 
(+57) 
- (modified) clang/test/SemaOpenCL/intel-subgroups-builtins.cl (+39) 
- (modified) clang/test/SemaOpenCL/intel-subgroups-char-builtins.cl (+32) 
- (modified) clang/test/SemaOpenCL/intel-subgroups-long-builtins.cl (+21-1) 
- (modified) clang/test/SemaOpenCL/intel-subgroups-short-builtins.cl (+32) 


``````````diff
diff --git a/clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl 
b/clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl
new file mode 100644
index 0000000000000..d0a0e97150ad6
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl
@@ -0,0 +1,162 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 6
+// RUN: %clang_cc1 %s -triple spir-unknown-unknown -finclude-default-header 
-fdeclare-opencl-builtins -cl-std=CL3.0 -emit-llvm -o - -O0 | FileCheck %s
+
+// CHECK-LABEL: define dso_local spir_func zeroext i16 
@test_convert_bfloat16_as_ushort(
+// CHECK-SAME: float noundef [[SOURCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca float, align 4
+// CHECK-NEXT:    store float [[SOURCE]], ptr [[SOURCE_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load float, ptr [[SOURCE_ADDR]], align 4
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func zeroext i16 
@_Z32intel_convert_bfloat16_as_ushortf(float noundef [[TMP0]]) #[[ATTR2:[0-9]+]]
+// CHECK-NEXT:    ret i16 [[CALL]]
+//
+ushort test_convert_bfloat16_as_ushort(float source) {
+  return intel_convert_bfloat16_as_ushort(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <2 x i16> 
@test_convert_bfloat162_as_ushort2(
+// CHECK-SAME: <2 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <2 x float>, align 8
+// CHECK-NEXT:    store <2 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load <2 x float>, ptr [[SOURCE_ADDR]], align 8
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <2 x i16> 
@_Z34intel_convert_bfloat162_as_ushort2Dv2_f(<2 x float> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <2 x i16> [[CALL]]
+//
+ushort2 test_convert_bfloat162_as_ushort2(float2 source) {
+  return intel_convert_bfloat162_as_ushort2(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <3 x i16> 
@test_convert_bfloat163_as_ushort3(
+// CHECK-SAME: <3 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <3 x float>, align 16
+// CHECK-NEXT:    [[EXTRACTVEC:%.*]] = shufflevector <3 x float> [[SOURCE]], 
<3 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+// CHECK-NEXT:    store <4 x float> [[EXTRACTVEC]], ptr [[SOURCE_ADDR]], align 
16
+// CHECK-NEXT:    [[LOADVECN:%.*]] = load <4 x float>, ptr [[SOURCE_ADDR]], 
align 16
+// CHECK-NEXT:    [[EXTRACTVEC1:%.*]] = shufflevector <4 x float> 
[[LOADVECN]], <4 x float> poison, <3 x i32> <i32 0, i32 1, i32 2>
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <3 x i16> 
@_Z34intel_convert_bfloat163_as_ushort3Dv3_f(<3 x float> noundef 
[[EXTRACTVEC1]]) #[[ATTR2]]
+// CHECK-NEXT:    ret <3 x i16> [[CALL]]
+//
+ushort3 test_convert_bfloat163_as_ushort3(float3 source) {
+  return intel_convert_bfloat163_as_ushort3(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <4 x i16> 
@test_convert_bfloat164_as_ushort4(
+// CHECK-SAME: <4 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <4 x float>, align 16
+// CHECK-NEXT:    store <4 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 16
+// CHECK-NEXT:    [[TMP0:%.*]] = load <4 x float>, ptr [[SOURCE_ADDR]], align 
16
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <4 x i16> 
@_Z34intel_convert_bfloat164_as_ushort4Dv4_f(<4 x float> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <4 x i16> [[CALL]]
+//
+ushort4 test_convert_bfloat164_as_ushort4(float4 source) {
+  return intel_convert_bfloat164_as_ushort4(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <8 x i16> 
@test_convert_bfloat168_as_ushort8(
+// CHECK-SAME: <8 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <8 x float>, align 32
+// CHECK-NEXT:    store <8 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 32
+// CHECK-NEXT:    [[TMP0:%.*]] = load <8 x float>, ptr [[SOURCE_ADDR]], align 
32
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <8 x i16> 
@_Z34intel_convert_bfloat168_as_ushort8Dv8_f(<8 x float> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <8 x i16> [[CALL]]
+//
+ushort8 test_convert_bfloat168_as_ushort8(float8 source) {
+  return intel_convert_bfloat168_as_ushort8(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <16 x i16> 
@test_convert_bfloat1616_as_ushort16(
+// CHECK-SAME: <16 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <16 x float>, align 64
+// CHECK-NEXT:    store <16 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 64
+// CHECK-NEXT:    [[TMP0:%.*]] = load <16 x float>, ptr [[SOURCE_ADDR]], align 
64
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <16 x i16> 
@_Z36intel_convert_bfloat1616_as_ushort16Dv16_f(<16 x float> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <16 x i16> [[CALL]]
+//
+ushort16 test_convert_bfloat1616_as_ushort16(float16 source) {
+  return intel_convert_bfloat1616_as_ushort16(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func float 
@test_convert_as_bfloat16_float(
+// CHECK-SAME: i16 noundef zeroext [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca i16, align 2
+// CHECK-NEXT:    store i16 [[SOURCE]], ptr [[SOURCE_ADDR]], align 2
+// CHECK-NEXT:    [[TMP0:%.*]] = load i16, ptr [[SOURCE_ADDR]], align 2
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func float 
@_Z31intel_convert_as_bfloat16_floatt(i16 noundef zeroext [[TMP0]]) #[[ATTR2]]
+// CHECK-NEXT:    ret float [[CALL]]
+//
+float test_convert_as_bfloat16_float(ushort source) {
+  return intel_convert_as_bfloat16_float(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <2 x float> 
@test_convert_as_bfloat162_float2(
+// CHECK-SAME: <2 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <2 x i16>, align 4
+// CHECK-NEXT:    store <2 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load <2 x i16>, ptr [[SOURCE_ADDR]], align 4
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <2 x float> 
@_Z33intel_convert_as_bfloat162_float2Dv2_t(<2 x i16> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <2 x float> [[CALL]]
+//
+float2 test_convert_as_bfloat162_float2(ushort2 source) {
+  return intel_convert_as_bfloat162_float2(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <3 x float> 
@test_convert_as_bfloat163_float3(
+// CHECK-SAME: <3 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <3 x i16>, align 8
+// CHECK-NEXT:    [[EXTRACTVEC:%.*]] = shufflevector <3 x i16> [[SOURCE]], <3 
x i16> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+// CHECK-NEXT:    store <4 x i16> [[EXTRACTVEC]], ptr [[SOURCE_ADDR]], align 8
+// CHECK-NEXT:    [[LOADVECN:%.*]] = load <4 x i16>, ptr [[SOURCE_ADDR]], 
align 8
+// CHECK-NEXT:    [[EXTRACTVEC1:%.*]] = shufflevector <4 x i16> [[LOADVECN]], 
<4 x i16> poison, <3 x i32> <i32 0, i32 1, i32 2>
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <3 x float> 
@_Z33intel_convert_as_bfloat163_float3Dv3_t(<3 x i16> noundef [[EXTRACTVEC1]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <3 x float> [[CALL]]
+//
+float3 test_convert_as_bfloat163_float3(ushort3 source) {
+  return intel_convert_as_bfloat163_float3(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <4 x float> 
@test_convert_as_bfloat164_float4(
+// CHECK-SAME: <4 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <4 x i16>, align 8
+// CHECK-NEXT:    store <4 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 8
+// CHECK-NEXT:    [[TMP0:%.*]] = load <4 x i16>, ptr [[SOURCE_ADDR]], align 8
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <4 x float> 
@_Z33intel_convert_as_bfloat164_float4Dv4_t(<4 x i16> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <4 x float> [[CALL]]
+//
+float4 test_convert_as_bfloat164_float4(ushort4 source) {
+  return intel_convert_as_bfloat164_float4(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <8 x float> 
@test_convert_as_bfloat168_float8(
+// CHECK-SAME: <8 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <8 x i16>, align 16
+// CHECK-NEXT:    store <8 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 16
+// CHECK-NEXT:    [[TMP0:%.*]] = load <8 x i16>, ptr [[SOURCE_ADDR]], align 16
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <8 x float> 
@_Z33intel_convert_as_bfloat168_float8Dv8_t(<8 x i16> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <8 x float> [[CALL]]
+//
+float8 test_convert_as_bfloat168_float8(ushort8 source) {
+  return intel_convert_as_bfloat168_float8(source);
+}
+
+// CHECK-LABEL: define dso_local spir_func <16 x float> 
@test_convert_as_bfloat1616_float16(
+// CHECK-SAME: <16 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[SOURCE_ADDR:%.*]] = alloca <16 x i16>, align 32
+// CHECK-NEXT:    store <16 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 32
+// CHECK-NEXT:    [[TMP0:%.*]] = load <16 x i16>, ptr [[SOURCE_ADDR]], align 32
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func <16 x float> 
@_Z35intel_convert_as_bfloat1616_float16Dv16_t(<16 x i16> noundef [[TMP0]]) 
#[[ATTR2]]
+// CHECK-NEXT:    ret <16 x float> [[CALL]]
+//
+float16 test_convert_as_bfloat1616_float16(ushort16 source) {
+  return intel_convert_as_bfloat1616_float16(source);
+}
diff --git 
a/clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl 
b/clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl
new file mode 100644
index 0000000000000..6a543431e5751
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl
@@ -0,0 +1,96 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 6
+// RUN: %clang_cc1 %s -triple spir-unknown-unknown -finclude-default-header 
-fdeclare-opencl-builtins -cl-std=CL3.0 -emit-llvm -o - -O0 | FileCheck %s
+
+// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_ui(
+// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4
+// CHECK-NEXT:    store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z33intel_sub_group_block_prefetch_uiPU3AS1Kj(ptr addrspace(1) noundef 
[[TMP0]]) #[[ATTR2:[0-9]+]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_ui2PU3AS1Kj(ptr addrspace(1) noundef 
[[TMP1]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_ui4PU3AS1Kj(ptr addrspace(1) noundef 
[[TMP2]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_ui8PU3AS1Kj(ptr addrspace(1) noundef 
[[TMP3]]) #[[ATTR2]]
+// CHECK-NEXT:    ret void
+//
+void test_block_prefetch_ui(const __global uint *in) {
+  intel_sub_group_block_prefetch_ui(in);
+  intel_sub_group_block_prefetch_ui2(in);
+  intel_sub_group_block_prefetch_ui4(in);
+  intel_sub_group_block_prefetch_ui8(in);
+}
+
+// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_us(
+// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4
+// CHECK-NEXT:    store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z33intel_sub_group_block_prefetch_usPU3AS1Kt(ptr addrspace(1) noundef 
[[TMP0]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_us2PU3AS1Kt(ptr addrspace(1) noundef 
[[TMP1]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_us4PU3AS1Kt(ptr addrspace(1) noundef 
[[TMP2]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_us8PU3AS1Kt(ptr addrspace(1) noundef 
[[TMP3]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP4:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z35intel_sub_group_block_prefetch_us16PU3AS1Kt(ptr addrspace(1) noundef 
[[TMP4]]) #[[ATTR2]]
+// CHECK-NEXT:    ret void
+//
+void test_block_prefetch_us(const __global ushort *in) {
+  intel_sub_group_block_prefetch_us(in);
+  intel_sub_group_block_prefetch_us2(in);
+  intel_sub_group_block_prefetch_us4(in);
+  intel_sub_group_block_prefetch_us8(in);
+  intel_sub_group_block_prefetch_us16(in);
+}
+
+// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_uc(
+// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4
+// CHECK-NEXT:    store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z33intel_sub_group_block_prefetch_ucPU3AS1Kh(ptr addrspace(1) noundef 
[[TMP0]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_uc2PU3AS1Kh(ptr addrspace(1) noundef 
[[TMP1]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_uc4PU3AS1Kh(ptr addrspace(1) noundef 
[[TMP2]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_uc8PU3AS1Kh(ptr addrspace(1) noundef 
[[TMP3]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP4:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z35intel_sub_group_block_prefetch_uc16PU3AS1Kh(ptr addrspace(1) noundef 
[[TMP4]]) #[[ATTR2]]
+// CHECK-NEXT:    ret void
+//
+void test_block_prefetch_uc(const __global uchar *in) {
+  intel_sub_group_block_prefetch_uc(in);
+  intel_sub_group_block_prefetch_uc2(in);
+  intel_sub_group_block_prefetch_uc4(in);
+  intel_sub_group_block_prefetch_uc8(in);
+  intel_sub_group_block_prefetch_uc16(in);
+}
+
+// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_ul(
+// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4
+// CHECK-NEXT:    store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z33intel_sub_group_block_prefetch_ulPU3AS1Km(ptr addrspace(1) noundef 
[[TMP0]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_ul2PU3AS1Km(ptr addrspace(1) noundef 
[[TMP1]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_ul4PU3AS1Km(ptr addrspace(1) noundef 
[[TMP2]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    call spir_func void 
@_Z34intel_sub_group_block_prefetch_ul8PU3AS1Km(ptr addrspace(1) noundef 
[[TMP3]]) #[[ATTR2]]
+// CHECK-NEXT:    ret void
+//
+void test_block_prefetch_ul(const __global ulong *in) {
+  intel_sub_group_block_prefetch_ul(in);
+  intel_sub_group_block_prefetch_ul2(in);
+  intel_sub_group_block_prefetch_ul4(in);
+  intel_sub_group_block_prefetch_ul8(in);
+}
diff --git a/clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl 
b/clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl
new file mode 100644
index 0000000000000..63253ceb40384
--- /dev/null
+++ b/clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl
@@ -0,0 +1,319 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --version 6
+// RUN: %clang_cc1 %s -triple spir-unknown-unknown -finclude-default-header 
-fdeclare-opencl-builtins -cl-std=CL3.0 -emit-llvm -o - -O0 | FileCheck %s
+
+// CHECK-LABEL: define dso_local spir_func void @test_block_read_local(
+// CHECK-SAME: ptr addrspace(3) noundef [[IN:%.*]], ptr addrspace(3) noundef 
[[OUT:%.*]], i32 noundef [[VALUE:%.*]], <2 x i32> noundef [[VALUE2:%.*]], <4 x 
i32> noundef [[VALUE4:%.*]], <8 x i32> noundef [[VALUE8:%.*]]) 
#[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[IN_ADDR:%.*]] = alloca ptr addrspace(3), align 4
+// CHECK-NEXT:    [[OUT_ADDR:%.*]] = alloca ptr addrspace(3), align 4
+// CHECK-NEXT:    [[VALUE_ADDR:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[VALUE2_ADDR:%.*]] = alloca <2 x i32>, align 8
+// CHECK-NEXT:    [[VALUE4_ADDR:%.*]] = alloca <4 x i32>, align 16
+// CHECK-NEXT:    [[VALUE8_ADDR:%.*]] = alloca <8 x i32>, align 32
+// CHECK-NEXT:    [[V:%.*]] = alloca i32, align 4
+// CHECK-NEXT:    [[V2:%.*]] = alloca <2 x i32>, align 8
+// CHECK-NEXT:    [[V4:%.*]] = alloca <4 x i32>, align 16
+// CHECK-NEXT:    [[V8:%.*]] = alloca <8 x i32>, align 32
+// CHECK-NEXT:    store ptr addrspace(3) [[IN]], ptr [[IN_ADDR]], align 4
+// CHECK-NEXT:    store ptr addrspace(3) [[OUT]], ptr [[OUT_ADDR]], align 4
+// CHECK-NEXT:    store i32 [[VALUE]], ptr [[VALUE_ADDR]], align 4
+// CHECK-NEXT:    store <2 x i32> [[VALUE2]], ptr [[VALUE2_ADDR]], align 8
+// CHECK-NEXT:    store <4 x i32> [[VALUE4]], ptr [[VALUE4_ADDR]], align 16
+// CHECK-NEXT:    store <8 x i32> [[VALUE8]], ptr [[VALUE8_ADDR]], align 32
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    [[CALL:%.*]] = call spir_func i32 
@_Z26intel_sub_group_block_readPU3AS3Kj(ptr addrspace(3) noundef [[TMP0]]) 
#[[ATTR2:[0-9]+]]
+// CHECK-NEXT:    store i32 [[CALL]], ptr [[V]], align 4
+// CHECK-NEXT:    [[TMP1:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    [[CALL1:%.*]] = call spir_func <2 x i32> 
@_Z27intel_sub_group_block_read2PU3AS3Kj(ptr addrspace(3) noundef [[TMP1]]) 
#[[ATTR2]]
+// CHECK-NEXT:    store <2 x i32> [[CALL1]], ptr [[V2]], align 8
+// CHECK-NEXT:    [[TMP2:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    [[CALL2:%.*]] = call spir_func <4 x i32> 
@_Z27intel_sub_group_block_read4PU3AS3Kj(ptr addrspace(3) noundef [[TMP2]]) 
#[[ATTR2]]
+// CHECK-NEXT:    store <4 x i32> [[CALL2]], ptr [[V4]], align 16
+// CHECK-NEXT:    [[TMP3:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 
4
+// CHECK-NEXT:    [[CALL3:%.*]] = call spir_func <8 x i32> 
@_Z27intel_sub_group_block_read8PU3AS3Kj(ptr addrspace(3) noundef [[TMP3]]) 
#[[ATTR2]]
+// CHECK-NEXT:    store <8 x i32> [[CALL3]], ptr [[V8]], align 32
+// CHECK-NEXT:    [[TMP4:%.*]] = load ptr addrspace(3), ptr [[OUT_ADDR]], 
align 4
+// CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[VALUE_ADDR]], align 4
+// CHECK-NEXT:    call spir_func void 
@_Z27intel_sub_group_block_writePU3AS3jj(ptr addrspace(3) noundef [[TMP4]], i32 
noundef [[TMP5]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP6:%.*]] = load ptr addrspace(3), ptr [[OUT_ADDR]], 
align 4
+// CHECK-NEXT:    [[TMP7:%.*]] = load <2 x i32>, ptr [[VALUE2_ADDR]], align 8
+// CHECK-NEXT:    call spir_func void 
@_Z28intel_sub_group_block_write2PU3AS3jDv2_j(ptr addrspace(3) noundef 
[[TMP6]], <2 x i32> noundef [[TMP7]]) #[[ATTR2]]
+// CHECK-NEXT:    [[TMP8:%.*]] = load ptr addrspace(3), ptr [[OUT_ADDR]], 
align 4
+// CHECK-NEXT:    [[TMP9:%.*]] = load <4 x i3...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/199968
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to