llvmorg-github-actions[bot] wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-clang Author: Igor Gorban (igorban-intel) <details> <summary>Changes</summary> Add CodeGen tests and extend Sema-negative tests for the Intel OpenCL extension builtins declared in OpenCLBuiltins.td New CodeGen tests: clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl clang/test/CodeGenOpenCL/intel-subgroups-builtins.cl clang/test/CodeGenOpenCL/intel-subgroups-char-builtins.cl clang/test/CodeGenOpenCL/intel-subgroups-long-builtins.cl clang/test/CodeGenOpenCL/intel-subgroups-short-builtins.cl Extended Sema -verify tests covering the same extensions: clang/test/SemaOpenCL/intel-bfloat16-conversions-builtins.cl clang/test/SemaOpenCL/intel-subgroup-buffer-prefetch-builtins.cl clang/test/SemaOpenCL/intel-subgroup-local-block-io-builtins.cl clang/test/SemaOpenCL/intel-subgroups-builtins.cl clang/test/SemaOpenCL/intel-subgroups-char-builtins.cl clang/test/SemaOpenCL/intel-subgroups-long-builtins.cl clang/test/SemaOpenCL/intel-subgroups-short-builtins.cl Extensions covered: cl_intel_bfloat16_conversions cl_intel_subgroup_buffer_prefetch cl_intel_subgroup_local_block_io cl_intel_subgroups cl_intel_subgroups_char cl_intel_subgroups_long cl_intel_subgroups_short No functional change to the compiler; tests only. --- Patch is 121.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/199968.diff 14 Files Affected: - (added) clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl (+162) - (added) clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl (+96) - (added) clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl (+319) - (added) clang/test/CodeGenOpenCL/intel-subgroups-builtins.cl (+239) - (added) clang/test/CodeGenOpenCL/intel-subgroups-char-builtins.cl (+185) - (added) clang/test/CodeGenOpenCL/intel-subgroups-long-builtins.cl (+143) - (added) clang/test/CodeGenOpenCL/intel-subgroups-short-builtins.cl (+182) - (modified) clang/test/SemaOpenCL/intel-bfloat16-conversions-builtins.cl (+22-1) - (modified) clang/test/SemaOpenCL/intel-subgroup-buffer-prefetch-builtins.cl (+42) - (modified) clang/test/SemaOpenCL/intel-subgroup-local-block-io-builtins.cl (+57) - (modified) clang/test/SemaOpenCL/intel-subgroups-builtins.cl (+39) - (modified) clang/test/SemaOpenCL/intel-subgroups-char-builtins.cl (+32) - (modified) clang/test/SemaOpenCL/intel-subgroups-long-builtins.cl (+21-1) - (modified) clang/test/SemaOpenCL/intel-subgroups-short-builtins.cl (+32) ``````````diff diff --git a/clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl b/clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl new file mode 100644 index 0000000000000..d0a0e97150ad6 --- /dev/null +++ b/clang/test/CodeGenOpenCL/intel-bfloat16-conversions.cl @@ -0,0 +1,162 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 6 +// RUN: %clang_cc1 %s -triple spir-unknown-unknown -finclude-default-header -fdeclare-opencl-builtins -cl-std=CL3.0 -emit-llvm -o - -O0 | FileCheck %s + +// CHECK-LABEL: define dso_local spir_func zeroext i16 @test_convert_bfloat16_as_ushort( +// CHECK-SAME: float noundef [[SOURCE:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca float, align 4 +// CHECK-NEXT: store float [[SOURCE]], ptr [[SOURCE_ADDR]], align 4 +// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[SOURCE_ADDR]], align 4 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func zeroext i16 @_Z32intel_convert_bfloat16_as_ushortf(float noundef [[TMP0]]) #[[ATTR2:[0-9]+]] +// CHECK-NEXT: ret i16 [[CALL]] +// +ushort test_convert_bfloat16_as_ushort(float source) { + return intel_convert_bfloat16_as_ushort(source); +} + +// CHECK-LABEL: define dso_local spir_func <2 x i16> @test_convert_bfloat162_as_ushort2( +// CHECK-SAME: <2 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <2 x float>, align 8 +// CHECK-NEXT: store <2 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 8 +// CHECK-NEXT: [[TMP0:%.*]] = load <2 x float>, ptr [[SOURCE_ADDR]], align 8 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <2 x i16> @_Z34intel_convert_bfloat162_as_ushort2Dv2_f(<2 x float> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <2 x i16> [[CALL]] +// +ushort2 test_convert_bfloat162_as_ushort2(float2 source) { + return intel_convert_bfloat162_as_ushort2(source); +} + +// CHECK-LABEL: define dso_local spir_func <3 x i16> @test_convert_bfloat163_as_ushort3( +// CHECK-SAME: <3 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <3 x float>, align 16 +// CHECK-NEXT: [[EXTRACTVEC:%.*]] = shufflevector <3 x float> [[SOURCE]], <3 x float> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3> +// CHECK-NEXT: store <4 x float> [[EXTRACTVEC]], ptr [[SOURCE_ADDR]], align 16 +// CHECK-NEXT: [[LOADVECN:%.*]] = load <4 x float>, ptr [[SOURCE_ADDR]], align 16 +// CHECK-NEXT: [[EXTRACTVEC1:%.*]] = shufflevector <4 x float> [[LOADVECN]], <4 x float> poison, <3 x i32> <i32 0, i32 1, i32 2> +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <3 x i16> @_Z34intel_convert_bfloat163_as_ushort3Dv3_f(<3 x float> noundef [[EXTRACTVEC1]]) #[[ATTR2]] +// CHECK-NEXT: ret <3 x i16> [[CALL]] +// +ushort3 test_convert_bfloat163_as_ushort3(float3 source) { + return intel_convert_bfloat163_as_ushort3(source); +} + +// CHECK-LABEL: define dso_local spir_func <4 x i16> @test_convert_bfloat164_as_ushort4( +// CHECK-SAME: <4 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <4 x float>, align 16 +// CHECK-NEXT: store <4 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 16 +// CHECK-NEXT: [[TMP0:%.*]] = load <4 x float>, ptr [[SOURCE_ADDR]], align 16 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <4 x i16> @_Z34intel_convert_bfloat164_as_ushort4Dv4_f(<4 x float> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <4 x i16> [[CALL]] +// +ushort4 test_convert_bfloat164_as_ushort4(float4 source) { + return intel_convert_bfloat164_as_ushort4(source); +} + +// CHECK-LABEL: define dso_local spir_func <8 x i16> @test_convert_bfloat168_as_ushort8( +// CHECK-SAME: <8 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <8 x float>, align 32 +// CHECK-NEXT: store <8 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 32 +// CHECK-NEXT: [[TMP0:%.*]] = load <8 x float>, ptr [[SOURCE_ADDR]], align 32 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <8 x i16> @_Z34intel_convert_bfloat168_as_ushort8Dv8_f(<8 x float> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <8 x i16> [[CALL]] +// +ushort8 test_convert_bfloat168_as_ushort8(float8 source) { + return intel_convert_bfloat168_as_ushort8(source); +} + +// CHECK-LABEL: define dso_local spir_func <16 x i16> @test_convert_bfloat1616_as_ushort16( +// CHECK-SAME: <16 x float> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <16 x float>, align 64 +// CHECK-NEXT: store <16 x float> [[SOURCE]], ptr [[SOURCE_ADDR]], align 64 +// CHECK-NEXT: [[TMP0:%.*]] = load <16 x float>, ptr [[SOURCE_ADDR]], align 64 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <16 x i16> @_Z36intel_convert_bfloat1616_as_ushort16Dv16_f(<16 x float> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <16 x i16> [[CALL]] +// +ushort16 test_convert_bfloat1616_as_ushort16(float16 source) { + return intel_convert_bfloat1616_as_ushort16(source); +} + +// CHECK-LABEL: define dso_local spir_func float @test_convert_as_bfloat16_float( +// CHECK-SAME: i16 noundef zeroext [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca i16, align 2 +// CHECK-NEXT: store i16 [[SOURCE]], ptr [[SOURCE_ADDR]], align 2 +// CHECK-NEXT: [[TMP0:%.*]] = load i16, ptr [[SOURCE_ADDR]], align 2 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func float @_Z31intel_convert_as_bfloat16_floatt(i16 noundef zeroext [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret float [[CALL]] +// +float test_convert_as_bfloat16_float(ushort source) { + return intel_convert_as_bfloat16_float(source); +} + +// CHECK-LABEL: define dso_local spir_func <2 x float> @test_convert_as_bfloat162_float2( +// CHECK-SAME: <2 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <2 x i16>, align 4 +// CHECK-NEXT: store <2 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 4 +// CHECK-NEXT: [[TMP0:%.*]] = load <2 x i16>, ptr [[SOURCE_ADDR]], align 4 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <2 x float> @_Z33intel_convert_as_bfloat162_float2Dv2_t(<2 x i16> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <2 x float> [[CALL]] +// +float2 test_convert_as_bfloat162_float2(ushort2 source) { + return intel_convert_as_bfloat162_float2(source); +} + +// CHECK-LABEL: define dso_local spir_func <3 x float> @test_convert_as_bfloat163_float3( +// CHECK-SAME: <3 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <3 x i16>, align 8 +// CHECK-NEXT: [[EXTRACTVEC:%.*]] = shufflevector <3 x i16> [[SOURCE]], <3 x i16> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3> +// CHECK-NEXT: store <4 x i16> [[EXTRACTVEC]], ptr [[SOURCE_ADDR]], align 8 +// CHECK-NEXT: [[LOADVECN:%.*]] = load <4 x i16>, ptr [[SOURCE_ADDR]], align 8 +// CHECK-NEXT: [[EXTRACTVEC1:%.*]] = shufflevector <4 x i16> [[LOADVECN]], <4 x i16> poison, <3 x i32> <i32 0, i32 1, i32 2> +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <3 x float> @_Z33intel_convert_as_bfloat163_float3Dv3_t(<3 x i16> noundef [[EXTRACTVEC1]]) #[[ATTR2]] +// CHECK-NEXT: ret <3 x float> [[CALL]] +// +float3 test_convert_as_bfloat163_float3(ushort3 source) { + return intel_convert_as_bfloat163_float3(source); +} + +// CHECK-LABEL: define dso_local spir_func <4 x float> @test_convert_as_bfloat164_float4( +// CHECK-SAME: <4 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <4 x i16>, align 8 +// CHECK-NEXT: store <4 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 8 +// CHECK-NEXT: [[TMP0:%.*]] = load <4 x i16>, ptr [[SOURCE_ADDR]], align 8 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <4 x float> @_Z33intel_convert_as_bfloat164_float4Dv4_t(<4 x i16> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <4 x float> [[CALL]] +// +float4 test_convert_as_bfloat164_float4(ushort4 source) { + return intel_convert_as_bfloat164_float4(source); +} + +// CHECK-LABEL: define dso_local spir_func <8 x float> @test_convert_as_bfloat168_float8( +// CHECK-SAME: <8 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <8 x i16>, align 16 +// CHECK-NEXT: store <8 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 16 +// CHECK-NEXT: [[TMP0:%.*]] = load <8 x i16>, ptr [[SOURCE_ADDR]], align 16 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <8 x float> @_Z33intel_convert_as_bfloat168_float8Dv8_t(<8 x i16> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <8 x float> [[CALL]] +// +float8 test_convert_as_bfloat168_float8(ushort8 source) { + return intel_convert_as_bfloat168_float8(source); +} + +// CHECK-LABEL: define dso_local spir_func <16 x float> @test_convert_as_bfloat1616_float16( +// CHECK-SAME: <16 x i16> noundef [[SOURCE:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[SOURCE_ADDR:%.*]] = alloca <16 x i16>, align 32 +// CHECK-NEXT: store <16 x i16> [[SOURCE]], ptr [[SOURCE_ADDR]], align 32 +// CHECK-NEXT: [[TMP0:%.*]] = load <16 x i16>, ptr [[SOURCE_ADDR]], align 32 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func <16 x float> @_Z35intel_convert_as_bfloat1616_float16Dv16_t(<16 x i16> noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: ret <16 x float> [[CALL]] +// +float16 test_convert_as_bfloat1616_float16(ushort16 source) { + return intel_convert_as_bfloat1616_float16(source); +} diff --git a/clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl b/clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl new file mode 100644 index 0000000000000..6a543431e5751 --- /dev/null +++ b/clang/test/CodeGenOpenCL/intel-subgroup-buffer-prefetch-builtins.cl @@ -0,0 +1,96 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 6 +// RUN: %clang_cc1 %s -triple spir-unknown-unknown -finclude-default-header -fdeclare-opencl-builtins -cl-std=CL3.0 -emit-llvm -o - -O0 | FileCheck %s + +// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_ui( +// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4 +// CHECK-NEXT: store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z33intel_sub_group_block_prefetch_uiPU3AS1Kj(ptr addrspace(1) noundef [[TMP0]]) #[[ATTR2:[0-9]+]] +// CHECK-NEXT: [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_ui2PU3AS1Kj(ptr addrspace(1) noundef [[TMP1]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_ui4PU3AS1Kj(ptr addrspace(1) noundef [[TMP2]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_ui8PU3AS1Kj(ptr addrspace(1) noundef [[TMP3]]) #[[ATTR2]] +// CHECK-NEXT: ret void +// +void test_block_prefetch_ui(const __global uint *in) { + intel_sub_group_block_prefetch_ui(in); + intel_sub_group_block_prefetch_ui2(in); + intel_sub_group_block_prefetch_ui4(in); + intel_sub_group_block_prefetch_ui8(in); +} + +// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_us( +// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4 +// CHECK-NEXT: store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z33intel_sub_group_block_prefetch_usPU3AS1Kt(ptr addrspace(1) noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_us2PU3AS1Kt(ptr addrspace(1) noundef [[TMP1]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_us4PU3AS1Kt(ptr addrspace(1) noundef [[TMP2]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_us8PU3AS1Kt(ptr addrspace(1) noundef [[TMP3]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP4:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z35intel_sub_group_block_prefetch_us16PU3AS1Kt(ptr addrspace(1) noundef [[TMP4]]) #[[ATTR2]] +// CHECK-NEXT: ret void +// +void test_block_prefetch_us(const __global ushort *in) { + intel_sub_group_block_prefetch_us(in); + intel_sub_group_block_prefetch_us2(in); + intel_sub_group_block_prefetch_us4(in); + intel_sub_group_block_prefetch_us8(in); + intel_sub_group_block_prefetch_us16(in); +} + +// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_uc( +// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4 +// CHECK-NEXT: store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z33intel_sub_group_block_prefetch_ucPU3AS1Kh(ptr addrspace(1) noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_uc2PU3AS1Kh(ptr addrspace(1) noundef [[TMP1]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_uc4PU3AS1Kh(ptr addrspace(1) noundef [[TMP2]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_uc8PU3AS1Kh(ptr addrspace(1) noundef [[TMP3]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP4:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z35intel_sub_group_block_prefetch_uc16PU3AS1Kh(ptr addrspace(1) noundef [[TMP4]]) #[[ATTR2]] +// CHECK-NEXT: ret void +// +void test_block_prefetch_uc(const __global uchar *in) { + intel_sub_group_block_prefetch_uc(in); + intel_sub_group_block_prefetch_uc2(in); + intel_sub_group_block_prefetch_uc4(in); + intel_sub_group_block_prefetch_uc8(in); + intel_sub_group_block_prefetch_uc16(in); +} + +// CHECK-LABEL: define dso_local spir_func void @test_block_prefetch_ul( +// CHECK-SAME: ptr addrspace(1) noundef [[IN:%.*]]) #[[ATTR0]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[IN_ADDR:%.*]] = alloca ptr addrspace(1), align 4 +// CHECK-NEXT: store ptr addrspace(1) [[IN]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z33intel_sub_group_block_prefetch_ulPU3AS1Km(ptr addrspace(1) noundef [[TMP0]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP1:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_ul2PU3AS1Km(ptr addrspace(1) noundef [[TMP1]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP2:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_ul4PU3AS1Km(ptr addrspace(1) noundef [[TMP2]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP3:%.*]] = load ptr addrspace(1), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z34intel_sub_group_block_prefetch_ul8PU3AS1Km(ptr addrspace(1) noundef [[TMP3]]) #[[ATTR2]] +// CHECK-NEXT: ret void +// +void test_block_prefetch_ul(const __global ulong *in) { + intel_sub_group_block_prefetch_ul(in); + intel_sub_group_block_prefetch_ul2(in); + intel_sub_group_block_prefetch_ul4(in); + intel_sub_group_block_prefetch_ul8(in); +} diff --git a/clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl b/clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl new file mode 100644 index 0000000000000..63253ceb40384 --- /dev/null +++ b/clang/test/CodeGenOpenCL/intel-subgroup-local-block-io-builtins.cl @@ -0,0 +1,319 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 6 +// RUN: %clang_cc1 %s -triple spir-unknown-unknown -finclude-default-header -fdeclare-opencl-builtins -cl-std=CL3.0 -emit-llvm -o - -O0 | FileCheck %s + +// CHECK-LABEL: define dso_local spir_func void @test_block_read_local( +// CHECK-SAME: ptr addrspace(3) noundef [[IN:%.*]], ptr addrspace(3) noundef [[OUT:%.*]], i32 noundef [[VALUE:%.*]], <2 x i32> noundef [[VALUE2:%.*]], <4 x i32> noundef [[VALUE4:%.*]], <8 x i32> noundef [[VALUE8:%.*]]) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT: [[IN_ADDR:%.*]] = alloca ptr addrspace(3), align 4 +// CHECK-NEXT: [[OUT_ADDR:%.*]] = alloca ptr addrspace(3), align 4 +// CHECK-NEXT: [[VALUE_ADDR:%.*]] = alloca i32, align 4 +// CHECK-NEXT: [[VALUE2_ADDR:%.*]] = alloca <2 x i32>, align 8 +// CHECK-NEXT: [[VALUE4_ADDR:%.*]] = alloca <4 x i32>, align 16 +// CHECK-NEXT: [[VALUE8_ADDR:%.*]] = alloca <8 x i32>, align 32 +// CHECK-NEXT: [[V:%.*]] = alloca i32, align 4 +// CHECK-NEXT: [[V2:%.*]] = alloca <2 x i32>, align 8 +// CHECK-NEXT: [[V4:%.*]] = alloca <4 x i32>, align 16 +// CHECK-NEXT: [[V8:%.*]] = alloca <8 x i32>, align 32 +// CHECK-NEXT: store ptr addrspace(3) [[IN]], ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: store ptr addrspace(3) [[OUT]], ptr [[OUT_ADDR]], align 4 +// CHECK-NEXT: store i32 [[VALUE]], ptr [[VALUE_ADDR]], align 4 +// CHECK-NEXT: store <2 x i32> [[VALUE2]], ptr [[VALUE2_ADDR]], align 8 +// CHECK-NEXT: store <4 x i32> [[VALUE4]], ptr [[VALUE4_ADDR]], align 16 +// CHECK-NEXT: store <8 x i32> [[VALUE8]], ptr [[VALUE8_ADDR]], align 32 +// CHECK-NEXT: [[TMP0:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[CALL:%.*]] = call spir_func i32 @_Z26intel_sub_group_block_readPU3AS3Kj(ptr addrspace(3) noundef [[TMP0]]) #[[ATTR2:[0-9]+]] +// CHECK-NEXT: store i32 [[CALL]], ptr [[V]], align 4 +// CHECK-NEXT: [[TMP1:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[CALL1:%.*]] = call spir_func <2 x i32> @_Z27intel_sub_group_block_read2PU3AS3Kj(ptr addrspace(3) noundef [[TMP1]]) #[[ATTR2]] +// CHECK-NEXT: store <2 x i32> [[CALL1]], ptr [[V2]], align 8 +// CHECK-NEXT: [[TMP2:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[CALL2:%.*]] = call spir_func <4 x i32> @_Z27intel_sub_group_block_read4PU3AS3Kj(ptr addrspace(3) noundef [[TMP2]]) #[[ATTR2]] +// CHECK-NEXT: store <4 x i32> [[CALL2]], ptr [[V4]], align 16 +// CHECK-NEXT: [[TMP3:%.*]] = load ptr addrspace(3), ptr [[IN_ADDR]], align 4 +// CHECK-NEXT: [[CALL3:%.*]] = call spir_func <8 x i32> @_Z27intel_sub_group_block_read8PU3AS3Kj(ptr addrspace(3) noundef [[TMP3]]) #[[ATTR2]] +// CHECK-NEXT: store <8 x i32> [[CALL3]], ptr [[V8]], align 32 +// CHECK-NEXT: [[TMP4:%.*]] = load ptr addrspace(3), ptr [[OUT_ADDR]], align 4 +// CHECK-NEXT: [[TMP5:%.*]] = load i32, ptr [[VALUE_ADDR]], align 4 +// CHECK-NEXT: call spir_func void @_Z27intel_sub_group_block_writePU3AS3jj(ptr addrspace(3) noundef [[TMP4]], i32 noundef [[TMP5]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP6:%.*]] = load ptr addrspace(3), ptr [[OUT_ADDR]], align 4 +// CHECK-NEXT: [[TMP7:%.*]] = load <2 x i32>, ptr [[VALUE2_ADDR]], align 8 +// CHECK-NEXT: call spir_func void @_Z28intel_sub_group_block_write2PU3AS3jDv2_j(ptr addrspace(3) noundef [[TMP6]], <2 x i32> noundef [[TMP7]]) #[[ATTR2]] +// CHECK-NEXT: [[TMP8:%.*]] = load ptr addrspace(3), ptr [[OUT_ADDR]], align 4 +// CHECK-NEXT: [[TMP9:%.*]] = load <4 x i3... [truncated] `````````` </details> https://github.com/llvm/llvm-project/pull/199968 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
