[clang] [llvm] [clang-tools-extra] [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (PR #75917)

2024-01-15 Thread Mariusz Sikora via cfe-commits
mariusz-sikora-at-amd wrote: What is the plan for atomic_{flat/ds/global}_bf16 builtins ? Right now they are accepting <2 x i16> instead of <2 x bfloat>. Do we want to create new builtins or we want to override them to accept both <2 x i16> and <2 x bfloat> ? https://github.com/llvm/llvm-proj

[clang] [llvm] [clang-tools-extra] [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (PR #75917)

2024-01-15 Thread Mariusz Sikora via cfe-commits
@@ -27,34 +27,23 @@ main_body: ret float %out0 } -define amdgpu_ps float @atomic_pk_add_bf16_1d_v2(<8 x i32> inreg %rsrc, <2 x i16> %data, i32 %s) { +define amdgpu_ps float @atomic_pk_add_bf16_1d_v2(<8 x i32> inreg %rsrc, <2 x bfloat> %data, i32 %s) { ; GFX12-LABEL: atomi

[clang] [llvm] [clang-tools-extra] [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (PR #75917)

2024-01-15 Thread Mariusz Sikora via cfe-commits
@@ -362,24 +358,34 @@ define amdgpu_ps void @struct_buffer_atomic_add_v2f16_noret(<2 x half> %val, <4 ret void } -define amdgpu_ps float @struct_buffer_atomic_add_v2bf16_ret(<2 x i16> %val, <4 x i32> inreg %rsrc, i32 %vindex, i32 %voffset, i32 inreg %soffset) { +define amd

[clang] [llvm] [clang-tools-extra] [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (PR #75917)

2024-01-15 Thread Mariusz Sikora via cfe-commits
@@ -0,0 +1,92 @@ +// RUN: %clang_cc1 -O0 -cl-std=CL2.0 -triple amdgcn-amd-amdhsa -target-cpu gfx1200 \ +// RUN: %s -S -emit-llvm -o - | FileCheck %s + +// RUN: %clang_cc1 -O0 -cl-std=CL2.0 -triple amdgcn-amd-amdhsa -target-cpu gfx1200 \ +// RUN: -S -o - %s | FileCheck -check