[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-06 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1,6 +1,7 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 ; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -o - %s | FileCheck -check-prefixes=GFX9 %s ; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -o - %s | File

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-06 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1,5 +1,6 @@ # NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2 # RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -start-before=machine-scheduler -verify-misched -o - %s | FileCheck -check-prefix=GCN %s +# RUN: llc -mtriple

[clang] [llvm] [AMDGPU] modify named barrier builtins and intrinsics (PR #114550)

2024-11-06 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. It's the same code already reviewed downstream. LGTM. https://github.com/llvm/llvm-project/pull/114550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailma

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1024,6 +1024,16 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec commented: In general LTGM. https://github.com/llvm/llvm-project/pull/114481 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1024,6 +1024,16 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/114481 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -345,6 +345,15 @@ extern char &AMDGPUPrintfRuntimeBindingID; void initializeAMDGPUResourceUsageAnalysisPass(PassRegistry &); extern char &AMDGPUResourceUsageAnalysisID; +struct AMDGPUExpandPseudoIntrinsicsPass rampitec wrote: The pass isn't needed now? ht

[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics, start with `llvm.amdgcn.wavefrontsize` (PR #114481)

2024-11-01 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: This is really just a constant folding rather than a new pass. If the concern is that InstCombine works too late that is possible to add an earlier invocation. https://github.com/llvm/llvm-project/pull/114481 ___ cfe-commits mailing l

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-31 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec closed https://github.com/llvm/llvm-project/pull/113610 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-31 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/113610 >From edda0e600abeabff4d44e8b0b897104efacc8f98 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 24 Oct 2024 11:31:52 -0700 Subject: [PATCH 1/2] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-29 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: It does not really work w/o https://github.com/llvm/llvm-project/pull/113500 though. https://github.com/llvm/llvm-project/pull/113610 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listin

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-25 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/113610 >From edda0e600abeabff4d44e8b0b897104efacc8f98 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 24 Oct 2024 11:31:52 -0700 Subject: [PATCH 1/2] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-25 Thread Stanislav Mekhanoshin via cfe-commits
@@ -152,6 +115,44 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID, return false; } +bool SemaAMDGPU::CheckMovDPPFunctionCall(CallExpr *TheCall, unsigned NumArgs, rampitec wrote: Done https://github.com/llvm/llvm-project/pull/113610 _

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-24 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/113610 The same handling as for __builtin_amdgcn_mov_dpp. >From edda0e600abeabff4d44e8b0b897104efacc8f98 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 24 Oct 2024 11:31:52 -0700 Subject: [PATCH] [AM

[clang] [AMDGPU] Relax __builtin_amdgcn_update_dpp sema check (PR #113341)

2024-10-22 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec closed https://github.com/llvm/llvm-project/pull/113341 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Relax __builtin_amdgcn_update_dpp sema check (PR #113341)

2024-10-22 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/113341 Recent change applied too strict check for old and src operands match. These shall be compatible, but not necessarily exactly the same. >From 01e8c4224a1a0b8e1067c087f3a5e1283566f80a Mon Sep 17 00:00:00 2001 F

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-21 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec closed https://github.com/llvm/llvm-project/pull/112447 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-21 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/7] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-18 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/7] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-18 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/6] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-18 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: I actually wish a better way to have overloaded builtins in clang. I do not believe any user of these builtins is expecting that a wide integer will be silently truncated, and any fp will go through fptosi and backwards after, like we do now. We have much more builtins like tha

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-18 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/6] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-17 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/6] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-17 Thread Stanislav Mekhanoshin via cfe-commits
@@ -7,3 +7,37 @@ void test_gfx9_fmed3h(global half *out, half a, half b, half c) { *out = __builtin_amdgcn_fmed3h(a, b, c); // expected-error {{'__builtin_amdgcn_fmed3h' needs target feature gfx9-insts}} } + +void test_mov_dpp(global int* out, int src, int i) +{ + *out = __

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-17 Thread Stanislav Mekhanoshin via cfe-commits
@@ -102,20 +102,66 @@ void test_s_dcache_wb() __builtin_amdgcn_s_dcache_wb(); } -// CHECK-LABEL: @test_mov_dpp +// CHECK-LABEL: @test_mov_dpp_int // CHECK: {{.*}}call{{.*}} i32 @llvm.amdgcn.update.dpp.i32(i32 poison, i32 %src, i32 0, i32 0, i32 0, i1 false) -void test_mov_

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-17 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/5] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-17 Thread Stanislav Mekhanoshin via cfe-commits
@@ -102,20 +102,66 @@ void test_s_dcache_wb() __builtin_amdgcn_s_dcache_wb(); } -// CHECK-LABEL: @test_mov_dpp +// CHECK-LABEL: @test_mov_dpp_int // CHECK: {{.*}}call{{.*}} i32 @llvm.amdgcn.update.dpp.i32(i32 poison, i32 %src, i32 0, i32 0, i32 0, i1 false) -void test_mov_

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-17 Thread Stanislav Mekhanoshin via cfe-commits
@@ -102,20 +102,66 @@ void test_s_dcache_wb() __builtin_amdgcn_s_dcache_wb(); } -// CHECK-LABEL: @test_mov_dpp +// CHECK-LABEL: @test_mov_dpp_int // CHECK: {{.*}}call{{.*}} i32 @llvm.amdgcn.update.dpp.i32(i32 poison, i32 %src, i32 0, i32 0, i32 0, i1 false) -void test_mov_

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-16 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: > This needs some sema type restrictions to make sure it's something sensible Added. https://github.com/llvm/llvm-project/pull/112447 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listin

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-16 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/4] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-16 Thread Stanislav Mekhanoshin via cfe-commits
@@ -224,8 +224,8 @@ TARGET_BUILTIN(__builtin_amdgcn_frexp_exph, "sh", "nc", "16-bit-insts") TARGET_BUILTIN(__builtin_amdgcn_fracth, "hh", "nc", "16-bit-insts") TARGET_BUILTIN(__builtin_amdgcn_classh, "bhi", "nc", "16-bit-insts") TARGET_BUILTIN(__builtin_amdgcn_s_memrealtime, "

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-15 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: Note, there is also dpp8 with the similar problem. But dpp8 is not properly handled even if intrinsic is used with a 64-bit type (i.e. not split into 2 separate 32-bit dpp ops). This would be a nice to have, but not absolutely necessary like here, because there are no 64-bit re

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-15 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/3] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-15 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/112447 >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 15 Oct 2024 15:23:28 -0700 Subject: [PATCH 1/2] [AMDGPU] Allow overload of __builtin_amdgcn_mov/up

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-15 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/112447 We need to support 64-bit data types (intrinsics do support it). We are also silently converting FP to integer argument now, also fixed. >From 761b3e21748dd3a7b53cd0ead745943213317eb4 Mon Sep 17 00:00:00 2001

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec closed https://github.com/llvm/llvm-project/pull/107293 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: > I think the parent needs some revision for global/flat/infer handling Do you like this more? https://github.com/llvm/llvm-project/pull/107624 https://github.com/llvm/llvm-project/pull/107293 ___ cfe-commits mailing list cfe-commits@l

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
@@ -0,0 +1,36 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1200 < %s | FileCheck --check-prefix=GCN %s +; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1200 < %s | FileC

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/107293 >From 8361742ca5fe20a3168b3274166909412e225184 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Wed, 4 Sep 2024 12:00:27 -0700 Subject: [PATCH 1/2] [AMDGPU] Add target intrinsic for s_buffer_prefetch_

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
@@ -9934,6 +9934,12 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op, auto NewMI = DAG.getMachineNode(Opc, DL, Op->getVTList(), Ops); return SDValue(NewMI, 0); } + case Intrinsic::amdgcn_s_prefetch_data: { +// For non-global address space preserve the

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
@@ -9934,6 +9934,12 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op, auto NewMI = DAG.getMachineNode(Opc, DL, Op->getVTList(), Ops); return SDValue(NewMI, 0); } + case Intrinsic::amdgcn_s_prefetch_data: { +// For non-global address space preserve the

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-06 Thread Stanislav Mekhanoshin via cfe-commits
@@ -9934,6 +9934,12 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op, auto NewMI = DAG.getMachineNode(Opc, DL, Op->getVTList(), Ops); return SDValue(NewMI, 0); } + case Intrinsic::amdgcn_s_prefetch_data: { +// For non-global address space preserve the

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-05 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec closed https://github.com/llvm/llvm-project/pull/107133 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add target intrinsic for s_buffer_prefetch_data (PR #107293)

2024-09-04 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/107293 None >From 8361742ca5fe20a3168b3274166909412e225184 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Wed, 4 Sep 2024 12:00:27 -0700 Subject: [PATCH] [AMDGPU] Add target intrinsic for s_buffer_prefetc

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -2689,6 +2689,12 @@ def int_amdgcn_global_load_tr_b128 : AMDGPULoadIntrinsic; def int_amdgcn_wave_id : DefaultAttrsIntrinsic<[llvm_i32_ty], [], [NoUndef, IntrNoMem, IntrSpeculatable]>; +def int_amdgcn_s_prefetch_data : + Intrinsic<[], [llvm_anyptr_ty, llvm_i32_ty], ---

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-04 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/107133 >From 000e16cbd27783be68afdd9952c65e58f4cd7040 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 3 Sep 2024 10:14:35 -0700 Subject: [PATCH 1/4] [AMDGPU] Add target intrinsic for s_prefetch_data -

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-03 Thread Stanislav Mekhanoshin via cfe-commits
@@ -0,0 +1,136 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -global-isel=0 -march=amdgcn -mcpu=gfx1200 < %s | FileCheck --check-prefixes=GCN,SDAG %s +; RUN: llc -global-isel=1 -march=amdgcn -mcpu=gfx1200 < %s

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-03 Thread Stanislav Mekhanoshin via cfe-commits
@@ -19489,6 +19489,12 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, F, {EmitScalarExpr(E->getArg(0)), EmitScalarExpr(E->getArg(1)), EmitScalarExpr(E->getArg(2)), EmitScalarExpr(E->getArg(3))}); } + case AMDGPU::BI__builtin_amdgcn_s

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-03 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/107133 >From 000e16cbd27783be68afdd9952c65e58f4cd7040 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 3 Sep 2024 10:14:35 -0700 Subject: [PATCH 1/3] [AMDGPU] Add target intrinsic for s_prefetch_data -

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-03 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/107133 >From 000e16cbd27783be68afdd9952c65e58f4cd7040 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 3 Sep 2024 10:14:35 -0700 Subject: [PATCH 1/2] [AMDGPU] Add target intrinsic for s_prefetch_data -

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-03 Thread Stanislav Mekhanoshin via cfe-commits
@@ -19489,6 +19489,12 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, F, {EmitScalarExpr(E->getArg(0)), EmitScalarExpr(E->getArg(1)), EmitScalarExpr(E->getArg(2)), EmitScalarExpr(E->getArg(3))}); } + case AMDGPU::BI__builtin_amdgcn_s

[clang] [llvm] [AMDGPU] Add target intrinsic for s_prefetch_data (PR #107133)

2024-09-03 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/107133 None >From 000e16cbd27783be68afdd9952c65e58f4cd7040 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Tue, 3 Sep 2024 10:14:35 -0700 Subject: [PATCH] [AMDGPU] Add target intrinsic for s_prefetch_data

[clang] [llvm] AMDGPU: Loop over the types for global_load_tr16 pats (NFC) (PR #99551)

2024-07-18 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/99551 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Add back half and bfloat support for global_load_tr16 pats (PR #99540)

2024-07-18 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/99540 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-09 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: > /build/buildbot/premerge-monolithic-linux/llvm-project/flang/lib/Frontend/CompilerInstance.cpp:226:44: > error: too many arguments to function call, expected 3, have 4 Fixed. https://github.com/llvm/llvm-project/pull/97633 ___ cfe-c

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-09 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: [AMD Official Use Only - AMD Internal Distribution Only] Fixed https://github.com/llvm/llvm-project/pull/98231 Sorry. Stas From: LLVM Continuous Integration ***@***.***> Date: Tuesday, July 9, 2024 at 14:37 To: llvm/llvm-project ***@***.***> Cc: Mekhanoshin, Stanislav ***@***.*

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-09 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec closed https://github.com/llvm/llvm-project/pull/97633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-09 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/97633 >From dc9d1e2039981bb412e68975570d9911511bb880 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Wed, 3 Jul 2024 13:12:21 -0700 Subject: [PATCH 1/3] [AMDGPU] Report error in clang if wave32 is requested

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-09 Thread Stanislav Mekhanoshin via cfe-commits
@@ -188,8 +188,12 @@ bool AMDGPUTargetInfo::initFeatureMap( // TODO: Should move this logic into TargetParser std::string ErrorMsg; - if (!insertWaveSizeFeature(CPU, getTriple(), Features, ErrorMsg)) { -Diags.Report(diag::err_invalid_feature_combination) << ErrorMsg;

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-09 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/97633 >From dc9d1e2039981bb412e68975570d9911511bb880 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Wed, 3 Jul 2024 13:12:21 -0700 Subject: [PATCH 1/2] [AMDGPU] Report error in clang if wave32 is requested

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-08 Thread Stanislav Mekhanoshin via cfe-commits
@@ -188,8 +188,12 @@ bool AMDGPUTargetInfo::initFeatureMap( // TODO: Should move this logic into TargetParser std::string ErrorMsg; - if (!insertWaveSizeFeature(CPU, getTriple(), Features, ErrorMsg)) { -Diags.Report(diag::err_invalid_feature_combination) << ErrorMsg;

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-08 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/97633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-08 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/97633 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-08 Thread Stanislav Mekhanoshin via cfe-commits
@@ -188,8 +188,12 @@ bool AMDGPUTargetInfo::initFeatureMap( // TODO: Should move this logic into TargetParser std::string ErrorMsg; - if (!insertWaveSizeFeature(CPU, getTriple(), Features, ErrorMsg)) { -Diags.Report(diag::err_invalid_feature_combination) << ErrorMsg;

[clang] [llvm] [AMDGPU] Report error in clang if wave32 is requested where unsupported (PR #97633)

2024-07-03 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/97633 None >From dc9d1e2039981bb412e68975570d9911511bb880 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Wed, 3 Jul 2024 13:12:21 -0700 Subject: [PATCH] [AMDGPU] Report error in clang if wave32 is request

[clang] [libclc] [llvm] [AMDGPU] Add a new target gfx1152 (PR #94534)

2024-06-05 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1534,6 +1534,12 @@ def FeatureISAVersion11_5_1 : FeatureSet< FeatureVGPRSingleUseHintInsts, Feature1_5xVGPRs])>; +def FeatureISAVersion11_5_2 : FeatureSet< rampitec wrote: Then I defer review to Jay. https://github.com/llvm/llvm-project/pull/94

[clang] [libclc] [llvm] [AMDGPU] Add a new target gfx1152 (PR #94534)

2024-06-05 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1534,6 +1534,12 @@ def FeatureISAVersion11_5_1 : FeatureSet< FeatureVGPRSingleUseHintInsts, Feature1_5xVGPRs])>; +def FeatureISAVersion11_5_2 : FeatureSet< rampitec wrote: I don't know, but if they are I have a question why a new target needed?

[clang] [libclc] [llvm] [AMDGPU] Add a new target gfx1152 (PR #94534)

2024-06-05 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1534,6 +1534,12 @@ def FeatureISAVersion11_5_1 : FeatureSet< FeatureVGPRSingleUseHintInsts, Feature1_5xVGPRs])>; +def FeatureISAVersion11_5_2 : FeatureSet< rampitec wrote: Looks the same as 1150? https://github.com/llvm/llvm-project/pull/94534

[clang] [llvm] [AMDGPU][Clang] Builtin for GLOBAL_LOAD_LDS on GFX940 (PR #92962)

2024-05-21 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/92962 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Clang builtin for GLOBAL_LOAD_LDS on GFX940 (PR #92962)

2024-05-21 Thread Stanislav Mekhanoshin via cfe-commits
@@ -2466,23 +2466,20 @@ def int_amdgcn_perm : // GFX9 Intrinsics //===--===// -class AMDGPUGlobalLoadLDS : Intrinsic < - [], - [LLVMQualPointerType<1>, // Base global pointer to load from - LL

[clang] [llvm] [AMDGPU] Clang builtin for GLOBAL_LOAD_LDS on GFX940 (PR #92962)

2024-05-21 Thread Stanislav Mekhanoshin via cfe-commits
@@ -2466,23 +2466,24 @@ def int_amdgcn_perm : // GFX9 Intrinsics //===--===// -class AMDGPUGlobalLoadLDS : Intrinsic < - [], - [LLVMQualPointerType<1>, // Base global pointer to load from - LL

[clang] [llvm] [AMDGPU] Add Clang builtins for amdgcn s_ttrace intrinsics (PR #88076)

2024-04-11 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/88076 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add Clang builtins for amdgcn s_ttrace intrinsics (PR #88076)

2024-04-11 Thread Stanislav Mekhanoshin via cfe-commits
@@ -61,6 +61,8 @@ BUILTIN(__builtin_amdgcn_s_waitcnt, "vIi", "n") BUILTIN(__builtin_amdgcn_s_sendmsg, "vIiUi", "n") BUILTIN(__builtin_amdgcn_s_sendmsghalt, "vIiUi", "n") BUILTIN(__builtin_amdgcn_s_barrier, "v", "n") +BUILTIN(__builtin_amdgcn_s_ttracedata, "vi", "n") +BUILTIN(__

[clang] [llvm] AMDGPU: Rename intrinsics and remove f16/bf16 versions for load transpose (PR #86313)

2024-03-22 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: > global_load_re_b64 Type global_load_re_b64. https://github.com/llvm/llvm-project/pull/86313 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-21 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: > I don't think intrinsics are meant for users. Builtins are the user-facing > front. :-) Depending on who you consider an user. Are folks writing MLIR generators users? https://github.com/llvm/llvm-project/pull/86202 ___ cfe-commits

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-21 Thread Stanislav Mekhanoshin via cfe-commits
rampitec wrote: > > Do you want to rename intrinsics as well? Because now intrinsic names do > > not match builtin names. > > Do we have to match builtins with intrinsics? Renaming intrinsics here means > we will have to duplicate the intrinsics. Is that because of the mangling? https://gith

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-21 Thread Stanislav Mekhanoshin via cfe-commits
@@ -432,13 +432,15 @@ TARGET_BUILTIN(__builtin_amdgcn_s_wakeup_barrier, "vi", "n", "gfx12-insts") TARGET_BUILTIN(__builtin_amdgcn_s_barrier_leave, "b", "n", "gfx12-insts") TARGET_BUILTIN(__builtin_amdgcn_s_get_barrier_state, "Uii", "n", "gfx12-insts") -TARGET_BUILTIN(__builti

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-21 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec commented: Do you want to rename intrinsics as well? Because now intrinsic names do not match builtin names. https://github.com/llvm/llvm-project/pull/86202 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://li

[clang] [llvm] AMDGPU: Define a feature for v_dot4_f32_* instructions (PR #84248)

2024-03-06 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. LGTM, thanks! https://github.com/llvm/llvm-project/pull/84248 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv' (PR #83906)

2024-03-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1122,7 +1122,7 @@ class S_SETREG_B32_Pseudo pattern=[]> : SOPK_Pseudo < pattern>; def S_SETREG_B32 : S_SETREG_B32_Pseudo < - [(int_amdgcn_s_setreg (i32 SIMM16bit:$simm16), i32:$sdst)]> { + [(int_amdgcn_s_setreg (i32 timm:$simm16), i32:$sdst)]> { ramp

[clang] [llvm] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv' (PR #83906)

2024-03-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1122,7 +1122,7 @@ class S_SETREG_B32_Pseudo pattern=[]> : SOPK_Pseudo < pattern>; def S_SETREG_B32 : S_SETREG_B32_Pseudo < - [(int_amdgcn_s_setreg (i32 SIMM16bit:$simm16), i32:$sdst)]> { + [(int_amdgcn_s_setreg (i32 timm:$simm16), i32:$sdst)]> { ramp

[clang] [llvm] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv' (PR #83906)

2024-03-04 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1122,7 +1122,7 @@ class S_SETREG_B32_Pseudo pattern=[]> : SOPK_Pseudo < pattern>; def S_SETREG_B32 : S_SETREG_B32_Pseudo < - [(int_amdgcn_s_setreg (i32 SIMM16bit:$simm16), i32:$sdst)]> { + [(int_amdgcn_s_setreg (i32 timm:$simm16), i32:$sdst)]> { ramp

[clang] [llvm] [AMDGPU] Fix operand types for `V_DOT2_F32_BF16` (PR #82044)

2024-02-20 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/82044 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-16 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec approved this pull request. Thanks. There are definitely at least 2 outstanding problems, but it seems there are no regressions comparing to what we have now. LGTM. https://github.com/llvm/llvm-project/pull/80908 ___ cfe-co

[clang] [llvm] [AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-16 Thread Stanislav Mekhanoshin via cfe-commits
@@ -0,0 +1,8 @@ +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck %s +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1200 -show-encoding %s | FileCheck %s + +v_dot2_bf16_bf16 v5, v1, v2, 100.0 +// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8 ; encoding: [0x05,0x00,0x

[clang] [llvm] [AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-16 Thread Stanislav Mekhanoshin via cfe-commits
@@ -0,0 +1,8 @@ +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck %s +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1200 -show-encoding %s | FileCheck %s + +v_dot2_bf16_bf16 v5, v1, v2, 100.0 +// CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8 ; encoding: [0x05,0x00,0x

[clang] [llvm] [AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-16 Thread Stanislav Mekhanoshin via cfe-commits
@@ -2652,6 +2652,23 @@ bool isInlinableLiteral32(int32_t Literal, bool HasInv2Pi) { (Val == 0x3e22f983 && HasInv2Pi); } +bool isInlinableLiteralBF16(int16_t Literal, bool HasInv2Pi) { + if (!HasInv2Pi) +return false; rampitec wrote: It does not

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-15 Thread Stanislav Mekhanoshin via cfe-commits
@@ -0,0 +1,8 @@ +# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1100 -disassemble -show-encoding < %s | FileCheck %s +# RUN: llvm-mc -triple=amdgcn -mcpu=gfx1200 -disassemble -show-encoding < %s | FileCheck %s + +# CHECK: v_dot2_bf16_bf16 v5, v1, v2, 0x42c8 rampitec wro

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-13 Thread Stanislav Mekhanoshin via cfe-commits
@@ -4185,9 +4185,17 @@ bool SIInstrInfo::isInlineConstant(const MachineOperand &MO, case AMDGPU::OPERAND_REG_INLINE_C_V2FP16: case AMDGPU::OPERAND_REG_INLINE_AC_V2FP16: return AMDGPU::isInlinableLiteralV2F16(Imm); + case AMDGPU::OPERAND_REG_IMM_V2BF16: + case AMDGPU:

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-13 Thread Stanislav Mekhanoshin via cfe-commits
@@ -2819,11 +2819,11 @@ def int_amdgcn_fdot2_f16_f16 : def int_amdgcn_fdot2_bf16_bf16 : ClangBuiltin<"__builtin_amdgcn_fdot2_bf16_bf16">, DefaultAttrsIntrinsic< -[llvm_i16_ty], // %r +[llvm_bfloat_ty], // %r rampitec wrote: clang/test/CodeGenOp

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-13 Thread Stanislav Mekhanoshin via cfe-commits
@@ -4185,9 +4185,17 @@ bool SIInstrInfo::isInlineConstant(const MachineOperand &MO, case AMDGPU::OPERAND_REG_INLINE_C_V2FP16: case AMDGPU::OPERAND_REG_INLINE_AC_V2FP16: return AMDGPU::isInlinableLiteralV2F16(Imm); + case AMDGPU::OPERAND_REG_IMM_V2BF16: + case AMDGPU:

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-13 Thread Stanislav Mekhanoshin via cfe-commits
@@ -488,6 +488,49 @@ static bool printImmediateFloat16(uint32_t Imm, const MCSubtargetInfo &STI, return true; } +static bool printImmediateBFloat16(uint32_t Imm, const MCSubtargetInfo &STI, + raw_ostream &O) { + if (Imm == 0x3F80) +O <

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-13 Thread Stanislav Mekhanoshin via cfe-commits
@@ -1,8 +1,7 @@ ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py ; RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -verify-machineinstrs < %s | FileCheck %s --check-prefixes=GFX11,SDAG-GFX11 -; RUN: llc -global-isel -mtriple=amdgcn -mcpu=gfx1100 -verify-mach

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-13 Thread Stanislav Mekhanoshin via cfe-commits
@@ -0,0 +1,8 @@ +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck %s rampitec wrote: You also need a disasm test for this. https://github.com/llvm/llvm-project/pull/80908 ___ cfe-commits mailing

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-12 Thread Stanislav Mekhanoshin via cfe-commits
@@ -79,17 +79,17 @@ define amdgpu_ps void @test_llvm_amdgcn_fdot2_bf16_bf16_sis( ; GFX11: ; %bb.0: ; %entry ; GFX11-NEXT:v_mov_b32_e32 v2, s1 ; GFX11-NEXT:s_delay_alu instid0(VALU_DEP_1) -; GFX11-NEXT:v_dot2_bf16_bf16 v2, s0, 0x10001, v2 +; GFX11-NEXT:v_do

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Stanislav Mekhanoshin via cfe-commits
https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/80908 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [RFC][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Stanislav Mekhanoshin via cfe-commits
@@ -4181,13 +4181,20 @@ bool SIInstrInfo::isInlineConstant(const MachineOperand &MO, case AMDGPU::OPERAND_REG_INLINE_C_V2INT16: case AMDGPU::OPERAND_REG_INLINE_AC_V2INT16: return AMDGPU::isInlinableLiteralV2I16(Imm); + case AMDGPU::OPERAND_REG_IMM_V2BF16:

  1   2   3   >