[llvm] [clang] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-24 Thread Mirko Brkušanin via cfe-commits
https://github.com/mbrkusanin closed https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [mlir] [llvm] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-24 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: Rebased and updated after https://github.com/llvm/llvm-project/pull/76143 https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[mlir] [llvm] [clang] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-24 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: Rebased and reverted bfloat https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[mlir] [llvm] [clang] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-23 Thread Mirko Brkušanin via cfe-commits
@@ -2601,67 +2601,73 @@ def int_amdgcn_ds_bvh_stack_rtn : [ImmArg>, IntrWillReturn, IntrNoCallback, IntrNoFree] >; +def int_amdgcn_s_wait_event_export_ready : + ClangBuiltin<"__builtin_amdgcn_s_wait_event_export_ready">, + Intrinsic<[], [], [IntrNoMem,

[llvm] [mlir] [clang] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-23 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: Ping https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lld] [llvm] [compiler-rt] [libcxx] [flang] [clang-tools-extra] [lldb] [libc] [clang] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-23 Thread Mirko Brkušanin via cfe-commits
@@ -305,6 +305,11 @@ class VOP3OpSel_gfx10 op, VOPProfile p> : VOP3e_gfx10 { class VOP3OpSel_gfx11_gfx12 op, VOPProfile p> : VOP3OpSel_gfx10; +class VOP3FP8OpSel_gfx11_gfx12 op, VOPProfile p> : VOP3e_gfx10 { + let Inst{11} = !if(p.HasSrc0, src0_modifiers{2}, 0); + let

[lld] [llvm] [compiler-rt] [libcxx] [flang] [clang-tools-extra] [lldb] [libc] [clang] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-23 Thread Mirko Brkušanin via cfe-commits
@@ -305,6 +305,11 @@ class VOP3OpSel_gfx10 op, VOPProfile p> : VOP3e_gfx10 { class VOP3OpSel_gfx11_gfx12 op, VOPProfile p> : VOP3OpSel_gfx10; +class VOP3FP8OpSel_gfx11_gfx12 op, VOPProfile p> : VOP3e_gfx10 { + let Inst{11} = !if(p.HasSrc0, src0_modifiers{2}, 0); + let

[clang] [llvm] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-23 Thread Mirko Brkušanin via cfe-commits
@@ -253,22 +253,22 @@ def ROCDL_mfma_f32_32x32x16_fp8_fp8 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.fp8.f //===-===// // WMMA intrinsics -class ROCDL_Wmma_IntrOp traits = []> : +class ROCDL_Wmma_IntrOp

[mlir] [clang] [llvm] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: If there are no further comments, should I merge this? https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libc] [llvm] [clang] [lldb] [libcxx] [compiler-rt] [lld] [clang-tools-extra] [flang] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
@@ -305,6 +305,11 @@ class VOP3OpSel_gfx10 op, VOPProfile p> : VOP3e_gfx10 { class VOP3OpSel_gfx11_gfx12 op, VOPProfile p> : VOP3OpSel_gfx10; +class VOP3FP8OpSel_gfx11_gfx12 op, VOPProfile p> : VOP3e_gfx10 { + let Inst{11} = !if(p.HasSrc0, src0_modifiers{2}, 0); + let

[clang] [llvm] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
@@ -253,22 +253,22 @@ def ROCDL_mfma_f32_32x32x16_fp8_fp8 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.fp8.f //===-===// // WMMA intrinsics -class ROCDL_Wmma_IntrOp traits = []> : +class ROCDL_Wmma_IntrOp

[compiler-rt] [clang-tools-extra] [lld] [lldb] [clang] [flang] [libc] [libcxx] [llvm] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: > > Correct, some of these instructions use opsel[1] which in LLVM in stored in > > src1_modifiers so a dummy src1 is used. > > Why can't we just use `SRCMODS.OP_SEL_1` with src0? That could work. We would have to make custom encoding classes then since OP_SEL_1 would have

[clang] [clang-tools-extra] [compiler-rt] [llvm] [flang] [libc] [lld] [lldb] [libcxx] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
@@ -626,11 +629,82 @@ class Cvt_PK_F32_F8_Pat; -foreach Index = [0, -1] in { - def : Cvt_PK_F32_F8_Pat; - def : Cvt_PK_F32_F8_Pat; +let SubtargetPredicate = isGFX9Only in { + foreach Index = [0, -1] in { +def : Cvt_PK_F32_F8_Pat; +def : Cvt_PK_F32_F8_Pat; + } +} + +

[libc] [clang] [libcxx] [llvm] [lld] [clang-tools-extra] [flang] [compiler-rt] [lldb] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: > > > Why is so there so much special casing in the assembler/disassembler? > > > > > > I'm not an original author of these change, but from what I understand it > > is a workaround to handle VOP3 instructions which have a single source but > > require the use of two bits

[libc] [flang] [compiler-rt] [llvm] [clang-tools-extra] [lldb] [clang] [libcxx] [lld] [AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (PR #78414)

2024-01-22 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: > > Why is so there so much special casing in the assembler/disassembler? > > I'm not an original author of these change, but from what I understand it is > a workaround to handle VOP3 instructions which have a single source but > require the use of two bits from OPSEL.

[clang] [llvm] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-19 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: Rebased. https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] Do not emit `V_DOT2C_F32_F16_e32` on GFX12 (PR #78709)

2024-01-19 Thread Mirko Brkušanin via cfe-commits
https://github.com/mbrkusanin approved this pull request. https://github.com/llvm/llvm-project/pull/78709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [AMDGPU] Update uses of new VOP2 pseudos for GFX12 (PR #78155)

2024-01-18 Thread Mirko Brkušanin via cfe-commits
https://github.com/mbrkusanin approved this pull request. https://github.com/llvm/llvm-project/pull/78155 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-18 Thread Mirko Brkušanin via cfe-commits
@@ -423,6 +423,67 @@ TARGET_BUILTIN(__builtin_amdgcn_s_wakeup_barrier, "vi", "n", "gfx12-insts") TARGET_BUILTIN(__builtin_amdgcn_s_barrier_leave, "b", "n", "gfx12-insts") TARGET_BUILTIN(__builtin_amdgcn_s_get_barrier_state, "Uii", "n", "gfx12-insts")

[llvm] [clang] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-11 Thread Mirko Brkušanin via cfe-commits
mbrkusanin wrote: Note that the first commit in this PR is: https://github.com/llvm/llvm-project/pull/77785 https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org