[clang] 4a78225 - [AMDGPU] Add WMMA clang builtins

2022-06-30 Thread Piotr Sobczak via cfe-commits
Author: Piotr Sobczak Date: 2022-07-01T08:55:25+02:00 New Revision: 4a782252127761b60d33e74f9d9acb0aad6f742f URL: https://github.com/llvm/llvm-project/commit/4a782252127761b60d33e74f9d9acb0aad6f742f DIFF: https://github.com/llvm/llvm-project/commit/4a782252127761b60d33e74f9d9acb0aad6f742f.diff

[clang-tools-extra] [AMDGPU] Rematerialize scalar loads (PR #68778)

2023-10-25 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Thanks. If there are no more comments, I will merge the change tomorrow. https://github.com/llvm/llvm-project/pull/68778 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Rematerialize scalar loads (PR #68778)

2023-10-25 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Thanks. If there are no more comments, I will merge the change tomorrow. https://github.com/llvm/llvm-project/pull/68778 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [AMDGPU] Rematerialize scalar loads (PR #68778)

2023-10-26 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD closed https://github.com/llvm/llvm-project/pull/68778 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-11 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD created https://github.com/llvm/llvm-project/pull/2 Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsics int_amdgcn_global_load_tr_b64/int_amdgcn_global_load_tr_b128 * C

[llvm] [clang] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-12 Thread Piotr Sobczak via cfe-commits
@@ -18240,65 +18240,211 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32: case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64: case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32: -

[llvm] [clang] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-12 Thread Piotr Sobczak via cfe-commits
@@ -18240,65 +18240,211 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w32: case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu4_w64: case AMDGPU::BI__builtin_amdgcn_wmma_i32_16x16x16_iu8_w32: -

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-12 Thread Piotr Sobczak via cfe-commits
@@ -18178,6 +18178,51 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, llvm::Function *F = CGM.getIntrinsic(IID, {ArgTy}); return Builder.CreateCall(F, {Addr, Val, ZeroI32, ZeroI32, ZeroI1}); } + case AMDGPU::BI__builtin_amdgcn_global_load_tr_b64

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-12 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD updated https://github.com/llvm/llvm-project/pull/2 >From 1b2085465dd0988459a4c71dab6cd65b1de065be Mon Sep 17 00:00:00 2001 From: Piotr Sobczak Date: Thu, 11 Jan 2024 14:52:59 +0100 Subject: [PATCH 1/2] [AMDGPU] Add global_load_tr for GFX12 Support new amdgcn_gl

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-12 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD updated https://github.com/llvm/llvm-project/pull/2 >From 1b2085465dd0988459a4c71dab6cd65b1de065be Mon Sep 17 00:00:00 2001 From: Piotr Sobczak Date: Thu, 11 Jan 2024 14:52:59 +0100 Subject: [PATCH 1/3] [AMDGPU] Add global_load_tr for GFX12 Support new amdgcn_gl

[clang-tools-extra] [AMDGPU] Rematerialize scalar loads (PR #68778)

2023-10-25 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD updated https://github.com/llvm/llvm-project/pull/68778 >From 6b5ada294d999ba1412020806ce8fab8e34a408e Mon Sep 17 00:00:00 2001 From: Piotr Sobczak Date: Wed, 11 Oct 2023 10:32:57 +0200 Subject: [PATCH 1/5] [AMDGPU] Rematerialize scalar loads Extend the list of inst

[libcxx] [clang] [llvm] [mlir] [compiler-rt] [flang] [libc] [AMDGPU] Define new targets gfx1200 and gfx1201 (PR #73133)

2023-11-23 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD approved this pull request. LGTM. The failures in buildkite/github-pull-requests look unrelated. https://github.com/llvm/llvm-project/pull/73133 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org

[lldb] [lld] [llvm] [flang] [libc] [libcxx] [clang-tools-extra] [compiler-rt] [clang] [libcxxabi] [libunwind] [AMDGPU] Add test for GCNRegPressure tracker bug (PR #73786)

2023-11-30 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Rebased and added missing live-throughs. https://github.com/llvm/llvm-project/pull/73786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lldb] [compiler-rt] [lld] [libc] [libcxx] [llvm] [flang] [clang-tools-extra] [clang] [libunwind] [libcxxabi] [AMDGPU] Add test for GCNRegPressure tracker bug (PR #73786)

2023-11-30 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD closed https://github.com/llvm/llvm-project/pull/73786 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] GFX12: Add Split Workgroup Barrier (PR #74836)

2023-12-12 Thread Piotr Sobczak via cfe-commits
@@ -684,6 +684,59 @@ # GFX12: s_rndne_f16 s5, 0x3456 ; encoding: [0xff,0x6e,0x85,0xbe,0x56,0x34,0x00,0x00] 0xff,0x6e,0x85,0xbe,0x56,0x34,0x00,0x00 +# GFX12: s_barrier_signal -2 ; encoding: [0xc2,0x4e,0x80,0xbe] +0xc2,0x4e,0x80,0xbe + +#

[compiler-rt] [clang] [libcxx] [llvm] [flang] [clang-tools-extra] [libc] [AMDGPU] GFX12: Add Split Workgroup Barrier (PR #74836)

2023-12-12 Thread Piotr Sobczak via cfe-commits
@@ -15,6 +15,15 @@ # GFX12: s_singleuse_vdst 0x1234 ; encoding: [0x34,0x12,0x93,0xbf] 0x34,0x12,0x93,0xbf +# GFX12: s_barrier_wait 0x ; encoding: [0xff,0xff,0x94,0xbf] +0xff,0xff,0x94,0xbf + +# GFX12: s_barrier_wait 1

[clang] [clang-tools-extra] [llvm] [libcxx] [flang] [compiler-rt] [libc] [AMDGPU] GFX12: Add Split Workgroup Barrier (PR #74836)

2023-12-12 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD approved this pull request. LGTM with a nit. https://github.com/llvm/llvm-project/pull/74836 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libcxx] [llvm] [lld] [compiler-rt] [flang] [libc] [lldb] [clang-tools-extra] [AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (PR #75030)

2023-12-13 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD closed https://github.com/llvm/llvm-project/pull/75030 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lld] [libc] [clang] [flang] [llvm] [lldb] [clang-tools-extra] [libcxx] [mlir] [compiler-rt] [AMDGPU] Min/max changes for GFX12 (PR #75214)

2023-12-13 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD closed https://github.com/llvm/llvm-project/pull/75214 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][GFX12] Add new v_permlane16 variants (PR #75475)

2023-12-14 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD approved this pull request. LGTM You could also update existing permlane tests with run lines for gfx12: * test/CodeGen/AMDGPU/llvm.amdgcn.permlane.ll * test/CodeGen/AMDGPU/vcmpx-permlane-hazard.mir This can also be a separate patch. https://github.com/llvm/llvm-pro

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-21 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: The change LG - thanks for adding support for bf16. Agreed that the intrinsics should match the builtins for consistency (now or in a follow-up commit). These intrinsics were added for the upcoming generation - it should be fine to rename them at this stage. https://github.com

[clang] [AMDGPU] Check wavefrontsize for GFX11 WMMA builtins (PR #79980)

2024-01-30 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Do you think it makes sense to add two gfx11 tests where _w32 variant is now rejected with w64, and _w64 variant rejected with w32? Maybe what is being printed in *-gfx10-err.cl test is enough, though. https://github.com/llvm/llvm-project/pull/79980

[clang] [AMDGPU] Check wavefrontsize for GFX11 WMMA builtins (PR #79980)

2024-01-30 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD approved this pull request. https://github.com/llvm/llvm-project/pull/79980 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang-tools-extra] [clang] [AMDGPU] Add IR-level pass to rewrite away address space 7 (PR #77952)

2024-02-02 Thread Piotr Sobczak via cfe-commits
@@ -0,0 +1,1983 @@ +//===-- AMDGPULowerBufferFatPointers.cpp ---=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0

[llvm] [clang] [clang-tools-extra] [AMDGPU] Add IR-level pass to rewrite away address space 7 (PR #77952)

2024-02-02 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Came across an assertion, see the attachment for the reproducer [repro.txt](https://github.com/llvm/llvm-project/files/14138750/repro.txt): opt -S -mcpu=gfx1100 -amdgpu-lower-buffer-fat-pointers repro.txt #10 0x03aed749 llvm::StructLayout::StructLayout(llvm::StructType*

[lld] [lldb] [libcxx] [clang] [libc] [clang-tools-extra] [flang] [llvm] [AMDGPU] Add pal metadata 3.0 support to callable pal funcs (PR #67104)

2024-02-05 Thread Piotr Sobczak via cfe-commits
@@ -1025,6 +1025,26 @@ void AMDGPUAsmPrinter::EmitProgramInfoSI(const MachineFunction &MF, OutStreamer->emitInt32(MFI->getNumSpilledVGPRs()); } +// Helper function to add common PAL Metadata 3.0+ +static void EmitPALMetadataCommon(AMDGPUPALMetadata *MD, +

[lld] [lldb] [libcxx] [clang] [libc] [clang-tools-extra] [flang] [llvm] [AMDGPU] Add pal metadata 3.0 support to callable pal funcs (PR #67104)

2024-02-05 Thread Piotr Sobczak via cfe-commits
@@ -1025,6 +1025,26 @@ void AMDGPUAsmPrinter::EmitProgramInfoSI(const MachineFunction &MF, OutStreamer->emitInt32(MFI->getNumSpilledVGPRs()); } +// Helper function to add common PAL Metadata 3.0+ +static void EmitPALMetadataCommon(AMDGPUPALMetadata *MD, +

[clang] [lld] [flang] [llvm] [compiler-rt] [openmp] [lldb] [clang-tools-extra] [libcxx] [libc] [mlir] AMDGPU: Do not generate non-temporal hint when Load_Tr intrinsic did not specify it (PR #79104)

2024-01-23 Thread Piotr Sobczak via cfe-commits
@@ -13,9 +13,8 @@ define amdgpu_kernel void @global_load_tr_b64(ptr addrspace(1) %addr, ptr addrsp ; GFX12-SDAG-W32-NEXT:s_load_b128 s[0:3], s[0:1], 0x24 ; GFX12-SDAG-W32-NEXT:v_mov_b32_e32 v2, 0 ; GFX12-SDAG-W32-NEXT:s_wait_kmcnt 0x0 -; GFX12-SDAG-W32-NEXT:glo

[clang] [llvm] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-24 Thread Piotr Sobczak via cfe-commits
@@ -2601,67 +2601,73 @@ def int_amdgcn_ds_bvh_stack_rtn : [ImmArg>, IntrWillReturn, IntrNoCallback, IntrNoFree] >; +def int_amdgcn_s_wait_event_export_ready : + ClangBuiltin<"__builtin_amdgcn_s_wait_event_export_ready">, + Intrinsic<[], [], [IntrNoMem, IntrHasSideEffec

[llvm] [clang] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-24 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD approved this pull request. https://github.com/llvm/llvm-project/pull/77795 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-15 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD updated https://github.com/llvm/llvm-project/pull/2 >From 1b2085465dd0988459a4c71dab6cd65b1de065be Mon Sep 17 00:00:00 2001 From: Piotr Sobczak Date: Thu, 11 Jan 2024 14:52:59 +0100 Subject: [PATCH 1/4] [AMDGPU] Add global_load_tr for GFX12 Support new amdgcn_gl

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-15 Thread Piotr Sobczak via cfe-commits
@@ -2496,6 +2496,26 @@ def int_amdgcn_flat_atomic_fmax_num : AMDGPUAtomicRtn; def int_amdgcn_global_atomic_fmin_num : AMDGPUAtomicRtn; def int_amdgcn_global_atomic_fmax_num : AMDGPUAtomicRtn; +class AMDGPUGlobalLoadTr : + Intrinsic< +[data_ty], +[global_ptr_ty], +

[clang] [llvm] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-16 Thread Piotr Sobczak via cfe-commits
@@ -18178,6 +18178,51 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, llvm::Function *F = CGM.getIntrinsic(IID, {ArgTy}); return Builder.CreateCall(F, {Addr, Val, ZeroI32, ZeroI32, ZeroI1}); } + case AMDGPU::BI__builtin_amdgcn_global_load_tr_b64

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-16 Thread Piotr Sobczak via cfe-commits
@@ -18178,6 +18178,51 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, llvm::Function *F = CGM.getIntrinsic(IID, {ArgTy}); return Builder.CreateCall(F, {Addr, Val, ZeroI32, ZeroI32, ZeroI1}); } + case AMDGPU::BI__builtin_amdgcn_global_load_tr_b64

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-16 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD updated https://github.com/llvm/llvm-project/pull/2 >From 1b2085465dd0988459a4c71dab6cd65b1de065be Mon Sep 17 00:00:00 2001 From: Piotr Sobczak Date: Thu, 11 Jan 2024 14:52:59 +0100 Subject: [PATCH 1/5] [AMDGPU] Add global_load_tr for GFX12 Support new amdgcn_gl

[llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-18 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Discussed it some more internally and the agreement was to keep the "global" and have one intrinsic for both instructions. Just updated the PR to reflect that - this effectively reverts the previous update. https://github.com/llvm/llvm-project/pull/2 ___

[clang-tools-extra] [libc] [openmp] [llvm] [clang] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-18 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Rebased and regenerated lit tests after GFX12 waitcnt codegen changes. https://github.com/llvm/llvm-project/pull/2 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libc] [llvm] [clang] [clang-tools-extra] [openmp] [AMDGPU] Add global_load_tr for GFX12 (PR #77772)

2024-01-18 Thread Piotr Sobczak via cfe-commits
https://github.com/piotrAMD closed https://github.com/llvm/llvm-project/pull/2 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [compiler-rt] [llvm] [PAC][AArch64] Support init/fini array signing (PR #96478)

2024-08-06 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: I am getting build errors with gcc. Any ideas? ``` llvm-project/compiler-rt/lib/builtins/crtbegin.c:11:18: error: missing binary operator before token "(" 11 | #if __has_feature(ptrauth_init_fini) | ^ llvm-project/compiler-rt/lib/builtins/crtbegin.c:

[clang] [compiler-rt] [llvm] [PAC][AArch64] Support init/fini array signing (PR #96478)

2024-08-06 Thread Piotr Sobczak via cfe-commits
piotrAMD wrote: Thanks for the quick fix, but apparently the fix was already submitted in 41b83ca559c402d238e303c0ac233180d60dcd57 - I wasn't aware of that. https://github.com/llvm/llvm-project/pull/96478 ___ cfe-commits mailing list cfe-commits@list