[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-13 Thread Krzysztof Drewniak via cfe-commits
@@ -0,0 +1,9 @@ + +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn -emit-llvm -o - %s -debug-info-kind=limited 2>&1 | FileCheck %s + +// CHECK: name: "__amdgcn_buffer_rsrc_t",{{.*}}baseType: ![[BT:[0-9]+]] +// CHECK: [[BT]] = !DICompositeType(tag:

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-12 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: Just a note - and maybe this was already discussed above - is there good reason not to explicitly make this type a 128-bit scalar? The LLVM data layout already does this https://github.com/llvm/llvm-project/pull/94830 ___

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-12 Thread Krzysztof Drewniak via cfe-commits
@@ -0,0 +1,95 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu verde -emit-llvm -o - %s | FileCheck %s +// RUN: %clang_cc1 -triple

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-10 Thread Krzysztof Drewniak via cfe-commits
@@ -2201,6 +2207,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { Align = 8; \ break; #include "clang/Basic/WebAssemblyReferenceTypes.def" +case BuiltinType::AMDGPUBufferRsrc: +

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: @ThomasRaoux No, I just left a nitpick. I'm happy with the state of this. https://github.com/llvm/llvm-project/pull/94735 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: (The ugly version of the arbitrary types code lives around https://github.com/GPUOpen-Drivers/llpc/blob/6c770c7d276d2c2504aed2a0278aab1610993ecf/lgc/patch/PatchBufferOp.cpp#L1559 and really should be an isel legalization instead) https://github.com/llvm/llvm-project/pull/94576

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: (My guesses for how I might use sofffset is if I've got multiple identical buffers concatentated and I need to pick between them without messing with the extent field) https://github.com/llvm/llvm-project/pull/94576 ___ cfe-commits

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: The thing is, in all the usecases I've seen, `soffsset == 0`, and so you can legalize on `voffset` (voffset is also what the constant offsets on an instruction get added to) https://github.com/llvm/llvm-project/pull/94576 ___

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: `raw.ptr.buffer.load` (and `.store`) are loads and stores and should be able to deal with any type you could send through a normal pointer (especially since a partially-OOB read is already hardware-level UB, so extending that through the intrinsics is reasonable)

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: `voffset` and `soffset` are "offset that goes in VGPRs" and "offset that goes in SGPRs", with the latter having some different bounds-checking semantics on ... at least some of the gfx9's, IIRC. The address space 7 lowering just uses voffset. Re arbitrary aggregates: LLPC has

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: 1. For the swizzled case, that's `struct.ptr.buffer.*`, and yeah, those will always need builtins because LLVM can't deal in 2D addressing schemes 2. What I mean is that "types that work" isn't the right framing: any type can be legalized to one or more types that work. That

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: Actually, even ignoring address space 7, it feels like these builtins if you could `raw.ptr.buffer.store` any type you liked, and then they could be type-varying in Clang? https://github.com/llvm/llvm-project/pull/94576 ___

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 edited https://github.com/llvm/llvm-project/pull/94735 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
@@ -139,6 +143,10 @@ static constexpr fltSemantics semFloat8E4M3FNUZ = { static constexpr fltSemantics semFloat8E4M3B11FNUZ = { 4, -10, 4, 8, fltNonfiniteBehavior::NanOnly, fltNanEncoding::NegativeZero}; static constexpr fltSemantics semFloatTF32 = {127, -126, 11, 19};

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 commented: I have no issues with the code as written but I'm rather confused by how it will be used What's the motivation for this PR? Will anyone be trying to constant-fold these things? (If it's for MLIR support, I'd like to have a discussion there, since I

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-06 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: Re addrspace 7, there's one major piece of work missing: arbitrary-typed inputs. That is, we can't currently handle, for example, `load <16 x i8>, ptr addrspace(7) %p` (or, worse, `load i256, ptr addrspace(7) %p`. That's been a followup ticket I never have time to do. If we do

[llvm] [clang] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-06 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: @arsenm Are you suggesting that these should instead be a range of minimum/maximum number of workitems globally? https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [lldb] [lld] [libc] [clang-tools-extra] [clang] [libcxx] [flang] [AMDGPU] Add IR-level pass to rewrite away address space 7 (PR #77952)

2024-02-02 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: @piotrAMD Thanks for the thorough testing! I found the issue (stale pointer) and your code also gave me an unrelated crash, namely that I wasn't correctly handling unreachable intrinssics. https://github.com/llvm/llvm-project/pull/77952

[clang] [clang-tools-extra] [llvm] [AMDGPU] Add IR-level pass to rewrite away address space 7 (PR #77952)

2024-02-01 Thread Krzysztof Drewniak via cfe-commits
@@ -0,0 +1,1983 @@ +//===-- AMDGPULowerBufferFatPointers.cpp ---=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-01-30 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: Yeah, that's my proposal for metadata that's useful to record, especially since `min == max` gives the present case https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [clang] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-01-29 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: I'm suggesting that this might be a more general design and that there might be more uses for it. https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-01-29 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: Do we want to also get `min-num-work-groups` and `max-num-work-groups` versions? https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [clang-tools-extra] [clang] [SeperateConstOffsetFromGEP] Handle `or disjoint` flags (PR #76997)

2024-01-26 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 closed https://github.com/llvm/llvm-project/pull/76997 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [llvm] [SeperateConstOffsetFromGEP] Handle `or disjoint` flags (PR #76997)

2024-01-26 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 edited https://github.com/llvm/llvm-project/pull/76997 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [llvm] [SeperateConstOffsetFromGEP] Handle `or disjoint` flags (PR #76997)

2024-01-25 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 updated https://github.com/llvm/llvm-project/pull/76997 >From 5cc46862df42e7d01a2d45ccc18f221744af0b93 Mon Sep 17 00:00:00 2001 From: Krzysztof Drewniak Date: Thu, 4 Jan 2024 20:20:54 + Subject: [PATCH 1/2] [SeperateConstOffsetFromGEP] Handle `or disjoint` flags

[llvm] [clang] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-22 Thread Krzysztof Drewniak via cfe-commits
@@ -253,22 +253,22 @@ def ROCDL_mfma_f32_32x32x16_fp8_fp8 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.fp8.f //===-===// // WMMA intrinsics -class ROCDL_Wmma_IntrOp traits = []> : +class ROCDL_Wmma_IntrOp

[clang] [llvm] [mlir] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-22 Thread Krzysztof Drewniak via cfe-commits
@@ -253,22 +253,22 @@ def ROCDL_mfma_f32_32x32x16_fp8_fp8 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.fp8.f //===-===// // WMMA intrinsics -class ROCDL_Wmma_IntrOp traits = []> : +class ROCDL_Wmma_IntrOp

[clang] [mlir] [llvm] [AMDGPU] Add GFX12 WMMA and SWMMAC instructions (PR #77795)

2024-01-22 Thread Krzysztof Drewniak via cfe-commits
@@ -253,22 +253,22 @@ def ROCDL_mfma_f32_32x32x16_fp8_fp8 : ROCDL_Mfma_IntrOp<"mfma.f32.32x32x16.fp8.f //===-===// // WMMA intrinsics -class ROCDL_Wmma_IntrOp traits = []> : +class ROCDL_Wmma_IntrOp

[llvm] [clang] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #75647)

2024-01-17 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: "dispatch size"? https://github.com/llvm/llvm-project/pull/75647 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #75647)

2024-01-16 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: Good to know that other targets have that sort of "how many work groups will be launched" information. Having that be a min/max (either per dimension or in total or both) may be the right approach here, and this could be a good excuse for the unification being talked about.

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #75647)

2024-01-15 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: I'd go with Matt's point: close this, and then add metadata for required launch grid sizes. Then you can update `AMDGPULowerKernelAttributes` to use said metadata. https://github.com/llvm/llvm-project/pull/75647 ___ cfe-commits

[llvm] [clang] [compiler-rt] [clang-tools-extra] [AMDGPU] Avoid hitting AMDGPUAsmPrinter related asserts for local functions at O0 (PR #72129)

2024-01-12 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: As a somewhat naive question, what would it take to turn off requiring codegen to be in SCC order? We seem to be the only target doing that. The comments on that line say something about function calls and noinline https://github.com/llvm/llvm-project/pull/72129

[llvm] [clang] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #75647)

2024-01-10 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: @arsenm It's entirely possible that max dispatch size per dimension is the right feature instead, now that you mention it (I keep forgetting we have a grid). Currently I was thinking this'll be useful for `KnownBits`-type info, so ... yeah, per-dimension

[mlir] [clang] [llvm] [AMDGPU] - Add address space for strided buffers (PR #74471)

2023-12-12 Thread Krzysztof Drewniak via cfe-commits
@@ -864,6 +865,16 @@ supported for the ``amdgcn`` target. (bits `127:96`). The specific interpretation of these fields varies by the target architecture and is detailed in the ISA descriptions. +**Buffer Strided Pointer** + The buffer index pointer is an experimental

[llvm] [mlir] [clang] [AMDGPU] - Add address space for strided buffers (PR #74471)

2023-12-12 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 edited https://github.com/llvm/llvm-project/pull/74471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [mlir] [AMDGPU] - Add address space for strided buffers (PR #74471)

2023-12-12 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 approved this pull request. Looks good to me, aside from a documentation nit. https://github.com/llvm/llvm-project/pull/74471 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [mlir] [clang] [AMDGPU] - Add address space for strided buffers (PR #74471)

2023-12-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: I'm going to ask the annoying questions: 1. Isn't a strided buffer one where the field that's named something like `stride` (bits 61:48 or 63:48) is non-zero 2. And therefore it uses structured buffers and the `llvm.struct[.ptr].buffer.*` intrinsics? 3. So, with LLVM's gep, how

[mlir] [flang] [clang-tools-extra] [compiler-rt] [clang] [libcxx] [llvm] [libc] Make SmallVectorImpl destructor protected (PR #71439)

2023-11-08 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: I put up a PR to fix SerializeToHsaco and unblock this https://github.com/llvm/llvm-project/pull/71439 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libc] [clang-tools-extra] [libcxx] [clang] [llvm] [flang] [compiler-rt] [mlir] Make SmallVectorImpl destructor protected (PR #71439)

2023-11-07 Thread Krzysztof Drewniak via cfe-commits
krzysz00 wrote: I don't know if I'll be able to get to the SerializeToHsaco fix today, but passing in `SmallVectorImpl&` would be my preferred solution ... Or really that should be `MemoryBuffer &` or some other such structure if feasible. https://github.com/llvm/llvm-project/pull/71439

[clang] [Sema] -Wzero-as-null-pointer-constant: don't warn for __null (PR #69126)

2023-10-24 Thread Krzysztof Drewniak via cfe-commits
https://github.com/krzysz00 updated https://github.com/llvm/llvm-project/pull/69126 >From 357a21c38c1036a012affc85026fcba376ab7128 Mon Sep 17 00:00:00 2001 From: Arseny Kapoulkine Date: Sun, 15 Oct 2023 13:20:31 -0700 Subject: [PATCH 1/2] [Sema] -Wzero-as-null-pointer-constant: don't warn for

[clang] 5d8da5a - Add missing cases to clang switch after D141863

2023-02-09 Thread Krzysztof Drewniak via cfe-commits
Author: Krzysztof Drewniak Date: 2023-02-09T23:17:55Z New Revision: 5d8da5a208e6501baff7a8fd8de76ea143e49646 URL: https://github.com/llvm/llvm-project/commit/5d8da5a208e6501baff7a8fd8de76ea143e49646 DIFF: https://github.com/llvm/llvm-project/commit/5d8da5a208e6501baff7a8fd8de76ea143e49646.diff

[clang] d6ef3d2 - [mlir] Remove VectorToROCDL

2022-07-12 Thread Krzysztof Drewniak via cfe-commits
Author: Krzysztof Drewniak Date: 2022-07-12T15:21:22Z New Revision: d6ef3d20b4e3768dc30fb229dfa938d8059fffef URL: https://github.com/llvm/llvm-project/commit/d6ef3d20b4e3768dc30fb229dfa938d8059fffef DIFF: https://github.com/llvm/llvm-project/commit/d6ef3d20b4e3768dc30fb229dfa938d8059fffef.diff