https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/96738
>From 5f614809ac4ffa5e29a01c7e9410d91eadcbe6f2 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Tue, 11 Jun 2024 10:40:27 +0200
Subject: [PATCH 1/2] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins
---
c
@@ -178,6 +178,20 @@ bool AMDGPUAtomicOptimizerImpl::run(Function &F) {
return Changed;
}
+static bool shouldOptimize(Type *Ty) {
arsenm wrote:
Better name that expresses why this type is handleable.
Also in a follow up, really should cover the i16/half/b
@@ -178,6 +178,20 @@ bool AMDGPUAtomicOptimizerImpl::run(Function &F) {
return Changed;
}
+static bool shouldOptimize(Type *Ty) {
+ switch (Ty->getTypeID()) {
+ case Type::FloatTyID:
+ case Type::DoubleTyID:
+return true;
+ case Type::IntegerTyID: {
+if (Ty->getI
https://github.com/arsenm ready_for_review
https://github.com/llvm/llvm-project/pull/96738
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
* **#96739** https://app.graphite.dev/github/pr/llvm/llvm-project/96739?utm_source=stack-comment-icon";
target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite"
width="10px" height="10px"/>
* **#96738** https://app.graphite.dev/github/pr/llvm/llvm-proj
https://github.com/arsenm created
https://github.com/llvm/llvm-project/pull/96738
None
>From 0d9ab2bcbaa2b4b11832a8ac1848505cf73f4880 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Tue, 11 Jun 2024 10:40:27 +0200
Subject: [PATCH] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins
---
arsenm wrote:
> > > > Incrementing by align is just a bug, of course the size is the real
> > > > value. Whether we want to continue wasting space is another
> > > > not-correctness discussion
> > >
> > >
> > > Struct padding is pretty universal, AMDGPU seems the odd one out here. I
> > > wo
arsenm wrote:
> > Incrementing by align is just a bug, of course the size is the real value.
> > Whether we want to continue wasting space is another not-correctness
> > discussion
>
> Struct padding is pretty universal, AMDGPU seems the odd one out here. I
> wouldn't mind it so much if it di
@@ -228,10 +228,11 @@ void
AMDGPUAtomicOptimizerImpl::visitAtomicRMWInst(AtomicRMWInst &I) {
// If the value operand is divergent, each lane is contributing a different
// value to the atomic calculation. We can only optimize divergent values if
- // we have DPP availabl
@@ -311,10 +312,11 @@ void
AMDGPUAtomicOptimizerImpl::visitIntrinsicInst(IntrinsicInst &I) {
// If the value operand is divergent, each lane is contributing a different
// value to the atomic calculation. We can only optimize divergent values if
- // we have DPP availabl
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/92725
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/94576
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -2626,14 +2629,20 @@ void CodeGenFunction::EmitAsmStmt(const AsmStmt &S) {
SmallVector OutputConstraintInfos;
SmallVector InputConstraintInfos;
+ const FunctionDecl *FD = dyn_cast_or_null(CurCodeDecl);
arsenm wrote:
I think we should just get rid of d
https://github.com/arsenm commented:
It's really unfortunate to have to add all this asm handling to clang. Can't it
rely on backend diagnostic remarks for this?
https://github.com/llvm/llvm-project/pull/96363
___
cfe-commits mailing list
cfe-commits
@@ -2626,14 +2629,20 @@ void CodeGenFunction::EmitAsmStmt(const AsmStmt &S) {
SmallVector OutputConstraintInfos;
SmallVector InputConstraintInfos;
+ const FunctionDecl *FD = dyn_cast_or_null(CurCodeDecl);
arsenm wrote:
Where do you get dyn_cast_or_null i
arsenm wrote:
> Kindly review only the top commit here
If you're going to repost with a pre-commit, it would be better to have all the
pieces squashed into one. Also you could look into using graphite or SPR for
managing dependent pull requests
https://github.com/llvm/llvm-project/pull/96473
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/95396
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/95593
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/95592
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
arsenm wrote:
### Merge activity
* **Jun 23, 4:06 AM EDT**: @arsenm started a stack merge that includes this
pull request via
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/95592).
https://github.com/llvm/llvm-project/pull/95592
__
arsenm wrote:
Incrementing by align is just a bug, of course the size is the real value.
Whether we want to continue wasting space is another not-correctness discussion
https://github.com/llvm/llvm-project/pull/96370
___
cfe-commits mailing list
cfe-
arsenm wrote:
> Here, because the minimum alignment is 4, we will only increment the
buffer by 4,
It should be incrementing by the size? 4 byte aligned access of 8 byte type
should work fine
https://github.com/llvm/llvm-project/pull/96370
___
cfe-co
@@ -1671,6 +1671,7 @@ int main(int Argc, char **Argv) {
NewArgv.push_back(Arg->getValue());
for (const opt::Arg *Arg : Args.filtered(OPT_offload_opt_eq_minus))
NewArgv.push_back(Args.MakeArgString(StringRef("-") + Arg->getValue()));
+ llvm::errs() << "asdfasdf\n";
--
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/96313
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -149,6 +149,12 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi",
"nc")
BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_b8, "vcQbiiIi", "n")
+BUILT
@@ -149,6 +149,12 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi",
"nc")
BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc")
+BUILTIN(__builtin_amdgcn_raw_buffer_store_b8, "vcQbiiIi", "n")
+BUILT
@@ -581,49 +581,19 @@ static Value
*emitCallMaybeConstrainedFPBuiltin(CodeGenFunction &CGF,
return CGF.Builder.CreateCall(F, Args);
}
-// Emit a simple mangled intrinsic that has 1 argument and a return type
-// matching the argument type.
-static Value *emitUnaryBuiltin(
@@ -149,6 +149,19 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi",
"nc")
BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc")
+BUILTIN(__builtin_amdgcn_raw_ptr_buffer_store_i8, "vcQbiiIi", "n")
--
@@ -149,6 +149,19 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi",
"nc")
BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc")
+BUILTIN(__builtin_amdgcn_raw_ptr_buffer_store_i8, "vcQbiiIi", "n")
--
@@ -626,6 +626,18 @@ static Value *emitQuaternaryBuiltin(CodeGenFunction &CGF,
const CallExpr *E,
return CGF.Builder.CreateCall(F, {Src0, Src1, Src2, Src3});
}
+static Value *emitQuinaryBuiltin(CodeGenFunction &CGF, const CallExpr *E,
arsenm wrote:
The nam
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/94576
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -149,6 +149,19 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi",
"nc")
BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc")
BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc")
+BUILTIN(__builtin_amdgcn_raw_ptr_buffer_store_i8, "vcQbiiIi", "n")
--
https://github.com/arsenm commented:
I'm wondering if we should really have all the different typed variants, and if
this should be the name. I guess
https://github.com/llvm/llvm-project/pull/94576
___
cfe-commits mailing list
cfe-commits@lists.llvm.o
@@ -169,6 +180,11 @@
// COMMON-UNSAFE-MATH-SAME: "-mlink-builtin-bitcode"
"{{.*}}/amdgcn/bitcode/oclc_finite_only_off.bc"
// COMMON-UNSAFE-MATH-SAME: "-mlink-builtin-bitcode"
"{{.*}}/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc"
+// ASAN-SAME: "-fsanitize=address"
+
+//
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/96262
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96262
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/95396
>From f0f8e09caff2df5632d4252ca354b24c0c6f0e87 Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Mon, 10 Jun 2024 19:48:13 +0200
Subject: [PATCH] AMDGPU: Remove ds atomic fadd intrinsics
These have been replace
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/95396
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/95395
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -33,6 +33,7 @@
// q -> Scalable vector, followed by the number of elements and the base type.
// Q -> target builtin type, followed by a character to distinguish the
builtin type
//Qa -> AArch64 svcount_t builtin type.
+//Qb -> AMDGPU __amdgpu_buffer_rsrc_t builti
https://github.com/arsenm updated
https://github.com/llvm/llvm-project/pull/95395
>From 35c741fe2563094bc20c179ee9f244620025405c Mon Sep 17 00:00:00 2001
From: Matt Arsenault
Date: Mon, 10 Jun 2024 19:40:59 +0200
Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from ds_fadd builtins
We should hav
@@ -33,6 +33,7 @@
// q -> Scalable vector, followed by the number of elements and the base type.
// Q -> target builtin type, followed by a character to distinguish the
builtin type
//Qa -> AArch64 svcount_t builtin type.
+//Qb -> AMDGPU __amdgpu_buffer_rsrc_t builti
@@ -19082,6 +19082,15 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned
BuiltinID,
CGM.getIntrinsic(Intrinsic::amdgcn_s_sendmsg_rtn, {ResultType});
return Builder.CreateCall(F, {Arg});
}
+ case AMDGPU::BI__builtin_amdgcn_make_buffer_rsrc: {
+llvm::Va
https://github.com/arsenm approved this pull request.
LGTM but I'm not a frontend expert
https://github.com/llvm/llvm-project/pull/94830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit
@@ -0,0 +1,11 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -verify -triple amdgcn-amd-amdhsa -Wno-unused-value %s
arsenm wrote:
set explicit -cl-std, and check 1.2 and 2.0?
https://github.com/llvm/llvm-project/pull/94830
___
@@ -305,43 +304,43 @@ bool SIAnnotateControlFlow::handleLoop(BranchInst *Term) {
}
/// Close the last opened control flow
-bool SIAnnotateControlFlow::closeControlFlow(BasicBlock *BB) {
- llvm::Loop *L = LI->getLoopFor(BB);
+bool SIAnnotateControlFlow::tryWaveReconverge(Basic
@@ -0,0 +1 @@
+remark: :0:0: removing function 'needs_extimg': +extended-image-insts
is not supported on the current target
arsenm wrote:
accidentally added file?
https://github.com/llvm/llvm-project/pull/92809
___
c
@@ -15740,6 +15740,32 @@ void
SITargetLowering::finalizeLowering(MachineFunction &MF) const {
}
}
+ // ISel inserts copy to regs for the successor PHIs
+ // at the BB end. We need to move the SI_WAVE_RECONVERGE right before the
arsenm wrote:
Can you
@@ -2103,12 +2103,36 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI)
const {
MI.setDesc(get(AMDGPU::S_MOV_B64));
break;
+ case AMDGPU::S_CMOV_B64_term:
+// This is only a terminator to get the correct spill code placement during
+// register allocat
@@ -1,3 +1,4 @@
+; XFAIL: *
arsenm wrote:
can't just xfail tests
https://github.com/llvm/llvm-project/pull/92809
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/
@@ -3172,8 +3172,8 @@ def int_amdgcn_loop : Intrinsic<[llvm_i1_ty],
[llvm_anyint_ty], [IntrWillReturn, IntrNoCallback, IntrNoFree]
>;
-def int_amdgcn_end_cf : Intrinsic<[], [llvm_anyint_ty],
- [IntrWillReturn, IntrNoCallback, IntrNoFree]>;
+def int_amdgcn_wave_reconverge :
@@ -15740,6 +15740,32 @@ void
SITargetLowering::finalizeLowering(MachineFunction &MF) const {
}
}
+ // ISel inserts copy to regs for the successor PHIs
+ // at the BB end. We need to move the SI_WAVE_RECONVERGE right before the
+ // branch.
+ for (auto &MBB : MF) {
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/92809
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm requested changes to this pull request.
There are quite a few code quality regressions, and XFAILed tests. The
description needs more elaboration on what the strategy is here
https://github.com/llvm/llvm-project/pull/92809
_
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1764,6 +1764,13 @@ class TargetInfo : public TransferrableTargetInfo,
return 0;
}
+ /// \returns Target specific flat ptr address space; a flat ptr is a ptr that
+ /// can be casted to / from all other target address spaces. If the target
+ /// exposes no such add
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdg
@@ -0,0 +1,84 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --function-signature
+ // REQUIRES: amdgpu-registered-target
+ // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde
-emit-llvm -o - %s | FileCheck %s
+ // RUN
@@ -0,0 +1,84 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --function-signature
+ // REQUIRES: amdgpu-registered-target
+ // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde
-emit-llvm -o - %s | FileCheck %s
+ // RUN
@@ -0,0 +1,9 @@
+
arsenm wrote:
Extra blank line
https://github.com/llvm/llvm-project/pull/94830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,17 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -fsyntax-only -verify -std=gnu++11 -triple amdgcn
-Wno-unused-value %s
+
arsenm wrote:
We probably want another similar sema test for OpenCL/HIP/OpenMP
https://github.com/llvm/llvm-pro
@@ -1764,6 +1764,13 @@ class TargetInfo : public TransferrableTargetInfo,
return 0;
}
+ /// \returns Target specific flat ptr address space; a flat ptr is a ptr that
+ /// can be casted to / from all other target address spaces. If the target
+ /// exposes no such add
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdg
@@ -0,0 +1,86 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --version 5
+// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm
-o - | FileCheck --check-prefix=OPENCL12 %s
+// RUN: %clang_cc1 %s -O0 -triple amdg
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/95395
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89217
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,65 @@
+; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100
-verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s
+
+; CHECK-LABEL: name:basic_readfirstlane_i64
+; CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL
@@ -0,0 +1,21 @@
+//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apa
@@ -0,0 +1,84 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
UTC_ARGS: --function-signature
+ // REQUIRES: amdgpu-registered-target
+ // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde
-emit-llvm -o - %s | FileCheck %s
+ // RUN
@@ -0,0 +1,21 @@
+//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apa
@@ -0,0 +1,65 @@
+; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100
-verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s
+
+; CHECK-LABEL: name:basic_readfirstlane_i64
+; CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL
@@ -6129,13 +6150,55 @@ static SDValue lowerLaneOp(const SITargetLowering &TLI,
SDNode *N,
if (ValSize % 32 != 0)
return SDValue();
+ auto unrollLaneOp = [&DAG, &SL](SDNode *N) -> SDValue {
+EVT VT = N->getValueType(0);
+unsigned NE = VT.getVectorNumElements();
@@ -0,0 +1,65 @@
+; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100
-verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s
+
+; CHECK-LABEL: name:basic_readfirstlane_i64
+; CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL
@@ -0,0 +1,69 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+ // REQUIRES: amdgpu-registered-target
+ // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde
-emit-llvm -o - %s | FileCheck %s
+ // RUN: %clang_cc1 -triple amdgcn-unkn
https://github.com/arsenm approved this pull request.
https://github.com/llvm/llvm-project/pull/95373
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C,
const JobAction &JA,
CmdArgs.push_back("__clang_openmp_device_functions.h");
}
+ if (Args.hasArg(options::OPT_foffload_via_llvm)) {
+// Add llvm_wrappers/* to our system include path. This
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C,
const JobAction &JA,
CmdArgs.push_back("__clang_openmp_device_functions.h");
}
+ if (Args.hasArg(options::OPT_foffload_via_llvm)) {
+// Add llvm_wrappers/* to our system include path. This
arsenm wrote:
> Just a note - and maybe this was already discussed above - is there good
> reason not to explicitly make this type a 128-bit scalar? The LLVM data
> layout already does this
I thought this was the 160 bit version?
Can we have an opaque-but-sized type? The concern is exposing
@@ -0,0 +1,95 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu
verde -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple
arsenm wrote:
> I understand the chance of conflict is low. It may be like the chance of
> hitting by a meteor. However, if we prefix with `__amdgcn_`, there is no such
> risk. And we have the benefit to clearly indicate it is a amdgcn
> target-specific type.
Should use amdgpu
https://githu
@@ -0,0 +1,95 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu
verde -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple
@@ -0,0 +1,14 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm
-o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple amdgcn-unknown
@@ -128,12 +128,13 @@ enum class CudaArch {
GFX12_GENERIC,
GFX1200,
GFX1201,
+ AMDGCNSPIRV,
Generic, // A processor model named 'generic' if the target backend defines a
// public one.
LAST,
CudaDefault = CudaArch::SM_52,
- HIPDefault = CudaArch::
arsenm wrote:
> Or drop the new nodes altogether and legelaize to intrinsics directly ?
That's another option. The only real plus to the intermediate is it's slightly
less annoying to write combines for. But there are limited combining
opportunities for these
https://github.com/llvm/llvm-p
@@ -0,0 +1,46 @@
+# RUN: not --crash llc -mtriple=amdgcn -run-pass=none -verify-machineinstrs -o
/dev/null %s 2>&1 | FileCheck %s
arsenm wrote:
I'd still test all 3, but yes an IR test
https://github.com/llvm/llvm-project/pull/89217
___
@@ -0,0 +1,46 @@
+# RUN: not --crash llc -mtriple=amdgcn -run-pass=none -verify-machineinstrs -o
/dev/null %s 2>&1 | FileCheck %s
arsenm wrote:
You should not need to introduce any new machine verifier tests, they are not
useful. The useful test would be the IR
@@ -2201,6 +2207,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const
{
Align = 8;
\
break;
#include "clang/Basic/WebAssemblyReferenceTypes.def"
+case BuiltinType::AMDGPUBufferRsrc:
+ W
@@ -0,0 +1,11 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -fsyntax-only -verify -triple amdgcn -Wno-unused-value %s
+
+void foo() {
+ int n = 100;
+ __buffer_rsrc_t v = 0; // expected-error {{cannot initialize a variable of
type '__buffer_rsrc_t' with an rvalu
@@ -0,0 +1,11 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -fsyntax-only -verify -triple amdgcn -Wno-unused-value %s
+
+void foo() {
+ int n = 100;
+ __buffer_rsrc_t v = 0; // expected-error {{cannot initialize a variable of
type '__buffer_rsrc_t' with an rvalu
@@ -2200,6 +2206,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const
{
Align = 8;
\
break;
#include "clang/Basic/WebAssemblyReferenceTypes.def"
+case BuiltinType::AMDGPUBufferRsrc:
+ W
@@ -0,0 +1,30 @@
+// RUN: %clang++ -foffload-via-llvm --offload-arch=native %s -o %t
+// RUN: %t | %fcheck-generic
+
+// UNSUPPORTED: aarch64-unknown-linux-gnu
+// UNSUPPORTED: aarch64-unknown-linux-gnu-LTO
+// UNSUPPORTED: x86_64-pc-linux-gnu
+// UNSUPPORTED: x86_64-pc-linux-gnu-
@@ -0,0 +1,9 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -fclang-abi-compat=latest -triple amdgcn %s -emit-llvm -o -
| FileCheck %s
arsenm wrote:
Why do you need -fclang-abi-compat=latest
https://github.com/llvm/llvm-project/pull/94830
___
@@ -0,0 +1,21 @@
+//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apa
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/94830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -2200,6 +2206,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const
{
Align = 8;
\
break;
#include "clang/Basic/WebAssemblyReferenceTypes.def"
+case BuiltinType::AMDGPUBufferRsrc:
+ W
https://github.com/arsenm commented:
Need stacked PR that adds the make_buffer_rsrc builtin that shows its use
https://github.com/llvm/llvm-project/pull/94830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailm
@@ -1091,6 +1091,9 @@ enum PredefinedTypeIDs {
// \brief WebAssembly reference types with auto numeration
#define WASM_TYPE(Name, Id, SingletonId) PREDEF_TYPE_##Id##_ID,
#include "clang/Basic/WebAssemblyReferenceTypes.def"
+// \breif AMDGPU types with auto numeration
--
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less
than +0.0 for this
intrinsic. Note that these are the semantics specified in the draft of
IEEE 754-2019.
+.. _i_minimumnum:
+
+'``llvm.minimumnum.*``' Intrinsic
+^
+
+
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less
than +0.0 for this
intrinsic. Note that these are the semantics specified in the draft of
IEEE 754-2019.
+.. _i_minimumnum:
+
+'``llvm.minimumnum.*``' Intrinsic
+^
+
+
301 - 400 of 1471 matches
Mail list logo