[clang] [llvm] [OpenMP] Ensure the actual kernel is annotated with launch bounds (PR #99927)

2024-07-22 Thread Joseph Huber via cfe-commits
@@ -4569,7 +4569,17 @@ OpenMPIRBuilder::createTargetInit(const LocationDescription , bool IsSPMD, Constant *MayUseNestedParallelismVal = ConstantInt::getSigned(Int8, true); Constant *DebugIndentionLevelVal = ConstantInt::getSigned(Int16, 0); - Function *Kernel =

[clang] [llvm] [OpenMP] Ensure the actual kernel is annotated with launch bounds (PR #99927)

2024-07-22 Thread Joseph Huber via cfe-commits
@@ -4569,7 +4569,17 @@ OpenMPIRBuilder::createTargetInit(const LocationDescription , bool IsSPMD, Constant *MayUseNestedParallelismVal = ConstantInt::getSigned(Int8, true); Constant *DebugIndentionLevelVal = ConstantInt::getSigned(Int16, 0); - Function *Kernel =

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Joseph Huber via cfe-commits
jhuber6 wrote: ping https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-22 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > Also, apparently there's some driver tests that expect > > `--target=amdgcn-amd-amdhsa-opencl` as the environment type. Is that the > > expected way to specify OpenCL? I'm not overly familiar. > > That was a mistake from long ago we shouldn't continue. A language is not an

[clang] [Clang] Fix C library wrappers for offloading (PR #99716)

2024-07-20 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/99716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Fix C library wrappers for offloading (PR #99716)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/99716 >From 02012a704c4ad0c666d38bf2b2a7bf74c7f3b2c1 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 19 Jul 2024 17:33:08 -0500 Subject: [PATCH] [Clang] Fix C library wrappers for offloading Summary: This

[clang] [Clang] Fix C library wrappers for offloading (PR #99716)

2024-07-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Yeah, rewriting the function signature is gonna be quite a hassle. LG for now. > > If we change the function signature, does it make easier to set default > argument? It'll be similar code, but it would allow us to use the same helpers that the other targets use.

[clang] [Clang] Fix C library wrappers for offloading (PR #99716)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/99716 Summary: This block of code wraps around the standard C library includes. However, the order C library includes are presented is actually important. If they are visible before the `libc++` headers then it will

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/99687 >From 59901100a2c11d37947938dfb9db5dd1164cbbf5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 19 Jul 2024 14:07:18 -0500 Subject: [PATCH 1/3] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/99687 >From 59901100a2c11d37947938dfb9db5dd1164cbbf5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 19 Jul 2024 14:07:18 -0500 Subject: [PATCH 1/2] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/99687 >From 59901100a2c11d37947938dfb9db5dd1164cbbf5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 19 Jul 2024 14:07:18 -0500 Subject: [PATCH 1/2] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Okay, so the only user of the toolchain is `Darwin`. So I'd need to somehow rework that logic to make it work. https://github.com/llvm/llvm-project/pull/99687 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/99687 >From 59901100a2c11d37947938dfb9db5dd1164cbbf5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 19 Jul 2024 14:07:18 -0500 Subject: [PATCH] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: There's a test that's `.cl` that doesn't pass `-opencl` so probably can't rely on it. https://github.com/llvm/llvm-project/pull/99687 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Also, apparently there's some driver tests that expect `--target=amdgcn-amd-amdhsa-opencl` as the environment type. Is that the expected way to specify OpenCL? I'm not overly familiar. https://github.com/llvm/llvm-project/pull/99687

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Needs test. I also would assume you shouldn't have to reinvent language > detection code There's an existing function for getting the inputs, but it takes the `ToolChain` as input so it makes a circular dependency. The implementation doesn't seem to need _that_ much from the

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-07-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/99687 Summary: The `getToolChain` pass uses the triple to determine which toolchain to create. Currently the `amdgcn-amd-amdhsa` triple maps to the `ROCmToolChain` which uses things expected to be provided by `ROCm`.

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: The CI seems to fail because it's not building the tool yet, the patch should enable that dependency but maybe the bot doesn't know how to pick it up until it lands. https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits

[clang] [libc] [llvm] [OpenMP][libc] Remove special handling for OpenMP printf (PR #98940)

2024-07-17 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Do we have tests that cover these cases? There's already `printf` related tests AFAIK, this just changes the implementation. I could add more if needed. https://github.com/llvm/llvm-project/pull/98940 ___ cfe-commits mailing list

[clang] [libc] [libcxx] [libc][libcxx] Support for building libc++ against LLVM libc (PR #99287)

2024-07-17 Thread Joseph Huber via cfe-commits
@@ -293,6 +293,7 @@ option(LIBCXX_ENABLE_THREADS "Build libc++ with support for threads." ON) option(LIBCXX_ENABLE_MONOTONIC_CLOCK "Build libc++ with support for a monotonic clock. This option may only be set to OFF when LIBCXX_ENABLE_THREADS=OFF." ON)

[clang] [libc] [libcxx] [libc][libcxx] Support for building libc++ against LLVM libc (PR #99287)

2024-07-17 Thread Joseph Huber via cfe-commits
@@ -293,6 +293,7 @@ option(LIBCXX_ENABLE_THREADS "Build libc++ with support for threads." ON) option(LIBCXX_ENABLE_MONOTONIC_CLOCK "Build libc++ with support for a monotonic clock. This option may only be set to OFF when LIBCXX_ENABLE_THREADS=OFF." ON)

[clang] [libc] [libcxx] [libc][libcxx] Support for building libc++ against LLVM libc (PR #99287)

2024-07-17 Thread Joseph Huber via cfe-commits
@@ -293,6 +293,7 @@ option(LIBCXX_ENABLE_THREADS "Build libc++ with support for threads." ON) option(LIBCXX_ENABLE_MONOTONIC_CLOCK "Build libc++ with support for a monotonic clock. This option may only be set to OFF when LIBCXX_ENABLE_THREADS=OFF." ON)

[clang] [libc] [libcxx] [libc][libcxx] Support for building libc++ against LLVM libc (PR #99287)

2024-07-17 Thread Joseph Huber via cfe-commits
@@ -293,6 +293,7 @@ option(LIBCXX_ENABLE_THREADS "Build libc++ with support for threads." ON) option(LIBCXX_ENABLE_MONOTONIC_CLOCK "Build libc++ with support for a monotonic clock. This option may only be set to OFF when LIBCXX_ENABLE_THREADS=OFF." ON)

[clang] [ClangLinkerWrapper] Fix intermediate file naming for multi-arch compilation (PR #99325)

2024-07-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/99325 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [libcxx] [libc][libcxx] Support for building libc++ against LLVM libc (PR #99287)

2024-07-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. I tested this locally and it worked quite well thanks. My nit is that we should add a check that mirrors the already existing `LIBCXXABI_USE_LLVM_UNWINDER` flag, which prints an error if `libunwind` isn't in the runtimes list. After that

[clang] [libc] [libcxx] [libc][libcxx] Support for building libc++ against LLVM libc (PR #99287)

2024-07-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: I was just about to do something similar, so I'm glad you got to it before me. I think we'll need a CMake check on whether or not `libc` was enabled as a runtime if this option is enabled in `libcxx`, but beyond that it looks very reasonable. All we're

[clang] [OpenMP][AMDGPU] Do not attach -fcuda-is-device (PR #99002)

2024-07-16 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/99002 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > First batch of comments on the patch -- I only got till about the middle of > ClangNVLinkWrapper.cpp. Will continue reviewing tomorrow. Appreciate it. Sorry about dropping a huge patch like this on you. https://github.com/llvm/llvm-project/pull/96561

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Ping https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [OpenMP][libc] Remove special handling for OpenMP printf (PR #98940)

2024-07-15 Thread Joseph Huber via cfe-commits
@@ -5892,8 +5892,6 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, getTarget().getTriple().isAMDGCN() || (getTarget().getTriple().isSPIRV() && getTarget().getTriple().getVendor() == Triple::VendorType::AMD)) { -

[clang] [libc] [llvm] [OpenMP][libc] Remove special handling for OpenMP printf (PR #98940)

2024-07-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/98940 >From c5be26a03cde6a4818e44298a37e41addc2cb4c5 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jul 2024 12:42:09 -0500 Subject: [PATCH] [OpenMP][libc] Remove special handling for OpenMP printf

[clang] [libc] [llvm] [OpenMP][libc] Remove special handling for OpenMP printf (PR #98940)

2024-07-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/98940 Summary: Currently there are several layers to handle `printf`. Since we now have varargs and an implementation of `printf` this can be heavily simplified. 1. The frontend renames `printf` into `omp_vprintf` and

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Joseph Huber via cfe-commits
@@ -203,8 +203,12 @@ ABIArgInfo NVPTXABIInfo::classifyArgumentType(QualType Ty) const { void NVPTXABIInfo::computeInfo(CGFunctionInfo ) const { if (!getCXXABI().classifyReturnType(FI)) FI.getReturnInfo() = classifyReturnType(FI.getReturnType()); + + unsigned

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96015 >From 698ec5bb3e9247b4b47a99eeac4d0933fc0a59ee Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 17 Jun 2024 15:32:31 -0500 Subject: [PATCH] [NVPTX] Implement variadic functions using IR lowering Summary:

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Ping, waiting on this so I can land https://github.com/llvm/llvm-project/pull/96369. https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96015 >From 8bd49caa9fa93fd3d0812e0a4315f8ff4956056a Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 17 Jun 2024 15:32:31 -0500 Subject: [PATCH 1/2] [NVPTX] Implement variadic functions using IR lowering

[clang] [Clang] Correctly enable the f16 type for offloading (PR #98331)

2024-07-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/98331 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Correctly enable the f16 type for offloading (PR #98331)

2024-07-10 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,117 @@ +// REQUIRES: nvptx-registered-target +// +// RUN: %clang_cc1 -ffp-contract=off -triple nvptx-unknown-unknown -target-cpu \ +// RUN: sm_86 -target-feature +ptx72 -fcuda-is-device -x cuda -emit-llvm -o - %s \ jhuber6 wrote: They're probably

[clang] [Clang] Correctly enable the f16 type for offloading (PR #98331)

2024-07-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/98331 >From b8f50d9fb7576c0ff7b6b9202736d47913af47ee Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 10 Jul 2024 09:39:44 -0500 Subject: [PATCH] [Clang] Correctly enable the f16 type for offloading Summary:

[clang] [Clang] Do not emit intrinsic math functions on GPU targets (PR #98209)

2024-07-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/98209 >From 605e9e78c1cba3b1947a538c566ffedbb9525be0 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 10 Jul 2024 09:39:44 -0500 Subject: [PATCH 1/2] [Clang] Correctly enable the f16 type for offloading

[clang] [Clang] Correctly enable the f16 type for offloading (PR #98331)

2024-07-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/98331 Summary: There's an extra argument that's required to *actually* enable f16 usage. For whatever reason there's a difference between fp16 and f16, where fp16 is some weird version that converts between the two.

[clang] [Clang] Add `__CLANG_GPU_DISABLE_MATH_WRAPPERS` macro for offloading math (PR #98234)

2024-07-10 Thread Joseph Huber via cfe-commits
@@ -345,4 +349,5 @@ __DEVICE__ float ynf(int __a, float __b) { return __nv_ynf(__a, __b); } #pragma pop_macro("__DEVICE_VOID__") #pragma pop_macro("__FAST_OR_SLOW") +#endif // __CLANG_GPU_DISABLE_MATH_WRAPPERS jhuber6 wrote: Hm, good question. I added

[clang] [Clang] Do not emit intrinsic math functions on GPU targets (PR #98209)

2024-07-10 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > This is going to break the library build. We use the __builtin functions to > access the intrinsic in the cases where the llvm intrinsic lowering provides > the implementation of the function. In a more sensible world, the library > would not provide the implementations of

[clang] [Clang] Add `__CLANG_GPU_DISABLE_MATH_WRAPPERS` macro for offloading math (PR #98234)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/98234 Summary: Currently we replace all math calls with vendor specific ones. This patch introduces a macro `__CLANG_GPU_DISABLE_MATH_WRAPPERS` that when defined will disable this. I went this route instead of a flag

[clang] [Clang] Do not emit intrinsic math functions on GPU targets (PR #98209)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/98209 Summary: Currently, the GPU gets its math by using wrapper headers that eagerly replace libcalls with calls to the vendor's math library. e.g. ``` // __clang_cuda_math.h [[gnu::always_inline]] double sin(double

[clang] [llvm] [Offload] Move HIP and CUDA to new driver by default (PR #84420)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/84420 >From 778fff60cb81c3a0ffaf0a74264eb7cddd6dfb58 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 7 Mar 2024 15:48:00 -0600 Subject: [PATCH] [Offload] Move HIP and CUDA to new driver by default Summary:

[clang] [Offload] Move HIP and CUDA to new driver by default (PR #84420)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/84420 >From 3b5c3110cc1e781e9e7a8d9a621970fd3d7e9aa0 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 7 Mar 2024 15:48:00 -0600 Subject: [PATCH] [Offload] Move HIP and CUDA to new driver by default Summary:

[clang] [Clang] Make the GPU toolchains implicitly link `-lm` and `-lc` (PR #98170)

2024-07-09 Thread Joseph Huber via cfe-commits
@@ -633,6 +633,17 @@ void amdgpu::Linker::ConstructJob(Compilation , const JobAction , else if (Args.hasArg(options::OPT_mcpu_EQ)) CmdArgs.push_back(Args.MakeArgString( "-plugin-opt=mcpu=" + Args.getLastArgValue(options::OPT_mcpu_EQ))); +

[clang] [Clang] Make the GPU toolchains implicitly link `-lm` and `-lc` (PR #98170)

2024-07-09 Thread Joseph Huber via cfe-commits
@@ -633,6 +633,17 @@ void amdgpu::Linker::ConstructJob(Compilation , const JobAction , else if (Args.hasArg(options::OPT_mcpu_EQ)) CmdArgs.push_back(Args.MakeArgString( "-plugin-opt=mcpu=" + Args.getLastArgValue(options::OPT_mcpu_EQ))); + + // If the user's

[clang] [Clang] Make the GPU toolchains implicitly link `-lm` and `-lc` (PR #98170)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/98170 >From 6c6c781a658c4349073a40e0a0ecc10a893a4ca8 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH 1/3] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [Clang] Make the GPU toolchains implicitly link `-lm` and `-lc` (PR #98170)

2024-07-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: So, one thing I've noticed is that passing `-lc` and `-lm` to the `ld.lld` invocation greatly increases link times for trivial applications. This is because the handling in `ld.lld` will intentionally extract known `libcall` functions from LTO static libraries. We then have

[clang] [Clang] Make the GPU toolchains implicitly link `-lm` and `-lc` (PR #98170)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/98170 Summary: The previous patches (The other commits in this chain) allow the offloading toolchain to directly invoke the device linker. Because of this, we can now just have the toolchain implicitly include `-lc`

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/97573 >From 6c6c781a658c4349073a40e0a0ecc10a893a4ca8 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH 1/2] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-08 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/97573 >From 7a64ee668b33c912f83d4f848ab72d421f8a1bec Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH 1/2] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [OpenMP] Correctly code-gen default atomic mem order (PR #97663)

2024-07-03 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/97663 Summary: The parsing for this was implemented, but we never hooked up the default value to the result of this clause. This patch adds the support by making it default to the requires directive. >From

[libunwind] [libunwind] Remove needless `sys/uio.h` (PR #97495)

2024-07-03 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/97495 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-03 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/97573 Summary: The linker wrapper's job is to extract embedded device code from fat binaries and create linked images that can then be embedded and executed. In order to support LTO, we originally reinvented all of the

[clang] [flang] [Flang-new][OpenMP] Add bitcode files for AMD GPU OpenMP (PR #96742)

2024-07-03 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Would it be possible for you to investigate that? It really shouldn't be required if we can't help it. https://github.com/llvm/llvm-project/pull/96742 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-02 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96561 >From 2d3957ac14906d569acf5b3ceb5c7e2f4dfabe54 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-02 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96561 >From 2bb5bd081a29b9bf1c4e6e0f727e21a1b9258920 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [llvm] [mlir] [OpenMP] Migrate GPU Reductions CodeGen from Clang to OMPIRBuilder (PR #80343)

2024-07-02 Thread Joseph Huber via cfe-commits
jhuber6 wrote: This patch causes the `offloading/bug51781.c` test to fail when compiled with reductions + debug information. ```console > clang ../offload/test/offloading/bug51781.c -fopenmp -O1 --offload-arch=sm_89 > -DADD_REDUCTION --offload-device-only -gline-tables-only !dbg attachment

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-02 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96561 >From 3b10fce6b3d3f8eeb7bd9a3828d488362bb061dd Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96015 >From 8bd49caa9fa93fd3d0812e0a4315f8ff4956056a Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 17 Jun 2024 15:32:31 -0500 Subject: [PATCH 1/2] [NVPTX] Implement variadic functions using IR lowering

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-01 Thread Joseph Huber via cfe-commits
@@ -116,8 +116,7 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo { } BuiltinVaListKind getBuiltinVaListKind() const override { -// FIXME: implement -return TargetInfo::CharPtrBuiltinVaList; +return TargetInfo::VoidPtrBuiltinVaList;

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96561 >From 12d00a54169fef15efccfe9472db25b1261d31d3 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-07-01 Thread Joseph Huber via cfe-commits
@@ -54,7 +54,34 @@ class MockArgList { } template LIBC_INLINE T next_var() { -++arg_counter; +arg_counter++; +return T(arg_counter); + } + + size_t read_count() const { return arg_counter; } +}; + +// Used by the GPU implementation to parse how many bytes

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-07-01 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Lower than native alignment is legal in AMDGPU hardware and it's possible to work around in the `printf` implementation, closing. https://github.com/llvm/llvm-project/pull/96370 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-07-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/96370 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-07-01 Thread Joseph Huber via cfe-commits
@@ -942,6 +942,36 @@ struct Amdgpu final : public VariadicABIInfo { } }; +struct NVPTX final : public VariadicABIInfo { + + bool enableForTarget() override { return true; } + + bool vaListPassedInSSARegister() override { return true; } + + Type *vaListType(LLVMContext )

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-07-01 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > You could theoretically break this if you didn't go through the C ABI and > > ignored type promotion, but I'm not concerned with that kind of misuse > > since it's against the ABI in the first place. > > The IR has its own ABI that may or may not match whatever the platform

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-07-01 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Patch should not land. Need to know what bug this was trying to address to > guess at what the right fix would be. My understanding was that the variadics did lowering to a struct with a minimum alignment of four. This currently *doesn't* do that, hence my confusion. The

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-07-01 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > The nvptx lowering looks dubious - values smaller than slot size should be > passed with the same alignment as the slot and presently aren't. A struct > containing i8, i16 or half should be miscompiled on nvptx as written. I mentioned this in the original patch, it's correct

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-07-01 Thread Joseph Huber via cfe-commits
@@ -942,6 +942,36 @@ struct Amdgpu final : public VariadicABIInfo { } }; +struct NVPTX final : public VariadicABIInfo { + + bool enableForTarget() override { return true; } + + bool vaListPassedInSSARegister() override { return true; } + + Type *vaListType(LLVMContext )

[clang-tools-extra] Revert: [clangd] Replace an include with a forward declaration (PR #97082)

2024-06-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Seems reasonable as I believe there were extra uses that needed the size. https://github.com/llvm/llvm-project/pull/97082 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)

2024-06-28 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Malloc cannot be helped here if we want to have correctness. Currently it is > just broken and not even runnable. I figured that all this code would go away if we just made all schedules static. https://github.com/llvm/llvm-project/pull/97065

[clang] [llvm] [openmp] [OpenMP][offload] Fix dynamic schedule tracking (PR #97065)

2024-06-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: Could you provide a more descriptive summary? I thought we discussed that the dynamic support would just use the static scheduler, but this seems to implement it? I personally don't want to see more things in the OpenMP runtime relying on `malloc` if we

[clang] [CUDA][NFC] CudaArch to OffloadArch rename (PR #97028)

2024-06-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. This is definitely overdue, thanks. https://github.com/llvm/llvm-project/pull/97028 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang] Introduce 'clang-nvlink-wrappaer' to work around 'nvlink' (PR #96561)

2024-06-27 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Re-did it and tested it against `libc` in https://github.com/llvm/llvm-project/pull/96972 so it will have a CI running it one that lands. it works for other cases I've tested, but let me know if something else should be added. https://github.com/llvm/llvm-project/pull/96561

[clang] [Clang] Introduce 'clang-nvlink-wrappaer' to work around 'nvlink' (PR #96561)

2024-06-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96561 >From 849c8dab14c9332081a8c6331c9ca0c234793393 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Jun 2024 15:14:52 -0500 Subject: [PATCH] [Clang] Introduce 'clang-nvlink-wrappaer' to work around

[clang] [AMDGPU][OpenMP] Do not attach -fcuda-is-device flag for AMDGPU OpenMP (PR #96909)

2024-06-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. We don't even pass this in the NVPTX offloading case, so there's no reason to do it for AMDGPU. https://github.com/llvm/llvm-project/pull/96909 ___ cfe-commits mailing list

[clang] [libc] [libc] Remove atomic alignment diagnostics globally (PR #96803)

2024-06-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/96803 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [libc] Remove atomic alignment diagnostics globally (PR #96803)

2024-06-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/96803 >From 66b82f970e8914a920259dd12decd65fbb325356 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 26 Jun 2024 12:58:22 -0500 Subject: [PATCH] [libc] Remove atomic alignment diagnostics globally Summary:

[clang] [llvm] [openmp] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-06-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: Looks fine in general, I'm not a huge fan of all the `isGPUProfTarget` things we have around now, but I understand it's required to set up the visibility. I wonder if we could factor that out into something more common.

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-06-26 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,77 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -emit-llvm -o - %s | FileCheck %s + +extern void varargs_simple(int, ...); + +// CHECK-LABEL: define dso_local

[clang] [flang] [Flang-new][OpenMP] Add offload related flags for AMDGPU (PR #96742)

2024-06-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. The fact that it's called `-fcuda-is-device` is historical cruft, but I guess it's easiest to just work with it. I also hate `-mlink-builtin-bitcode` as a concept, but we're not quite ready to move away from its hacks unfortunately.

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. I think this looks good overall, though I'd like to hear some other clang maintainers chime in on the LIT config changes. https://github.com/llvm/llvm-project/pull/96704 ___ cfe-commits mailing

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,77 @@ +; Check various clang-linker-wrapper pass options after -offload-opt. jhuber6 wrote: I see, probably fine then. https://github.com/llvm/llvm-project/pull/96704 ___ cfe-commits mailing list

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,77 @@ +; Check various clang-linker-wrapper pass options after -offload-opt. jhuber6 wrote: -disable-O0-optnone handles the optnone, don't think `noinline` affects that much. https://github.com/llvm/llvm-project/pull/96704

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,77 @@ +; Check various clang-linker-wrapper pass options after -offload-opt. jhuber6 wrote: Hm, is this really the only LLVM-IR file in the Driver directory? I guess it makes sense, though you could probably just do what the other linker wrapper

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,86 @@ +; Check various clang-linker-wrapper pass options after -offload-opt. + +; REQUIRES: llvm-plugins, llvm-examples +; REQUIRES: x86-registered-target +; REQUIRES: amdgpu-registered-target + +; Setup. +; RUN: split-file %s %t +; RUN: opt -o

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,10 @@ +// Check that these simple command lines for listing LLVM options are supported, jhuber6 wrote: Do we have any other tests that just check the output for `--help`? Might be a little excessive. https://github.com/llvm/llvm-project/pull/96704

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: Makes sense overall. However in the future I'm looking to move away from the home-baked LTO pipeline in favor of giving it to the linker. That allows me to set up libraries as a part of the target toolchain in the driver. I guess for that I'll just need

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/96704 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Extend with usual pass options (PR #96704)

2024-06-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,86 @@ +; Check various clang-linker-wrapper pass options after -offload-opt. + +; REQUIRES: llvm-plugins, llvm-examples +; REQUIRES: x86-registered-target +; REQUIRES: amdgpu-registered-target + +; Setup. +; RUN: split-file %s %t +; RUN: opt -o

<    1   2   3   4   5   6   7   8   9   10   >