[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote:

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH 1/3] [NVPTX] Add 'activemask' builtin and intrinsic support

[llvm] [clang] [NVPTX] Add builtin support for 'nanosleep' PTX instrunction (PR #79888)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79888 Summary: This patch adds a builtin for the `nanosleep` PTX function. It takes either an immediate or a register and sleeps for [0, 2t] nanoseconds given t. More information at the documentation:

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote:

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote:

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote:

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync : [IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback], "llvm.nvvm.vote.ballot.sync">, ClangBuiltin<"__nvvm_vote_ballot_sync">; +// +// ACTIVEMASK +// +def int_nvvm_activemask : +

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Added side effects attribute, I believe this matches the current behavior of the inline asm better. https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH 1/2] [NVPTX] Add 'activemask' builtin and intrinsic support

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > https://bugs.llvm.org/show_bug.cgi?id=35249 Yeah, there's constant issues with convergence analysis. I included one of the tests to try to show that it won't merge with the covergent attribute. Since this is a general issue for all of these things. In the past I usually add

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > I was planning on updating this to use the new instrinsic for the newer > > version. Alternatively we could make __activemask the builtin which expands > > to both versions, but I'm somewhat averse since we should target the > > instruction directly I feel. > > Yes, I

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Unlike the other PRs, this one has a CUDA function, `__activemask()`. > Presumably we should make that one work by hacking our headers? That is currently defined here https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_intrinsics.h#L214. I was

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79873 Summary: The NVPTX tools require an architecture to be used, however if we are creating generic LLVM-IR we should be able to leave it unspecified. This will result in the `target-cpu` attributes not being set on

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Reverted. I don't think there's a "proper" solution here since this seems to have leaked into the headers due to whoever set this up initially not properly setting these on the host. That seems to be endemic now, so the best we can do it just set it to some dummy values I

[clang] 72d4fc1 - Revert "[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660)"

2024-01-29 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2024-01-29T11:11:25-06:00 New Revision: 72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d URL: https://github.com/llvm/llvm-project/commit/72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d DIFF: https://github.com/llvm/llvm-project/commit/72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d.diff

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > This seems to have perturbed the HIP build. > > https://lab.llvm.org/staging/#/builders/22/builds/22 > > The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host > > compilation as well in a bunch of the wave function macros. I think that > > this is just

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: This seems to have perturbed the HIP build. https://lab.llvm.org/staging/#/builders/22/builds/22 The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host compilation as well in a bunch of the wave function macros. I think that this is just poor programming,

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH] [NVPTX] Add 'activemask' builtin and intrinsic support Summary:

[clang] [llvm] [NVPTX} Add builtin support for 'globaltimer' (PR #79765)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79765 >From cb2503ee6c10a3d03548b6bd44d6800ed67b2753 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 29 Jan 2024 08:12:35 -0600 Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer' Summary: This

[llvm] [clang] [NVPTX] Add builtin for 'exit' handling (PR #79777)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79777 Summary: The PTX ISA has always supported the 'exit' instruction to terminate individual threads. This patch adds a builtin to handle it. See the PTX documentation for further details.

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79768 Summary: This patch adds support for getting the 'activemask' instruction's value without needing to use inline assembly. See the relevant PTX reference for details.

[clang] [llvm] [NVPTX} Add builtin support for 'globaltimer' (PR #79765)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79765 >From 9a07e319274f4ec2f7b12a174b7664af118de4e9 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 29 Jan 2024 08:12:35 -0600 Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer' Summary: This

[llvm] [clang] [NVPTX} Add builtin support for 'globaltimer' (PR #79765)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79765 Summary: This patch adds support for `globaltimer` to match `clock` and `clock64`. See the PTX ISA reference fro details. This patch does not implement the `hi` or `lo` variants for brevity as they can be

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > LGTM. AFAIK only device libs compile OpenCL code without -mcpu. I don't think > it uses any of these predefined macros. That's what I figured from a cursory look at the ROCm-Device-Libs. The goal is to formalize this more to make more generic LLVM-IR.

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79660 >From ba04b20709cbf76ef6f1490081aecc125bdafec7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 26 Jan 2024 16:25:30 -0600 Subject: [PATCH 1/2] [AMDGPU] Do not emit arch dependent macros with unspecified

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79660 >From ba04b20709cbf76ef6f1490081aecc125bdafec7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 26 Jan 2024 16:25:30 -0600 Subject: [PATCH] [AMDGPU] Do not emit arch dependent macros with unspecified cpu

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79660 Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into

[clang-tools-extra] [llvm] [libc] [clang] [libcxx] [lldb] [lld] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79373 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[flang] [clang] [libc] [compiler-rt] [clang-tools-extra] [llvm] [lld] [lldb] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Got it, okay, thanks. > > Since this change only applies to `--target=nvptx64-nvidia-cuda`, fine by me. > Thanks for putting up with our scrutiny. :) No problem, I probably should've have been clearer in my commit messages. https://github.com/llvm/llvm-project/pull/79373

[lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I...think I understand. > > Is the output of this compilation step a cubin, then? Yes, it will spit out a simple `cubin` instead of a fatbinary. The NVIDIA toolchain is much worse about this stuff than the AMD one, but in general it works. You can check with `-###` or

[lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > This method of compilation is not like CUDA, so we can't target all the > > GPUs at the same time. > > Can you clarify for me -- what are you compiling where it's impossible to > target multiple GPUs in the binary? I'm confused because Art is understanding > that it's not

[flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > This method of compilation is not like CUDA, so we can't target all the > > GPUs at the same time. > > I think this is the key fact I was missing. If the patch is only for a > standalone compilation which does not do multi-GPU compilation in principle, > then your approach

[clang] [clang-tools-extra] [lldb] [libc] [libcxx] [lld] [llvm] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > I think the semantics of native on other architectures are clear enough > > here. > > I don't think we have the same idea about that. Let's spell it out, so > there's no confusion. > > [GCC > manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) >

[clang] [lld] [libcxx] [flang] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > User confusion is only part of the issue here. With any single GPU choice we > would still potentially produce a nonworking binary, if our GPU choice does > not match what the user wants. > > "all GPUs" has the advantage of always producing the binary that's guaranteed > to

[lld] [lldb] [llvm] [compiler-rt] [clang-tools-extra] [libc] [clang] [flang] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > On the other hand, I'd be OK with providing --offload-arch=native translating > into "compile for all present GPU variants", with a possibility to further > adjust the selected set with the usual --no-offload-arch-foo, if the user > needs to. This will at least produce code

[compiler-rt] [flang] [libcxx] [clang] [llvm] [clang-tools-extra] [lldb] [lld] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/2] [NVPTX] Add support for -march=native in standalone NVPTX

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I think I'm with Art on this one. > > > > Problem #2 [...] The arch=native will create a working configuration, but > > > would build more than necessary. > > > > > > It will target the first GPU it finds. We could maybe change the behavior > > to detect the newest, but the

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Some interesting points, I'll try to clarify some things. > This option may not as well as one would hope. > > Problem #1 is that it will drastically slow down compilation for some users. > NVIDIA GPU drivers are loaded on demand, and the process takes a while > (O(second),

[lldb] [clang] [openmp] [compiler-rt] [lld] [llvm] [libc] [libcxx] [clang-tools-extra] [mlir] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Maybe need to specify `--target=x86_64-unknown-linux-gnu` in the test? https://github.com/llvm/llvm-project/pull/79222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79373 Summary: We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX architecture from standard CPU. This patch simply uses the existing support for handling `--offload-arch=native` to also apply to

[llvm] [clang] [Offload] Fix the offloading wrapper when merged multiple times. (PR #79231)

2024-01-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Do not link device code under a relocatable link (PR #79314)

2024-01-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79314 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Offload] Fix the offloading wrapper when merged multiple times. (PR #79231)

2024-01-24 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Do we need two different linkages or could the COFF setting be used in both? > Can we have a test to show the merging works as expected? Doing a merge intentionally will be difficult until I add another flag to do this on purpose as an extra feature. This patch just changes

[clang] [LinkerWrapper] Do not link device code under a relocatable link (PR #79314)

2024-01-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79314 >From 0f8d9bb329b6d50493286e117ea0fe45e0a49247 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 09:41:15 -0600 Subject: [PATCH 1/2] [LinkerWrapper] Do not link device code under a relocatable

[lldb] [pstl] [llvm] [mlir] [libc] [compiler-rt] [libcxx] [openmp] [clang-tools-extra] [clang] [lld] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,5 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{check} = %clang -### -c -mcmodel=medium jhuber6 wrote: Probably depends on the option we're testing. We could do both.

[clang] [LinkerWrapper] Do not link device code under a relocatable link (PR #79314)

2024-01-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79314 Summary: A relocatable link through `clang -r` can go through the clang-linker-wrapper if offloading is enabled. This will have the effect of linking the device code and creating the wrapper module. It will then

[libc] [clang] [openmp] [lld] [clang-tools-extra] [lldb] [libcxx] [compiler-rt] [mlir] [llvm] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/79222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [lldb] [lld] [compiler-rt] [clang] [mlir] [libc] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,7 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check +// DEFINE: %{check} = %clang -### -c %{gpu_opts}

[clang] [libc] [lldb] [llvm] [mlir] [compiler-rt] [lld] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,7 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check +// DEFINE: %{check} = %clang -### -c %{gpu_opts}

[llvm] [clang] [Offload] Fix the offloading wrapper when merged multiple times. (PR #79231)

2024-01-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79231 Summary: The offloading wrapper is a object file that contains code necessary to register offloading entries for the given runtime. Currently, we expected only one of these to be present when we make the final

[clang] [Clang][Driver] Fix `--save-temps` for OpenCL AoT compilation (PR #78333)

2024-01-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/78333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[mlir] [llvm] [clang] [AMDGPU] Update llvm-objdump lit tests for COV5 (PR #79039)

2024-01-22 Thread Joseph Huber via cfe-commits
@@ -99,6 +99,7 @@ class ROCDLDialectLLVMIRTranslationInterface if (!llvmFunc->hasFnAttribute("amdgpu-flat-work-group-size")) { llvmFunc->addFnAttr("amdgpu-flat-work-group-size", "1,256"); } + llvmFunc->addFnAttr("amdgpu-implicitarg-num-bytes", "256");

[mlir] [llvm] [clang] [AMDGPU] Update llvm-objdump lit tests for COV5 (PR #79039)

2024-01-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/79039 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [AMDGPU] Update llvm-objdump lit tests for COV5 (PR #79039)

2024-01-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/79039 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [mlir] [AMDGPU] Change default AMDHSA Code Object version to 5 (PR #79038)

2024-01-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Seems straightforward enough https://github.com/llvm/llvm-project/pull/79038 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Fix `--save-temps` for OpenCL AoT compilation (PR #78333)

2024-01-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/78333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Fix `--save-temps` for OpenCL AoT compilation (PR #78333)

2024-01-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: You should add a test that checks the output of `-ccc-print-phases` and `-ccc-print-bindings`. https://github.com/llvm/llvm-project/pull/78333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [clang] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-22 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > FYI. There is a failure in liner-wrapper.c in > https://buildkite.com/llvm-project/github-pull-requests/builds/30337#018d1aaa-8225-4630-a5f0-527d1c7c129d > > ``` > # note: command had no output on stdout or stderr > | # error: command failed with exit status: 1 > | #

[clang] ec0ac85 - [Clang][Obvious] Correctly disable Windows on linker-wrapper test

2024-01-20 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2024-01-20T12:53:03-06:00 New Revision: ec0ac85e58f0a80cc52a132336b132ffe7b50b59 URL: https://github.com/llvm/llvm-project/commit/ec0ac85e58f0a80cc52a132336b132ffe7b50b59 DIFF: https://github.com/llvm/llvm-project/commit/ec0ac85e58f0a80cc52a132336b132ffe7b50b59.diff

[clang] cb2f340 - [CUDA] Disable registering surfaces and textures with the new driver

2024-01-18 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2024-01-18T10:56:33-06:00 New Revision: cb2f340850db007aebf5012858697ba5afc1ce4e URL: https://github.com/llvm/llvm-project/commit/cb2f340850db007aebf5012858697ba5afc1ce4e DIFF: https://github.com/llvm/llvm-project/commit/cb2f340850db007aebf5012858697ba5afc1ce4e.diff

[clang] 2b804f8 - [LinkerWrapper][Obvious] Fix move on temporary object

2024-01-18 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2024-01-18T10:42:13-06:00 New Revision: 2b804f875579995b1588f1a079e265929163d0e4 URL: https://github.com/llvm/llvm-project/commit/2b804f875579995b1588f1a079e265929163d0e4 DIFF: https://github.com/llvm/llvm-project/commit/2b804f875579995b1588f1a079e265929163d0e4.diff

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/78359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [LinkerWrapper] Support device binaries in multiple link jobs (PR #72442)

2024-01-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Replaced by https://github.com/llvm/llvm-project/pull/78359 https://github.com/llvm/llvm-project/pull/72442 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LinkerWrapper] Support device binaries in multiple link jobs (PR #72442)

2024-01-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/72442 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [openmp] [OpenMP][USM] Introduces -fopenmp-force-usm flag (PR #76571)

2024-01-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/76571 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/78359 >From 2a460f6ff9e7bca938adca5487609df41616e8c1 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jan 2024 15:42:06 -0600 Subject: [PATCH] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/78359 >From d7c8a6e0cb2289af939a90e82afbc6e35b08010c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jan 2024 15:42:06 -0600 Subject: [PATCH 1/3] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when

[llvm] [clang] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/78359 >From d7c8a6e0cb2289af939a90e82afbc6e35b08010c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jan 2024 15:42:06 -0600 Subject: [PATCH 1/2] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Looks like it still has that Windows failure. That's going to be impossible to debug on account of the fact that I have no clue how to run this thing on Windows. The precommit checking takes a whole day to run as well. The only error message is "invalid argument", so I really

[llvm] [clang] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits
@@ -162,6 +162,19 @@ class OffloadFile : public OwningBinary { std::unique_ptr Buffer) : OwningBinary(std::move(Binary), std::move(Buffer)) {} + /// Make a deep copy of this offloading file. + OffloadFile copy() const { +std::unique_ptr Buffer =

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-16 Thread Joseph Huber via cfe-commits
jhuber6 wrote: This is a redo of what was originally in https://github.com/llvm/llvm-project/pull/72442 https://github.com/llvm/llvm-project/pull/78359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-16 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/78359 Summary: The linker wrapper's job is to sort various embedded inputs into a list of files that participate in a single link job. So far, this has been completely 1-to-1, that is, each input file participates in

[clang] [Clang] Add a NULL check (PR #77131)

2024-01-16 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Thanks for the patch, this one likely fell through the cracks because it has no assigned reviewers. We'll need a test based off of the original bug report. Put that in `clang/test/OpenMP/.c` and then look at other tests for what it should look like. LLVM uses `lit` to test, you

[clang] [Clang] Add a NULL check (PR #77131)

2024-01-16 Thread Joseph Huber via cfe-commits
@@ -21067,6 +21067,10 @@ Sema::ActOnOpenMPDependClause(const OMPDependClause::DependDataTy , ExprTy = ATy->getElementType(); else ExprTy = BaseType->getPointeeType(); +// bug 69200 +if (ExprTy.isNull()) { +

[libc] [clang] [libc] Give more functions restrict qualifiers (NFC) (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/78061 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Thanks. I'll probably make a patch after this to make the surface handling for CUDA default off because it seems to be unsupported. https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits

[clang] [libc] [Libc] Give more functions restrict qualifiers (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Thanks. https://github.com/llvm/llvm-project/pull/78061 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libc] [clang] [llvm] [Libc] Give more functions restrict qualifiers (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > LLVM changes look unrelated, it was originally copied from OpenBSD it > > seems. But it's not a major issue. > > FWIW I opened a few PRs in FreeBSD regarding this. Yeah, go ahead and move that portion there so the people who know more about LLVM's regex can look at it

[libc] [llvm] [clang] [Libc] Give more functions restrict qualifiers (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: LLVM changes look unrelated, it was originally copied from OpenBSD it seems. But it's not a major issue. https://github.com/llvm/llvm-project/pull/78061 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module , GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module , ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module ,

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module , GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module , ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module ,

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,62 @@ +//===- OffloadWrapper.h --r-*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module , GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module , ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module ,

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module , GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module , ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module ,

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,62 @@ +//===- OffloadWrapper.h --r-*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: Thanks, some comments. https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [compiler-rt] [clang-tools-extra] [llvm] [AMDGPU] Avoid hitting AMDGPUAsmPrinter related asserts for local functions at O0 (PR #72129)

2024-01-12 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > As a somewhat naive question, what would it take to turn off requiring > codegen to be in SCC order? We seem to be the only target doing that. The > comments on that line say something about function calls and noinline I believe this is also the reason parallel codegen via

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-12 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > > An AMDGPU library function is not internalized and can be used to > > > fullfill calls generated by LLVM passes or instruction selection. > > > > > > I am confused by the description of "internalized". Do you refer to LTO > > internalization? You can leverage `llvm.used`

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > An AMDGPU library function is not internalized and can be used to fullfill > > calls generated by LLVM passes or instruction selection. > > I am confused by the description of "internalized". Do you refer to LTO > internalization? You can leverage `llvm.used` to disable LTO

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-09 Thread Joseph Huber via cfe-commits
@@ -2011,6 +2011,13 @@ def AMDGPUNumVGPR : InheritableAttr { let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; } +def AMDGPULibFun : InheritableAttr { jhuber6 wrote: Why isn't this a `TargetSpecificAttr`? We should have one for AMDGPU.

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-09 Thread Joseph Huber via cfe-commits
@@ -2693,6 +2693,17 @@ An error will be given if: }]; } +def AMDGPULibFunDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The ``amdgpu_lib_fun`` attribute can be applied to a function for AMDGPU target +to indicate it is a library

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I was thinking of implementing libm/libc for nvptx, which would produce an IR > library . We'll still need to keep the functions around if they are not used > explicitly, because we may need them to fulfill libcalls later in the > compilation pipeline. Sort of a libdevice

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: My use-case is more to be able to write functions like `is_wavefrontsize64()` in regular C++ code. This would require some way to emit builtins for these. I believe the use-case here is a workaround for the issues caused by library ordering? I'm guessing this is related to the

<    2   3   4   5   6   7   8   9   10   11   >