https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79768
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79768
>From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 28 Jan 2024 14:57:05 -0600
Subject: [PATCH 1/3] [NVPTX] Add 'activemask' builtin and intrinsic support
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79888
Summary:
This patch adds a builtin for the `nanosleep` PTX function. It takes
either an immediate or a register and sleeps for [0, 2t] nanoseconds
given t. More information at the documentation:
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>;
def : Proc<"sm_62", [SM62, PTX50]>;
def : Proc<"sm_70", [SM70, PTX60]>;
def : Proc<"sm_72", [SM72, PTX61]>;
-def : Proc<"sm_75", [SM75, PTX63]>;
+def : Proc<"sm_75", [SM75, PTX62, PTX63]>;
jhuber6 wrote:
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync :
[IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback],
"llvm.nvvm.vote.ballot.sync">,
ClangBuiltin<"__nvvm_vote_ballot_sync">;
+//
+// ACTIVEMASK
+//
+def int_nvvm_activemask :
+
jhuber6 wrote:
Added side effects attribute, I believe this matches the current behavior of
the inline asm better.
https://github.com/llvm/llvm-project/pull/79768
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79768
>From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 28 Jan 2024 14:57:05 -0600
Subject: [PATCH 1/2] [NVPTX] Add 'activemask' builtin and intrinsic support
jhuber6 wrote:
> https://bugs.llvm.org/show_bug.cgi?id=35249
Yeah, there's constant issues with convergence analysis. I included one of the
tests to try to show that it won't merge with the covergent attribute. Since
this is a general issue for all of these things. In the past I usually add
jhuber6 wrote:
> > I was planning on updating this to use the new instrinsic for the newer
> > version. Alternatively we could make __activemask the builtin which expands
> > to both versions, but I'm somewhat averse since we should target the
> > instruction directly I feel.
>
> Yes, I
jhuber6 wrote:
> Unlike the other PRs, this one has a CUDA function, `__activemask()`.
> Presumably we should make that one work by hacking our headers?
That is currently defined here
https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_intrinsics.h#L214.
I was
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79873
Summary:
The NVPTX tools require an architecture to be used, however if we are
creating generic LLVM-IR we should be able to leave it unspecified. This
will result in the `target-cpu` attributes not being set on
jhuber6 wrote:
Reverted. I don't think there's a "proper" solution here since this seems to
have leaked into the headers due to whoever set this up initially not properly
setting these on the host. That seems to be endemic now, so the best we can do
it just set it to some dummy values I
Author: Joseph Huber
Date: 2024-01-29T11:11:25-06:00
New Revision: 72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d
URL:
https://github.com/llvm/llvm-project/commit/72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d
DIFF:
https://github.com/llvm/llvm-project/commit/72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d.diff
jhuber6 wrote:
> > This seems to have perturbed the HIP build.
> > https://lab.llvm.org/staging/#/builders/22/builds/22
> > The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host
> > compilation as well in a bunch of the wave function macros. I think that
> > this is just
jhuber6 wrote:
This seems to have perturbed the HIP build.
https://lab.llvm.org/staging/#/builders/22/builds/22
The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host
compilation as well in a bunch of the wave function macros. I think that this
is just poor programming,
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79660
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/79765
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79768
>From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 28 Jan 2024 14:57:05 -0600
Subject: [PATCH] [NVPTX] Add 'activemask' builtin and intrinsic support
Summary:
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79765
>From cb2503ee6c10a3d03548b6bd44d6800ed67b2753 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 29 Jan 2024 08:12:35 -0600
Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer'
Summary:
This
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79777
Summary:
The PTX ISA has always supported the 'exit' instruction to terminate
individual threads. This patch adds a builtin to handle it. See the PTX
documentation for further details.
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79768
Summary:
This patch adds support for getting the 'activemask' instruction's value
without needing to use inline assembly. See the relevant PTX reference
for details.
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79765
>From 9a07e319274f4ec2f7b12a174b7664af118de4e9 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 29 Jan 2024 08:12:35 -0600
Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer'
Summary:
This
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79765
Summary:
This patch adds support for `globaltimer` to match `clock` and
`clock64`. See the PTX ISA reference fro details. This patch does not
implement the `hi` or `lo` variants for brevity as they can be
jhuber6 wrote:
> LGTM. AFAIK only device libs compile OpenCL code without -mcpu. I don't think
> it uses any of these predefined macros.
That's what I figured from a cursory look at the ROCm-Device-Libs. The goal is
to formalize this more to make more generic LLVM-IR.
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79660
>From ba04b20709cbf76ef6f1490081aecc125bdafec7 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 26 Jan 2024 16:25:30 -0600
Subject: [PATCH 1/2] [AMDGPU] Do not emit arch dependent macros with
unspecified
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79660
>From ba04b20709cbf76ef6f1490081aecc125bdafec7 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 26 Jan 2024 16:25:30 -0600
Subject: [PATCH] [AMDGPU] Do not emit arch dependent macros with unspecified
cpu
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79660
Summary:
Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means
to create a sort of "generic" IR. The resulting IR will not contain any
target dependent attributes and can then be inserted into
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79373
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> Got it, okay, thanks.
>
> Since this change only applies to `--target=nvptx64-nvidia-cuda`, fine by me.
> Thanks for putting up with our scrutiny. :)
No problem, I probably should've have been clearer in my commit messages.
https://github.com/llvm/llvm-project/pull/79373
jhuber6 wrote:
> I...think I understand.
>
> Is the output of this compilation step a cubin, then?
Yes, it will spit out a simple `cubin` instead of a fatbinary. The NVIDIA
toolchain is much worse about this stuff than the AMD one, but in general it
works. You can check with `-###` or
jhuber6 wrote:
> > This method of compilation is not like CUDA, so we can't target all the
> > GPUs at the same time.
>
> Can you clarify for me -- what are you compiling where it's impossible to
> target multiple GPUs in the binary? I'm confused because Art is understanding
> that it's not
jhuber6 wrote:
> > This method of compilation is not like CUDA, so we can't target all the
> > GPUs at the same time.
>
> I think this is the key fact I was missing. If the patch is only for a
> standalone compilation which does not do multi-GPU compilation in principle,
> then your approach
jhuber6 wrote:
> > I think the semantics of native on other architectures are clear enough
> > here.
>
> I don't think we have the same idea about that. Let's spell it out, so
> there's no confusion.
>
> [GCC
> manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16)
>
jhuber6 wrote:
> User confusion is only part of the issue here. With any single GPU choice we
> would still potentially produce a nonworking binary, if our GPU choice does
> not match what the user wants.
>
> "all GPUs" has the advantage of always producing the binary that's guaranteed
> to
jhuber6 wrote:
> On the other hand, I'd be OK with providing --offload-arch=native translating
> into "compile for all present GPU variants", with a possibility to further
> adjust the selected set with the usual --no-offload-arch-foo, if the user
> needs to. This will at least produce code
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79373
>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79373
>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79373
>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/2] [NVPTX] Add support for -march=native in standalone NVPTX
jhuber6 wrote:
> I think I'm with Art on this one.
>
> > > Problem #2 [...] The arch=native will create a working configuration, but
> > > would build more than necessary.
> >
> >
> > It will target the first GPU it finds. We could maybe change the behavior
> > to detect the newest, but the
jhuber6 wrote:
Some interesting points, I'll try to clarify some things.
> This option may not as well as one would hope.
>
> Problem #1 is that it will drastically slow down compilation for some users.
> NVIDIA GPU drivers are loaded on demand, and the process takes a while
> (O(second),
jhuber6 wrote:
Maybe need to specify `--target=x86_64-unknown-linux-gnu` in the test?
https://github.com/llvm/llvm-project/pull/79222
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79373
Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also apply to
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79231
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79314
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> Do we need two different linkages or could the COFF setting be used in both?
> Can we have a test to show the merging works as expected?
Doing a merge intentionally will be difficult until I add another flag to do
this on purpose as an extra feature. This patch just changes
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79314
>From 0f8d9bb329b6d50493286e117ea0fe45e0a49247 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 24 Jan 2024 09:41:15 -0600
Subject: [PATCH 1/2] [LinkerWrapper] Do not link device code under a
relocatable
@@ -0,0 +1,5 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with
code 0.
+// DEFINE: %{check} = %clang -### -c -mcmodel=medium
jhuber6 wrote:
Probably depends on the option we're testing. We could do both.
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79314
Summary:
A relocatable link through `clang -r` can go through the
clang-linker-wrapper if offloading is enabled. This will have the effect
of linking the device code and creating the wrapper module. It will then
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/79222
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,7 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with
code 0.
+// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60
--cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check
+// DEFINE: %{check} = %clang -### -c %{gpu_opts}
@@ -0,0 +1,7 @@
+/// Some target-specific options are ignored for GPU, so %clang exits with
code 0.
+// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60
--cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check
+// DEFINE: %{check} = %clang -### -c %{gpu_opts}
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79231
Summary:
The offloading wrapper is a object file that contains code necessary to
register offloading entries for the given runtime. Currently, we
expected only one of these to be present when we make the final
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/78333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -99,6 +99,7 @@ class ROCDLDialectLLVMIRTranslationInterface
if (!llvmFunc->hasFnAttribute("amdgpu-flat-work-group-size")) {
llvmFunc->addFnAttr("amdgpu-flat-work-group-size", "1,256");
}
+ llvmFunc->addFnAttr("amdgpu-implicitarg-num-bytes", "256");
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/79039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/79039
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 approved this pull request.
Seems straightforward enough
https://github.com/llvm/llvm-project/pull/79038
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/78333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 commented:
You should add a test that checks the output of `-ccc-print-phases` and
`-ccc-print-bindings`.
https://github.com/llvm/llvm-project/pull/78333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
jhuber6 wrote:
> FYI. There is a failure in liner-wrapper.c in
> https://buildkite.com/llvm-project/github-pull-requests/builds/30337#018d1aaa-8225-4630-a5f0-527d1c7c129d
>
> ```
> # note: command had no output on stdout or stderr
> | # error: command failed with exit status: 1
> | #
Author: Joseph Huber
Date: 2024-01-20T12:53:03-06:00
New Revision: ec0ac85e58f0a80cc52a132336b132ffe7b50b59
URL:
https://github.com/llvm/llvm-project/commit/ec0ac85e58f0a80cc52a132336b132ffe7b50b59
DIFF:
https://github.com/llvm/llvm-project/commit/ec0ac85e58f0a80cc52a132336b132ffe7b50b59.diff
Author: Joseph Huber
Date: 2024-01-18T10:56:33-06:00
New Revision: cb2f340850db007aebf5012858697ba5afc1ce4e
URL:
https://github.com/llvm/llvm-project/commit/cb2f340850db007aebf5012858697ba5afc1ce4e
DIFF:
https://github.com/llvm/llvm-project/commit/cb2f340850db007aebf5012858697ba5afc1ce4e.diff
Author: Joseph Huber
Date: 2024-01-18T10:42:13-06:00
New Revision: 2b804f875579995b1588f1a079e265929163d0e4
URL:
https://github.com/llvm/llvm-project/commit/2b804f875579995b1588f1a079e265929163d0e4
DIFF:
https://github.com/llvm/llvm-project/commit/2b804f875579995b1588f1a079e265929163d0e4.diff
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/78359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
Replaced by https://github.com/llvm/llvm-project/pull/78359
https://github.com/llvm/llvm-project/pull/72442
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/72442
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/76571
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/78359
>From 2a460f6ff9e7bca938adca5487609df41616e8c1 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 15 Jan 2024 15:42:06 -0600
Subject: [PATCH] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when
linking
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/78359
>From d7c8a6e0cb2289af939a90e82afbc6e35b08010c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 15 Jan 2024 15:42:06 -0600
Subject: [PATCH 1/3] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/78359
>From d7c8a6e0cb2289af939a90e82afbc6e35b08010c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 15 Jan 2024 15:42:06 -0600
Subject: [PATCH 1/2] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when
jhuber6 wrote:
Looks like it still has that Windows failure. That's going to be impossible to
debug on account of the fact that I have no clue how to run this thing on
Windows. The precommit checking takes a whole day to run as well. The only
error message is "invalid argument", so I really
@@ -162,6 +162,19 @@ class OffloadFile : public OwningBinary {
std::unique_ptr Buffer)
: OwningBinary(std::move(Binary), std::move(Buffer)) {}
+ /// Make a deep copy of this offloading file.
+ OffloadFile copy() const {
+std::unique_ptr Buffer =
jhuber6 wrote:
This is a redo of what was originally in
https://github.com/llvm/llvm-project/pull/72442
https://github.com/llvm/llvm-project/pull/78359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/78359
Summary:
The linker wrapper's job is to sort various embedded inputs into a list
of files that participate in a single link job. So far, this has been
completely 1-to-1, that is, each input file participates in
jhuber6 wrote:
Thanks for the patch, this one likely fell through the cracks because it has no
assigned reviewers. We'll need a test based off of the original bug report. Put
that in `clang/test/OpenMP/.c` and then look at other tests for what
it should look like. LLVM uses `lit` to test, you
@@ -21067,6 +21067,10 @@ Sema::ActOnOpenMPDependClause(const
OMPDependClause::DependDataTy ,
ExprTy = ATy->getElementType();
else
ExprTy = BaseType->getPointeeType();
+// bug 69200
+if (ExprTy.isNull()) {
+
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/78061
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 approved this pull request.
Thanks. I'll probably make a patch after this to make the surface handling for
CUDA default off because it seems to be unsupported.
https://github.com/llvm/llvm-project/pull/78057
___
cfe-commits
https://github.com/jhuber6 approved this pull request.
Thanks.
https://github.com/llvm/llvm-project/pull/78061
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> > LLVM changes look unrelated, it was originally copied from OpenBSD it
> > seems. But it's not a major issue.
>
> FWIW I opened a few PRs in FreeBSD regarding this.
Yeah, go ahead and move that portion there so the people who know more about
LLVM's regex can look at it
jhuber6 wrote:
LLVM changes look unrelated, it was originally copied from OpenBSD it seems.
But it's not a major issue.
https://github.com/llvm/llvm-project/pull/78061
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module ,
GlobalVariable *FatbinDesc,
} // namespace
-Error wrapOpenMPBinaries(Module , ArrayRef> Images) {
- GlobalVariable *Desc = createBinDesc(M, Images);
+Error OffloadWrapper::wrapOpenMPBinaries(
+Module ,
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/78057
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module ,
GlobalVariable *FatbinDesc,
} // namespace
-Error wrapOpenMPBinaries(Module , ArrayRef> Images) {
- GlobalVariable *Desc = createBinDesc(M, Images);
+Error OffloadWrapper::wrapOpenMPBinaries(
+Module ,
@@ -0,0 +1,62 @@
+//===- OffloadWrapper.h --r-*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module ,
GlobalVariable *FatbinDesc,
} // namespace
-Error wrapOpenMPBinaries(Module , ArrayRef> Images) {
- GlobalVariable *Desc = createBinDesc(M, Images);
+Error OffloadWrapper::wrapOpenMPBinaries(
+Module ,
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/78057
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module ,
GlobalVariable *FatbinDesc,
} // namespace
-Error wrapOpenMPBinaries(Module , ArrayRef> Images) {
- GlobalVariable *Desc = createBinDesc(M, Images);
+Error OffloadWrapper::wrapOpenMPBinaries(
+Module ,
@@ -0,0 +1,62 @@
+//===- OffloadWrapper.h --r-*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier:
https://github.com/jhuber6 commented:
Thanks, some comments.
https://github.com/llvm/llvm-project/pull/78057
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> As a somewhat naive question, what would it take to turn off requiring
> codegen to be in SCC order? We seem to be the only target doing that. The
> comments on that line say something about function calls and noinline
I believe this is also the reason parallel codegen via
jhuber6 wrote:
> > > An AMDGPU library function is not internalized and can be used to
> > > fullfill calls generated by LLVM passes or instruction selection.
> >
> >
> > I am confused by the description of "internalized". Do you refer to LTO
> > internalization? You can leverage `llvm.used`
jhuber6 wrote:
> > An AMDGPU library function is not internalized and can be used to fullfill
> > calls generated by LLVM passes or instruction selection.
>
> I am confused by the description of "internalized". Do you refer to LTO
> internalization? You can leverage `llvm.used` to disable LTO
@@ -2011,6 +2011,13 @@ def AMDGPUNumVGPR : InheritableAttr {
let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">;
}
+def AMDGPULibFun : InheritableAttr {
jhuber6 wrote:
Why isn't this a `TargetSpecificAttr`? We should have one for AMDGPU.
@@ -2693,6 +2693,17 @@ An error will be given if:
}];
}
+def AMDGPULibFunDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The ``amdgpu_lib_fun`` attribute can be applied to a function for AMDGPU target
+to indicate it is a library
jhuber6 wrote:
> I was thinking of implementing libm/libc for nvptx, which would produce an IR
> library . We'll still need to keep the functions around if they are not used
> explicitly, because we may need them to fulfill libcalls later in the
> compilation pipeline. Sort of a libdevice
jhuber6 wrote:
My use-case is more to be able to write functions like `is_wavefrontsize64()`
in regular C++ code. This would require some way to emit builtins for these.
I believe the use-case here is a workaround for the issues caused by library
ordering? I'm guessing this is related to the
601 - 700 of 1368 matches
Mail list logo