[clang] [Clang][AMDGPU][Driver] Add `avail-extern-gv-in-addrspace-to-local` option when ThinTLO is enabled (PR #144914)

2025-06-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Bit of a hack it seems, but better than it being broken. https://github.com/llvm/llvm-project/pull/144914 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mail

[clang] [HIP] Emit the CUID value in the module with the new driver (PR #144570)

2025-06-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/144570 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add standalone AMDGPU SPIR-V toolchain (PR #144576)

2025-06-19 Thread Joseph Huber via cfe-commits
@@ -417,3 +417,15 @@ void HIPAMDToolChain::checkTargetID( getDriver().Diag(clang::diag::err_drv_bad_target_id) << *PTID.OptionalTargetID; } + +SPIRVAMDToolChain::SPIRVAMDToolChain(const Driver &D, + const llvm::Triple &Triple, +

[clang] [HIP] Emit the CUID value in the module with the new driver (PR #144570)

2025-06-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > `llvm.compiler.used` global uses AS(0) which makes SPIR-V unhappy, but with > > this global it's AS(4) which makes it happy. Either way, this should be > > fixed. > > llvm.used and llvm.compiler.used should universally use addrspace(0) and > SPIRV should be fixed to not bre

[clang] [LinkerWrapper] Fix 'save-temps' when targeting SPIR-V (PR #144605)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/144605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Fix 'save-temps' when targeting SPIR-V (PR #144605)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/144605 Summary: The logic here is flawed, it was only intended to apply to the CPU case where we use the linker passed in on the command line. This was falsely applying to SPIR-V which caused issues. >From a0a041b7904

[clang] [HIP] Emit the CUID value in the module with the new driver (PR #144570)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/144570 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add standalone AMDGPU SPIR-V toolchain (PR #144576)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/144576 >From 94ad34a699173fe6b62614874952ca5cfe98f471 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 17 Jun 2025 12:58:14 -0500 Subject: [PATCH] [Clang] Add standalone AMDGPU SPIR-V toolchain Summary: The AMDG

[clang] [Clang] Add standalone AMDGPU SPIR-V toolchain (PR #144576)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/144576 >From 92db44cf46098f9171b06d0251b632eb1ff6d5e6 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 17 Jun 2025 12:58:14 -0500 Subject: [PATCH] [Clang] Add standalone AMDGPU SPIR-V toolchain Summary: The AMDG

[clang] [Clang] Add standalone AMDGPU SPIR-V toolchain (PR #144576)

2025-06-17 Thread Joseph Huber via cfe-commits
@@ -417,3 +417,17 @@ void HIPAMDToolChain::checkTargetID( getDriver().Diag(clang::diag::err_drv_bad_target_id) << *PTID.OptionalTargetID; } + +SPIRVAMDToolChain::SPIRVAMDToolChain(const Driver &D, + const llvm::Triple &Triple, +

[clang] [Clang] Add standalone AMDGPU SPIR-V toolchain (PR #144576)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/144576 Summary: The AMDGPU toolchain uses a different set of tools than the standard SPIR-V toolchain. The linker wrapper prefers to invoke a linker via a clang toolchain. To make that work we introduce `--target=spirv6

[clang] [HIP] Emit the CUID value in the module with the new driver (PR #144570)

2025-06-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/144570 Summary: This is a weird point of divergence, which is also apparently critical for SPIR-V compilation not failing? Somehow if we don't emit this global than the `llvm.compiler.used` global uses AS(0) which makes

[clang] Reland [HIP] use offload wrapper for non-device-only non-rdc (#132869) (PR #143964)

2025-06-12 Thread Joseph Huber via cfe-commits
@@ -9249,8 +9249,20 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, // Add the linker arguments to be forwarded by the wrapper. CmdArgs.push_back(Args.MakeArgString(Twine("--linker-path=") + LinkCommand->getEx

[clang] Reland [HIP] use offload wrapper for non-device-only non-rdc (#132869) (PR #143964)

2025-06-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/143964 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Reland [HIP] use offload wrapper for non-device-only non-rdc (#132869) (PR #143964)

2025-06-12 Thread Joseph Huber via cfe-commits
@@ -310,8 +310,8 @@ Error relocateOffloadSection(const ArgList &Args, StringRef Output) { // Remove the old .llvm.offloading section to prevent further linking. ObjcopyArgs.emplace_back("--remove-section"); ObjcopyArgs.emplace_back(".llvm.offloading"); - for (StringRef

[clang] Reland [HIP] use offload wrapper for non-device-only non-rdc (#132869) (PR #143964)

2025-06-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Looks good. Thanks for digging into this, guess I forgot to update the relocatable bit when I made that change. https://github.com/llvm/llvm-project/pull/143964 ___ cfe-commits mailing list cfe-co

[clang] [llvm] [Offload][PGO] Fix PGO on NVPTX targets (PR #143568)

2025-06-10 Thread Joseph Huber via cfe-commits
@@ -947,11 +954,18 @@ bool InstrLowerer::lower() { if (!ContainsProfiling && !CoverageNamesVar) return MadeChange; + // Cached info for generating delayed offset calculations + // This is only relevant on NVPTX targets + SmallVector Kernels; + SmallVector ValueSites;

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-06-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/134016 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-06-10 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,21 @@ +// REQUIRES: amdgpu-registered-target +// REQUIRES: spirv-registered-target +// RUN: %clang_cc1 -fsyntax-only -verify -triple amdgcn -Wno-unused-value %s +// RUN: %clang_cc1 -fsyntax-only -verify -triple spirv64-amd-amdhsa -Wno-unused-value %s +// RUN: %clang_cc

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-06-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: This looks generally good to me, but I'll let the clang code owners make the final decision. https://github.com/llvm/llvm-project/pull/134016 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.l

[clang] [llvm] [PGO][Offload] Fix offload coverage mapping (PR #143490)

2025-06-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/143490 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Revert "[HIP] use offload wrapper for non-device-only non-rdc (#132869)" (PR #143432)

2025-06-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/143432 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Revert "[HIP] use offload wrapper for non-device-only non-rdc (#132869)" (PR #143432)

2025-06-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/143432 This breaks a lot of new driver HIP compilation. We should probably revert this for now until we can make a fixed version. ```c++ static __global__ void print() { printf("%s\n", "foo"); } void b(); int main()

[clang] [NVPTX] Enable OpenCL 3d_image_writes support (PR #143331)

2025-06-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > LGTM but I'm never sure who owns this aspect of NVPTX. @Artem-B, @jhuber6 ? I work at AMD and don't really know much about OpenCL. https://github.com/llvm/llvm-project/pull/143331 ___ cfe-commits mailing list cfe-commits@lists.llvm.or

[clang] Reapply "[AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly" (PR #125744)

2025-06-03 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/125744 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Reapply "[AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly" (PR #125744)

2025-06-02 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/125744 >From ce7701b7c95ee1e59c7942b23833a7a7336abfb7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 4 Feb 2025 12:06:34 -0600 Subject: [PATCH] Reapply "[AMDGPU] Use the AMDGPUToolChain when targeting C/C++ di

[clang] [Clang] Always pass the detected CUDA path to the linker wrapper (PR #142021)

2025-05-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/142021 Summary: This patch identifies the CUDA path that clang used and forwards it to the linker wrapper. This should make sure that we're using a consistent CUDA path, but the behavior should be the same for normal us

[clang] [clang][Driver][OpenMP][SPIR-V] Fix SPIR-V OpenMP DeviceRTL expected file name (PR #141855)

2025-05-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. The `.bc` file for AMDGPU is unused, I'd imagine SPIR-V is as well since its compilation flow is like AMDGPU. https://github.com/llvm/llvm-project/pull/141855 ___ cfe-commits mailing list cfe-comm

[libclc] [llvm] [libclc] Support LLVM_ENABLE_RUNTIMES when building (PR #141574)

2025-05-29 Thread Joseph Huber via cfe-commits
@@ -72,7 +72,7 @@ else() # Note that we check this later (for both build types) but we can provide a # more useful error message when built in-tree. We assume that LLVM tools are # always available so don't warn here. - if( NOT clang IN_LIST LLVM_ENABLE_PROJECTS ) + if(

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/141142 >From 07caec33a1113602f3d6ba79357edeae6b66647c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 22 May 2025 16:21:34 -0500 Subject: [PATCH] [OpenMP] Fix atomic compare handling with overloaded operators

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-28 Thread Joseph Huber via cfe-commits
@@ -12062,32 +12154,56 @@ bool OpenMPAtomicCompareCaptureChecker::checkForm3(IfStmt *S, X = BO->getLHS(); D = BO->getRHS(); - auto *Cond = dyn_cast(S->getCond()); - if (!Cond) { + if (auto *Cond = dyn_cast(S->getCond())) { +C = Cond; +if (Cond->getOpcode() != B

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/141142 >From f2c18ba64744320a8e2a63938b17137a1b6e74d7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 22 May 2025 16:21:34 -0500 Subject: [PATCH] [OpenMP] Fix atomic compare handling with overloaded operators

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-28 Thread Joseph Huber via cfe-commits
@@ -991,3 +991,34 @@ int mixed() { // expected-note@+1 {{in instantiation of function template specialization 'mixed' requested here}} return mixed(); } + +#ifdef OMP51 +struct U {}; +struct U operator<(U, U); +struct U operator>(U, U); +struct U operator==(U, U); + +templ

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-28 Thread Joseph Huber via cfe-commits
@@ -12062,32 +12154,56 @@ bool OpenMPAtomicCompareCaptureChecker::checkForm3(IfStmt *S, X = BO->getLHS(); D = BO->getRHS(); - auto *Cond = dyn_cast(S->getCond()); - if (!Cond) { + if (auto *Cond = dyn_cast(S->getCond())) { +C = Cond; +if (Cond->getOpcode() != B

[libclc] [llvm] [libclc] Support LLVM_ENABLE_RUNTIMES when building (PR #141574)

2025-05-28 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I'll need to look into that - maybe we can talk offline. Since `libclc` is > used by downstream toolchains I feel it'll be hard to significantly change > how it's built or presented to the host. We could support two methods of > building but that would get sticky pretty quickl

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/141142 >From a45dc43315631f28ced9cf5a14890e46e011e6d2 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 22 May 2025 16:21:34 -0500 Subject: [PATCH] [OpenMP] Fix atomic compare handling with overloaded operators

[libclc] [llvm] [libclc] Support LLVM_ENABLE_RUNTIMES when building (PR #141574)

2025-05-27 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > That said, I don't believe it "works" in the way it's supposed to. It still > grabs the host tools using `get_host_tool_path` in CMake, and custom commands > to build with that. I take it we're supposed to use `CMAKE_C_COMPILER` as if > we were a regular CMake project? Are we

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-05-27 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > This commit breaks a critical optimization for us. We have a project that > compiles most of the C++26 language features to Vulkan SPIRV. I've thought about doing that, since I've spent a few years just using normal C++ to write GPU runtimes. The main issue right now is that t

[libclc] [llvm] [libclc] Support LLVM_ENABLE_RUNTIMES when building (PR #141574)

2025-05-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/141574 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-05-27 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Also, if you're compiling C++26, why is it enabling OpenCL language features? You can already do this pretty easily with `clang --target=spirv64` so long as you're okay with using vendor intrinsic extensions in SPIR-V. That's pretty much what the HIP support for SPIR-V does anyw

[libclc] [llvm] [libclc] Support LLVM_ENABLE_RUNTIMES when building (PR #141574)

2025-05-27 Thread Joseph Huber via cfe-commits
@@ -35,7 +35,7 @@ list(INSERT CMAKE_MODULE_PATH 0 # We order libraries to mirror roughly how they are layered, except that compiler-rt can depend # on libc++, so we put it after. -set(LLVM_DEFAULT_RUNTIMES "libc;libunwind;libcxxabi;pstl;libcxx;compiler-rt;openmp;offload") +s

[libclc] [llvm] [libclc] Support LLVM_ENABLE_RUNTIMES when building (PR #141574)

2025-05-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. The changes make sense for just adding it, but does it actually work? The CMake I've seen in `libclc` does multiple compilations and some with `--target=amdgcn` for example, which is a little different from how the runtimes builds handle i

[clang] [Driver][LTO] Move common code for LTO to addLTOOptions() (PR #74178)

2025-05-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Looks like a nice cleanup https://github.com/llvm/llvm-project/pull/74178 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-22 Thread Joseph Huber via cfe-commits
@@ -11762,52 +11762,98 @@ bool OpenMPAtomicCompareChecker::checkCondUpdateStmt(IfStmt *S, X = BO->getLHS(); - auto *Cond = dyn_cast(S->getCond()); - if (!Cond) { -ErrorInfo.Error = ErrorTy::NotABinaryOp; -ErrorInfo.ErrorLoc = ErrorInfo.NoteLoc = S->getCond()->get

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-22 Thread Joseph Huber via cfe-commits
@@ -11762,52 +11762,98 @@ bool OpenMPAtomicCompareChecker::checkCondUpdateStmt(IfStmt *S, X = BO->getLHS(); - auto *Cond = dyn_cast(S->getCond()); - if (!Cond) { -ErrorInfo.Error = ErrorTy::NotABinaryOp; -ErrorInfo.ErrorLoc = ErrorInfo.NoteLoc = S->getCond()->get

[clang] [XRay] Fix argument parsing with offloading (#140748) (PR #141043)

2025-05-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/141043 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [XRay] Fix argument parsing with offloading (#140748) (PR #141043)

2025-05-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. We'll now be creating the XRayArgs class when we do this every time, but I don't think it's expensive enough or done enough times to be an issue. Thanks. https://github.com/llvm/llvm-project/pull/141043 _

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > but there is other comgr user expecting comgr to have -nogpuinc by default. > changing that will cause regressions. If `comgr` can have custom flags then you could just pass the 'do not pass `-nogpuinc` by default' flag presumably. https://github.com/llvm/llvm-project/pull/14

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > > > Hmm, in what cases is `-nogpuinc` added when we don't actually want it? > > > > I think we should avoid adding `-nogpuinc` if it's not needed, if > > > > possible. > > > > > > > > > comgr is the JIT compiler for HIP on ROCm. comgr uses -nogpuinc by > > > default. Howev

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > Hmm, in what cases is `-nogpuinc` added when we don't actually want it? I > > think we should avoid adding `-nogpuinc` if it's not needed, if possible. > > comgr is the JIT compiler for HIP on ROCm. comgr uses -nogpuinc by default. > However, some users of comgr need to over

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > It's just the AMDGCN target without any `+features`, right? The only issue > > I was aware of was assuming w64 when unspecified but you fixed that > > previously. > > Almost, but it's problematic in several ways. The problems multiply once you > start adding in manually spe

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > The main obstacle of letting clang emit error when `--offload-arch` is not > > specified is HIP apps using hipcc as CMAKE_CXX_COMPILER. hipcc adds -xhip > > by default for .cpp programs. This is a known and long existing issue. > > Another option is to have multiple `--offloa

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Right now this checks for `libc++` less than 14. Is that still relevant following that change? https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[clang] [NFC][Clang][CodeGen] Remove vestigial assertion (PR #127528)

2025-05-10 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > @jhuber6 Was the follow-up for this backported too? I don't remember, sorry. I think the whole thing got reverted or something? https://github.com/llvm/llvm-project/pull/127528 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[clang] [OpenMP] Fix crash when diagnosing dist_schedule (PR #139277)

2025-05-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/139277 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] change default offload archs (PR #139281)

2025-05-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > @jhuber6 do you think can we use `native` instead? I think it would be a > somewhat better option here. If we have to choose a GPU variant by default, > we may as well choose the actual GPU, rather than a conditional choice > between generic SPIR-V or an old GPU, which has the

[clang] [Clang][SYCL] Add AOT compilation support for Intel GPUs in clang-sycl-linker (PR #133194)

2025-05-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Seems like a lot of the linker wrapper utilities copied and applied to Intel binaries, harmless enough. I'm wondering though, is there a reason we can't just use the backend right now? What do these tools do that running `llc` can't. http

[clang] [Clang][Driver] Only enable internalization for OpenMP target offloading with ThinLTO on AMDGPU (PR #138547)

2025-05-05 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/138547 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-linker-wrapper] Remove unused local variables (NFC) (PR #138480)

2025-05-04 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/138480 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I don't think OpenMP is more special than HIP here. Anything exposed to the > host should not be internalized. In addition, OpenMP actually also heavily > uses internalization as well in OpenMPOpt. It is likely that this change > exposes something bad in the downstream. > > T

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Joseph Huber via cfe-commits
@@ -9284,6 +9284,12 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back(Args.MakeArgString( "--device-linker=" + TC->getTripleString() + "=" + Arg)); + // Enable internalization for AMDGPU. + if (TC->getTrip

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > also seeing "PluginInterface" error: Failure to look up global address: Error > in hsa_executable_get_symbol_by_name(grid_points): > HSA_STATUS_ERROR_INVALID_SYMBOL_NAME: There is no symbol with the given name. > omptarget error: Failed to load symbol grid_points Yeah, this i

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Joseph Huber via cfe-commits
@@ -9284,6 +9284,12 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back(Args.MakeArgString( "--device-linker=" + TC->getTripleString() + "=" + Arg)); + // Enable internalization for AMDGPU. + if (TC->getTrip

[clang] [Clang][SYCL] Add initial set of Intel OffloadArch values (PR #138158)

2025-05-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/138158 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][SYCL] Add initial set of Intel OffloadArch values (PR #138158)

2025-05-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/138158 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 0a1dde1 - [Clang] Fix GPU match any truncating 64-bit lane mask

2025-04-30 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2025-04-30T16:25:28-05:00 New Revision: 0a1dde1d7957531701ba56e357276033a927f496 URL: https://github.com/llvm/llvm-project/commit/0a1dde1d7957531701ba56e357276033a927f496 DIFF: https://github.com/llvm/llvm-project/commit/0a1dde1d7957531701ba56e357276033a927f496.diff

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -3979,6 +3979,16 @@ def fsyntax_only : Flag<["-"], "fsyntax-only">, Visibility<[ClangOption, CLOption, DXCOption, CC1Option, FC1Option, FlangOption]>, Group, HelpText<"Run the preprocessor, parser and semantic analysis stages">; + + +def fno_builtin_modules : Flag<["-

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -157,6 +157,9 @@ class ToolChain { /// The list of toolchain specific path prefixes to search for programs. path_list ProgramPaths; +path_list ModulePaths; +path_list IntrinsicModulePaths; jhuber6 wrote: Format. https://github.com/llvm/llv

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/137828 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -299,6 +310,18 @@ elseif (FLANG_RT_GCC_RESOURCE_DIR) endif () endif () + + +if (CMAKE_C_BYTE_ORDER STREQUAL "BIG_ENDIAN") jhuber6 wrote: I was hoping I got rid of needing to detect endianness in CMake, since it makes cross-compiling a pain. Not eager to

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: What's the main limitation here? If this is just a file dependency it should be identical to how all the OpenMP tests depend on `omp.h` being in the resource directory. IMHO this is trivial if we do a runtimes build, since we can just require that `openmp;

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -102,6 +102,10 @@ ToolChain::ToolChain(const Driver &D, const llvm::Triple &T, getFilePaths().push_back(*Path); for (const auto &Path : getArchSpecificLibPaths()) addIfExists(getFilePaths(), Path); + + if (D.IsFlangMode()) { +getIntrinsicModulePaths().append(

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -29,6 +29,8 @@ MODULE_PASS("amdgpu-printf-runtime-binding", AMDGPUPrintfRuntimeBindingPass()) MODULE_PASS("amdgpu-remove-incompatible-functions", AMDGPURemoveIncompatibleFunctionsPass(*this)) MODULE_PASS("amdgpu-sw-lower-lds", AMDGPUSwLowerLDSPass(*this)) MODULE_PASS("amdg

[clang] [Clang] Disable RTTI for offloading at the frontend level (PR #127082)

2025-04-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/127082 >From b17f35541bb5de23389afe0af61cda2cac749e81 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 13 Feb 2025 09:27:24 -0600 Subject: [PATCH] [Clang] Disable RTTI for offloading at the frontend level Summar

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [OpenMP] Do not emit default thread limits of 128 (PR #87558)

2025-04-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/87558 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
@@ -97,30 +97,30 @@ static const OffloadArchToStringMap arch_names[] = { #undef GFX const char *OffloadArchToString(OffloadArch A) { - auto result = std::find_if( - std::begin(arch_names), std::end(arch_names), - [A](const OffloadArchToStringMap &map) { return A ==

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Thanks https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC[ Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,126 @@ +#include "clang/Basic/OffloadArch.h" + +#include "llvm/ADT/StringRef.h" + +#include + +namespace clang { + +namespace { +struct OffloadArchToStringMap { + OffloadArch arch; + const char *arch_name; + const char *virtual_arch_name; jhuber6 wr

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-24 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > A naive question from someone who is not familiar with this area: Is any of > this stuff usable with anything but a matching version of clang? If no, can > we place these things in the clang resource directory, where the other > version-bound runtimes live? It's not intended,

[clang] [Clang] Move OffloadArch enum to a generic location and add initial set of Intel OffloadArch values (PR #137070)

2025-04-23 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,143 @@ +//===--- OffloadArch.h - Definition of offloading architectures --- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [OpenMP] Update the bitcode library install and search path (PR #136754)

2025-04-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/136754 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-23 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I think using the LLVM_ENABLE_RUNTIMES-machanism is a great idea. Regarding > the move back to `openmp/device`, I don't really have an opinion. However, > there are some arguments to make: > > 1. The same arguments apply to `libomptarget` as well > > 2. Definitions su

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-23 Thread Joseph Huber via cfe-commits
@@ -122,35 +130,41 @@ else() get_clang_resource_dir(LIBOMP_HEADERS_INSTALL_PATH SUBDIR include) endif() -# Build host runtime library, after LIBOMPTARGET variables are set since they are needed -# to enable time profiling support in the OpenMP runtime. -add_subdirectory(run

[clang] [llvm] [OpenMP] Update the bitcode library install and search path (PR #136754)

2025-04-23 Thread Joseph Huber via cfe-commits
@@ -2794,6 +2794,11 @@ void tools::addOpenMPDeviceRTL(const Driver &D, for (const auto &LibPath : HostTC.getFilePaths()) LibraryPaths.emplace_back(LibPath); + // Check the target specific library path for the triple as well. + SmallString<128> P(D.Dir); + llvm::sys::p

[clang] [llvm] [OpenMP] Update the bitcode library install and search path (PR #136754)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/136754 Summary: This was accidentally kept in the old location when we moved to the new `lib//` location for the DeviceRTL. Move this to reduce the delta with https://github.com/llvm/llvm-project/pull/136729. >From 21

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/136729 >From 748a7f76bf0188e0a1b72fcd5527a03a5ca2f054 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 22 Apr 2025 12:05:42 -0500 Subject: [PATCH] [OpenMP] Change build of OpenMP device runtime to be a separate

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/136729 >From ee6ca9501a07746c446a106619567d3faff07e98 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 22 Apr 2025 12:05:42 -0500 Subject: [PATCH] [OpenMP] Change build of OpenMP device runtime to be a separate

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/136729 Summary: Currently we build the OpenMP device runtime as part of the `offload/` project. This is problematic because it has several restrictions when compared to the normal offloading runtime. It can only be buil

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Yeah, appending `-march=` seems to work. Is this a functional work-around for now? ```diff diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt index cce360236960..277ad9816411 100644 --- a/offload/DeviceRTL/CMakeLists.txt +++ b/offload/DeviceRTL/CMak

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > No, I'm afraid that didn't change anything. However, it did if I added it to > `target_link_options` too. > > That said, you want to instead: > > ```diff > --- a/offload/DeviceRTL/CMakeLists.txt > +++ b/offload/DeviceRTL/CMakeLists.txt > @@ -132,7 +132,7 @@ function(compileDev

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > It should only be invoking `nvptx-arch` if the user passed `-march=native`. > > Sorry, didn't notice this sentence. Well, _I am_ building with > `-march=native` here — after all, other files are built for a CPU. If I > change it to, say, `-march=zen2`, then it indeed compile

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > How did you disable it? Perhaps it's failing because of the specific error: > > ``` > $ nvptx-arch > > Failed to 'dlopen' libcuda.so.1 > ``` > > For comparison, `amdgpu-

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: I disabled my NVIDIA GPU discovery and I could build it successfully. What's your `clang` version? I'm wondering what could be different here. https://github.com/llvm/llvm-project/pull/126143 ___ cfe-commits mailing list cfe-commits@lis

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Nah, building standalone directly. And separately from OpenMP. I see, it previously worked because when you built with `gcc` it was still finding `clang` in your environment and using that. I'm going to move this code so that it's more explicit that we only support a just-buil

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > This change broke building with GCC set as the C++ compiler: > > ``` > FAILED: libomptarget-nvptx.bc > : && /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -O2 -pipe -march=native > -Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs > --target=nvptx64-nvidia-cuda -r -nostdlib

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/126143 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-17 Thread Joseph Huber via cfe-commits
@@ -792,6 +805,7 @@ bundleLinkedOutput(ArrayRef Images, const ArgList &Args, llvm::TimeTraceScope TimeScope("Bundle linked output"); switch (Kind) { case OFK_OpenMP: + case OFK_SYCL: return bundleOpenMP(Images); jhuber6 wrote: Could call it `offlo

  1   2   3   4   5   6   7   8   9   10   >