[PATCH] D128914: [HIP] Add support for handling HIP in the linker wrapper

2022-06-30 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, yaxunl, tra. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang. This patch adds the necessary changes required

[PATCH] D128850: [HIP] Generate offloading entries for HIP with the new driver.

2022-06-29 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, yaxunl, tra. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang. This patch adds the small change required to

[PATCH] D128923: [LinkerWrapper] Add AMDGPU specific options to the LLD invocation

2022-06-30 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: arsenm, JonChesterfield, saiislam, yaxunl. Herald added subscribers: kosarev, t-tye, tpr, dstuttard, kzhuravl. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, wdng. Herald added a

[PATCH] D128752: [CUDA] Stop adding CUDA features twice

2022-06-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, tra, yaxunl. Herald added subscribers: mattd, carlosgalvezp. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang. We currently call

[PATCH] D128752: [CUDA] Stop adding CUDA features twice

2022-06-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D128752#3616553 , @tra wrote: >> we no longer will have a cached CUDA installation so we will usually create >> it twice. > > Does that result in extra output in case we find an unexpected CUDA version, > or when compiler is

[PATCH] D128914: [HIP] Add support for handling HIP in the linker wrapper

2022-06-30 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Thanks for the comments. Comment at: clang/test/Driver/linker-wrapper.c:109 // RUN: clang-offload-packager -o %t-lib.out \ // RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \ tra

[PATCH] D127686: [Offloading] Embed the target features in the OffloadBinary

2022-06-22 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. ping. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127686/new/ https://reviews.llvm.org/D127686 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D127901#3603006 , @tra wrote: > Then we do need a knob controlling whether we do want to embed PTX or not. > The default should be "off" IMO. > We currently have `--[no-]cuda-include-ptx=` we may reuse for that purpose. We

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-22 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D127901#3603467 , @tra wrote: > I'm not sure I follow. WDYM by "go inside the binary itself" ? I assume you > mean the per-GPU offload binaries inside per-TU .o. so that it could be used > when that GPU object gets linked

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 435118. jhuber6 added a comment. There was a problem where the Triple and Arch data would be deallocated when the LTO pass took ownership of every single file. Add a UniqueStringSaver to make sure they are still accessible after linking. Repository: rG

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 435124. jhuber6 added a comment. Sorry for this noise. This is a pretty large change but shouldn't affect any functionality and passes all the tests I know of, so this should be good to land. Let me know if you have any objections to how I've structured

[PATCH] D127304: [LinkerWrapper] Embed OffloadBinaries for OpenMP offloading images

2022-06-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, saiislam, JonChesterfield, tianshilei1992. Herald added subscribers: guansong, yaxunl. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project:

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 435122. jhuber6 added a comment. Add use of bitcode libraries so this works on AMD. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127246/new/ https://reviews.llvm.org/D127246 Files:

[PATCH] D127515: [Clang] Change host/device only compilation to a driver mode

2022-06-10 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: tra, jdoerfert, JonChesterfield, yaxunl. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang. We use the flags `--offload-host-only` and

[PATCH] D127686: [Offloading] Embed the target features in the OffloadBinary

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 436611. jhuber6 added a comment. Does this approach work? I'm just using the reverse iterator and only adding the argument if it hasn't been seen yet. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127686/new/

[PATCH] D127686: [Offloading] Embed the target features in the OffloadBinary

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Thanks for the comments, I'll try to address them. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:8320 +TC->getDriver().isUsingLTO(/* IsOffload */ true) +? ",feature=" + llvm::join(FeatureArgs, ",feature=") +: "";

[PATCH] D127707: [Clang] Simplify unifying target features

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: yaxunl, jdoerfert, tra. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang. This patch simplifies how we unify target features. Now we simply

[PATCH] D127686: [Offloading] Embed the target features in the OffloadBinary

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 436607. jhuber6 added a comment. Adjust how we generate arguments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127686/new/ https://reviews.llvm.org/D127686 Files: clang/lib/Driver/ToolChains/Clang.cpp

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, tra, yaxunl. Herald added subscribers: mattd, gchakrabarti, asavonic, inglorion. Herald added a project: All. jhuber6 requested review of this revision. Herald added a project: clang. Herald added a subscriber:

[PATCH] D127686: [Offloading] Embed the target features in the OffloadBinary

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, tianshilei1992, tra, yaxunl, saiislam. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1, MaskRay. Herald added a project: clang. The target

[PATCH] D127901: [LinkerWrapper] Add PTX output to CUDA fatbinary in LTO-mode

2022-06-16 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D127901#3590402 , @tra wrote: > Playing devil's advocate, I've got to ask -- do we even want to support JIT? > > JIT brings more trouble than benefits. > > - substantial start-up time on nontrivial apps. Last time I tried

[PATCH] D127673: [OpenMP] Fix offload packager not writing to temps correctly

2022-06-14 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG6a6484c666ed: [OpenMP] Fix offload packager not writing to temps correctly (authored by jhuber6). Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D127707: [Clang] Simplify unifying target features

2022-06-14 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGc4a2674e21c4: [Clang] Simplify unifying target features (authored by jhuber6). Changed prior to commit:

[PATCH] D128206: [Clang] Allow multiple comma separated arguments to `--offload-arch=`

2022-06-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: tra, yaxunl, jdoerfert, JonChesterfield, ye-luo. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1, MaskRay. Herald added a project: clang. This patch updates the

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Is anyone up to review this? I'm mostly looking for some feedback on the interfaces I've built. If no one has time to look into it I can probably just land without review. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-07 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, tianshilei1992, JonChesterfield, tra, yaxunl. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang. The linker wrapper currently

[PATCH] D127246: [LinkerWrapper] Rework the linker wrapper and use owning binaries

2022-06-09 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 435571. jhuber6 added a comment. Fixing bug when capturing a StringRef by reference in a callback. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127246/new/ https://reviews.llvm.org/D127246 Files:

[PATCH] D127515: [Clang] Change host/device only compilation to a driver mode

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG1054a7318788: [Clang] Change host/device only compilation to a driver mode (authored by jhuber6). Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D127673: [OpenMP] Fix offload packager not writing to temps correctly

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added a comment. Thanks for the review. Comment at: clang/lib/Driver/Driver.cpp:5420 +/*CreatePrefixForHost=*/isa(A) || +(!!A->getOffloadingHostActiveKinds() && !AtTopLevel)); if (isa(JA)) {

[PATCH] D127673: [OpenMP] Fix offload packager not writing to temps correctly

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 436493. jhuber6 added a comment. Addressing nits. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D127673/new/ https://reviews.llvm.org/D127673 Files: clang/lib/Driver/Driver.cpp

[PATCH] D127673: [OpenMP] Fix offload packager not writing to temps correctly

2022-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, yaxunl, tra. Herald added a subscriber: guansong. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1, MaskRay. Herald added a project: clang.

[PATCH] D125165: [Clang] Introduce clang-offload-packager tool to bundle device files

2022-06-03 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D125165#3557015 , @tra wrote: > @jhuber6 -- @MaskRay has found that `ninja install` is failing in a clean > build with: > > clang: error: unable to execute command: Executable > "clang-offload-packager" doesn't exist! > >

[PATCH] D125165: [Clang] Introduce clang-offload-packager tool to bundle device files

2022-06-03 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D125165#3557015 , @tra wrote: > @jhuber6 -- @MaskRay has found that `ninja install` is failing in a clean > build with: > > clang: error: unable to execute command: Executable > "clang-offload-packager" doesn't exist! > >

[PATCH] D125165: [Clang] Introduce clang-offload-packager tool to bundle device files

2022-06-03 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D125165#3557441 , @MaskRay wrote: > Add openmp to `LLVM_ENABLE_PROJECTS` to trigger the issue: > > cmake -GNinja -Sllvm -B/tmp/out/play -DCMAKE_BUILD_TYPE=Release > -DLLVM_ENABLE_PROJECTS='clang;openmp' >

[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

2022-05-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D125904#3532608 , @tra wrote: > That said, I would consider compiling the same source with different > preprocessor options to be a legitimate use case that we should support. > Explicitly passing cuid would work as a

[PATCH] D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker

2022-05-24 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. jhuber6 marked an inline comment as done. Closed by commit rGf37101983fc9: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker (authored by jhuber6). Changed prior to commit:

[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

2022-05-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 431666. jhuber6 added a comment. Removing use of the line number, instead replacing it with an 8 character wide hash of the `-D` options passed to the front-end. This should make it sufficiently unique for users compiling the same file with different

[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

2022-05-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6845-6846 +llvm::MD5::MD5Result Result; +for (const auto : PreprocessorOpts.Macros) + Hash.update(Arg.first); +Hash.final(Result); yaxunl wrote: > Are these options

[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

2022-05-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/CodeGen/CodeGenModule.cpp:6845-6846 +llvm::MD5::MD5Result Result; +for (const auto : PreprocessorOpts.Macros) + Hash.update(Arg.first); +Hash.final(Result); yaxunl wrote: > jhuber6 wrote: > >

[PATCH] D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given

2022-05-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 431702. jhuber6 added a comment. Adding extra commentto mention the hidden requirement that the driver shuold not define a different `-D` option for the host and device. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D129885: [CUDA] Make the new driver properly ignore non-CUDA inputs

2022-07-15 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGbb957a8d524c: [CUDA] Make the new driver properly ignore non-CUDA inputs (authored by jhuber6). Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D129885: [CUDA] Make the new driver properly ignore non-CUDA inputs

2022-07-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 445087. jhuber6 added a comment. Adjusting tests Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129885/new/ https://reviews.llvm.org/D129885 Files: clang/lib/Driver/Driver.cpp

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added a comment. In D130096#3663295 , @yaxunl wrote: > There is no constant propagation for globals with weak linage, right? > Otherwise, it won't work. My concern is that there may be optimization

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D130096#3663062 , @JonChesterfield wrote: > A safer bet is to use the current control flow that links in specific bitcode > files, but create the global directly instead of linking in the file. That'll > give us zero

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D130096#3663411 , @arsenm wrote: > In D130096#3663398 , @jhuber6 wrote: > >> In D130096#3663295 , @yaxunl wrote: >> >>> There is no constant

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D130096#3663062 , @JonChesterfield wrote: > A safer bet is to use the current control flow that links in specific bitcode > files, but create the global directly instead of linking in the file. That'll > give us zero

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D130096#3663295 , @yaxunl wrote: > There is no constant propagation for globals with weak linage, right? > Otherwise, it won't work. My concern is that there may be optimization passes > which do not respect the weak linkage

[PATCH] D129586: [LinkerWrapper] Support remarks files for device LTO

2022-07-12 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, tianshilei1992, JonChesterfield. Herald added subscribers: wenlei, inglorion. Herald added a project: All. jhuber6 requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. This

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D130096#3666155 , @yaxunl wrote: > The current patch does not consider HIP/OpenCL compile options, therefore the > value of these variables are not correct for OpenCL/HIP. They need to be > overridden by the variables with

[PATCH] D129784: [HIP] Allow the new driver to compile HIP in non-RDC mode

2022-07-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129784/new/ https://reviews.llvm.org/D129784 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D129784: [HIP] Allow the new driver to compile HIP in non-RDC mode

2022-07-20 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG0c1b32717bcf: [HIP] Allow the new driver to compile HIP in non-RDC mode (authored by jhuber6). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129784/new/

[PATCH] D122683: [OpenMP] Use new offloading binary when embedding offloading images

2022-07-21 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D122683#3668423 , @jhuber6 wrote: > In D122683#3668412 , @mgorny wrote: > >> I'm sorry for noticing it this late but this change seems to have broken the >> test on 32-bit x86: > >

[PATCH] D122683: [OpenMP] Use new offloading binary when embedding offloading images

2022-07-21 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D122683#3668412 , @mgorny wrote: > I'm sorry for noticing it this late but this change seems to have broken the > test on 32-bit x86: Seems to be a difference in the alignment, I'm not sure why this changes though because

[PATCH] D129581: [Clang] Rework LTO argument handling in the linker wrapper

2022-07-14 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. On second thought I'm not sure if overloading `-plugin-opt` is a good idea because we could have situations where we'd want them to be mutually exclusive, although I'd like to reuse the logic to set the arguments. I could change it to emit `-offload-opt=` instead or

[PATCH] D129784: [HIP] Allow the new driver to compile HIP in non-RDC mode

2022-07-14 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, tra, yaxunl, JonChesterfield. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang. The new driver primarily allows us to support

[PATCH] D129694: [OPENMP] Make declare target static global externally visible

2022-07-14 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/test/OpenMP/target_update_messages.cpp:17 -static int y; -#pragma omp declare target(y) - -void yyy() { -#pragma omp target update to(y) // expected-error {{the host cannot update a declare target variable that is not

[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Sorry never noticed this revision. The purpose of this patch seems to be supporting something like this clang input.c -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_70 -Xopenmp-target=nvptx64 -march=sm_80 Right now the above works if you replace

[PATCH] D129655: [CUDA] Allow the new driver to compile CUDA in non-RDC mode

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, tra, JonChesterfield, yaxunl. Herald added subscribers: mattd, carlosgalvezp. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang.

[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D128090#3648984 , @tra wrote: > At some point we should start consolidating the ways we can specify an > offload target and try to avoid adding new ones until then. Agreed, that was my intention with making `--offload-arch`

[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D128090#3649202 , @tra wrote: > In D128090#3649125 , @jhuber6 wrote: > >> It just defaults to `sm_35` if CUDA isn't present on the system IIRC. >> Alternatively we could ship a tool

[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D128090#3649579 , @tra wrote: > In D128090#3649235 , @jhuber6 wrote: > >> Interesting, may be worthwhile to query that if it exists, though AMD does >> this with `amdgpu-arch` which

[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a subscriber: tra. jhuber6 added a comment. In D128090#3648879 , @saiislam wrote: > `-Xopenmp-target -march ` used to be the only option to target a specific sub > arch before `--offload-arch`. But, it doesn't support multiple archs. >

[PATCH] D129655: [CUDA] Allow the new driver to compile CUDA in non-RDC mode

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7009 + // Host-side offloading compilation receives all device-side outputs. Include + // them in the host compilation depending on the target. if

[PATCH] D128090: [Clang][OpenMP] Process multi-arch compilation options given via -march

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D128090#3649059 , @tra wrote: > In D128090#3648999 , @jhuber6 wrote: > >> Right now there's `CLANG_OPENMP_NVPTX_DEFAULT_ARCH`, which is defined by >> CMake to be the architecture of

[PATCH] D129655: [CUDA] Allow the new driver to compile CUDA in non-RDC mode

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added a comment. It's also worth noting that this doesn't include the `PTX` output for JIT in the fatbinary, it would be relatively easy to include that but I wanted to ask how we should handle that. Comment at:

[PATCH] D129655: [CUDA] Allow the new driver to compile CUDA in non-RDC mode

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 444398. jhuber6 marked an inline comment as done. jhuber6 added a comment. Updating and making suggested changes. I removed the old `fgpu-rdc` in rG6abaa8e2103760025cee76528f555de7cf6698e6

[PATCH] D129655: [CUDA] Allow the new driver to compile CUDA in non-RDC mode

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:6998-6999 +CmdArgs.push_back(CudaDeviceInput->getFilename()); +if (IsRDCMode) + CmdArgs.push_back("-fgpu-rdc"); + } else if (IsCuda && !HostOffloadingInputs.empty() && !IsRDCMode) {

[PATCH] D129694: [OPENMP] Make declare target static global externally visible

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:3284-3286 // Hidden or internal symbols on the device are not externally visible. We // should not attempt to register them by creating an offloading entry. if (auto *GV =

[PATCH] D129694: [OPENMP] Make declare target static global externally visible

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Thanks for the patch. I still think this is a silly feature to support, but users will probably expect it. See comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:10790 Flags = OffloadEntriesInfoManagerTy::OMPTargetGlobalVarEntryTo; -

[PATCH] D129655: [CUDA] Allow the new driver to compile CUDA in non-RDC mode

2022-07-13 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. jhuber6 marked an inline comment as done. Closed by commit rGb370be37cca7: [CUDA] Allow the new driver to compile CUDA in non-RDC mode (authored by jhuber6). Changed prior to commit:

[PATCH] D129694: [OPENMP] Make declare target static global externally visible

2022-07-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. I still think we shouldn't bother making all the noise containing the original name. Just mangle it and treat it like every other declare target variable without introducing any extra complexity. These symbols never should've been emitted in the first place so I'm not

[PATCH] D127304: [LinkerWrapper] Embed OffloadBinaries for OpenMP offloading images

2022-07-21 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG080022d8ed6c: [LinkerWrapper] Embed OffloadBinaries for OpenMP offloading images (authored by jhuber6). Repository: rG LLVM Github Monorepo

[PATCH] D129885: [CUDA] Make the new driver properly ignore non-CUDA inputs

2022-07-15 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, tra, yaxunl. Herald added subscribers: mattd, carlosgalvezp. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang. The new driver

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D130096#3663010 , @JonChesterfield wrote: > Tagging Brian as the code owner of rocm device libs - emitting these in clang > would simplify that library. > > Currently clang reads these commandline flags and conditionally

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: JonChesterfield, yaxunl, saiislam, arsenm, carlo.bertolli, MaskRay, jdoerfert, tianshilei1992. Herald added subscribers: kosarev, StephenFan, t-tye, tpr, dstuttard, jvesely, kzhuravl. Herald added a project: All. jhuber6 requested review of

[PATCH] D130096: [Clang][AMDGPU] Emit AMDGPU library control constants in clang

2022-07-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Let me know if I should move this code somewhere else, or if there are problems. One change I made is that the constant is `weak_odr` and `hidden` instead of `linkonce_odr` and `protected`. This is so this constant is alive until link time, AMDGPU pretty much always

[PATCH] D124721: [OpenMP] Allow compiling multiple target architectures with OpenMP

2022-05-02 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/test/Driver/amdgpu-openmp-toolchain-new.c:6 // RUN: | FileCheck %s +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \ +// RUN: --offload-arch=gfx906

[PATCH] D123471: [CUDA] Create offloading entries when using the new driver

2022-05-02 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 426404. jhuber6 added a comment. Fix test. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123471/new/ https://reviews.llvm.org/D123471 Files: clang/include/clang/Basic/LangOptions.def

[PATCH] D124721: [OpenMP] Allow compiling multiple target architectures with OpenMP

2022-04-30 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, yaxunl, saiislam, tianshilei1992, tra. Herald added subscribers: kerbowa, guansong, jvesely. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1,

[PATCH] D123810: [Cuda] Add initial support for wrapping CUDA images in the new driver.

2022-05-02 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 426420. jhuber6 added a comment. Updating to use the `OffloadKind` enum rather than the string. I will also probably simplify some of the logic for handling multiple files here in a later patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D123471: [CUDA] Create offloading entries when using the new driver

2022-04-29 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 426150. jhuber6 added a comment. Fixed missing info flag for `--offload-new-driver`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123471/new/ https://reviews.llvm.org/D123471 Files:

[PATCH] D124721: [OpenMP] Allow compiling multiple target architectures with OpenMP

2022-05-02 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/test/Driver/amdgpu-openmp-toolchain-new.c:6 // RUN: | FileCheck %s +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \ +// RUN: --offload-arch=gfx906

[PATCH] D124721: [OpenMP] Allow compiling multiple target architectures with OpenMP

2022-05-02 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/test/Driver/amdgpu-openmp-toolchain-new.c:6 // RUN: | FileCheck %s +// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa \ +// RUN: --offload-arch=gfx906

[PATCH] D122683: [OpenMP] Use new offloading binary when embedding offloading images

2022-04-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 422580. jhuber6 added a comment. Fix test after move to opaque pointers. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122683/new/ https://reviews.llvm.org/D122683 Files:

[PATCH] D123313: [OpenMP] Make clang argument handling for the new driver more generic

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123313/new/ https://reviews.llvm.org/D123313 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D123325: [Clang] Make enabling the new driver more generic

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. ping Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123325/new/ https://reviews.llvm.org/D123325 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D122831: [OpenMP] Make the new offloading driver the default

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 423416. jhuber6 added a comment. Herald added a subscriber: mattd. Splitting major changes into two files as per suggestion. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122831/new/

[PATCH] D122831: [OpenMP] Make the new offloading driver the default

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 423433. jhuber6 added a comment. Fix test Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122831/new/ https://reviews.llvm.org/D122831 Files: clang/include/clang/Driver/Options.td

[PATCH] D123325: [Clang] Make enabling the new driver more generic

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/Driver/Driver.cpp:3885-3888 + bool UseNewOffloadingDriver = + C.isOffloadingHostKind(C.getActiveOffloadKinds()) && + (Args.hasArg(options::OPT_foffload_new_driver) || +

[PATCH] D123946: [CUDA][HIP] Fix gpu.used.external

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 accepted this revision. jhuber6 added a comment. This revision is now accepted and ready to land. LGTM CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123946/new/ https://reviews.llvm.org/D123946 ___ cfe-commits mailing list

[PATCH] D122831: [OpenMP] Make the new offloading driver the default

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGae23be84cb60: [OpenMP] Make the new offloading driver the default (authored by jhuber6). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D120273: [OpenMP] Allow CUDA to be linked with OpenMP using the new driver

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 423460. jhuber6 added a comment. Herald added subscribers: mattd, dexonsmith, MaskRay. rebase. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120273/new/ https://reviews.llvm.org/D120273 Files:

[PATCH] D123325: [Clang] Make enabling the new driver more generic

2022-04-18 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 423459. jhuber6 added a comment. Rebase & update after making the new driver default for OpenMP. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123325/new/ https://reviews.llvm.org/D123325 Files:

[PATCH] D125050: [OpenMP] Try to Infer target triples using the offloading architecture

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done. jhuber6 added a comment. In D125050#3496289 , @saiislam wrote: > Looks good to me. > > Will it work with `-fno-openmp`? Sometimes `-fno-openmp` is used by the > end-user to override system provided `-fopenmp`

[PATCH] D125050: [OpenMP] Try to Infer target triples using the offloading architecture

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 427599. jhuber6 added a comment. Add test for disabling openmp with `-fno-openmp` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D125050/new/ https://reviews.llvm.org/D125050 Files:

[PATCH] D125092: [OpenMP] Add basic support for properly handling static libraries

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D125092#3497033 , @saiislam wrote: >> Ideally we could just put this on the linker itself, but nvlink doesn't seem >> to support .a files. > > Yes, nvlink does not support archives. So we used a wrapper to extract cubin >

[PATCH] D123812: [CUDA] Add wrapper code generation for registering CUDA images

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D123812#3496914 , @yaxunl wrote: > LGTM. Did you forget to accept the revision? D123810 and D123471 still need to be looked at, but these are mostly

[PATCH] D125092: [OpenMP] Add basic support for properly handling static libraries

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGe12905b4d5f9: [OpenMP] Add basic support for properly handling static libraries (authored by jhuber6). Changed prior to commit: https://reviews.llvm.org/D125092?vs=427618=427646#toc Repository: rG

[PATCH] D125092: [OpenMP] Add basic support for properly handling static libraries

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision. jhuber6 added reviewers: jdoerfert, JonChesterfield, tianshilei1992. Herald added subscribers: guansong, yaxunl. Herald added a project: All. jhuber6 requested review of this revision. Herald added subscribers: cfe-commits, sstefan1. Herald added a project: clang.

[PATCH] D123471: [CUDA] Create offloading entries when using the new driver

2022-05-06 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/CodeGen/CGCUDARuntime.h:58-70 + OffloadRegionKernelEntry = 0x0, +}; + +/// The kind flag of the global variable entry. +enum OffloadVarEntryKindFlag : uint32_t { + /// Mark the entry as a global variable.

<    3   4   5   6   7   8   9   10   11   12   >