[PATCH] D141350: Fix runtime problem for base class member data used in target region.

2023-01-09 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Does this patch work when there are more than one level of inheritance? For example `class B: public A; class C: public B` Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D141350/new/ https://reviews.llvm.org/D141350

[PATCH] D139287: [OpenMP] Introduce basic JIT support to OpenMP target offloading

2022-12-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Got tons of runtime failure target doesn't support jit UNREACHABLE executed at /gpfs/jlse-fs0/users/yeluo/opt/llvm-clang/llvm-project-nightly/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:543! on AMD GPU gfx908 when running miniqmc

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14 +{ +#pragma omp target data use_device_addr(x) +{ dreachem wrote: > ye-luo wrote: > > doru1004 wrote: > > > doru1004 wrote: > > > >

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14 +{ +#pragma omp target data use_device_addr(x) +{ doru1004 wrote: > doru1004 wrote: > > ye-luo wrote: > > > doru1004 wrote: > > > >

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14 +{ +#pragma omp target data use_device_addr(x) +{ doru1004 wrote: > ye-luo wrote: > > doru1004 wrote: > > > ye-luo wrote: > > > >

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14 +{ +#pragma omp target data use_device_addr(x) +{ doru1004 wrote: > ye-luo wrote: > > doru1004 wrote: > > > ye-luo wrote: > > > > In my

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14 +{ +#pragma omp target data use_device_addr(x) +{ doru1004 wrote: > ye-luo wrote: > > In my understanding of the spec. > >

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14 +{ +#pragma omp target data use_device_addr(x) +{ In my understanding of the spec. `map(tofrom:x[0:256])` only maps the memory segment

[PATCH] D130020: [OpenMP] Deprecate the old driver for OpenMP offloading

2022-08-24 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. The new driver remains failing offload to x86 in certain scenarios when linking static libraries. Once I link object files directly there is no issue. This not worse than old driver that only supports linking object files directly.

[PATCH] D130020: [OpenMP] Deprecate the old driver for OpenMP offloading

2022-07-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. I noticed that in one of my applications, offload to x86 is not fully working with static libraries but directly linking all object files resolves the issue. So the new driver doesn't cause regression compared to the old driver which doesn't work with static libraries

[PATCH] D129393: [Clang] Fix the wrong features being derivec in the offload packager

2022-07-08 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129393/new/ https://reviews.llvm.org/D129393

[PATCH] D129383: [LinkerWrapper] Fix use of string savers and correctly pass bitcode libraries

2022-07-08 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. Confirm that #56445 is fixed now Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129383/new/ https://reviews.llvm.org/D129383

[PATCH] D128206: [Clang] Allow multiple comma separated arguments to `--offload-arch=`

2022-06-20 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. LGTM. This allows me to write concise compile lines. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D128206/new/ https://reviews.llvm.org/D128206

[PATCH] D123498: [clang] Adding Platform/Architecture Specific Resource Header Installation Targets

2022-04-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. @qiongsiwu1 Tested the updated patch. Works fine now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123498/new/ https://reviews.llvm.org/D123498 ___ cfe-commits mailing list

[PATCH] D123498: [clang] Adding Platform/Architecture Specific Resource Header Installation Targets

2022-04-23 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. See https://github.com/llvm/llvm-project/issues/55002 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D123498/new/ https://reviews.llvm.org/D123498 ___ cfe-commits mailing list

[PATCH] D122592: [OpenMP] Fix library path missing when using OpenMP

2022-03-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. > using the multiarch directory If we can cross compile libomp and libomptarget to the target system. We may have lib/x86_64-unknown-linux-gnu/libomp.so lib/aarch64-unknown-linux-gnu/libomp.so Compile clang once but compile runtime library for multiple architectures.

[PATCH] D120106: [OpenMP] Add flag for disabling threat state in runtime

2022-02-17 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Change title threat state to thread state Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120106/new/ https://reviews.llvm.org/D120106 ___ cfe-commits mailing list

[PATCH] D116865: [OpenMP][FIX] Emit debug declares only if debug info is available

2022-01-08 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Confirm that #52938 is fixed by this patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D116865/new/ https://reviews.llvm.org/D116865 ___ cfe-commits mailing list

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-20 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/tools/amdgpu-arch/CMakeLists.txt:13 + +find_package(hsa-runtime64 QUIET 1.2.0 HINTS ${CMAKE_INSTALL_PREFIX} PATH ${ROCM_PATH}) if (NOT ${hsa-runtime64_FOUND}) JonChesterfield wrote: > arsenm wrote: > > I also

[PATCH] D113140: [OpenMP][NFCI] Introduce the kernel environment for target regions

2021-11-11 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. > the kernel environment which contains information passed by the compiler to a > GPU kernel. DId you mean this environment is baked into kernel at compile time? So there is no additional H2D cost at each call, right? Repository: rG LLVM Github Monorepo CHANGES

[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105191/new/ https://reviews.llvm.org/D105191

[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-10-05 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62 + bool postClangLink); +void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA, + const InputInfoList *Inputs, const Driver

[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-09-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62 + bool postClangLink); +void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA, + const InputInfoList *Inputs, const Driver

[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-09-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62 + bool postClangLink); +void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA, + const InputInfoList *Inputs, const Driver

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-21 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. This patch doesn't seem to break anything on my side. @saiislam could you 1. address all the in-source review comments 2. update the title to `[Clang][OpenMP] Add partial support for Static Device Libraries` 3. update the patch description about what works and what

[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-09-16 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision. ye-luo added a comment. This revision now requires changes to proceed. The fallback opt/rocm is desired. If a module system needs to point to the specific rocm installation. Set CMAKE_PREFIX_PATH= in the module file. If you would like to honor

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-15 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. > The option of adding sm_XX in Bundle Entry ID when user hasn't used -march > flag, comes under command line simplification. I have a bunch of upcoming > patches which will significantly simplify OpenMP command line for GPU > offloading. But, don't you think this

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. 1. modf works now. 2. if I modify the complile.sh clang++ -fopenmp -fopenmp-targets=nvptx64 -c classA.cpp rm -f libmylib.a ar qc libmylib.a classA.o ranlib libmylib.a clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib ./a.out doesn't work. I

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_80 modf.cpp -c yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang-offload-bundler -type=o --inputs=modf.o --list openmp-nvptx64

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-13 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. the modf test still doesn't work. The issue was from unbundle. case 1 works. clang++ -fopenmp -fopenmp-targets=nvptx64 modf.cpp -c clang++ -fopenmp -fopenmp-targets=nvptx64 modf.o case 2 clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-13 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. @saiislam did you turn on offload? https://github.com/ye-luo/openmp-target/wiki/OpenMP-offload-compilers#llvm-clang On NVIDIA, it fails at CMake step. On AMD, make step stops because of unrelated issue. Please make the exact reproducer 1 working. Right now I got $

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-11 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. @saiislam do my test cases work on your side? I tried this patch and still got linking failure. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105191/new/ https://reviews.llvm.org/D105191

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. @saiislam since clang-nvlink-wrapper has landed, could you update this patch? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D105191/new/ https://reviews.llvm.org/D105191 ___

[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-09-05 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. This patch has landed as 83f3782c6129e7a5df3faaf0ae576611d16a8d49 but not reflected on Phabricator Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-31 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. Documentation is much improved. LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D108291/new/ https://reviews.llvm.org/D108291

[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-08-30 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D101960#2961133 , @jdoerfert wrote: > There are 3 problems here (ignoring our test setup which should be discussed > separately): > > 1. make sure clang finds libomp.so > 2. make sure libomp.so (or clang?) finds

[PATCH] D101935: [clang] Search runtimes build tree for openmp runtime

2021-08-30 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. It seems that this path is baked in to clang executable even after make install. I'm not convinced this is the right direction. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D101935/new/ https://reviews.llvm.org/D101935

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-08-23 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62 + bool postClangLink); +void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA, + const InputInfoList *Inputs, const Driver

[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-23 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp:19 +/// Such an archive is then passed to this tool to extract cubin files before +/// passing to nvlink. +/// Right now clang-offload-bundler is only used to

[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. this is the working steps in the linking script. clang-offload-bundler (host,device) in: complex_reduction.cpp.o out: complex_reduction-494ba8.o, complex_reduction-5aba63.cubin nvlink (device) in: complex_reduction-5aba63.cubin out:

[PATCH] D106960: [OffloadArch] Library to query properties of current offload archicture

2021-08-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. I testing with aomp 13.0-5 on ubuntu 20.04.2 LTS (Focal Fossa) yeluo@epyc-server:~$ offload-arch -a gfx906 ERROR: offload-arch not found for 10de:2486. yeluo@epyc-server:~$ offload-arch -c gfx906 sramecc+ xnack- yeluo@epyc-server:~$ offload-arch -n gfx906

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-30 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Unforuantely I hit error #include int main() { } ~/opt/llvm-clang/build_mirror_offload_main/bin/clang++ -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 main.cpp -c works fine

[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. how to get this moving? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104904/new/ https://reviews.llvm.org/D104904 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Do I must use llvm-ar/ranlib or system ar/ranlib is OK? 1. existing use case breaks Use https://github.com/ye-luo/openmp-target/blob/master/tests/math/modf.cpp $ clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_80 modf.cpp # still OK $

[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-07-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D106870#2907257 , @saiislam wrote: > In D106870#2907252 , @ye-luo wrote: > >> `-fopenmp-targets=amdgcn-amd-amdhsa,amdgcn-amd-amdhsa` seems burdensome. >> Could you just count how many

[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-07-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. `-fopenmp-targets=amdgcn-amd-amdhsa,amdgcn-amd-amdhsa` seems burdensome. Could you just count how many `-Xopenmp-target=amdgcn-amd-amdhsa` there are on the comand line and then count the unique ones? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D106793: [OpenMP] Add a driver flag to enable the new device runtime library

2021-07-26 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D106793#2904943 , @jhuber6 wrote: > In D106793#2904661 , @ye-luo wrote: > >> Not clear from the summary what is the new driver option. > > It's a rewrite of the current device runtime.

[PATCH] D106793: [OpenMP] Add a driver flag to enable the new device runtime library

2021-07-26 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Not clear from the summary what is the new driver option. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D106793/new/ https://reviews.llvm.org/D106793 ___ cfe-commits mailing list

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-22 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D96877#2578752 , @tianshilei1992 wrote: > In D96877#2578748 , @ye-luo wrote: > >> to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc >> as a solution for forward

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-22 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc as a solution for forward compatibility until a clean solution lands. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. Got it. Copy a file can be tricky. Compile one more can be easily done in cmake. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision. ye-luo added a comment. This revision now requires changes to proceed. Let user to copy the bc file is not feasible. Handle this in CMake please. libomptarget-nvptx-unknown.bc Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Does this patch includes creating 'libomptarget-nvptx-unknown.bc'? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96877/new/ https://reviews.llvm.org/D96877 ___ cfe-commits

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents -.Components[target_data_function == targetDataEnd ? I : E - I - 1]; +.Components[target_data_function == targetDataEnd ? E - I - 1 : I];

[PATCH] D80743: (PR46111) Properly handle elaborated types in an implicit deduction guide

2020-11-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. This patch caused severe regression in Clang 11. https://bugs.llvm.org/show_bug.cgi?id=48177 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80743/new/ https://reviews.llvm.org/D80743

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-11-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. 1. Could you separate the reordering related changes to separate patch? 2. Could you mention which line in spec 4.5 was the restriction? Even 5.0/5.1 has some restrictions. Need to be clear which one you refer to. Repository: rG LLVM Github Monorepo CHANGES SINCE

[PATCH] D89844: [Clang][OpenMP][WIP] Fixed an issue of segment fault when using target nowait

2020-10-21 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. Getting this even when compiling without offload. You can use the reproducer from the original bug report. clang++: /home/yeluo/opt/llvm-clang/llvm-project/llvm/include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88929/new/ https://reviews.llvm.org/D88929 ___

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:80 + else() +list(APPEND compute_capabilities ${CMAKE_MATCH_1}) + endif() jhuber6 wrote: > ye-luo wrote: > > 1. Doesn't work right now. Missing comma

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:92 foreach(sm ${nvptx_sm_list}) set(CUDA_ARCH ${CUDA_ARCH} -gencode arch=compute_${sm},code=sm_${sm}) endforeach() my point 2 refers to here CUDA_ARCH which

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision. ye-luo added inline comments. This revision now requires changes to proceed. Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:80 + else() +list(APPEND compute_capabilities ${CMAKE_MATCH_1}) + endif()

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D88929#2315640 , @JonChesterfield wrote: > An alternative approach is to build the deviceRTL for multiple cuda versions > and then pick whichever one is the best fit when compiling application code. > That has advantages when

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D88929#2315538 , @jhuber6 wrote: > In D88929#2315519 , @ye-luo wrote: > >> Probably not messing with `enable_language(CUDA)` at the moment, just add >>

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D88929#2315513 , @jhuber6 wrote: > In D88929#2315451 , @ye-luo wrote: > >> I just realized that this patch affects clang and libomptarget. >> I cannot comment on clang. Regarding

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. I just realized that this patch affects clang and libomptarget. I cannot comment on clang. Regarding libomptarget, Could you explain why the detection is not put together with other cuda stuff in `openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake`

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. 3.18 introduces CMAKE_CUDA_ARCHITECTURES. Does 3.18 supports detection? If we know a new way works since 3.18, I think putting both with if-else makes sense. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88929/new/

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. The link I posted indicated that independent feature is merged since 3.12. Better to avoid deprecated stuff when introducing new cmake lines even though some existing lines may rely on deprecated cmake. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. FindCUDA has been deprecated. Please explore the following feature without directly calling FindCUDA. https://gitlab.kitware.com/cmake/cmake/-/merge_requests/1856 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88929/new/

[PATCH] D88384: [OpenMP][FIX] Verify compatible types for declare variant calls

2020-09-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. The minimal reproducer and full app work now. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D88384/new/ https://reviews.llvm.org/D88384 ___ cfe-commits mailing list

[PATCH] D78075: [WIP][Clang][OpenMP] Added support for nowait target in CodeGen

2020-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D78075#2272474 , @tianshilei1992 wrote: > In D78075#2272398 , @ye-luo wrote: > >>> However, OpenMP task has a problem that it must be within >>> to a parallel region; otherwise the task

[PATCH] D78075: [WIP][Clang][OpenMP] Added support for nowait target in CodeGen

2020-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. > However, OpenMP task has a problem that it must be within > to a parallel region; otherwise the task will be executed immediately. As a > result, if we directly wrap to a regular task, the nowait target outside of a > parallel region is still a synchronous version. The

[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. LGTM. My applications run as expected now. PR46824, PR46012, PR46868 all work fine. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84767/new/

[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D84767#2180280 , @ye-luo wrote: > This patch > GPU activities: 96.99% 350.05ms10 35.005ms 1.5680us 350.00ms > [CUDA memcpy HtoD] > before the July21 change > GPU activities: 95.33% 20.317ms 4 5.0793ms

[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. This patch GPU activities: 96.99% 350.05ms10 35.005ms 1.5680us 350.00ms [CUDA memcpy HtoD] before the July21 change GPU activities: 95.33% 20.317ms 4 5.0793ms 1.6000us 20.305ms [CUDA memcpy HtoD] Still more transfer than it should.

[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision. ye-luo added a comment. This revision now requires changes to proceed. Please check the reproducer in https://bugs.llvm.org/show_bug.cgi?id=46868 with LIBOMPTARGET_DEBUG=1. The reference counting on the base pointer variable has side effects. It was

[PATCH] D84182: [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. In D84182#2173578 , @grokos wrote: > @ABataev: > > After this patch was committed, I tried to run the following example: > > #include > > int *yptr; > > int main() { > int y[10]; > y[1] = 1; > yptr = [0]; >

[PATCH] D84182: [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.

2020-07-21 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. I verified that 46012 is fixed with this patch Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D84182/new/ https://reviews.llvm.org/D84182

[PATCH] D83707: [OpenMP][NFC] Emit remarks during GPU state machine optimization

2020-07-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D83707/new/ https://reviews.llvm.org/D83707

[PATCH] D83707: [OpenMP][NFC] Emit remarks during GPU state machine optimization

2020-07-13 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments. Comment at: clang/test/OpenMP/remarks_parallel_in_target_state_machine.c:12 +#pragma omp parallel // #1 +// expected-remark@#1 {{Found parallel region that is called through a state machine__omp_outlined__2_wrapper in non-SPMD target

[PATCH] D75788: [OpenMP] Provide math functions in OpenMP device code via OpenMP variants

2020-04-02 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment. My RHEL issue was caused by a CPLUS_INCLUDE_PATH environment variable. So this is feature not a bug. After removing it, everything works smoothly for me. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D75788/new/

[PATCH] D75788: [OpenMP] Provide math functions in OpenMP device code via OpenMP variants

2020-04-02 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision. ye-luo added a comment. This revision is now accepted and ready to land. Good work. I verified that PR42798 and PR42799 are fixed by this. Tests are completed on Ubuntu 18.04. Clang now becomes usable for application developers. There are still issues on RHEL that