[clang-tools-extra] [libc] [llvm] [compiler-rt] [lld] [libcxx] [lldb] [flang] [clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This is what we already do for `--offload-arch=native` on CUDA, but this is > somewhat tangential. I've updated this patch to present the warning in the > case of multiply GPUs being detected, so I don't think there's a concern here > with the user being confused. If they have

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: > It's not unspecified per-se, it just picks the one the CUDA driver assigned > to ID zero, so it will correspond to the layman using a default device if > loaded into CUDA. The default "fastest card first" is also somewhat flaky. First, the "default" enumeration order is affec

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX Sum

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/2] [NVPTX] Add support for -march=native in standalone NVPTX Sum

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I think I'm with Art on this one. > > > > Problem #2 [...] The arch=native will create a working configuration, but > > > would build more than necessary. > > > > > > It will target the first GPU it finds. We could maybe change the behavior > > to detect the newest, but the

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Justin Lebar via cfe-commits
jlebar wrote: I think I'm with Art on this one. >> Problem #2 [...] The arch=native will create a working configuration, but >> would build more than necessary. > > It will target the first GPU it finds. We could maybe change the behavior to > detect the newest, but the idea is just to target

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Some interesting points, I'll try to clarify some things. > This option may not as well as one would hope. > > Problem #1 is that it will drastically slow down compilation for some users. > NVIDIA GPU drivers are loaded on demand, and the process takes a while > (O(second), dep

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Artem Belevich via cfe-commits
Artem-B wrote: This option may not as well as one would hope. Problem #1 is that it will drastically slow down compilation for some users. NVIDIA GPU drivers are loaded on demand, and the process takes a while (O(second), depending on the kind and number of GPUs). If you build on a headless m

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-clang Author: Joseph Huber (jhuber6) Changes Summary: We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX architecture from standard CPU. This patch simply uses the existing support for handling `--offload-arch=native` to also apply to

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-clang-driver Author: Joseph Huber (jhuber6) Changes Summary: We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX architecture from standard CPU. This patch simply uses the existing support for handling `--offload-arch=native` to also a

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79373 Summary: We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX architecture from standard CPU. This patch simply uses the existing support for handling `--offload-arch=native` to also apply to the