Artem-B wrote:
> This is what we already do for `--offload-arch=native` on CUDA, but this is
> somewhat tangential. I've updated this patch to present the warning in the
> case of multiply GPUs being detected, so I don't think there's a concern here
> with the user being confused. If they have
Artem-B wrote:
> It's not unspecified per-se, it just picks the one the CUDA driver assigned
> to ID zero, so it will correspond to the layman using a default device if
> loaded into CUDA.
The default "fastest card first" is also somewhat flaky. First, the "default"
enumeration order is affec
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79373
>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX
Sum
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79373
>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/2] [NVPTX] Add support for -march=native in standalone NVPTX
Sum
jhuber6 wrote:
> I think I'm with Art on this one.
>
> > > Problem #2 [...] The arch=native will create a working configuration, but
> > > would build more than necessary.
> >
> >
> > It will target the first GPU it finds. We could maybe change the behavior
> > to detect the newest, but the
jlebar wrote:
I think I'm with Art on this one.
>> Problem #2 [...] The arch=native will create a working configuration, but
>> would build more than necessary.
>
> It will target the first GPU it finds. We could maybe change the behavior to
> detect the newest, but the idea is just to target
jhuber6 wrote:
Some interesting points, I'll try to clarify some things.
> This option may not as well as one would hope.
>
> Problem #1 is that it will drastically slow down compilation for some users.
> NVIDIA GPU drivers are loaded on demand, and the process takes a while
> (O(second), dep
Artem-B wrote:
This option may not as well as one would hope.
Problem #1 is that it will drastically slow down compilation for some users.
NVIDIA GPU drivers are loaded on demand, and the process takes a while
(O(second), depending on the kind and number of GPUs). If you build on a
headless m
llvmbot wrote:
@llvm/pr-subscribers-clang
Author: Joseph Huber (jhuber6)
Changes
Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also apply to
llvmbot wrote:
@llvm/pr-subscribers-clang-driver
Author: Joseph Huber (jhuber6)
Changes
Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also a
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/79373
Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also apply to the
11 matches
Mail list logo