On 19/12/2025 16:32, Tobias Burnus wrote:
Andrew Stubbs wrote:
The -fopenmp-simd enables the "#pragma omp simd" directive (only) as a
vectorization hint, but it did not work for SIMT vectorization.
This patch enables the feature for backends in which the
TARGET_SIMT_VF hook is
present and indicates SIMT is available. Only the NVPTX backend
actually does
this, at present.
...
--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -6588,6 +6588,14 @@ expand_omp_simd (struct omp_region *region,
struct omp_for_data *fd)
bool is_simt = omp_find_clause (gimple_omp_for_clauses (fd-
>for_stmt),
OMP_CLAUSE__SIMT_);
+
+ /* Allow compatible targets to use SIMT with -fopenmp-simd. */
+ if (!is_simt
+ && flag_openmp_simd
+ && targetm.simt.vf
+ && targetm.simt.vf () > 1)
+ is_simt = true;
This one feels a bit odd. It seems as if this should be either
(flag_openmp_simd || flag_openmp)
or
flag_openmp_simd
&& !flag_openmp
(and possibly even &&!flag_openacc ?)
My goal here was to enable -fopenmp-simd so that it works similarly on
SIMT targets as it already works on SIMD targets. This has nothing to
do with -fopenmp (or -fopenacc).
Basically, with this patch we can use all 32 "threads" in an NVPTX warp,
instead of being limited to entirely scalar code.
Additionally, I wonder whether there should be a
&& optimize.
Agreed: this is probably appropriate, to match the behaviour on SIMD
targets.
For the current target(s), -fopenmp-simd w/o -fopenmp and w/o -fopenacc,
it would only apply when running in stand-alone code and nvptx has to my
knowledge no nvptx as GPU OpenMP/OpenACC parallelization, i.e. it is low
risk.
Yes, this is the intended use case.
Otherwise, for '-fopenmp-simd' (with '|| flag_openmp' or when the user
specified both), it would be also active. I am not sure whether that
would be useful in term of performance or not, but it would be more
widely visible. I guess, the latter case would be worthwhile to check
how often it applies and whether there is a performance win - and
whether there is a correctness issue.
I'm pretty sure there are deeper correctness issues with trying to use
-fopenmp when it wasn't enabled at configure time, not to mention that
there's no attempt at porting any of the libgomp features for using
these targets as a "host" system.
Actually, the latter is already true if only enabled for -fopenmp-simd.
(I had no deeper thought about correctness issues, yet.)
I've attached a v2 patch with the "&& optimize".
OK?
Andrew
From 9737d0690ff639932a583bb12d99ee977c554572 Mon Sep 17 00:00:00 2001
From: Andrew Stubbs <[email protected]>
Date: Mon, 15 Dec 2025 11:15:24 +0000
Subject: [PATCH v2] openmp: Enable -fopenmp-simd for SIMT architecture
The -fopenmp-simd enables the "#pragma omp simd" directive (only) as a
vectorization hint, but it did not work for SIMT vectorization.
This patch enables the feature for backends in which the TARGET_SIMT_VF hook is
present and indicates SIMT is available. Only the NVPTX backend actually does
this, at present.
I've tested this with some simple testcases, and run the code using
"nvptx-none-run -L32 a.out" successfully, but I've not messed with any of the
testsuite effective target settings, yet, so there are not changes to the test
results.
The purpose is to allow me to test some SIMT features without using a full
offloading toolchain.
gcc/ChangeLog:
* omp-expand.cc (expand_omp_simd): Use target.simt.vf to enable
-fopenmp-simd on SIMT architectures.
---
gcc/omp-expand.cc | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/gcc/omp-expand.cc b/gcc/omp-expand.cc
index 9864ce40219..4c6aaf97cf4 100644
--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -6588,6 +6588,15 @@ expand_omp_simd (struct omp_region *region, struct omp_for_data *fd)
bool is_simt = omp_find_clause (gimple_omp_for_clauses (fd->for_stmt),
OMP_CLAUSE__SIMT_);
+
+ /* Allow compatible targets to use SIMT with -fopenmp-simd. */
+ if (!is_simt
+ && flag_openmp_simd
+ && optimize
+ && targetm.simt.vf
+ && targetm.simt.vf () > 1)
+ is_simt = true;
+
if (is_simt)
{
cfun->curr_properties &= ~PROP_gimple_lomp_dev;
--
2.51.0