https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/176791
Summary: One issue with the CMake detection scripts is that they will use `_OPENMP` to determine the version. If we are compiling in HIP mode then this will always fail due to the device portion not having it defined. This PR adds an options to define this to zero, so if people check for features they won't get anything. Slight possibility of this corrupting some headers that try to distinguish between OpenMP and HIP offloading? But I think in most cases the OpenMP source uses pragmas and `__HIP__` for that. Meanwhile, without this it's pretty much impossible to compile standards compliant CPU OpenMP with HIP mode on. >From 0b79928555ed9989664ed8ce607543d45e10cb03 Mon Sep 17 00:00:00 2001 From: Joseph Huber <[email protected]> Date: Mon, 19 Jan 2026 12:01:34 -0600 Subject: [PATCH] [HIP] Define `_OPENMP` on the device for mixed OpenMP CPU compilations Summary: One issue with the CMake detection scripts is that they will use `_OPENMP` to determine the version. If we are compiling in HIP mode then this will always fail due to the device portion not having it defined. This PR adds an options to define this to zero, so if people check for features they won't get anything. Slight possibility of this corrupting some headers that try to distinguish between OpenMP and HIP offloading? But I think in most cases the OpenMP source uses pragmas and `__HIP__` for that. Meanwhile, without this it's pretty much impossible to compile standards compliant CPU OpenMP with HIP mode on. --- clang/include/clang/Basic/LangOptions.def | 1 + clang/include/clang/Options/Options.td | 3 +++ clang/lib/Driver/ToolChains/Clang.cpp | 8 ++++++++ clang/lib/Frontend/InitPreprocessor.cpp | 5 +++++ clang/test/Driver/offloading-interoperability.c | 2 +- 5 files changed, 18 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def index 8cba1dbaee24e..2831e1f6b2b14 100644 --- a/clang/include/clang/Basic/LangOptions.def +++ b/clang/include/clang/Basic/LangOptions.def @@ -223,6 +223,7 @@ LANGOPT(NativeInt16Type , 1, 1, NotCompatible, "Native int 16 type support") LANGOPT(CUDA , 1, 0, NotCompatible, "CUDA") LANGOPT(HIP , 1, 0, NotCompatible, "HIP") LANGOPT(OpenMP , 32, 0, NotCompatible, "OpenMP support and version of OpenMP (31, 40 or 45)") +LANGOPT(OpenMPMacros , 1, 0, NotCompatible, "Define the OpenMP Macros") LANGOPT(OpenMPExtensions , 1, 1, NotCompatible, "Enable all Clang extensions for OpenMP directives and clauses") LANGOPT(OpenMPSimd , 1, 0, NotCompatible, "Use SIMD only OpenMP support.") LANGOPT(OpenMPUseTLS , 1, 0, NotCompatible, "Use TLS for threadprivates or runtime calls") diff --git a/clang/include/clang/Options/Options.td b/clang/include/clang/Options/Options.td index 2f57a5b13b917..485da985a0716 100644 --- a/clang/include/clang/Options/Options.td +++ b/clang/include/clang/Options/Options.td @@ -3886,6 +3886,9 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], "fopenmp-cuda-blocks-per-sm="> Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>; def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], "fopenmp-cuda-teams-reduction-recs-num=">, Group<f_Group>, Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>; +def fopenmp_macros : Flag<["-"], "fopenmp-macros">, Group<f_Group>, + MarshallingInfoFlag<LangOpts<"OpenMPMacros">>, + Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>; //===----------------------------------------------------------------------===// // Shared cc1 + fc1 OpenMP Target Options diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 430130ec92cab..ef248b1b103d6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -6746,6 +6746,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, // semantic analysis, etc. break; } + } + if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ, + options::OPT_fno_openmp, false) && + (JA.isDeviceOffloading(Action::OFK_Cuda) || + JA.isDeviceOffloading(Action::OFK_HIP))) { + // We need to define only the OpenMP macros on the device so two-pass + // compilation can succeed when they are used on the host. + CmdArgs.push_back("-fopenmp-macros"); } else { Args.AddLastArg(CmdArgs, options::OPT_fopenmp_simd, options::OPT_fno_openmp_simd); diff --git a/clang/lib/Frontend/InitPreprocessor.cpp b/clang/lib/Frontend/InitPreprocessor.cpp index 8253fad9e5503..9823cc944d4ae 100644 --- a/clang/lib/Frontend/InitPreprocessor.cpp +++ b/clang/lib/Frontend/InitPreprocessor.cpp @@ -1460,6 +1460,11 @@ static void InitializePredefinedMacros(const TargetInfo &TI, } } + // CUDA / HIP offloading only supports OpenMP's CPU support, but both + // compilations must define these macros to compile. + if (LangOpts.OpenMPMacros) + Builder.defineMacro("_OPENMP", "0"); + // CUDA device path compilaton if (LangOpts.CUDAIsDevice && !LangOpts.HIP) { // The CUDA_ARCH value is set for the GPU target specified in the NVPTX diff --git a/clang/test/Driver/offloading-interoperability.c b/clang/test/Driver/offloading-interoperability.c index f4d980e5fa5ce..8efc095750f94 100644 --- a/clang/test/Driver/offloading-interoperability.c +++ b/clang/test/Driver/offloading-interoperability.c @@ -5,7 +5,7 @@ // RUN: | FileCheck %s --check-prefix NO-OPENMP-FLAGS-FOR-CUDA-DEVICE // // NO-OPENMP-FLAGS-FOR-CUDA-DEVICE: "-cc1" "-triple" "nvptx64-nvidia-cuda" -// NO-OPENMP-FLAGS-FOR-CUDA-DEVICE-NOT: -fopenmp +// NO-OPENMP-FLAGS-FOR-CUDA-DEVICE: -fopenmp-macros // NO-OPENMP-FLAGS-FOR-CUDA-DEVICE-NEXT: ptxas" "-m64" // NO-OPENMP-FLAGS-FOR-CUDA-DEVICE-NEXT: fatbinary"{{( "--cuda")?}} "-64" // NO-OPENMP-FLAGS-FOR-CUDA-DEVICE-NEXT: "-cc1" "-triple" "powerpc64le-unknown-linux-gnu" _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
