[PATCH] D118399: [OpenMP] Only generate runtime flags with host input

2022-01-27 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 403762.
jhuber6 added a comment.

Changing to use host bitcode instead of adding a new flag.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118399/new/

https://reviews.llvm.org/D118399

Files:
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp


Index: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -1198,7 +1198,8 @@
 llvm_unreachable("OpenMP can only handle device code.");
 
   llvm::OpenMPIRBuilder  = getOMPBuilder();
-  if (CGM.getLangOpts().OpenMPTargetNewRuntime) {
+  if (CGM.getLangOpts().OpenMPTargetNewRuntime &&
+  !CGM.getLangOpts().OMPHostIRFile.empty()) {
 OMPBuilder.createGlobalFlag(CGM.getLangOpts().OpenMPTargetDebug,
 "__omp_rtl_debug_kind");
 OMPBuilder.createGlobalFlag(CGM.getLangOpts().OpenMPTeamSubscription,


Index: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -1198,7 +1198,8 @@
 llvm_unreachable("OpenMP can only handle device code.");
 
   llvm::OpenMPIRBuilder  = getOMPBuilder();
-  if (CGM.getLangOpts().OpenMPTargetNewRuntime) {
+  if (CGM.getLangOpts().OpenMPTargetNewRuntime &&
+  !CGM.getLangOpts().OMPHostIRFile.empty()) {
 OMPBuilder.createGlobalFlag(CGM.getLangOpts().OpenMPTargetDebug,
 "__omp_rtl_debug_kind");
 OMPBuilder.createGlobalFlag(CGM.getLangOpts().OpenMPTeamSubscription,
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/lib/CodeGen/BackendUtil.cpp:1774
+  SectionName += ".";
+  SectionName += *BinarySection;
+}

JonChesterfield wrote:
> jhuber6 wrote:
> > JonChesterfield wrote:
> > > This looks lossy - if two files use the same section name, they'll end up 
> > > appended in an order that is probably an implementation quirk of 
> > > llvm-link, and I think we've thrown away the filename info so can't get 
> > > back to where we were.
> > > 
> > > Would .llvm.offloading.filename be a reasonable name for each section, 
> > > with either error on duplicates or warning + discard?
> > We only care about the sections per-file right. When I extract these in the 
> > `linker-wrapper` I simply look at each file's sections, and put them into a 
> > list of device inputs, we don't need them to be unique as long as there 
> > aren't multiple in the same file.
> I think we'll have problems if multiple files are embedded with the same 
> section string, as they'll get concatenated in the output. llvm-link or ld -r 
> on the host bitcode files will hit that.
> 
> It would be worth testing this with two input files, for the same offloading 
> architecture, on amdgpu since I think it will feed the host bitcode to 
> llvm-link which will implicitly concatenate the two embedded files.
This scheme works on AMDGPU because we don't use `llvm-link` as a part of the 
driver anymore, the new scheme unifies the behavior between NVPTX and AMDGPU 
until we hit the linker wrapper. But you're right that if the user creates host 
bitcode and runs llvm-link on that, or performs a relocatable link, we'll get 
conflicts. I can add a unique string at the end of the section name to avoid 
this in a later patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/test/Frontend/embed-object.ll:2
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK

JonChesterfield wrote:
> I think we need a test case with more than one embedded file, given there's 
> the careful splitting around commas in the implementation
I only split to create a pair now, I can make a test where we pass this flag 
multiple times.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404030.
jhuber6 added a comment.

Adding test for multiple input files to embed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Frontend/embed-object.ll


Index: clang/test/Frontend/embed-object.ll
===
--- clang/test/Frontend/embed-object.ll
+++ clang/test/Frontend/embed-object.ll
@@ -1,9 +1,11 @@
 ; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
-; RUN:-fembed-offload-object=%S/Inputs/empty.h,section -x ir %s -o - \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section1 \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section2 -x ir %s -o - \
 ; RUN:| FileCheck %s -check-prefix=CHECK
 
-; CHECK: @llvm.embedded.object = private constant [0 x i8] zeroinitializer, 
section ".llvm.offloading.section"
-; CHECK: @llvm.compiler.used = appending global [2 x i8*] [i8* @x, i8* 
getelementptr inbounds ([0 x i8], [0 x i8]* @llvm.embedded.object, i32 0, i32 
0)], section "llvm.metadata"
+; CHECK: @[[OBJECT1:.+]] = private constant [0 x i8] zeroinitializer, section 
".llvm.offloading.section1"
+; CHECK: @[[OBJECT2:.+]] = private constant [0 x i8] zeroinitializer, section 
".llvm.offloading.section2"
+; CHECK: @llvm.compiler.used = appending global [3 x i8*] [i8* @x, i8* 
getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT1]], i32 0, i32 0), i8* 
getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT2]], i32 0, i32 0)], 
section "llvm.metadata"
 
 @x = private constant i8 1
 @llvm.compiler.used = appending global [1 x i8*] [i8* @x], section 
"llvm.metadata"
Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", 
"[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: 
["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  
--libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc
 \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes %s -o 
openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: 
-fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4366,9 +4366,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4385,8 +4385,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
-} else if (IsOpenMPHost && !OpenMPHostInput) {
-  OpenMPHostInput = 
+} else if (IsOpenMPHost) {
+  OpenMPHostInputs.push_back(I);
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
@@ -6894,6 +6894,24 @@
 }
   }
 
+  // Host-side OpenMP offloading recieves the device object files and embeds it
+  // in a named section including the associated target triple and 
architecture.
+  if (IsOpenMPHost && !OpenMPHostInputs.empty()) {
+auto InputFile = OpenMPHostInputs.begin();
+auto OpenMPTCs = C.getOffloadToolChains();
+for (auto TI = OpenMPTCs.first, TE = OpenMPTCs.second; TI != TE;
+ ++TI, ++InputFile) {
+  const ToolChain *TC = TI->second;
+  const ArgList  = C.getArgsForToolChain(TC, "", 
Action::OFK_OpenMP);
+  StringRef File =
+  C.getArgs().MakeArgString(TC->getInputFilename(*InputFile));
+
+  CmdArgs.push_back(Args.MakeArgString(
+  "-fembed-offload-object=" + File + "," + TC->getTripleString() + "." 
+
+  TCArgs.getLastArgValue(options::OPT_march_EQ)));
+}
+  }
+
   if (Triple.isAMDGPU()) {
 handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);
 


Index: clang/test/Frontend/embed-object.ll
===
--- clang/test/Frontend/embed-object.ll
+++ clang/test/Frontend/embed-object.ll
@@ -1,9 +1,11 @@
 ; 

[PATCH] D118399: [OpenMP] Only generate runtime flags with host input

2022-01-27 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2945f11c605b: [OpenMP] Only generate runtime flags with host 
input (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118399/new/

https://reviews.llvm.org/D118399

Files:
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
  clang/test/OpenMP/target_globals_codegen.cpp


Index: clang/test/OpenMP/target_globals_codegen.cpp
===
--- clang/test/OpenMP/target_globals_codegen.cpp
+++ clang/test/OpenMP/target_globals_codegen.cpp
@@ -6,6 +6,7 @@
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown 
-fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime 
-fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
%s --check-prefix=CHECK-DEFAULT
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown 
-fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime 
-fopenmp-assume-threads-oversubscription -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s 
--check-prefix=CHECK-THREADS
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown 
-fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime 
-fopenmp-assume-teams-oversubscription -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s 
--check-prefix=CHECK-TEAMS
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown 
-fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime 
-fopenmp-assume-teams-oversubscription -fopenmp-is-device -o - | FileCheck %s 
--check-prefix=CHECK-RUNTIME
 // expected-no-diagnostics
 
 #ifndef HEADER
@@ -32,6 +33,10 @@
 // CHECK-TEAMS: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 1
 // CHECK-TEAMS: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
+// CHECK-RUNTIME-NOT: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-RUNTIME-NOT: @__omp_rtl_assume_teams_oversubscription = weak_odr 
hidden constant i32 1
+// CHECK-RUNTIME-NOT: @__omp_rtl_assume_threads_oversubscription = weak_odr 
hidden constant i32 0
+//.
 void foo() {
 #pragma omp target
   { }
Index: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
===
--- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -1198,7 +1198,8 @@
 llvm_unreachable("OpenMP can only handle device code.");
 
   llvm::OpenMPIRBuilder  = getOMPBuilder();
-  if (CGM.getLangOpts().OpenMPTargetNewRuntime) {
+  if (CGM.getLangOpts().OpenMPTargetNewRuntime &&
+  !CGM.getLangOpts().OMPHostIRFile.empty()) {
 OMPBuilder.createGlobalFlag(CGM.getLangOpts().OpenMPTargetDebug,
 "__omp_rtl_debug_kind");
 OMPBuilder.createGlobalFlag(CGM.getLangOpts().OpenMPTeamSubscription,


Index: clang/test/OpenMP/target_globals_codegen.cpp
===
--- clang/test/OpenMP/target_globals_codegen.cpp
+++ clang/test/OpenMP/target_globals_codegen.cpp
@@ -6,6 +6,7 @@
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=CHECK-DEFAULT
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime -fopenmp-assume-threads-oversubscription -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=CHECK-THREADS
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime -fopenmp-assume-teams-oversubscription -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix=CHECK-TEAMS
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-target-new-runtime -fopenmp-assume-teams-oversubscription -fopenmp-is-device -o - | FileCheck %s --check-prefix=CHECK-RUNTIME
 // expected-no-diagnostics
 
 #ifndef HEADER
@@ -32,6 +33,10 @@
 // CHECK-TEAMS: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden constant i32 1
 // CHECK-TEAMS: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden constant i32 0
 //.
+// CHECK-RUNTIME-NOT: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-RUNTIME-NOT: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden constant i32 1
+// CHECK-RUNTIME-NOT: 

[PATCH] D116545: [OpenMP] Add support for extracting device code in linker wrapper

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404035.
jhuber6 added a comment.

Changing section embedding after adding filenames previously.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116545/new/

https://reviews.llvm.org/D116545

Files:
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -5,23 +5,41 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 //===-===//
-///
+//
+// This tool works as a wrapper over a linking job. This tool is used to create
+// linked device images for offloading. It scans the linker's input for embedded
+// device offloading data stored in sections `.llvm.offloading..`
+// and extracts it as a temporary file. The extracted device files will then be
+// passed to a device linking job to create a final device image.
+//
 //===-===//
 
 #include "clang/Basic/Version.h"
+#include "llvm/BinaryFormat/Magic.h"
+#include "llvm/Bitcode/BitcodeWriter.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
 #include "llvm/Object/Archive.h"
+#include "llvm/Object/ArchiveWriter.h"
+#include "llvm/Object/Binary.h"
+#include "llvm/Object/ObjectFile.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Errc.h"
+#include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/InitLLVM.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/Program.h"
 #include "llvm/Support/Signals.h"
+#include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/StringSaver.h"
 #include "llvm/Support/WithColor.h"
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm;
+using namespace llvm::object;
 
 static cl::opt Help("h", cl::desc("Alias for -help"), cl::Hidden);
 
@@ -30,16 +48,42 @@
 static cl::OptionCategory
 ClangLinkerWrapperCategory("clang-linker-wrapper options");
 
+static cl::opt StripSections(
+"strip-sections", cl::ZeroOrMore,
+cl::desc("Strip offloading sections from the host object file."),
+cl::init(true), cl::cat(ClangLinkerWrapperCategory));
+
 static cl::opt LinkerUserPath("linker-path",
cl::desc("Path of linker binary"),
cl::cat(ClangLinkerWrapperCategory));
 
-// Do not parse linker options
+// Do not parse linker options.
 static cl::list
 LinkerArgs(cl::Sink, cl::desc("..."));
 
-static Error runLinker(std::string LinkerPath,
-   SmallVectorImpl ) {
+/// Path of the current binary.
+static std::string LinkerExecutable;
+
+/// Magic section string that marks the existence of offloading data. The
+/// section string will be formatted as `.llvm.offloading..`.
+#define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading"
+
+struct DeviceFile {
+  DeviceFile(StringRef TheTriple, StringRef Arch, StringRef Filename)
+  : TheTriple(TheTriple), Arch(Arch), Filename(Filename) {}
+
+  const Triple TheTriple;
+  const std::string Arch;
+  const std::string Filename;
+};
+
+namespace {
+
+Expected>
+extractFromBuffer(std::unique_ptr Buffer,
+  SmallVectorImpl );
+
+Error runLinker(std::string , SmallVectorImpl ) {
   std::vector LinkerArgs;
   LinkerArgs.push_back(LinkerPath);
   for (auto  : Args)
@@ -50,11 +94,301 @@
   return Error::success();
 }
 
-static void PrintVersion(raw_ostream ) {
+void PrintVersion(raw_ostream ) {
   OS << clang::getClangToolFullVersion("clang-linker-wrapper") << '\n';
 }
 
+void removeFromCompilerUsed(Module , GlobalValue ) {
+  GlobalVariable *GV = M.getGlobalVariable("llvm.compiler.used");
+  Type *Int8PtrTy = Type::getInt8PtrTy(M.getContext());
+  Constant *ValueToRemove =
+  ConstantExpr::getPointerBitCastOrAddrSpaceCast(, Int8PtrTy);
+  SmallPtrSet InitAsSet;
+  SmallVector Init;
+  if (GV) {
+if (GV->hasInitializer()) {
+  auto *CA = cast(GV->getInitializer());
+  for (auto  : CA->operands()) {
+Constant *C = cast_or_null(Op);
+if (C != ValueToRemove && InitAsSet.insert(C).second)
+  Init.push_back(C);
+  }
+}
+GV->eraseFromParent();
+  }
+
+  if (Init.empty())
+return;
+
+  ArrayType *ATy = ArrayType::get(Int8PtrTy, Init.size());
+  GV = new llvm::GlobalVariable(M, ATy, false, GlobalValue::AppendingLinkage,
+ConstantArray::get(ATy, Init),
+"llvm.compiler.used");
+  GV->setSection("llvm.metadata");
+}
+
+Expected>
+extractFromBinary(const ObjectFile ,
+ 

[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404038.
jhuber6 added a comment.

Remove test that was intended for previous commit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", 
"[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: 
["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  
--libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc
 \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes %s -o 
openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: 
-fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4366,9 +4366,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4385,8 +4385,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
-} else if (IsOpenMPHost && !OpenMPHostInput) {
-  OpenMPHostInput = 
+} else if (IsOpenMPHost) {
+  OpenMPHostInputs.push_back(I);
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
@@ -6894,6 +6894,25 @@
 }
   }
 
+  // Host-side OpenMP offloading recieves the device object files and embeds it
+  // in a named section including the associated target triple and 
architecture.
+  if (IsOpenMPHost && !OpenMPHostInputs.empty()) {
+auto InputFile = OpenMPHostInputs.begin();
+auto OpenMPTCs = C.getOffloadToolChains();
+for (auto TI = OpenMPTCs.first, TE = OpenMPTCs.second; TI != TE;
+ ++TI, ++InputFile) {
+  const ToolChain *TC = TI->second;
+  const ArgList  = C.getArgsForToolChain(TC, "", 
Action::OFK_OpenMP);
+  StringRef File =
+  C.getArgs().MakeArgString(TC->getInputFilename(*InputFile));
+  StringRef InputName = Clang::getBaseInputStem(Args, Inputs);
+
+  CmdArgs.push_back(Args.MakeArgString(
+  "-fembed-offload-object=" + File + "," + TC->getTripleString() + "." 
+
+  TCArgs.getLastArgValue(options::OPT_march_EQ) + "." + InputName));
+}
+  }
+
   if (Triple.isAMDGPU()) {
 handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);
 


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: ["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  --libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: -fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4366,9 +4366,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for 

[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404033.
jhuber6 added a comment.

Add input filename to the section name to prevent it from being merged if the 
user does a link.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c
  clang/test/Frontend/embed-object.ll


Index: clang/test/Frontend/embed-object.ll
===
--- clang/test/Frontend/embed-object.ll
+++ clang/test/Frontend/embed-object.ll
@@ -1,9 +1,11 @@
 ; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
-; RUN:-fembed-offload-object=%S/Inputs/empty.h,section -x ir %s -o - \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section1 \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section2 -x ir %s -o - \
 ; RUN:| FileCheck %s -check-prefix=CHECK
 
-; CHECK: @llvm.embedded.object = private constant [0 x i8] zeroinitializer, 
section ".llvm.offloading.section"
-; CHECK: @llvm.compiler.used = appending global [2 x i8*] [i8* @x, i8* 
getelementptr inbounds ([0 x i8], [0 x i8]* @llvm.embedded.object, i32 0, i32 
0)], section "llvm.metadata"
+; CHECK: @[[OBJECT1:.+]] = private constant [0 x i8] zeroinitializer, section 
".llvm.offloading.section1"
+; CHECK: @[[OBJECT2:.+]] = private constant [0 x i8] zeroinitializer, section 
".llvm.offloading.section2"
+; CHECK: @llvm.compiler.used = appending global [3 x i8*] [i8* @x, i8* 
getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT1]], i32 0, i32 0), i8* 
getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT2]], i32 0, i32 0)], 
section "llvm.metadata"
 
 @x = private constant i8 1
 @llvm.compiler.used = appending global [1 x i8*] [i8* @x], section 
"llvm.metadata"
Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", 
"[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: 
["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  
--libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc
 \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes %s -o 
openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: 
-fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4366,9 +4366,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4385,8 +4385,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
-} else if (IsOpenMPHost && !OpenMPHostInput) {
-  OpenMPHostInput = 
+} else if (IsOpenMPHost) {
+  OpenMPHostInputs.push_back(I);
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
@@ -6894,6 +6894,25 @@
 }
   }
 
+  // Host-side OpenMP offloading recieves the device object files and embeds it
+  // in a named section including the associated target triple and 
architecture.
+  if (IsOpenMPHost && !OpenMPHostInputs.empty()) {
+auto InputFile = OpenMPHostInputs.begin();
+auto OpenMPTCs = C.getOffloadToolChains();
+for (auto TI = OpenMPTCs.first, TE = OpenMPTCs.second; TI != TE;
+ ++TI, ++InputFile) {
+  const ToolChain *TC = TI->second;
+  const ArgList  = C.getArgsForToolChain(TC, "", 
Action::OFK_OpenMP);
+  StringRef File =
+  C.getArgs().MakeArgString(TC->getInputFilename(*InputFile));
+  StringRef InputName = Clang::getBaseInputStem(Args, Inputs);
+
+  CmdArgs.push_back(Args.MakeArgString(
+  "-fembed-offload-object=" + File + "," + TC->getTripleString() + "." 
+
+  TCArgs.getLastArgValue(options::OPT_march_EQ) + "." + InputName));
+}
+  }
+
   if (Triple.isAMDGPU()) {
 handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);
 


Index: clang/test/Frontend/embed-object.ll

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404037.
jhuber6 added a comment.

Adding test for multiple files (added it to wrong commit).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4973,3 +4974,19 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedBufferInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData =
+  ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the data in the
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::PrivateLinkage,
+  ModuleConstant, "llvm.embedded.object");
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedBufferInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,15 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section1 \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section2 -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @[[OBJECT1:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section1"
+; CHECK: @[[OBJECT2:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section2"
+; CHECK: @llvm.compiler.used = appending global [3 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT1]], i32 0, i32 0), i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT2]], i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedObject(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1750,3 +1750,25 @@
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedObject(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : CGOpts.OffloadObjects) {
+auto FilenameAndSection = OffloadObject.split(',');
+llvm::ErrorOr> ObjectOrErr =
+llvm::MemoryBuffer::getFileOrSTDIN(std::get<0>(FilenameAndSection));
+if (std::error_code EC = ObjectOrErr.getError()) {
+  auto DiagID = 

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404047.
jhuber6 added a comment.

Changing the name to be the section name. This ensures that if the sections get 
merged we will get a linker error without failing silently.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4973,3 +4974,22 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedBufferInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData =
+  ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the buffer into the module. These sections are not supposed to be
+  // merged by the linker, so we set the variable name to prevent linking if
+  // they would otherwise be merged.
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::ExternalLinkage,
+  ModuleConstant, SectionName);
+  GV->setVisibility(GlobalValue::HiddenVisibility);
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedBufferInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,15 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section1 \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section2 -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @[[OBJECT1:.+]] = hidden constant [0 x i8] zeroinitializer, section ".llvm.offloading.section1"
+; CHECK: @[[OBJECT2:.+]] = hidden constant [0 x i8] zeroinitializer, section ".llvm.offloading.section2"
+; CHECK: @llvm.compiler.used = appending global [3 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT1]], i32 0, i32 0), i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT2]], i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedObject(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1750,3 +1750,25 @@
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedObject(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : 

[PATCH] D118495: [OpenMP] Accept shortened triples for -Xopenmp-target=

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: JonChesterfield, jdoerfert, tianshilei1992.
Herald added subscribers: guansong, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch builds on the change in D117634  
that expanded the short
triples when passed in by the user. This patch adds the same
functionality for the `-Xopenmp-target=` flag. Previously it was
unintuitive that passing `-fopenmp-targets=nvptx64
-Xopenmp-target=nvptx64 ` would not forward the arg because the
triples did not match on account of `nvptx64` being expanded to
`nvptx64-nvidia-cuda`.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D118495

Files:
  clang/lib/Driver/ToolChain.cpp


Index: clang/lib/Driver/ToolChain.cpp
===
--- clang/lib/Driver/ToolChain.cpp
+++ clang/lib/Driver/ToolChain.cpp
@@ -1129,8 +1129,20 @@
 A->getOption().matches(options::OPT_Xopenmp_target);
 
 if (A->getOption().matches(options::OPT_Xopenmp_target_EQ)) {
+  llvm::Triple TT(A->getValue(0));
+  // We want to expand the shortened versions of the triples passed in to
+  // the values used for the bitcode libraries for convenience.
+  if (TT.getVendor() == llvm::Triple::UnknownVendor ||
+  TT.getOS() == llvm::Triple::UnknownOS) {
+if (TT.getArch() == llvm::Triple::nvptx)
+  TT = llvm::Triple("nvptx-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::nvptx64)
+  TT = llvm::Triple("nvptx64-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::amdgcn)
+  TT = llvm::Triple("amdgcn-amd-amdhsa");
+  }
   // Passing device args: -Xopenmp-target= -opt=val.
-  if (A->getValue(0) == getTripleString())
+  if (TT.getTriple() == getTripleString())
 Index = Args.getBaseArgs().MakeIndex(A->getValue(1));
   else
 continue;


Index: clang/lib/Driver/ToolChain.cpp
===
--- clang/lib/Driver/ToolChain.cpp
+++ clang/lib/Driver/ToolChain.cpp
@@ -1129,8 +1129,20 @@
 A->getOption().matches(options::OPT_Xopenmp_target);
 
 if (A->getOption().matches(options::OPT_Xopenmp_target_EQ)) {
+  llvm::Triple TT(A->getValue(0));
+  // We want to expand the shortened versions of the triples passed in to
+  // the values used for the bitcode libraries for convenience.
+  if (TT.getVendor() == llvm::Triple::UnknownVendor ||
+  TT.getOS() == llvm::Triple::UnknownOS) {
+if (TT.getArch() == llvm::Triple::nvptx)
+  TT = llvm::Triple("nvptx-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::nvptx64)
+  TT = llvm::Triple("nvptx64-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::amdgcn)
+  TT = llvm::Triple("amdgcn-amd-amdhsa");
+  }
   // Passing device args: -Xopenmp-target= -opt=val.
-  if (A->getValue(0) == getTripleString())
+  if (TT.getTriple() == getTripleString())
 Index = Args.getBaseArgs().MakeIndex(A->getValue(1));
   else
 continue;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 403468.
jhuber6 added a comment.

clang format.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4972,3 +4973,19 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedBufferInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData =
+  ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the data in the
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::PrivateLinkage,
+  ModuleConstant, "llvm.embedded.object");
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedBufferInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,13 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @llvm.embedded.object = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section"
+; CHECK: @llvm.compiler.used = appending global [2 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @llvm.embedded.object, i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedBinary(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1745,8 +1745,31 @@
  llvm::MemoryBufferRef Buf) {
   if (CGOpts.getEmbedBitcode() == CodeGenOptions::Embed_Off)
 return;
+
   llvm::EmbedBitcodeInModule(
   *M, Buf, CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Marker,
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedObject(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : CGOpts.OffloadObjects) {
+auto FilenameAndSection = OffloadObject.split(',');
+llvm::ErrorOr> ObjectOrErr =
+llvm::MemoryBuffer::getFileOrSTDIN(std::get<0>(FilenameAndSection));
+if (std::error_code EC = ObjectOrErr.getError()) {
+  auto DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Error,
+  

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 403476.
jhuber6 added a comment.

Forgot to rename file.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4972,3 +4973,19 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedBufferInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData =
+  ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the data in the
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::PrivateLinkage,
+  ModuleConstant, "llvm.embedded.object");
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedBufferInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,13 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @llvm.embedded.object = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section"
+; CHECK: @llvm.compiler.used = appending global [2 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @llvm.embedded.object, i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedObject(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1750,3 +1750,25 @@
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedObject(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : CGOpts.OffloadObjects) {
+auto FilenameAndSection = OffloadObject.split(',');
+llvm::ErrorOr> ObjectOrErr =
+llvm::MemoryBuffer::getFileOrSTDIN(std::get<0>(FilenameAndSection));
+if (std::error_code EC = ObjectOrErr.getError()) {
+  auto DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Error,
+  "could not open '%0' for embedding");
+  Diags.Report(DiagID) << std::get<0>(FilenameAndSection);
+  return;
+}
+
+SmallString<128> SectionName(
+{".llvm.offloading.", 

[PATCH] D118495: [OpenMP] Accept shortened triples for -Xopenmp-target=

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404136.
jhuber6 added a comment.

Adding test and shared function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118495/new/

https://reviews.llvm.org/D118495

Files:
  clang/include/clang/Driver/ToolChain.h
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChain.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -343,3 +343,10 @@
 // RUN:   | FileCheck -check-prefix=SAVE_TEMPS_NAMES %s
 
 // SAVE_TEMPS_NAMES-NOT: "GNU::Linker"{{.*}}["[[SAVE_TEMPS_INPUT1:.*\.o]]", 
"[[SAVE_TEMPS_INPUT1]]"]
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64 
-Xopenmp-target=nvptx64 -march=sm_35 \
+// RUN:  -save-temps -no-canonical-prefixes %s -o openmp-offload-gpu 
2>&1 \
+// RUN:   | FileCheck -check-prefix=TRIPLE %s
+
+// TRIPLE: "-triple" "nvptx64-nvidia-cuda"
+// TRIPLE: "-target-cpu" "sm_35"
Index: clang/lib/Driver/ToolChain.cpp
===
--- clang/lib/Driver/ToolChain.cpp
+++ clang/lib/Driver/ToolChain.cpp
@@ -1129,8 +1129,10 @@
 A->getOption().matches(options::OPT_Xopenmp_target);
 
 if (A->getOption().matches(options::OPT_Xopenmp_target_EQ)) {
+  llvm::Triple TT(getOpenMPTriple(A->getValue(0)));
+
   // Passing device args: -Xopenmp-target= -opt=val.
-  if (A->getValue(0) == getTripleString())
+  if (TT.getTriple() == getTripleString())
 Index = Args.getBaseArgs().MakeIndex(A->getValue(1));
   else
 continue;
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -773,21 +773,9 @@
   if (HasValidOpenMPRuntime) {
 llvm::StringMap FoundNormalizedTriples;
 for (const char *Val : OpenMPTargets->getValues()) {
-  llvm::Triple TT(Val);
+  llvm::Triple TT(ToolChain::getOpenMPTriple(Val));
   std::string NormalizedName = TT.normalize();
 
-  // We want to expand the shortened versions of the triples passed in 
to
-  // the values used for the bitcode libraries for convenience.
-  if (TT.getVendor() == llvm::Triple::UnknownVendor ||
-  TT.getOS() == llvm::Triple::UnknownOS) {
-if (TT.getArch() == llvm::Triple::nvptx)
-  TT = llvm::Triple("nvptx-nvidia-cuda");
-else if (TT.getArch() == llvm::Triple::nvptx64)
-  TT = llvm::Triple("nvptx64-nvidia-cuda");
-else if (TT.getArch() == llvm::Triple::amdgcn)
-  TT = llvm::Triple("amdgcn-amd-amdhsa");
-  }
-
   // Make sure we don't have a duplicate triple.
   auto Duplicate = FoundNormalizedTriples.find(NormalizedName);
   if (Duplicate != FoundNormalizedTriples.end()) {
Index: clang/include/clang/Driver/ToolChain.h
===
--- clang/include/clang/Driver/ToolChain.h
+++ clang/include/clang/Driver/ToolChain.h
@@ -711,6 +711,22 @@
   const llvm::fltSemantics *FPType = nullptr) const {
 return llvm::DenormalMode::getIEEE();
   }
+
+  // We want to expand the shortened versions of the triples passed in to
+  // the values used for the bitcode libraries.
+  static llvm::Triple getOpenMPTriple(StringRef TripleStr) {
+llvm::Triple TT(TripleStr);
+if (TT.getVendor() == llvm::Triple::UnknownVendor ||
+TT.getOS() == llvm::Triple::UnknownOS) {
+  if (TT.getArch() == llvm::Triple::nvptx)
+return llvm::Triple("nvptx-nvidia-cuda");
+  if (TT.getArch() == llvm::Triple::nvptx64)
+return llvm::Triple("nvptx64-nvidia-cuda");
+  if (TT.getArch() == llvm::Triple::amdgcn)
+return llvm::Triple("amdgcn-amd-amdhsa");
+}
+return TT;
+  }
 };
 
 /// Set a ToolChain's effective triple. Reset it when the registration object


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -343,3 +343,10 @@
 // RUN:   | FileCheck -check-prefix=SAVE_TEMPS_NAMES %s
 
 // SAVE_TEMPS_NAMES-NOT: "GNU::Linker"{{.*}}["[[SAVE_TEMPS_INPUT1:.*\.o]]", "[[SAVE_TEMPS_INPUT1]]"]
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_35 \
+// RUN:  -save-temps -no-canonical-prefixes %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=TRIPLE %s
+
+// TRIPLE: "-triple" "nvptx64-nvidia-cuda"
+// TRIPLE: "-target-cpu" "sm_35"
Index: clang/lib/Driver/ToolChain.cpp
===
--- clang/lib/Driver/ToolChain.cpp
+++ 

[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404513.
jhuber6 added a comment.

Fix test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", 
"[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: 
["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  
--libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc
 \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes -nogpulib %s -o 
openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: 
-fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4366,9 +4366,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4385,8 +4385,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
-} else if (IsOpenMPHost && !OpenMPHostInput) {
-  OpenMPHostInput = 
+} else if (IsOpenMPHost) {
+  OpenMPHostInputs.push_back(I);
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
@@ -6894,6 +6894,25 @@
 }
   }
 
+  // Host-side OpenMP offloading recieves the device object files and embeds it
+  // in a named section including the associated target triple and 
architecture.
+  if (IsOpenMPHost && !OpenMPHostInputs.empty()) {
+auto InputFile = OpenMPHostInputs.begin();
+auto OpenMPTCs = C.getOffloadToolChains();
+for (auto TI = OpenMPTCs.first, TE = OpenMPTCs.second; TI != TE;
+ ++TI, ++InputFile) {
+  const ToolChain *TC = TI->second;
+  const ArgList  = C.getArgsForToolChain(TC, "", 
Action::OFK_OpenMP);
+  StringRef File =
+  C.getArgs().MakeArgString(TC->getInputFilename(*InputFile));
+  StringRef InputName = Clang::getBaseInputStem(Args, Inputs);
+
+  CmdArgs.push_back(Args.MakeArgString(
+  "-fembed-offload-object=" + File + "," + TC->getTripleString() + "." 
+
+  TCArgs.getLastArgValue(options::OPT_march_EQ) + "." + InputName));
+}
+  }
+
   if (Triple.isAMDGPU()) {
 handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);
 


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: ["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  --libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes -nogpulib %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: -fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4366,9 +4366,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : 

[PATCH] D116545: [OpenMP] Add support for extracting device code in linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116545#3284541 , @jdoerfert wrote:

> what commit contains the tests?

The previous four have clang tests, showing that we call this tool with the 
expected arguments. Testing the tool itself requires running it, so I was 
thinking I could make a new class of OpenMP tests that run with the new driver 
similar to how we handled the new runtime transition. Haven't gotten that done 
yet.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116545/new/

https://reviews.llvm.org/D116545

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116675: [OpenMP] Search for static libraries in offload linker tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:646
+if (Arg.startswith("-L"))
+  LibraryPaths.push_back(Arg.drop_front(2));
+

jdoerfert wrote:
> This seems to handle `-Lfoo`, what about `-L bar`? at least a todo would be 
> good.
These arguments come from clang, and are somewhat normalized by clang so they 
will always be in the form `"-Lfoo"` when they come from clang. I could add 
some checks for that if we expect the user to be calling this manually.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116675/new/

https://reviews.llvm.org/D116675

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 accepted this revision.
jhuber6 added a comment.
This revision is now accepted and ready to land.

LGTM, unless someone else has reservations.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118493/new/

https://reviews.llvm.org/D118493

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118493: Set rpath on openmp executables

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/test/OpenMP/implicit_rpath.c:28
+// CHECK-COMPOSABLE: ({{R|RUN}}PATH) Library {{r|run}}path: 
[early:late:{{.*}}llvm/lib]
+
+int main() {}

JonChesterfield wrote:
> This ^ probably has path separator issues on windows, will try to find what 
> the proper regex is for that
https://llvm.org/docs/CommandGuide/lit.html#substitutions I think it's one of 
these.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118493/new/

https://reviews.llvm.org/D118493

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116545: [OpenMP] Add support for extracting device code in linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:258
+  if (ToBeDeleted.empty())
+return None;
+

jdoerfert wrote:
> if (!StripSections)
>   return None;
Fixed this later, I could rebase it so it applies here if it makes the review 
easier.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:435
+Arg = **NewFileOrErr;
+  }
+}

jdoerfert wrote:
> Does this work with the "do not strip option"? I somehow doubt it and would 
> expect an error.
If we don't strip then we just return `None` to indicate we didn't create a new 
host file and this code won't execute.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:445
   // TODO: Wrap device image in a host binary and pass it to the linker.
   WithColor::warning(errs(), argv[0]) << "Offload linking not yet 
supported.\n";
 

jdoerfert wrote:
> I know this is not new but I don't understand the warning here.
it was just a stand-in to indicate that the tool was indeed running with the 
`-fopenmp-new-driver` flag, but wasn't doing the offload linking step and thus 
wouldn't execute on the device. It's removed once offloading actually works.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116545/new/

https://reviews.llvm.org/D116545

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116544: [Clang] Introduce Clang Linker Wrapper Tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404585.
jhuber6 added a comment.

Adding documentation for tool.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116544/new/

https://reviews.llvm.org/D116544

Files:
  clang/docs/ClangLinkerWrapper.rst
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Driver/Action.h
  clang/include/clang/Driver/Job.h
  clang/include/clang/Driver/ToolChain.h
  clang/lib/Driver/Action.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChain.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Clang.h
  clang/tools/CMakeLists.txt
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -0,0 +1,91 @@
+//===-- clang-linker-wrapper/ClangLinkerWrapper.cpp - wrapper over linker-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===-===//
+///
+//===-===//
+
+#include "clang/Basic/Version.h"
+#include "llvm/Object/Archive.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Errc.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Program.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/StringSaver.h"
+#include "llvm/Support/WithColor.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+static cl::opt Help("h", cl::desc("Alias for -help"), cl::Hidden);
+
+// Mark all our options with this category, everything else (except for -help)
+// will be hidden.
+static cl::OptionCategory
+ClangLinkerWrapperCategory("clang-linker-wrapper options");
+
+static cl::opt LinkerUserPath("linker-path",
+   cl::desc("Path of linker binary"),
+   cl::cat(ClangLinkerWrapperCategory));
+
+// Do not parse linker options
+static cl::list
+LinkerArgs(cl::Sink, cl::desc("..."));
+
+static Error runLinker(std::string LinkerPath,
+   SmallVectorImpl ) {
+  std::vector LinkerArgs;
+  LinkerArgs.push_back(LinkerPath);
+  for (auto  : Args)
+LinkerArgs.push_back(Arg);
+
+  if (sys::ExecuteAndWait(LinkerPath, LinkerArgs))
+return createStringError(inconvertibleErrorCode(), "'linker' failed");
+  return Error::success();
+}
+
+static void PrintVersion(raw_ostream ) {
+  OS << clang::getClangToolFullVersion("clang-linker-wrapper") << '\n';
+}
+
+int main(int argc, const char **argv) {
+  sys::PrintStackTraceOnErrorSignal(argv[0]);
+  cl::SetVersionPrinter(PrintVersion);
+  cl::HideUnrelatedOptions(ClangLinkerWrapperCategory);
+  cl::ParseCommandLineOptions(
+  argc, argv,
+  "A wrapper utility over the host linker. It scans the input files for\n"
+  "sections that require additional processing prior to linking. The tool\n"
+  "will then transparently pass all arguments and input to the specified\n"
+  "host linker to create the final binary.\n");
+
+  if (Help) {
+cl::PrintHelpMessage();
+return EXIT_SUCCESS;
+  }
+
+  auto reportError = [argv](Error E) {
+logAllUnhandledErrors(std::move(E), WithColor::error(errs(), argv[0]));
+exit(EXIT_FAILURE);
+  };
+
+  // TODO: Scan input object files for offloading sections and extract them.
+  // TODO: Perform appropriate device linking action.
+  // TODO: Wrap device image in a host binary and pass it to the linker.
+  WithColor::warning(errs(), argv[0]) << "Offload linking not yet supported.\n";
+
+  SmallVector Argv;
+  for (const std::string  : LinkerArgs)
+Argv.push_back(Arg);
+
+  if (Error Err = runLinker(LinkerUserPath, Argv))
+reportError(std::move(Err));
+
+  return EXIT_SUCCESS;
+}
Index: clang/tools/clang-linker-wrapper/CMakeLists.txt
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/CMakeLists.txt
@@ -0,0 +1,25 @@
+set(LLVM_LINK_COMPONENTS BitWriter Core Object Support)
+
+if(NOT CLANG_BUILT_STANDALONE)
+  set(tablegen_deps intrinsics_gen)
+endif()
+
+add_clang_executable(clang-linker-wrapper
+  ClangLinkerWrapper.cpp
+
+  DEPENDS
+  ${tablegen_deps}
+  )
+
+set(CLANG_LINKER_WRAPPER_LIB_DEPS
+  clangBasic
+  )
+
+add_dependencies(clang clang-linker-wrapper)
+
+target_link_libraries(clang-linker-wrapper
+  PRIVATE
+  ${CLANG_LINKER_WRAPPER_LIB_DEPS}
+  )
+
+install(TARGETS clang-linker-wrapper RUNTIME DESTINATION bin)
Index: clang/tools/CMakeLists.txt

[PATCH] D116627: [Clang] Initial support for linking offloading code in tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:425
+  ErrorOr NvlinkPath = sys::findProgramByName(
+  "nvlink", sys::path::parent_path(LinkerExecutable));
+  if (!NvlinkPath)

jdoerfert wrote:
> Unsure why we look at the linker bin dir first, PATH seems to be the right 
> choice here.
Copied it from somewhere, should change it. Right now it's just a weird way to 
look through `/bin/`



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:439
+
+  // TODO: Pass in arguments like `-g` and `-v` from the driver.
+  SmallVector CmdArgs;

jdoerfert wrote:
> If this is not addressed in a follow up, it seems trivial to add it.
It is addressed later.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:468
-
-  return EXIT_SUCCESS;
 }

jdoerfert wrote:
> keep exit success, makes it obvious this worked.
Will do.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:497
+Triple TheTriple(TargetFeatures.first);
+StringRef Arch(TargetFeatures.second);
+

jdoerfert wrote:
> .str() above to get a key is somewhat ok, but then parsing the key again is 
> weird.
> Can we make it a densemap and provide a densemapinfo for `struct DeviceFile`. 
> It can just inherit from the DenseMapInfo. Might even work w/o 
> explicitly calling the "super" functions with the `.str()` version if we have 
> a `std::string operator`. Might then even work w/o DenseMapInfo by telling 
> the DenseMap to use the std::string specialization of DenseMapInfo in the 
> first place.
Probably a smarter idea, I needed a way to group all files with the same device 
file type. I'll look into it.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:647
 if (std::error_code EC = sys::fs::remove(TempFile))
-  reportError(createFileError(TempFile, EC));
-  }
-
-  return EXIT_SUCCESS;
+  return reportError(createFileError(TempFile, EC));
 }

jdoerfert wrote:
> do we really want to return here?
Might be a good idea to keep trying to remove temp files even if one fails, 
will address.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116627/new/

https://reviews.llvm.org/D116627

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116627: [Clang] Initial support for linking offloading code in tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404598.
jhuber6 added a comment.

Maxing suggested changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116627/new/

https://reviews.llvm.org/D116627

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -28,6 +28,7 @@
 #include "llvm/Support/Errc.h"
 #include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Host.h"
 #include "llvm/Support/InitLLVM.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
@@ -59,22 +60,26 @@
 
 // Do not parse linker options.
 static cl::list
-LinkerArgs(cl::Sink, cl::desc("..."));
+HostLinkerArgs(cl::Sink, cl::desc("..."));
 
 /// Path of the current binary.
 static std::string LinkerExecutable;
 
+static SmallVector TempFiles;
 /// Magic section string that marks the existence of offloading data. The
 /// section string will be formatted as `.llvm.offloading..`.
-#define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading"
+#define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
 
+/// Information for a device offloading file extracted from the host.
 struct DeviceFile {
   DeviceFile(StringRef TheTriple, StringRef Arch, StringRef Filename)
   : TheTriple(TheTriple), Arch(Arch), Filename(Filename) {}
 
-  const Triple TheTriple;
+  const std::string TheTriple;
   const std::string Arch;
   const std::string Filename;
+
+  operator StringRef() const { return TheTriple + "-" + Arch; }
 };
 
 namespace {
@@ -83,6 +88,16 @@
 extractFromBuffer(std::unique_ptr Buffer,
   SmallVectorImpl );
 
+static StringRef getDeviceFileExtension(StringRef DeviceTriple,
+bool IsBitcode = false) {
+  Triple TheTriple(DeviceTriple);
+  if (TheTriple.isAMDGPU() || IsBitcode)
+return "bc";
+  if (TheTriple.isNVPTX())
+return "cubin";
+  return "o";
+}
+
 Error runLinker(std::string , SmallVectorImpl ) {
   std::vector LinkerArgs;
   LinkerArgs.push_back(LinkerPath);
@@ -150,9 +165,12 @@
 
 if (Expected Contents = Sec.getContents()) {
   SmallString<128> TempFile;
+  StringRef DeviceExtension = getDeviceFileExtension(
+  DeviceTriple, identify_magic(*Contents) == file_magic::bitcode);
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  Prefix + "-device-" + DeviceTriple, Extension, TempFile))
+  Prefix + "-device-" + DeviceTriple, DeviceExtension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
 
   Expected> OutputOrErr =
   FileOutputBuffer::create(TempFile, Sec.getSize());
@@ -173,10 +191,7 @@
 
   // We will use llvm-strip to remove the now unneeded section containing the
   // offloading code.
-  ErrorOr StripPath = sys::findProgramByName(
-  "llvm-strip", sys::path::parent_path(LinkerExecutable));
-  if (!StripPath)
-StripPath = sys::findProgramByName("llvm-strip");
+  ErrorOr StripPath = sys::findProgramByName("llvm-strip");
   if (!StripPath)
 return createStringError(StripPath.getError(),
  "Unable to find 'llvm-strip' in path");
@@ -185,6 +200,7 @@
   if (std::error_code EC =
   sys::fs::createTemporaryFile(Prefix + "-host", Extension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
 
   SmallVector StripArgs;
   StripArgs.push_back(*StripPath);
@@ -237,9 +253,12 @@
 
 StringRef Contents = CDS->getAsString();
 SmallString<128> TempFile;
+StringRef DeviceExtension = getDeviceFileExtension(
+DeviceTriple, identify_magic(Contents) == file_magic::bitcode);
 if (std::error_code EC = sys::fs::createTemporaryFile(
-Prefix + "-device-" + DeviceTriple, Extension, TempFile))
+Prefix + "-device-" + DeviceTriple, DeviceExtension, TempFile))
   return createFileError(TempFile, EC);
+TempFiles.push_back(static_cast(TempFile));
 
 Expected> OutputOrErr =
 FileOutputBuffer::create(TempFile, Contents.size());
@@ -271,6 +290,8 @@
   if (std::error_code EC =
   sys::fs::createTemporaryFile(Prefix + "-host", Extension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
   std::error_code EC;
   raw_fd_ostream HostOutput(TempFile, EC, sys::fs::OF_None);
   if (EC)
@@ -341,6 +362,7 @@
   if (std::error_code EC =
   sys::fs::createTemporaryFile(Prefix + "-host", Extension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
 
   std::unique_ptr Buffer =
   MemoryBuffer::getMemBuffer(Library.getMemoryBufferRef(), false);

[PATCH] D116975: [OpenMP] Initial Implementation of LTO and bitcode linking in linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404528.
jhuber6 added a comment.

Moving adding OpenMPOpt to LTO pipeline to a new patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116975/new/

https://reviews.llvm.org/D116975

Files:
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -17,9 +17,12 @@
 #include "clang/Basic/Version.h"
 #include "llvm/BinaryFormat/Magic.h"
 #include "llvm/Bitcode/BitcodeWriter.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Constants.h"
+#include "llvm/IR/DiagnosticPrinter.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IRReader/IRReader.h"
+#include "llvm/LTO/LTO.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/ArchiveWriter.h"
 #include "llvm/Object/Binary.h"
@@ -36,6 +39,7 @@
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/StringSaver.h"
+#include "llvm/Support/TargetSelect.h"
 #include "llvm/Support/WithColor.h"
 #include "llvm/Support/raw_ostream.h"
 
@@ -58,6 +62,15 @@
cl::desc("Path of linker binary"),
cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt
+TargetFeatures("target-feature", cl::desc("Target features for triple"),
+   cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt OptLevel("opt-level",
+ cl::desc("Optimization level for LTO"),
+ cl::init("O0"),
+ cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
 HostLinkerArgs(cl::Sink, cl::desc("..."));
@@ -68,6 +81,9 @@
 /// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
 
+/// Codegen flags for LTO backend.
+static codegen::RegisterCodeGenFlags CodeGenFlags;
+
 /// Magic section string that marks the existence of offloading data. The
 /// section string will be formatted as `.llvm.offloading..`.
 #define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
@@ -191,6 +207,28 @@
   if (ToBeStripped.empty())
 return None;
 
+  // If the object file to strip doesn't exist we need to write it so we can
+  // pass it to llvm-strip.
+  SmallString<128> StripFile = Obj.getFileName();
+  if (!sys::fs::exists(StripFile)) {
+SmallString<128> TempFile;
+if (std::error_code EC = sys::fs::createTemporaryFile(
+sys::path::stem(StripFile), "o", TempFile))
+  return createFileError(TempFile, EC);
+TempFiles.push_back(static_cast(TempFile));
+
+auto Contents = Obj.getMemoryBufferRef().getBuffer();
+Expected> OutputOrErr =
+FileOutputBuffer::create(TempFile, Contents.size());
+if (!OutputOrErr)
+  return OutputOrErr.takeError();
+std::unique_ptr Output = std::move(*OutputOrErr);
+std::copy(Contents.begin(), Contents.end(), Output->getBufferStart());
+if (Error E = Output->commit())
+  return E;
+StripFile = TempFile;
+  }
+
   // We will use llvm-strip to remove the now unneeded section containing the
   // offloading code.
   ErrorOr StripPath = sys::findProgramByName(
@@ -210,7 +248,7 @@
   SmallVector StripArgs;
   StripArgs.push_back(*StripPath);
   StripArgs.push_back("--no-strip-all");
-  StripArgs.push_back(Obj.getFileName());
+  StripArgs.push_back(StripFile);
   for (auto  : ToBeStripped) {
 StripArgs.push_back("--remove-section");
 StripArgs.push_back(Section);
@@ -411,6 +449,44 @@
 
 // TODO: Move these to a separate file.
 namespace nvptx {
+Expected assemble(StringRef InputFile, Triple TheTriple,
+   StringRef Arch) {
+  // NVPTX uses the nvlink binary to link device object files.
+  ErrorOr PtxasPath =
+  sys::findProgramByName("ptxas", sys::path::parent_path(LinkerExecutable));
+  if (!PtxasPath)
+PtxasPath = sys::findProgramByName("ptxas");
+  if (!PtxasPath)
+return createStringError(PtxasPath.getError(),
+ "Unable to find 'ptxas' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  // TODO: Pass in arguments like `-g` and `-v` from the driver.
+  SmallVector CmdArgs;
+  std::string Opt = "-" + OptLevel;
+  CmdArgs.push_back(*PtxasPath);
+  CmdArgs.push_back(TheTriple.isArch64Bit() ? "-m64" : "-m32");
+  

[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116541#3285455 , @thakis wrote:

> Looks like this breaks tests on macOS: 
> http://45.33.8.238/macm1/26856/step_7.txt
>
> Please take a look and revert for now if it takes a while to fix (maybe just 
> needs an explicit triple?)

Not sure what I expected when I hard-coded the host-triple in the test. I 
pushed a change in rGb79e2a1ccd3b 
, can you 
check it again?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116544: [Clang] Introduce Clang Linker Wrapper Tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG95c8f7464092: [Clang] Introduce Clang Linker Wrapper Tool 
(authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116544/new/

https://reviews.llvm.org/D116544

Files:
  clang/docs/ClangLinkerWrapper.rst
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Driver/Action.h
  clang/include/clang/Driver/Job.h
  clang/include/clang/Driver/ToolChain.h
  clang/lib/Driver/Action.cpp
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChain.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Clang.h
  clang/tools/CMakeLists.txt
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -0,0 +1,91 @@
+//===-- clang-linker-wrapper/ClangLinkerWrapper.cpp - wrapper over linker-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===-===//
+///
+//===-===//
+
+#include "clang/Basic/Version.h"
+#include "llvm/Object/Archive.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Errc.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/Path.h"
+#include "llvm/Support/Program.h"
+#include "llvm/Support/Signals.h"
+#include "llvm/Support/StringSaver.h"
+#include "llvm/Support/WithColor.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+static cl::opt Help("h", cl::desc("Alias for -help"), cl::Hidden);
+
+// Mark all our options with this category, everything else (except for -help)
+// will be hidden.
+static cl::OptionCategory
+ClangLinkerWrapperCategory("clang-linker-wrapper options");
+
+static cl::opt LinkerUserPath("linker-path",
+   cl::desc("Path of linker binary"),
+   cl::cat(ClangLinkerWrapperCategory));
+
+// Do not parse linker options
+static cl::list
+LinkerArgs(cl::Sink, cl::desc("..."));
+
+static Error runLinker(std::string LinkerPath,
+   SmallVectorImpl ) {
+  std::vector LinkerArgs;
+  LinkerArgs.push_back(LinkerPath);
+  for (auto  : Args)
+LinkerArgs.push_back(Arg);
+
+  if (sys::ExecuteAndWait(LinkerPath, LinkerArgs))
+return createStringError(inconvertibleErrorCode(), "'linker' failed");
+  return Error::success();
+}
+
+static void PrintVersion(raw_ostream ) {
+  OS << clang::getClangToolFullVersion("clang-linker-wrapper") << '\n';
+}
+
+int main(int argc, const char **argv) {
+  sys::PrintStackTraceOnErrorSignal(argv[0]);
+  cl::SetVersionPrinter(PrintVersion);
+  cl::HideUnrelatedOptions(ClangLinkerWrapperCategory);
+  cl::ParseCommandLineOptions(
+  argc, argv,
+  "A wrapper utility over the host linker. It scans the input files for\n"
+  "sections that require additional processing prior to linking. The tool\n"
+  "will then transparently pass all arguments and input to the specified\n"
+  "host linker to create the final binary.\n");
+
+  if (Help) {
+cl::PrintHelpMessage();
+return EXIT_SUCCESS;
+  }
+
+  auto reportError = [argv](Error E) {
+logAllUnhandledErrors(std::move(E), WithColor::error(errs(), argv[0]));
+exit(EXIT_FAILURE);
+  };
+
+  // TODO: Scan input object files for offloading sections and extract them.
+  // TODO: Perform appropriate device linking action.
+  // TODO: Wrap device image in a host binary and pass it to the linker.
+  WithColor::warning(errs(), argv[0]) << "Offload linking not yet supported.\n";
+
+  SmallVector Argv;
+  for (const std::string  : LinkerArgs)
+Argv.push_back(Arg);
+
+  if (Error Err = runLinker(LinkerUserPath, Argv))
+reportError(std::move(Err));
+
+  return EXIT_SUCCESS;
+}
Index: clang/tools/clang-linker-wrapper/CMakeLists.txt
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/CMakeLists.txt
@@ -0,0 +1,25 @@
+set(LLVM_LINK_COMPONENTS BitWriter Core Object Support)
+
+if(NOT CLANG_BUILT_STANDALONE)
+  set(tablegen_deps intrinsics_gen)
+endif()
+
+add_clang_executable(clang-linker-wrapper
+  ClangLinkerWrapper.cpp
+
+  DEPENDS
+  ${tablegen_deps}
+  )
+
+set(CLANG_LINKER_WRAPPER_LIB_DEPS
+  clangBasic
+  )
+
+add_dependencies(clang clang-linker-wrapper)
+
+target_link_libraries(clang-linker-wrapper
+  PRIVATE
+  ${CLANG_LINKER_WRAPPER_LIB_DEPS}
+  )
+

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG551b17745244: [OpenMP] Add a flag for embedding a file into 
the module (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D116542?vs=404506=404685#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4973,3 +4974,19 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedBufferInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData =
+  ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the buffer into the module.
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::PrivateLinkage,
+  ModuleConstant, "llvm.embedded.object");
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedBufferInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,15 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section1 \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section2 -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @[[OBJECT1:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section1"
+; CHECK: @[[OBJECT2:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section2"
+; CHECK: @llvm.compiler.used = appending global [3 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT1]], i32 0, i32 0), i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT2]], i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedObject(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1750,3 +1750,31 @@
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedObject(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : CGOpts.OffloadObjects) {
+if (OffloadObject.count(',') != 1) {
+  

[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2f9ace9e9a58: [OpenMP] Introduce new flag to change 
offloading driver pipeline (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D116541?vs=397089=404684#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

Files:
  clang/include/clang/Driver/Driver.h
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c

Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -350,3 +350,13 @@
 
 // TRIPLE: "-triple" "nvptx64-nvidia-cuda"
 // TRIPLE: "-target-cpu" "sm_35"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes -ccc-print-bindings %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER %s
+
+// NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_INPUT:.+]]"], output: "[[HOST_BC:.+]]" 
+// NEW_DRIVER: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[DEVICE_INPUT:.+]]", "[[HOST_BC]]"], output: "[[DEVICE_ASM:.+]]"
+// NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
+// NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
+// NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: ["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4351,6 +4351,7 @@
   bool IsHIP = JA.isOffloading(Action::OFK_HIP);
   bool IsHIPDevice = JA.isDeviceOffloading(Action::OFK_HIP);
   bool IsOpenMPDevice = JA.isDeviceOffloading(Action::OFK_OpenMP);
+  bool IsOpenMPHost = JA.isHostOffloading(Action::OFK_OpenMP);
   bool IsHeaderModulePrecompile = isa(JA);
   bool IsDeviceOffloadAction = !(JA.isDeviceOffloading(Action::OFK_None) ||
  JA.isDeviceOffloading(Action::OFK_Host));
@@ -4371,6 +4372,7 @@
   InputInfoList ModuleHeaderInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
+  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4387,6 +4389,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
+} else if (IsOpenMPHost && !OpenMPHostInput) {
+  OpenMPHostInput = 
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -3830,6 +3830,11 @@
   // Builder to be used to build offloading actions.
   OffloadingActionBuilder OffloadBuilder(C, Args, Inputs);
 
+  // Offload kinds active for this compilation.
+  unsigned OffloadKinds = Action::OFK_None;
+  if (C.hasOffloadToolChain())
+OffloadKinds |= Action::OFK_OpenMP;
+
   // Construct the actions to perform.
   HeaderModulePrecompileJobAction *HeaderModuleAction = nullptr;
   ActionList LinkerInputs;
@@ -3850,14 +3855,16 @@
 
 // Use the current host action in any of the offloading actions, if
 // required.
-if (OffloadBuilder.addHostDependenceToDeviceActions(Current, InputArg))
-  break;
+if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (OffloadBuilder.addHostDependenceToDeviceActions(Current, InputArg))
+break;
 
 for (phases::ID Phase : PL) {
 
   // Add any offload action the host action depends on.
-  Current = OffloadBuilder.addDeviceDependencesToHostAction(
-  Current, InputArg, Phase, PL.back(), FullPL);
+  if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+Current = OffloadBuilder.addDeviceDependencesToHostAction(
+Current, InputArg, Phase, PL.back(), FullPL);
   if (!Current)
 break;
 
@@ -3890,6 +3897,11 @@
 break;
   }
 
+  // Try to build the offloading actions and add the result as a dependency
+  // to the host.
+  if (Args.hasArg(options::OPT_fopenmp_new_driver))
+Current = BuildOffloadingActions(C, Args, I, Current);
+
   // FIXME: Should we include any prior module file outputs as inputs of
   // later actions in the same command line?
 
@@ -3907,8 +3919,9 @@
 
   // Use the current host action in any of the offloading actions, if
   // 

[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG12ae095bbb63: [OpenMP] Embed device files into the host IR 
(authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -360,3 +360,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", 
"[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: 
["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  
--libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc
 \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes -nogpulib %s -o 
openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: 
-fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4370,9 +4370,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4389,8 +4389,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
-} else if (IsOpenMPHost && !OpenMPHostInput) {
-  OpenMPHostInput = 
+} else if (IsOpenMPHost) {
+  OpenMPHostInputs.push_back(I);
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
@@ -6897,6 +6897,25 @@
 }
   }
 
+  // Host-side OpenMP offloading recieves the device object files and embeds it
+  // in a named section including the associated target triple and 
architecture.
+  if (IsOpenMPHost && !OpenMPHostInputs.empty()) {
+auto InputFile = OpenMPHostInputs.begin();
+auto OpenMPTCs = C.getOffloadToolChains();
+for (auto TI = OpenMPTCs.first, TE = OpenMPTCs.second; TI != TE;
+ ++TI, ++InputFile) {
+  const ToolChain *TC = TI->second;
+  const ArgList  = C.getArgsForToolChain(TC, "", 
Action::OFK_OpenMP);
+  StringRef File =
+  C.getArgs().MakeArgString(TC->getInputFilename(*InputFile));
+  StringRef InputName = Clang::getBaseInputStem(Args, Inputs);
+
+  CmdArgs.push_back(Args.MakeArgString(
+  "-fembed-offload-object=" + File + "," + TC->getTripleString() + "." 
+
+  TCArgs.getLastArgValue(options::OPT_march_EQ) + "." + InputName));
+}
+  }
+
   if (Triple.isAMDGPU()) {
 handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);
 


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -360,3 +360,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: ["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  --libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes -nogpulib %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: -fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4370,9 +4370,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo 

[PATCH] D116544: [Clang] Introduce Clang Linker Wrapper Tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116544#3285369 , @jyknight wrote:

> "clang-linker-wrapper" seems like a very generic name for a command which is 
> OpenMP offloading specific?

This could potentially be used to handle CUDA / HIP offloading as well, so I 
deliberately chose not bind it too tightly to OpenMP offloading. I can change 
the name if people think that it's too confusing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116544/new/

https://reviews.llvm.org/D116544

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116975: [OpenMP] Initial Implementation of LTO and bitcode linking in linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:8154
+for (auto TI = OpenMPTCRange.first, TE = OpenMPTCRange.second; TI != TE;
+ ++TI) {
+  const ToolChain *TC = TI->second;

jdoerfert wrote:
> Nit: maybe `for (auto  : make_range(OpenMPTCRange.first, 
> OpenMPTCRange.second))`
Will do, forgot about that helper.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:8180
+CmdArgs.push_back(Args.MakeArgString(Twine("-opt-level=O") + OOpt));
+}
+  }

jdoerfert wrote:
> I thought there is a helper somewhere that does this translation, isn't there?
There might be, but I didn't find an easy one. I copied this from the GNU 
toolchain's handling of the LTO arguments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116975/new/

https://reviews.llvm.org/D116975

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117048: [OpenMP] Link the bitcode library late for device LTO

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:75
+static cl::opt
+BitcodeLibrary("target-library",
+   cl::desc("Path for the target bitcode library"),

tianshilei1992 wrote:
> `target-library` is not the common name we call it. Maybe 
> `device-runtime-library`?
I think I called it `bitcode-library` later, since technically it's just any 
bitcode file that's linked in along LTO.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:986
+  if (!BitcodeLibrary.empty()) {
+// FIXME: Hacky workaround to avoid a backend crash at O0.
+if (OptLevel[1] - '0' == 0)

tianshilei1992 wrote:
> Is this still needed now?
It still breaks, but I removed it later. My ordering for these patches is a bit 
of a mess, sorry.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117048/new/

https://reviews.llvm.org/D117048

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117156: [OpenMP] Add extra flag handling to linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done.
jhuber6 added inline comments.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:98
+static cl::opt
+PtxasOption("ptxas-option", cl::ZeroOrMore,
+cl::desc("Argument to pass to the ptxas invocation"),

tianshilei1992 wrote:
> `ptxas-args`?
changed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117156/new/

https://reviews.llvm.org/D117156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117049: [OpenMP] Add support for embedding bitcode images in wrapper tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404785.
jhuber6 added a comment.

Removing clang flag because LTO won't be supported when these land.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117049/new/

https://reviews.llvm.org/D117049

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -76,12 +76,18 @@
cl::desc("Path for the target bitcode library"),
cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt EmbedBC(
+"target-embed-bc", cl::ZeroOrMore,
+cl::desc("Embed linked bitcode instead of an executable device image."),
+cl::init(false), cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
-HostLinkerArgs(cl::Sink, cl::desc("..."));
+HostLinkerArgs(cl::Positional,
+   cl::desc("..."));
 
 /// Path of the current binary.
-static std::string LinkerExecutable;
+static const char *LinkerExecutable;
 
 /// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
@@ -411,8 +417,8 @@
 
   std::unique_ptr Buffer =
   MemoryBuffer::getMemBuffer(Library.getMemoryBufferRef(), false);
-  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(),
-true, Library.isThin(), std::move(Buffer)))
+  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(), true,
+   Library.isThin(), std::move(Buffer)))
 return std::move(Err);
 
   return static_cast(TempFile);
@@ -489,7 +495,7 @@
   return static_cast(TempFile);
 }
 
-Expected link(ArrayRef InputFiles,
+Expected link(ArrayRef InputFiles,
ArrayRef LinkerArgs, Triple TheTriple,
StringRef Arch) {
   // NVPTX uses the nvlink binary to link device object files.
@@ -520,7 +526,7 @@
   CmdArgs.push_back(Arg);
 
   // Add extracted input files.
-  for (auto Input : InputFiles)
+  for (StringRef Input : InputFiles)
 CmdArgs.push_back(Input);
 
   if (sys::ExecuteAndWait(*NvlinkPath, CmdArgs))
@@ -530,7 +536,7 @@
 }
 } // namespace nvptx
 
-Expected linkDevice(ArrayRef InputFiles,
+Expected linkDevice(ArrayRef InputFiles,
  ArrayRef LinkerArgs,
  Triple TheTriple, StringRef Arch) {
   switch (TheTriple.getArch()) {
@@ -597,8 +603,10 @@
   llvm_unreachable("Invalid optimization level");
 }
 
-std::unique_ptr createLTO(const Triple , StringRef Arch,
-bool WholeProgram) {
+template >
+std::unique_ptr createLTO(
+const Triple , StringRef Arch, bool WholeProgram,
+ModuleHook Hook = [](size_t, const Module &) { return true; }) {
   lto::Config Conf;
   lto::ThinBackend Backend;
   // TODO: Handle index-only thin-LTO
@@ -617,7 +625,7 @@
   Conf.PTO.LoopVectorization = Conf.OptLevel > 1;
   Conf.PTO.SLPVectorization = Conf.OptLevel > 1;
 
-  // TODO: Handle outputting bitcode using a module hook.
+  Conf.PostInternalizeModuleHook = Hook;
   if (TheTriple.isNVPTX())
 Conf.CGFileType = CGFT_AssemblyFile;
   else
@@ -637,11 +645,11 @@
  [](char C) { return C == '_' || isAlnum(C); });
 }
 
-Expected> linkBitcodeFiles(ArrayRef InputFiles,
- const Triple ,
- StringRef Arch) {
+Error linkBitcodeFiles(SmallVectorImpl ,
+   const Triple , StringRef Arch) {
   SmallVector, 4> SavedBuffers;
   SmallVector, 4> BitcodeFiles;
+  SmallVector NewInputFiles;
   StringMap UsedInRegularObj;
 
   // Search for bitcode files in the input and create an LTO input file. If it
@@ -660,6 +668,7 @@
   if (!ObjFile)
 return ObjFile.takeError();
 
+  NewInputFiles.push_back(File.str());
   for (auto  : (*ObjFile)->symbols()) {
 Expected Name = Sym.getName();
 if (!Name)
@@ -679,12 +688,36 @@
   }
 
   if (BitcodeFiles.empty())
-return None;
+return Error::success();
+
+  auto HandleError = [&](std::error_code EC) {
+logAllUnhandledErrors(errorCodeToError(EC),
+  WithColor::error(errs(), LinkerExecutable));
+exit(1);
+  };
+
+  // LTO Module hook to output bitcode without running the backend.
+  auto LinkOnly = [&](size_t Task, const Module ) {
+SmallString<128> TempFile;
+if (std::error_code EC = sys::fs::createTemporaryFile(
+"jit-" + TheTriple.getTriple(), "bc", TempFile))
+  HandleError(EC);
+std::error_code EC;
+raw_fd_ostream LinkedBitcode(TempFile, EC, sys::fs::OF_None);
+if (EC)
+  HandleError(EC);
+

[PATCH] D118155: [OpenMP] Improve symbol resolution for OpenMP Offloading LTO

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:777-781
+// Record if we've seen these symbols in any object or shared 
libraries.
+if ((*ObjFile)->isRelocatableObject()) {
+  UsedInRegularObj[*Name] = true;
+} else
+  UsedInSharedLib[*Name] = true;

tianshilei1992 wrote:
> 
Fixed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118155/new/

https://reviews.llvm.org/D118155

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118198: [OpenMP] Remove call to 'clang-offload-wrapper' binary

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG19fac745e322: [OpenMP] Remove call to 
clang-offload-wrapper binary (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D118198?vs=403043=404797#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118198/new/

https://reviews.llvm.org/D118198

Files:
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  clang/tools/clang-linker-wrapper/OffloadWrapper.cpp
  clang/tools/clang-linker-wrapper/OffloadWrapper.h

Index: clang/tools/clang-linker-wrapper/OffloadWrapper.h
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/OffloadWrapper.h
@@ -0,0 +1,20 @@
+//===- OffloadWrapper.h ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_CLANG_LINKER_WRAPPER_OFFLOAD_WRAPPER_H
+#define LLVM_CLANG_TOOLS_CLANG_LINKER_WRAPPER_OFFLOAD_WRAPPER_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/IR/Module.h"
+
+/// Wrap the input device images into the module \p M as global symbols and
+/// registers the images with the OpenMP Offloading runtime libomptarget.
+llvm::Error wrapBinaries(llvm::Module ,
+ llvm::ArrayRef> Images);
+
+#endif
Index: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/OffloadWrapper.cpp
@@ -0,0 +1,267 @@
+//===- OffloadWrapper.cpp ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "OffloadWrapper.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/Triple.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
+
+using namespace llvm;
+
+namespace {
+
+IntegerType *getSizeTTy(Module ) {
+  LLVMContext  = M.getContext();
+  switch (M.getDataLayout().getPointerTypeSize(Type::getInt8PtrTy(C))) {
+  case 4u:
+return Type::getInt32Ty(C);
+  case 8u:
+return Type::getInt64Ty(C);
+  }
+  llvm_unreachable("unsupported pointer type size");
+}
+
+// struct __tgt_offload_entry {
+//   void *addr;
+//   char *name;
+//   size_t size;
+//   int32_t flags;
+//   int32_t reserved;
+// };
+StructType *getEntryTy(Module ) {
+  LLVMContext  = M.getContext();
+  StructType *EntryTy = StructType::getTypeByName(C, "__tgt_offload_entry");
+  if (!EntryTy)
+EntryTy = StructType::create("__tgt_offload_entry", Type::getInt8PtrTy(C),
+ Type::getInt8PtrTy(C), getSizeTTy(M),
+ Type::getInt32Ty(C), Type::getInt32Ty(C));
+  return EntryTy;
+}
+
+PointerType *getEntryPtrTy(Module ) {
+  return PointerType::getUnqual(getEntryTy(M));
+}
+
+// struct __tgt_device_image {
+//   void *ImageStart;
+//   void *ImageEnd;
+//   __tgt_offload_entry *EntriesBegin;
+//   __tgt_offload_entry *EntriesEnd;
+// };
+StructType *getDeviceImageTy(Module ) {
+  LLVMContext  = M.getContext();
+  StructType *ImageTy = StructType::getTypeByName(C, "__tgt_device_image");
+  if (!ImageTy)
+ImageTy = StructType::create("__tgt_device_image", Type::getInt8PtrTy(C),
+ Type::getInt8PtrTy(C), getEntryPtrTy(M),
+ getEntryPtrTy(M));
+  return ImageTy;
+}
+
+PointerType *getDeviceImagePtrTy(Module ) {
+  return PointerType::getUnqual(getDeviceImageTy(M));
+}
+
+// struct __tgt_bin_desc {
+//   int32_t NumDeviceImages;
+//   __tgt_device_image *DeviceImages;
+//   __tgt_offload_entry *HostEntriesBegin;
+//   __tgt_offload_entry *HostEntriesEnd;
+// };
+StructType *getBinDescTy(Module ) {
+  LLVMContext  = M.getContext();
+  StructType *DescTy = StructType::getTypeByName(C, "__tgt_bin_desc");
+  if (!DescTy)
+DescTy = StructType::create("__tgt_bin_desc", Type::getInt32Ty(C),
+getDeviceImagePtrTy(M), getEntryPtrTy(M),
+getEntryPtrTy(M));
+  return DescTy;
+}
+
+PointerType *getBinDescPtrTy(Module ) {
+  return PointerType::getUnqual(getBinDescTy(M));
+}
+
+/// Creates binary descriptor for the given 

[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116541#3285927 , @thakis wrote:

> Still failing: http://45.33.8.238/macm1/26873/step_7.txt

It seems what's happening here is that we are building the host.bc twice, this 
will work fine but isn't ideal. I prevent this manually by checking the cache 
if one of the jobs was already created, but for some reason that doesn't seem 
to be happening here. I'll need to figure out how to reproduce this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116541#3285927 , @thakis wrote:

> Still failing: http://45.33.8.238/macm1/26873/step_7.txt

Weird, can you show me what `-fopenmp -fopenmp-targets=nvptx64 
-fopenmp-new-driver -ccc-print-bindings` looks like there? I'm not sure why but 
it doesn't seem to be getting one of the expected arguments but I'm not sure 
how to reproduce it.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116542#3285857 , @cmtice wrote:

> This change introduces a circular dependency: BitcodeWriters now depends on 
> TransformUtils, but TransformUtils also depends on BitcodeWriters.  This 
> appears to be a layering violation.

Might explain why it wasn't included before, should I just copy the function I 
here and remove the dependency?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116542#3285983 , @MaskRay wrote:

> @jhuber6 Please don't do 4a780aa13ee5e1c8268de54ef946200a270127b9 
> .. OK, I 
> was late.
>
> See D118666  for the proper fix.
>
> I'd be better to revert this relevant changes if that doesn't make you feel 
> back.
> I can prepare the revert.

That's fine, I put it here originally because it was grouped with another 
similar function. But it's likely that one should be moved as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118666: [ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 accepted this revision.
jhuber6 added a comment.
This revision is now accepted and ready to land.

LGTM, thanks for fixing this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118666/new/

https://reviews.llvm.org/D118666

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116975: [OpenMP] Initial Implementation of LTO and bitcode linking in linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGc732c3df749b: [OpenMP] Initial Implementation of LTO and 
bitcode linking in linker wrapper (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D116975?vs=404528=404790#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116975/new/

https://reviews.llvm.org/D116975

Files:
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -17,9 +17,12 @@
 #include "clang/Basic/Version.h"
 #include "llvm/BinaryFormat/Magic.h"
 #include "llvm/Bitcode/BitcodeWriter.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Constants.h"
+#include "llvm/IR/DiagnosticPrinter.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IRReader/IRReader.h"
+#include "llvm/LTO/LTO.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/ArchiveWriter.h"
 #include "llvm/Object/Binary.h"
@@ -36,6 +39,7 @@
 #include "llvm/Support/Signals.h"
 #include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/StringSaver.h"
+#include "llvm/Support/TargetSelect.h"
 #include "llvm/Support/WithColor.h"
 #include "llvm/Support/raw_ostream.h"
 
@@ -58,6 +62,15 @@
cl::desc("Path of linker binary"),
cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt
+TargetFeatures("target-feature", cl::desc("Target features for triple"),
+   cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt OptLevel("opt-level",
+ cl::desc("Optimization level for LTO"),
+ cl::init("O0"),
+ cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
 HostLinkerArgs(cl::Sink, cl::desc("..."));
@@ -68,6 +81,9 @@
 /// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
 
+/// Codegen flags for LTO backend.
+static codegen::RegisterCodeGenFlags CodeGenFlags;
+
 /// Magic section string that marks the existence of offloading data. The
 /// section string will be formatted as `.llvm.offloading..`.
 #define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
@@ -191,6 +207,28 @@
   if (ToBeStripped.empty())
 return None;
 
+  // If the object file to strip doesn't exist we need to write it so we can
+  // pass it to llvm-strip.
+  SmallString<128> StripFile = Obj.getFileName();
+  if (!sys::fs::exists(StripFile)) {
+SmallString<128> TempFile;
+if (std::error_code EC = sys::fs::createTemporaryFile(
+sys::path::stem(StripFile), "o", TempFile))
+  return createFileError(TempFile, EC);
+TempFiles.push_back(static_cast(TempFile));
+
+auto Contents = Obj.getMemoryBufferRef().getBuffer();
+Expected> OutputOrErr =
+FileOutputBuffer::create(TempFile, Contents.size());
+if (!OutputOrErr)
+  return OutputOrErr.takeError();
+std::unique_ptr Output = std::move(*OutputOrErr);
+std::copy(Contents.begin(), Contents.end(), Output->getBufferStart());
+if (Error E = Output->commit())
+  return E;
+StripFile = TempFile;
+  }
+
   // We will use llvm-strip to remove the now unneeded section containing the
   // offloading code.
   ErrorOr StripPath = sys::findProgramByName("llvm-strip");
@@ -207,7 +245,7 @@
   SmallVector StripArgs;
   StripArgs.push_back(*StripPath);
   StripArgs.push_back("--no-strip-all");
-  StripArgs.push_back(Obj.getFileName());
+  StripArgs.push_back(StripFile);
   for (auto  : ToBeStripped) {
 StripArgs.push_back("--remove-section");
 StripArgs.push_back(Section);
@@ -408,6 +446,44 @@
 
 // TODO: Move these to a separate file.
 namespace nvptx {
+Expected assemble(StringRef InputFile, Triple TheTriple,
+   StringRef Arch) {
+  // NVPTX uses the nvlink binary to link device object files.
+  ErrorOr PtxasPath =
+  sys::findProgramByName("ptxas", sys::path::parent_path(LinkerExecutable));
+  if (!PtxasPath)
+PtxasPath = sys::findProgramByName("ptxas");
+  if (!PtxasPath)
+return createStringError(PtxasPath.getError(),
+ "Unable to find 'ptxas' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  // TODO: Pass 

[PATCH] D117048: [OpenMP] Link the bitcode library late for device LTO

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG3762111aa960: [OpenMP] Link the bitcode library late for 
device LTO (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117048/new/

https://reviews.llvm.org/D117048

Files:
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -68,9 +68,14 @@
 
 static cl::opt OptLevel("opt-level",
  cl::desc("Optimization level for LTO"),
- cl::init("O0"),
+ cl::init("O2"),
  cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt
+BitcodeLibrary("target-library",
+   cl::desc("Path for the target bitcode library"),
+   cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
 HostLinkerArgs(cl::Sink, cl::desc("..."));
@@ -197,7 +202,7 @@
   std::unique_ptr Output = std::move(*OutputOrErr);
   std::copy(Contents->begin(), Contents->end(), Output->getBufferStart());
   if (Error E = Output->commit())
-return E;
+return std::move(E);
 
   DeviceFiles.emplace_back(DeviceTriple, Arch, TempFile);
   ToBeStripped.push_back(*Name);
@@ -225,7 +230,7 @@
 std::unique_ptr Output = std::move(*OutputOrErr);
 std::copy(Contents.begin(), Contents.end(), Output->getBufferStart());
 if (Error E = Output->commit())
-  return E;
+  return std::move(E);
 StripFile = TempFile;
   }
 
@@ -307,7 +312,7 @@
 std::unique_ptr Output = std::move(*OutputOrErr);
 std::copy(Contents.begin(), Contents.end(), Output->getBufferStart());
 if (Error E = Output->commit())
-  return E;
+  return std::move(E);
 
 DeviceFiles.emplace_back(DeviceTriple, Arch, TempFile);
 ToBeDeleted.push_back();
@@ -318,7 +323,7 @@
 
   // We need to materialize the lazy module before we make any changes.
   if (Error Err = M->materializeAll())
-return Err;
+return std::move(Err);
 
   // Remove the global from the module and write it to a new file.
   for (GlobalVariable *GV : ToBeDeleted) {
@@ -392,7 +397,7 @@
   }
 
   if (Err)
-return Err;
+return std::move(Err);
 
   if (!NewMembers)
 return None;
@@ -406,9 +411,9 @@
 
   std::unique_ptr Buffer =
   MemoryBuffer::getMemBuffer(Library.getMemoryBufferRef(), false);
-  if (Error WriteErr = writeArchive(TempFile, Members, true, Library.kind(),
+  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(),
 true, Library.isThin(), std::move(Buffer)))
-return WriteErr;
+return std::move(Err);
 
   return static_cast(TempFile);
 }
@@ -726,7 +731,7 @@
 
 // Add the bitcode file with its resolved symbols to the LTO job.
 if (Error Err = LTOBackend->add(std::move(BitcodeFile), Resolutions))
-  return Err;
+  return std::move(Err);
   }
 
   // Run the LTO job to compile the bitcode.
@@ -744,7 +749,7 @@
 std::make_unique(FD, true));
   };
   if (Error Err = LTOBackend->run(AddStream))
-return Err;
+return std::move(Err);
 
   for (auto  : Files) {
 if (!TheTriple.isNVPTX())
@@ -957,6 +962,17 @@
 }
   }
 
+  // Add the device bitcode library to the device files if it was passed in.
+  if (!BitcodeLibrary.empty()) {
+// FIXME: Hacky workaround to avoid a backend crash at O0.
+if (OptLevel[1] - '0' == 0)
+  OptLevel[1] = '1';
+auto DeviceAndPath = StringRef(BitcodeLibrary).split('=');
+auto TripleAndArch = DeviceAndPath.first.rsplit('-');
+DeviceFiles.emplace_back(TripleAndArch.first, TripleAndArch.second,
+ DeviceAndPath.second);
+  }
+
   // Link the device images extracted from the linker input.
   SmallVector LinkedImages;
   if (Error Err = linkDeviceFiles(DeviceFiles, LinkerArgs, LinkedImages))
Index: clang/lib/Driver/ToolChains/Cuda.cpp
===
--- clang/lib/Driver/ToolChains/Cuda.cpp
+++ clang/lib/Driver/ToolChains/Cuda.cpp
@@ -744,6 +744,10 @@
   return;
 }
 
+// Link the bitcode library late if we're using device LTO.
+if (getDriver().isUsingLTO(/* IsOffload */ true))
+  return;
+
 std::string BitcodeSuffix;
 if (DriverArgs.hasFlag(options::OPT_fopenmp_target_new_runtime,
options::OPT_fno_openmp_target_new_runtime, true))
Index: 

[PATCH] D117049: [OpenMP] Add support for embedding bitcode images in wrapper tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf28c3153ee6d: [OpenMP] Add support for embedding bitcode 
images in wrapper tool (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117049/new/

https://reviews.llvm.org/D117049

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -76,12 +76,18 @@
cl::desc("Path for the target bitcode library"),
cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt EmbedBC(
+"target-embed-bc", cl::ZeroOrMore,
+cl::desc("Embed linked bitcode instead of an executable device image."),
+cl::init(false), cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
-HostLinkerArgs(cl::Sink, cl::desc("..."));
+HostLinkerArgs(cl::Positional,
+   cl::desc("..."));
 
 /// Path of the current binary.
-static std::string LinkerExecutable;
+static const char *LinkerExecutable;
 
 /// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
@@ -411,8 +417,8 @@
 
   std::unique_ptr Buffer =
   MemoryBuffer::getMemBuffer(Library.getMemoryBufferRef(), false);
-  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(),
-true, Library.isThin(), std::move(Buffer)))
+  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(), true,
+   Library.isThin(), std::move(Buffer)))
 return std::move(Err);
 
   return static_cast(TempFile);
@@ -489,7 +495,7 @@
   return static_cast(TempFile);
 }
 
-Expected link(ArrayRef InputFiles,
+Expected link(ArrayRef InputFiles,
ArrayRef LinkerArgs, Triple TheTriple,
StringRef Arch) {
   // NVPTX uses the nvlink binary to link device object files.
@@ -520,7 +526,7 @@
   CmdArgs.push_back(Arg);
 
   // Add extracted input files.
-  for (auto Input : InputFiles)
+  for (StringRef Input : InputFiles)
 CmdArgs.push_back(Input);
 
   if (sys::ExecuteAndWait(*NvlinkPath, CmdArgs))
@@ -530,7 +536,7 @@
 }
 } // namespace nvptx
 
-Expected linkDevice(ArrayRef InputFiles,
+Expected linkDevice(ArrayRef InputFiles,
  ArrayRef LinkerArgs,
  Triple TheTriple, StringRef Arch) {
   switch (TheTriple.getArch()) {
@@ -597,8 +603,10 @@
   llvm_unreachable("Invalid optimization level");
 }
 
-std::unique_ptr createLTO(const Triple , StringRef Arch,
-bool WholeProgram) {
+template >
+std::unique_ptr createLTO(
+const Triple , StringRef Arch, bool WholeProgram,
+ModuleHook Hook = [](size_t, const Module &) { return true; }) {
   lto::Config Conf;
   lto::ThinBackend Backend;
   // TODO: Handle index-only thin-LTO
@@ -617,7 +625,7 @@
   Conf.PTO.LoopVectorization = Conf.OptLevel > 1;
   Conf.PTO.SLPVectorization = Conf.OptLevel > 1;
 
-  // TODO: Handle outputting bitcode using a module hook.
+  Conf.PostInternalizeModuleHook = Hook;
   if (TheTriple.isNVPTX())
 Conf.CGFileType = CGFT_AssemblyFile;
   else
@@ -637,11 +645,11 @@
  [](char C) { return C == '_' || isAlnum(C); });
 }
 
-Expected> linkBitcodeFiles(ArrayRef InputFiles,
- const Triple ,
- StringRef Arch) {
+Error linkBitcodeFiles(SmallVectorImpl ,
+   const Triple , StringRef Arch) {
   SmallVector, 4> SavedBuffers;
   SmallVector, 4> BitcodeFiles;
+  SmallVector NewInputFiles;
   StringMap UsedInRegularObj;
 
   // Search for bitcode files in the input and create an LTO input file. If it
@@ -660,6 +668,7 @@
   if (!ObjFile)
 return ObjFile.takeError();
 
+  NewInputFiles.push_back(File.str());
   for (auto  : (*ObjFile)->symbols()) {
 Expected Name = Sym.getName();
 if (!Name)
@@ -679,12 +688,36 @@
   }
 
   if (BitcodeFiles.empty())
-return None;
+return Error::success();
+
+  auto HandleError = [&](std::error_code EC) {
+logAllUnhandledErrors(errorCodeToError(EC),
+  WithColor::error(errs(), LinkerExecutable));
+exit(1);
+  };
+
+  // LTO Module hook to output bitcode without running the backend.
+  auto LinkOnly = [&](size_t Task, const Module ) {
+SmallString<128> TempFile;
+if (std::error_code EC = sys::fs::createTemporaryFile(
+"jit-" + TheTriple.getTriple(), "bc", TempFile))
+  HandleError(EC);
+std::error_code EC;
+

[PATCH] D116545: [OpenMP] Add support for extracting device code in linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb8239af0eeed: [OpenMP] Add support for extracting device 
code in linker wrapper (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116545/new/

https://reviews.llvm.org/D116545

Files:
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -5,23 +5,41 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 //===-===//
-///
+//
+// This tool works as a wrapper over a linking job. This tool is used to create
+// linked device images for offloading. It scans the linker's input for embedded
+// device offloading data stored in sections `.llvm.offloading..`
+// and extracts it as a temporary file. The extracted device files will then be
+// passed to a device linking job to create a final device image.
+//
 //===-===//
 
 #include "clang/Basic/Version.h"
+#include "llvm/BinaryFormat/Magic.h"
+#include "llvm/Bitcode/BitcodeWriter.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IRReader/IRReader.h"
 #include "llvm/Object/Archive.h"
+#include "llvm/Object/ArchiveWriter.h"
+#include "llvm/Object/Binary.h"
+#include "llvm/Object/ObjectFile.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/Errc.h"
+#include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/InitLLVM.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include "llvm/Support/Program.h"
 #include "llvm/Support/Signals.h"
+#include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/StringSaver.h"
 #include "llvm/Support/WithColor.h"
 #include "llvm/Support/raw_ostream.h"
 
 using namespace llvm;
+using namespace llvm::object;
 
 static cl::opt Help("h", cl::desc("Alias for -help"), cl::Hidden);
 
@@ -30,16 +48,42 @@
 static cl::OptionCategory
 ClangLinkerWrapperCategory("clang-linker-wrapper options");
 
+static cl::opt StripSections(
+"strip-sections", cl::ZeroOrMore,
+cl::desc("Strip offloading sections from the host object file."),
+cl::init(true), cl::cat(ClangLinkerWrapperCategory));
+
 static cl::opt LinkerUserPath("linker-path",
cl::desc("Path of linker binary"),
cl::cat(ClangLinkerWrapperCategory));
 
-// Do not parse linker options
+// Do not parse linker options.
 static cl::list
 LinkerArgs(cl::Sink, cl::desc("..."));
 
-static Error runLinker(std::string LinkerPath,
-   SmallVectorImpl ) {
+/// Path of the current binary.
+static std::string LinkerExecutable;
+
+/// Magic section string that marks the existence of offloading data. The
+/// section string will be formatted as `.llvm.offloading..`.
+#define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading"
+
+struct DeviceFile {
+  DeviceFile(StringRef TheTriple, StringRef Arch, StringRef Filename)
+  : TheTriple(TheTriple), Arch(Arch), Filename(Filename) {}
+
+  const Triple TheTriple;
+  const std::string Arch;
+  const std::string Filename;
+};
+
+namespace {
+
+Expected>
+extractFromBuffer(std::unique_ptr Buffer,
+  SmallVectorImpl );
+
+Error runLinker(std::string , SmallVectorImpl ) {
   std::vector LinkerArgs;
   LinkerArgs.push_back(LinkerPath);
   for (auto  : Args)
@@ -50,11 +94,301 @@
   return Error::success();
 }
 
-static void PrintVersion(raw_ostream ) {
+void PrintVersion(raw_ostream ) {
   OS << clang::getClangToolFullVersion("clang-linker-wrapper") << '\n';
 }
 
+void removeFromCompilerUsed(Module , GlobalValue ) {
+  GlobalVariable *GV = M.getGlobalVariable("llvm.compiler.used");
+  Type *Int8PtrTy = Type::getInt8PtrTy(M.getContext());
+  Constant *ValueToRemove =
+  ConstantExpr::getPointerBitCastOrAddrSpaceCast(, Int8PtrTy);
+  SmallPtrSet InitAsSet;
+  SmallVector Init;
+  if (GV) {
+if (GV->hasInitializer()) {
+  auto *CA = cast(GV->getInitializer());
+  for (auto  : CA->operands()) {
+Constant *C = cast_or_null(Op);
+if (C != ValueToRemove && InitAsSet.insert(C).second)
+  Init.push_back(C);
+  }
+}
+GV->eraseFromParent();
+  }
+
+  if (Init.empty())
+return;
+
+  ArrayType *ATy = ArrayType::get(Int8PtrTy, Init.size());
+  GV = new llvm::GlobalVariable(M, ATy, false, GlobalValue::AppendingLinkage,
+ConstantArray::get(ATy, Init),
+

[PATCH] D116627: [Clang] Initial support for linking offloading code in tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd0744585f9ea: [Clang] Initial support for linking offloading 
code in tool (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D116627?vs=404598=404788#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116627/new/

https://reviews.llvm.org/D116627

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -28,6 +28,7 @@
 #include "llvm/Support/Errc.h"
 #include "llvm/Support/FileOutputBuffer.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/Host.h"
 #include "llvm/Support/InitLLVM.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
@@ -59,22 +60,26 @@
 
 // Do not parse linker options.
 static cl::list
-LinkerArgs(cl::Sink, cl::desc("..."));
+HostLinkerArgs(cl::Sink, cl::desc("..."));
 
 /// Path of the current binary.
 static std::string LinkerExecutable;
 
+static SmallVector TempFiles;
 /// Magic section string that marks the existence of offloading data. The
 /// section string will be formatted as `.llvm.offloading..`.
-#define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading"
+#define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
 
+/// Information for a device offloading file extracted from the host.
 struct DeviceFile {
   DeviceFile(StringRef TheTriple, StringRef Arch, StringRef Filename)
   : TheTriple(TheTriple), Arch(Arch), Filename(Filename) {}
 
-  const Triple TheTriple;
+  const std::string TheTriple;
   const std::string Arch;
   const std::string Filename;
+
+  operator std::string() const { return TheTriple + "-" + Arch; }
 };
 
 namespace {
@@ -83,6 +88,16 @@
 extractFromBuffer(std::unique_ptr Buffer,
   SmallVectorImpl );
 
+static StringRef getDeviceFileExtension(StringRef DeviceTriple,
+bool IsBitcode = false) {
+  Triple TheTriple(DeviceTriple);
+  if (TheTriple.isAMDGPU() || IsBitcode)
+return "bc";
+  if (TheTriple.isNVPTX())
+return "cubin";
+  return "o";
+}
+
 Error runLinker(std::string , SmallVectorImpl ) {
   std::vector LinkerArgs;
   LinkerArgs.push_back(LinkerPath);
@@ -150,9 +165,12 @@
 
 if (Expected Contents = Sec.getContents()) {
   SmallString<128> TempFile;
+  StringRef DeviceExtension = getDeviceFileExtension(
+  DeviceTriple, identify_magic(*Contents) == file_magic::bitcode);
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  Prefix + "-device-" + DeviceTriple, Extension, TempFile))
+  Prefix + "-device-" + DeviceTriple, DeviceExtension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
 
   Expected> OutputOrErr =
   FileOutputBuffer::create(TempFile, Sec.getSize());
@@ -173,10 +191,7 @@
 
   // We will use llvm-strip to remove the now unneeded section containing the
   // offloading code.
-  ErrorOr StripPath = sys::findProgramByName(
-  "llvm-strip", sys::path::parent_path(LinkerExecutable));
-  if (!StripPath)
-StripPath = sys::findProgramByName("llvm-strip");
+  ErrorOr StripPath = sys::findProgramByName("llvm-strip");
   if (!StripPath)
 return createStringError(StripPath.getError(),
  "Unable to find 'llvm-strip' in path");
@@ -185,6 +200,7 @@
   if (std::error_code EC =
   sys::fs::createTemporaryFile(Prefix + "-host", Extension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
 
   SmallVector StripArgs;
   StripArgs.push_back(*StripPath);
@@ -237,9 +253,12 @@
 
 StringRef Contents = CDS->getAsString();
 SmallString<128> TempFile;
+StringRef DeviceExtension = getDeviceFileExtension(
+DeviceTriple, identify_magic(Contents) == file_magic::bitcode);
 if (std::error_code EC = sys::fs::createTemporaryFile(
-Prefix + "-device-" + DeviceTriple, Extension, TempFile))
+Prefix + "-device-" + DeviceTriple, DeviceExtension, TempFile))
   return createFileError(TempFile, EC);
+TempFiles.push_back(static_cast(TempFile));
 
 Expected> OutputOrErr =
 FileOutputBuffer::create(TempFile, Contents.size());
@@ -271,6 +290,8 @@
   if (std::error_code EC =
   sys::fs::createTemporaryFile(Prefix + "-host", Extension, TempFile))
 return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
   std::error_code EC;
   raw_fd_ostream HostOutput(TempFile, EC, sys::fs::OF_None);
   if (EC)
@@ -341,6 +362,7 @@
   if (std::error_code EC =
   

[PATCH] D117156: [OpenMP] Add extra flag handling to linker wrapper

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
jhuber6 marked an inline comment as done.
Closed by commit rGcb7cfaec7185: [OpenMP] Add extra flag handling to linker 
wrapper (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117156/new/

https://reviews.llvm.org/D117156

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -48,6 +48,12 @@
 
 static cl::opt Help("h", cl::desc("Alias for -help"), cl::Hidden);
 
+enum DebugKind {
+  NoDebugInfo,
+  DirectivesOnly,
+  FullDebugInfo,
+};
+
 // Mark all our options with this category, everything else (except for -help)
 // will be hidden.
 static cl::OptionCategory
@@ -58,29 +64,53 @@
 cl::desc("Strip offloading sections from the host object file."),
 cl::init(true), cl::cat(ClangLinkerWrapperCategory));
 
-static cl::opt LinkerUserPath("linker-path",
+static cl::opt LinkerUserPath("linker-path", cl::Required,
cl::desc("Path of linker binary"),
cl::cat(ClangLinkerWrapperCategory));
 
 static cl::opt
-TargetFeatures("target-feature", cl::desc("Target features for triple"),
+TargetFeatures("target-feature", cl::ZeroOrMore,
+   cl::desc("Target features for triple"),
cl::cat(ClangLinkerWrapperCategory));
 
-static cl::opt OptLevel("opt-level",
+static cl::opt OptLevel("opt-level", cl::ZeroOrMore,
  cl::desc("Optimization level for LTO"),
  cl::init("O2"),
  cl::cat(ClangLinkerWrapperCategory));
 
 static cl::opt
-BitcodeLibrary("target-library",
+BitcodeLibrary("target-library", cl::ZeroOrMore,
cl::desc("Path for the target bitcode library"),
cl::cat(ClangLinkerWrapperCategory));
 
 static cl::opt EmbedBC(
 "target-embed-bc", cl::ZeroOrMore,
-cl::desc("Embed linked bitcode instead of an executable device image."),
+cl::desc("Embed linked bitcode instead of an executable device image"),
 cl::init(false), cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt
+HostTriple("host-triple", cl::ZeroOrMore,
+   cl::desc("Triple to use for the host compilation"),
+   cl::init(sys::getDefaultTargetTriple()),
+   cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt
+PtxasOption("ptxas-option", cl::ZeroOrMore,
+cl::desc("Argument to pass to the ptxas invocation"),
+cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt Verbose("v", cl::ZeroOrMore,
+ cl::desc("Verbose output from tools"),
+ cl::init(false),
+ cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt DebugInfo(
+cl::desc("Choose debugging level:"), cl::init(NoDebugInfo),
+cl::values(clEnumValN(NoDebugInfo, "g0", "No debug information"),
+   clEnumValN(DirectivesOnly, "gline-directives-only",
+  "Direction information"),
+   clEnumValN(FullDebugInfo, "g", "Full debugging support")));
+
 // Do not parse linker options.
 static cl::list
 HostLinkerArgs(cl::Positional,
@@ -480,6 +510,14 @@
   std::string Opt = "-" + OptLevel;
   CmdArgs.push_back(*PtxasPath);
   CmdArgs.push_back(TheTriple.isArch64Bit() ? "-m64" : "-m32");
+  if (Verbose)
+CmdArgs.push_back("-v");
+  if (DebugInfo == DirectivesOnly && OptLevel[1] == '0')
+CmdArgs.push_back("-lineinfo");
+  else if (DebugInfo == FullDebugInfo && OptLevel[1] == '0')
+CmdArgs.push_back("-g");
+  if (!PtxasOption.empty())
+CmdArgs.push_back(PtxasOption);
   CmdArgs.push_back("-o");
   CmdArgs.push_back(TempFile);
   CmdArgs.push_back(Opt);
@@ -511,10 +549,13 @@
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
-  // TODO: Pass in arguments like `-g` and `-v` from the driver.
   SmallVector CmdArgs;
   CmdArgs.push_back(*NvlinkPath);
   CmdArgs.push_back(TheTriple.isArch64Bit() ? "-m64" : "-m32");
+  if (Verbose)
+CmdArgs.push_back("-v");
+  if (DebugInfo != NoDebugInfo)
+CmdArgs.push_back("-g");
   CmdArgs.push_back("-o");
   CmdArgs.push_back(TempFile);
   CmdArgs.push_back("-arch");
@@ -563,16 +604,16 @@
 
   switch (DI.getSeverity()) {
   case DS_Error:
-WithColor::error(errs(), LinkerExecutable) << ErrStorage;
+WithColor::error(errs(), LinkerExecutable) << ErrStorage << "\n";
 break;
   case DS_Warning:
-

[PATCH] D117246: [OpenMP] Add support for linking AMDGPU images

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGce16ca3c7419: [OpenMP] Add support for linking AMDGPU images 
(authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117246/new/

https://reviews.llvm.org/D117246

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -501,7 +501,7 @@
   // Create a new file to write the linked device image to.
   SmallString<128> TempFile;
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+  "lto-" + TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
@@ -576,6 +576,50 @@
   return static_cast(TempFile);
 }
 } // namespace nvptx
+namespace amdgcn {
+Expected link(ArrayRef InputFiles,
+   ArrayRef LinkerArgs, Triple TheTriple,
+   StringRef Arch) {
+  // AMDGPU uses the lld binary to link device object files.
+  ErrorOr LLDPath =
+  sys::findProgramByName("lld", sys::path::parent_path(LinkerExecutable));
+  if (!LLDPath)
+LLDPath = sys::findProgramByName("lld");
+  if (!LLDPath)
+return createStringError(LLDPath.getError(),
+ "Unable to find 'lld' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch + "-image", "out", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  SmallVector CmdArgs;
+  CmdArgs.push_back(*LLDPath);
+  CmdArgs.push_back("-flavor");
+  CmdArgs.push_back("gnu");
+  CmdArgs.push_back("--no-undefined");
+  CmdArgs.push_back("-shared");
+  CmdArgs.push_back("-o");
+  CmdArgs.push_back(TempFile);
+
+  // Copy system library paths used by the host linker.
+  for (StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))
+  CmdArgs.push_back(Arg);
+
+  // Add extracted input files.
+  for (StringRef Input : InputFiles)
+CmdArgs.push_back(Input);
+
+  if (sys::ExecuteAndWait(*LLDPath, CmdArgs))
+return createStringError(inconvertibleErrorCode(), "'lld' failed");
+
+  return static_cast(TempFile);
+}
+} // namespace amdgcn
 
 Expected linkDevice(ArrayRef InputFiles,
  ArrayRef LinkerArgs,
@@ -585,7 +629,7 @@
   case Triple::nvptx64:
 return nvptx::link(InputFiles, LinkerArgs, TheTriple, Arch);
   case Triple::amdgcn:
-// TODO: AMDGCN linking support.
+return amdgcn::link(InputFiles, LinkerArgs, TheTriple, Arch);
   case Triple::x86:
   case Triple::x86_64:
 // TODO: x86 linking support.


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -501,7 +501,7 @@
   // Create a new file to write the linked device image to.
   SmallString<128> TempFile;
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+  "lto-" + TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
@@ -576,6 +576,50 @@
   return static_cast(TempFile);
 }
 } // namespace nvptx
+namespace amdgcn {
+Expected link(ArrayRef InputFiles,
+   ArrayRef LinkerArgs, Triple TheTriple,
+   StringRef Arch) {
+  // AMDGPU uses the lld binary to link device object files.
+  ErrorOr LLDPath =
+  sys::findProgramByName("lld", sys::path::parent_path(LinkerExecutable));
+  if (!LLDPath)
+LLDPath = sys::findProgramByName("lld");
+  if (!LLDPath)
+return createStringError(LLDPath.getError(),
+ "Unable to find 'lld' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch + "-image", "out", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  SmallVector CmdArgs;
+  CmdArgs.push_back(*LLDPath);
+  CmdArgs.push_back("-flavor");
+  CmdArgs.push_back("gnu");
+  CmdArgs.push_back("--no-undefined");
+  CmdArgs.push_back("-shared");
+  CmdArgs.push_back("-o");
+  CmdArgs.push_back(TempFile);
+
+  // Copy system 

[PATCH] D118155: [OpenMP] Improve symbol resolution for OpenMP Offloading LTO

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG46d019041cd9: [OpenMP] Improve symbol resolution for OpenMP 
Offloading LTO (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118155/new/

https://reviews.llvm.org/D118155

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -736,6 +736,7 @@
   SmallVector, 4> BitcodeFiles;
   SmallVector NewInputFiles;
   StringMap UsedInRegularObj;
+  StringMap UsedInSharedLib;
 
   // Search for bitcode files in the input and create an LTO input file. If it
   // is not a bitcode file, scan its symbol table for symbols we need to
@@ -759,7 +760,11 @@
 if (!Name)
   return Name.takeError();
 
-UsedInRegularObj[*Name] = true;
+// Record if we've seen these symbols in any object or shared libraries.
+if ((*ObjFile)->isRelocatableObject()) {
+  UsedInRegularObj[*Name] = true;
+} else
+  UsedInSharedLib[*Name] = true;
   }
 } else {
   Expected> InputFileOrErr =
@@ -767,6 +772,7 @@
   if (!InputFileOrErr)
 return InputFileOrErr.takeError();
 
+  // Save the input file and the buffer associated with its memory.
   BitcodeFiles.push_back(std::move(*InputFileOrErr));
   SavedBuffers.push_back(std::move(*BufferOrErr));
 }
@@ -797,22 +803,16 @@
 return false;
   };
 
-  // We have visibility of the whole program if every input is bitcode, all
-  // inputs are statically linked so there should be no external references.
+  // We assume visibility of the whole program if every input file was bitcode.
   bool WholeProgram = BitcodeFiles.size() == InputFiles.size();
   auto LTOBackend = (EmbedBC)
 ? createLTO(TheTriple, Arch, WholeProgram, LinkOnly)
 : createLTO(TheTriple, Arch, WholeProgram);
 
-  // TODO: Run more tests to verify that this is correct.
-  // Create the LTO instance with the necessary config and add the bitcode files
-  // to it after resolving symbols. We make a few assumptions about symbol
-  // resolution.
-  // 1. The target is going to be a stand-alone executable file.
-  // 2. We do not support relocatable object files.
-  // 3. All inputs are relocatable object files extracted from host binaries, so
-  //there is no resolution to a dynamic library.
-  StringMap PrevailingSymbols;
+  // We need to resolve the symbols so the LTO backend knows which symbols need
+  // to be kept or can be internalized. This is a simplified symbol resolution
+  // scheme to approximate the full resolution a linker would do.
+  DenseSet PrevailingSymbols;
   for (auto  : BitcodeFiles) {
 const auto Symbols = BitcodeFile->symbols();
 SmallVector Resolutions(Symbols.size());
@@ -821,35 +821,43 @@
   lto::SymbolResolution  = Resolutions[Idx++];
 
   // We will use this as the prevailing symbol definition in LTO unless
-  // it is undefined in the module or another symbol has already been used.
-  Res.Prevailing = !Sym.isUndefined() && !PrevailingSymbols[Sym.getName()];
-
-  // We need LTO to preserve symbols referenced in other object files, or
-  // are needed by the rest of the toolchain.
+  // it is undefined or another definition has already been used.
+  Res.Prevailing =
+  !Sym.isUndefined() && PrevailingSymbols.insert(Sym.getName()).second;
+
+  // We need LTO to preseve the following global symbols:
+  // 1) Symbols used in regular objects.
+  // 2) Sections that will be given a __start/__stop symbol.
+  // 3) Prevailing symbols that are needed visibile to external libraries.
   Res.VisibleToRegularObj =
   UsedInRegularObj[Sym.getName()] ||
   isValidCIdentifier(Sym.getSectionName()) ||
-  (Res.Prevailing && Sym.getName().startswith("__omp"));
-
-  // We do not currently support shared libraries, so no symbols will be
-  // referenced externally by shared libraries.
-  Res.ExportDynamic = false;
-
-  // The result will currently always be an executable, so the only time the
-  // definition will not reside in this link unit is if it's undefined.
-  Res.FinalDefinitionInLinkageUnit = !Sym.isUndefined();
+  (Res.Prevailing &&
+   (Sym.getVisibility() != GlobalValue::HiddenVisibility &&
+!Sym.canBeOmittedFromSymbolTable()));
+
+  // Identify symbols that must be exported dynamically and can be
+  // referenced by other files.
+  Res.ExportDynamic =
+  Sym.getVisibility() != GlobalValue::HiddenVisibility &&
+  

[PATCH] D118197: [OpenMP] Replace sysmtem call to `llc` with target machine

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGeb6ddf288cd0: [OpenMP] Replace sysmtem call to `llc` with 
target machine (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D118197?vs=403042=404796#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118197/new/

https://reviews.llvm.org/D118197

Files:
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -23,6 +23,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/IRReader/IRReader.h"
 #include "llvm/LTO/LTO.h"
+#include "llvm/MC/TargetRegistry.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/ArchiveWriter.h"
 #include "llvm/Object/Binary.h"
@@ -42,6 +43,7 @@
 #include "llvm/Support/TargetSelect.h"
 #include "llvm/Support/WithColor.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
 using namespace llvm::object;
@@ -958,6 +960,49 @@
   return Error::success();
 }
 
+// Compile the module to an object file using the appropriate target machine for
+// the host triple.
+Expected compileModule(Module ) {
+  if (M.getTargetTriple().empty())
+M.setTargetTriple(HostTriple);
+
+  std::string Msg;
+  const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
+  if (!T)
+return createStringError(inconvertibleErrorCode(), Msg);
+
+  auto Options =
+  codegen::InitTargetOptionsFromCodeGenFlags(Triple(M.getTargetTriple()));
+  StringRef CPU = "";
+  StringRef Features = "";
+  std::unique_ptr TM(T->createTargetMachine(
+  HostTriple, CPU, Features, Options, Reloc::PIC_, M.getCodeModel()));
+
+  if (M.getDataLayout().isDefault())
+M.setDataLayout(TM->createDataLayout());
+
+  SmallString<128> ObjectFile;
+  int FD = -1;
+  if (Error Err = createOutputFile(sys::path::filename(ExecutableName) +
+   "offload-wrapper",
+   "o", ObjectFile))
+return std::move(Err);
+  if (std::error_code EC = sys::fs::openFileForWrite(ObjectFile, FD))
+return errorCodeToError(EC);
+
+  auto OS = std::make_unique(FD, true);
+
+  legacy::PassManager CodeGenPasses;
+  TargetLibraryInfoImpl TLII(Triple(M.getTargetTriple()));
+  CodeGenPasses.add(new TargetLibraryInfoWrapperPass(TLII));
+  if (TM->addPassesToEmitFile(CodeGenPasses, *OS, nullptr, CGFT_ObjectFile))
+return createStringError(inconvertibleErrorCode(),
+ "Failed to execute host backend");
+  CodeGenPasses.run(M);
+
+  return static_cast(ObjectFile);
+}
+
 /// Creates an object file containing the device image stored in the filename \p
 /// ImageFile that can be linked with the host.
 Expected wrapDeviceImage(StringRef ImageFile) {
@@ -987,30 +1032,11 @@
 return createStringError(inconvertibleErrorCode(),
  "'clang-offload-wrapper' failed");
 
-  ErrorOr CompilerPath = sys::findProgramByName("llc");
-  if (!WrapperPath)
-return createStringError(WrapperPath.getError(),
- "Unable to find 'llc' in path");
-
-  // Create a new file to write the wrapped bitcode file to.
-  SmallString<128> ObjectFile;
-  if (Error Err = createOutputFile(sys::path::filename(ExecutableName) +
-   "-offload-wrapper",
-   "o", ObjectFile))
-return std::move(Err);
-
-  SmallVector CompilerArgs;
-  CompilerArgs.push_back(*CompilerPath);
-  CompilerArgs.push_back("--filetype=obj");
-  CompilerArgs.push_back("--relocation-model=pic");
-  CompilerArgs.push_back("-o");
-  CompilerArgs.push_back(ObjectFile);
-  CompilerArgs.push_back(BitcodeFile);
-
-  if (sys::ExecuteAndWait(*CompilerPath, CompilerArgs))
-return createStringError(inconvertibleErrorCode(), "'llc' failed");
+  LLVMContext Context;
+  SMDiagnostic Err;
+  std::unique_ptr M = parseIRFile(BitcodeFile, Err, Context);
 
-  return static_cast(ObjectFile);
+  return compileModule(*M);
 }
 
 Optional findFile(StringRef Dir, const Twine ) {
Index: clang/tools/clang-linker-wrapper/CMakeLists.txt
===
--- clang/tools/clang-linker-wrapper/CMakeLists.txt
+++ clang/tools/clang-linker-wrapper/CMakeLists.txt
@@ -4,6 +4,8 @@
   Core
   BinaryFormat
   MC
+  Target
+  Analysis
   Passes
   IRReader
   Object
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116675: [OpenMP] Search for static libraries in offload linker tool

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG0e82c7553be9: [OpenMP] Search for static libraries in 
offload linker tool (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116675/new/

https://reviews.llvm.org/D116675

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -65,7 +65,9 @@
 /// Path of the current binary.
 static std::string LinkerExecutable;
 
+/// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
+
 /// Magic section string that marks the existence of offloading data. The
 /// section string will be formatted as `.llvm.offloading..`.
 #define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
@@ -551,6 +553,44 @@
   return static_cast(ObjectFile);
 }
 
+Optional findFile(StringRef Dir, const Twine ) {
+  SmallString<128> Path;
+  // TODO: Parse `--sysroot` somewhere and use it here.
+  sys::path::append(Path, Dir, Name);
+  if (sys::fs::exists(Path))
+return static_cast(Path);
+  return None;
+}
+
+Optional findFromSearchPaths(StringRef Name,
+  ArrayRef SearchPaths) {
+  for (StringRef Dir : SearchPaths)
+if (Optional File = findFile(Dir, Name))
+  return File;
+  return None;
+}
+
+Optional searchLibraryBaseName(StringRef Name,
+ArrayRef SearchPaths) {
+  for (StringRef Dir : SearchPaths) {
+if (Optional File = findFile(Dir, "lib" + Name + ".a"))
+  return File;
+  }
+  return None;
+}
+
+/// Search for static libraries in the linker's library path given input like
+/// `-lfoo` or `-l:libfoo.a`.
+Optional searchLibrary(StringRef Input,
+ArrayRef SearchPaths) {
+  if (!Input.startswith("-l"))
+return None;
+  StringRef Name = Input.drop_front(2);
+  if (Name.startswith(":"))
+return findFromSearchPaths(Name.drop_front(), SearchPaths);
+  return searchLibraryBaseName(Name, SearchPaths);
+}
+
 } // namespace
 
 int main(int argc, const char **argv) {
@@ -581,16 +621,26 @@
   for (const std::string  : HostLinkerArgs)
 LinkerArgs.push_back(Arg);
 
+  SmallVector LibraryPaths;
+  for (const StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))
+  LibraryPaths.push_back(Arg.drop_front(2));
+
   // Try to extract device code from the linker input and replace the linker
   // input with a new file that has the device section stripped.
   SmallVector DeviceFiles;
   for (std::string  : LinkerArgs) {
-if (sys::path::extension(Arg) == ".o" ||
-sys::path::extension(Arg) == ".a") {
+// Search for static libraries in the library link path.
+std::string Filename = Arg;
+if (Optional Library = searchLibrary(Arg, LibraryPaths))
+  Filename = *Library;
+
+if (sys::path::extension(Filename) == ".o" ||
+sys::path::extension(Filename) == ".a") {
   ErrorOr> BufferOrErr =
-  MemoryBuffer::getFileOrSTDIN(Arg);
+  MemoryBuffer::getFileOrSTDIN(Filename);
   if (std::error_code EC = BufferOrErr.getError())
-return reportError(createFileError(Arg, EC));
+return reportError(createFileError(Filename, EC));
 
   auto NewFileOrErr =
   extractFromBuffer(std::move(*BufferOrErr), DeviceFiles);


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -65,7 +65,9 @@
 /// Path of the current binary.
 static std::string LinkerExecutable;
 
+/// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
+
 /// Magic section string that marks the existence of offloading data. The
 /// section string will be formatted as `.llvm.offloading..`.
 #define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
@@ -551,6 +553,44 @@
   return static_cast(ObjectFile);
 }
 
+Optional findFile(StringRef Dir, const Twine ) {
+  SmallString<128> Path;
+  // TODO: Parse `--sysroot` somewhere and use it here.
+  sys::path::append(Path, Dir, Name);
+  if (sys::fs::exists(Path))
+return static_cast(Path);
+  return None;
+}
+
+Optional findFromSearchPaths(StringRef Name,
+  ArrayRef SearchPaths) {
+  for (StringRef Dir : SearchPaths)
+if (Optional File = findFile(Dir, Name))
+  return File;
+  return None;
+}
+
+Optional searchLibraryBaseName(StringRef Name,
+ArrayRef SearchPaths) {
+  for (StringRef Dir : SearchPaths) {
+if (Optional File = 

[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-02-01 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116541#3287379 , @thakis wrote:

> Just build and run tests on any mac. This fails on 3 different macs I tried 
> (2x arm, 1x intel), in a bunch of different build configs.
>
> For the particular build I sent the output from, the cmake invocation looked 
> like `/Applications/CMake.app/Contents/bin/cmake -GNinja 
> -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON 
> -DLLVM_ENABLE_PROJECTS='compiler-rt;libcxx;clang' -DLLVM_APPEND_VC_REV=NO 
> -DCMAKE_C_COMPILER=$HOME/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang
>  
> -DCMAKE_CXX_COMPILER=$HOME/src/chrome/src/third_party/llvm-build/Release+Asserts/bin/clang++
>  -DCMAKE_OSX_SYSROOT=/Users/thakis/src/llvm-project/sysroot/MacOSX.sdk 
> -DDARWIN_macosx_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/MacOSX.sdk
>  
> -DDARWIN_iphoneos_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/iPhoneOS.sdk
>  
> -DDARWIN_iphonesimulator_CACHED_SYSROOT=/Users/thakis/src/llvm-project/sysroot/iPhoneSimulator.sdk
>  ../llvm`

I don't have access to a mac computer right now. I'm just going to remove the 
problematic check line so this passes and add it in later once I figure out 
what's going on here. The output you're getting should still work, it's just 
not ideal because we're regenerating the bitcode file so this isn't a breaking 
issue.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-02-01 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

Removed the two lines in rG28c15341368b 
, let me 
know if this lets the tests pass. I'll look into getting an access somehow so I 
can reproduce this and figure it out.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-02-01 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116541#3287330 , @thakis wrote:

> Tests have been failing on Mac for over 20 hours now. Time to revert and fix 
> async?
>
>% bin/clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
> -fopenmp-new-driver -no-canonical-prefixes -ccc-print-bindings 
> /Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c -o 
> openmp-offload-gpu -fopenmp -fopenmp-targets=nvptx64 -fopenmp-new-driver 
> -ccc-print-bindings
>   clang version 14.0.0
>   Target: x86_64-apple-darwin19.6.0
>   Thread model: posix
>   InstalledDir: bin
>   # "x86_64-apple-darwin19.6.0" - "clang", inputs: 
> ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c"], 
> output: 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-729b05.bc"
>   # "nvptx64-nvidia-cuda" - "clang", inputs: 
> ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-729b05.bc"],
>  output: 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-a35969.s"
>   # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
> ["/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-a35969.s"],
>  output: 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-a3f7f0.o"
>   # "x86_64-apple-darwin19.6.0" - "clang", inputs: 
> ["/Users/thakis/src/llvm-project/clang/test/Driver/openmp-offload-gpu.c", 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-a3f7f0.o"],
>  output: 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-f53552.bc"
>   # "x86_64-apple-darwin19.6.0" - "clang", inputs: 
> ["/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-f53552.bc"],
>  output: 
> "/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-86f846.o"
>   # "x86_64-apple-darwin19.6.0" - "Offload::Linker", inputs: 
> ["/var/folders/qt/hxckwtm545l643cnk200wzt0gn/T/openmp-offload-gpu-86f846.o"],
>  output: "openmp-offload-gpu"

I'd rather just disable the test, this patch has like 20 others that depend on 
it so we'd need to revert all of those. It's definitely not grabbing the 
bitcode from the cache as expected. I can disable this part of the test for 
now, but do you know how I could figure out how to reproduce this? I've been 
tracking some other buildbots and they don't seem to have the same issue so I'm 
not sure what's special about this one.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118495: [OpenMP] Accept shortened triples for -Xopenmp-target=

2022-01-28 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG24f88f57de58: [OpenMP] Accept shortened triples for 
-Xopenmp-target= (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D118495/new/

https://reviews.llvm.org/D118495

Files:
  clang/include/clang/Driver/ToolChain.h
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChain.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -343,3 +343,10 @@
 // RUN:   | FileCheck -check-prefix=SAVE_TEMPS_NAMES %s
 
 // SAVE_TEMPS_NAMES-NOT: "GNU::Linker"{{.*}}["[[SAVE_TEMPS_INPUT1:.*\.o]]", 
"[[SAVE_TEMPS_INPUT1]]"]
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64 
-Xopenmp-target=nvptx64 -march=sm_35 \
+// RUN:  -save-temps -no-canonical-prefixes %s -o openmp-offload-gpu 
2>&1 \
+// RUN:   | FileCheck -check-prefix=TRIPLE %s
+
+// TRIPLE: "-triple" "nvptx64-nvidia-cuda"
+// TRIPLE: "-target-cpu" "sm_35"
Index: clang/lib/Driver/ToolChain.cpp
===
--- clang/lib/Driver/ToolChain.cpp
+++ clang/lib/Driver/ToolChain.cpp
@@ -1129,8 +1129,10 @@
 A->getOption().matches(options::OPT_Xopenmp_target);
 
 if (A->getOption().matches(options::OPT_Xopenmp_target_EQ)) {
+  llvm::Triple TT(getOpenMPTriple(A->getValue(0)));
+
   // Passing device args: -Xopenmp-target= -opt=val.
-  if (A->getValue(0) == getTripleString())
+  if (TT.getTriple() == getTripleString())
 Index = Args.getBaseArgs().MakeIndex(A->getValue(1));
   else
 continue;
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -792,21 +792,9 @@
   if (HasValidOpenMPRuntime) {
 llvm::StringMap FoundNormalizedTriples;
 for (const char *Val : OpenMPTargets->getValues()) {
-  llvm::Triple TT(Val);
+  llvm::Triple TT(ToolChain::getOpenMPTriple(Val));
   std::string NormalizedName = TT.normalize();
 
-  // We want to expand the shortened versions of the triples passed in 
to
-  // the values used for the bitcode libraries for convenience.
-  if (TT.getVendor() == llvm::Triple::UnknownVendor ||
-  TT.getOS() == llvm::Triple::UnknownOS) {
-if (TT.getArch() == llvm::Triple::nvptx)
-  TT = llvm::Triple("nvptx-nvidia-cuda");
-else if (TT.getArch() == llvm::Triple::nvptx64)
-  TT = llvm::Triple("nvptx64-nvidia-cuda");
-else if (TT.getArch() == llvm::Triple::amdgcn)
-  TT = llvm::Triple("amdgcn-amd-amdhsa");
-  }
-
   // Make sure we don't have a duplicate triple.
   auto Duplicate = FoundNormalizedTriples.find(NormalizedName);
   if (Duplicate != FoundNormalizedTriples.end()) {
Index: clang/include/clang/Driver/ToolChain.h
===
--- clang/include/clang/Driver/ToolChain.h
+++ clang/include/clang/Driver/ToolChain.h
@@ -711,6 +711,22 @@
   const llvm::fltSemantics *FPType = nullptr) const {
 return llvm::DenormalMode::getIEEE();
   }
+
+  // We want to expand the shortened versions of the triples passed in to
+  // the values used for the bitcode libraries.
+  static llvm::Triple getOpenMPTriple(StringRef TripleStr) {
+llvm::Triple TT(TripleStr);
+if (TT.getVendor() == llvm::Triple::UnknownVendor ||
+TT.getOS() == llvm::Triple::UnknownOS) {
+  if (TT.getArch() == llvm::Triple::nvptx)
+return llvm::Triple("nvptx-nvidia-cuda");
+  if (TT.getArch() == llvm::Triple::nvptx64)
+return llvm::Triple("nvptx64-nvidia-cuda");
+  if (TT.getArch() == llvm::Triple::amdgcn)
+return llvm::Triple("amdgcn-amd-amdhsa");
+}
+return TT;
+  }
 };
 
 /// Set a ToolChain's effective triple. Reset it when the registration object


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -343,3 +343,10 @@
 // RUN:   | FileCheck -check-prefix=SAVE_TEMPS_NAMES %s
 
 // SAVE_TEMPS_NAMES-NOT: "GNU::Linker"{{.*}}["[[SAVE_TEMPS_INPUT1:.*\.o]]", "[[SAVE_TEMPS_INPUT1]]"]
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_35 \
+// RUN:  -save-temps -no-canonical-prefixes %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=TRIPLE %s
+
+// TRIPLE: "-triple" "nvptx64-nvidia-cuda"
+// TRIPLE: "-target-cpu" "sm_35"
Index: clang/lib/Driver/ToolChain.cpp

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 404506.
jhuber6 added a comment.

Add error handling routine to ensure that the embedding string is always a pair 
separated by a single ','.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4973,3 +4974,22 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedBufferInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData =
+  ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the buffer into the module. These sections are not supposed to be
+  // merged by the linker, so we set the variable name to prevent linking if
+  // they would otherwise be merged.
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::ExternalLinkage,
+  ModuleConstant, SectionName);
+  GV->setVisibility(GlobalValue::HiddenVisibility);
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedBufferInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,15 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section1 \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section2 -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @[[OBJECT1:.+]] = hidden constant [0 x i8] zeroinitializer, section ".llvm.offloading.section1"
+; CHECK: @[[OBJECT2:.+]] = hidden constant [0 x i8] zeroinitializer, section ".llvm.offloading.section2"
+; CHECK: @llvm.compiler.used = appending global [3 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT1]], i32 0, i32 0), i8* getelementptr inbounds ([0 x i8], [0 x i8]* @[[OBJECT2]], i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedObject(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1750,3 +1750,31 @@
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedObject(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : CGOpts.OffloadObjects) {
+if 

[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-31 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D116543#3283874 , @JonChesterfield 
wrote:

> Description and test have slightly diverged from implementation - filename is 
> appended to disambiguate, but the filecheck regex only looks at the prefix 
> and the name described in the commit message is missing the filename

Thanks, I changed it in the local commit but forgot to copy it over here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117362: [OpenMP] Remove hidden visibility for declare target variables

2022-01-14 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, tianshilei1992.
Herald added subscribers: guansong, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch changes the visiblity of variables declared within a declare
target directive. Variable declarations within a declare target
directive need to be externally visible to the plugin for initialization
or reading. Previously this would cause runtime errors where the named
global could not be found because it was not included in the symbol
table.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D117362

Files:
  clang/lib/AST/Decl.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/declare_target_only_one_side_compilation.cpp


Index: clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
===
--- clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
+++ clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
@@ -57,11 +57,11 @@
 
 // TODO: It is odd, probably wrong, that we don't mangle all variables.
 
-// DEVICE-DAG: @G1 = hidden {{.*}}global i32 0, align 4
+// DEVICE-DAG: @G1 = {{.*}}global i32 0, align 4
 // DEVICE-DAG: @_ZL2G2 = internal {{.*}}global i32 0, align 4
-// DEVICE-DAG: @G3 = hidden {{.*}}global i32 0, align 4
+// DEVICE-DAG: @G3 = {{.*}}global i32 0, align 4
 // DEVICE-DAG: @_ZL2G4 = internal {{.*}}global i32 0, align 4
-// DEVICE-DAG: @G5 = hidden {{.*}}global i32 0, align 4
+// DEVICE-DAG: @G5 = {{.*}}global i32 0, align 4
 // DEVICE-DAG: @_ZL2G6 = internal {{.*}}global i32 0, align 4
 // DEVICE-NOT: ref
 // DEVICE-NOT: llvm.used
Index: clang/test/OpenMP/declare_target_codegen.cpp
===
--- clang/test/OpenMP/declare_target_codegen.cpp
+++ clang/test/OpenMP/declare_target_codegen.cpp
@@ -26,25 +26,25 @@
 // CHECK-NOT: define {{.*}}{{baz1|baz4|maini1|Base|virtual_}}
 // CHECK-DAG: Bake
 // CHECK-NOT: @{{hhh|ggg|fff|eee}} =
-// CHECK-DAG: @flag = hidden global i8 undef,
+// CHECK-DAG: @flag = global i8 undef,
 // CHECK-DAG: @aaa = external global i32,
-// CHECK-DAG: @bbb ={{ hidden | }}global i32 0,
+// CHECK-DAG: @bbb = global i32 0,
 // CHECK-DAG: weak constant %struct.__tgt_offload_entry { i8* bitcast (i32* 
@bbb to i8*),
 // CHECK-DAG: @ccc = external global i32,
-// CHECK-DAG: @ddd ={{ hidden | }}global i32 0,
+// CHECK-DAG: @ddd = global i32 0,
 // CHECK-DAG: @hhh_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @ggg_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @fff_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @eee_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @{{.*}}maini1{{.*}}aaa = internal global i64 23,
 // CHECK-DAG: @pair = {{.*}}addrspace(3) global %struct.PAIR undef
-// CHECK-DAG: @b ={{ hidden | }}global i32 15,
-// CHECK-DAG: @d ={{ hidden | }}global i32 0,
+// CHECK-DAG: @b = global i32 15,
+// CHECK-DAG: @d = global i32 0,
 // CHECK-DAG: @c = external global i32,
-// CHECK-DAG: @globals ={{ hidden | }}global %struct.S zeroinitializer,
+// CHECK-DAG: @globals = hidden global %struct.S zeroinitializer,
 // CHECK-DAG: [[STAT:@.+stat]] = internal global %struct.S zeroinitializer,
 // CHECK-DAG: [[STAT_REF:@.+]] = internal constant %struct.S* [[STAT]]
-// CHECK-DAG: @out_decl_target ={{ hidden | }}global i32 0,
+// CHECK-DAG: @out_decl_target = global i32 0,
 // CHECK-DAG: @llvm.used = appending global [2 x i8*] [i8* bitcast (void ()* 
@__omp_offloading__{{.+}}_globals_l[[@LINE+84]]_ctor to i8*), i8* bitcast (void 
()* @__omp_offloading__{{.+}}_stat_l[[@LINE+85]]_ctor to i8*)],
 // CHECK-DAG: @llvm.compiler.used = appending global [1 x i8*] [i8* bitcast 
(%struct.S** [[STAT_REF]] to i8*)],
 
Index: clang/lib/AST/Decl.cpp
===
--- clang/lib/AST/Decl.cpp
+++ clang/lib/AST/Decl.cpp
@@ -912,8 +912,11 @@
   if (!isExternallyVisible(LV.getLinkage()))
 return LinkageInfo(LV.getLinkage(), DefaultVisibility, false);
 
-  // Mark the symbols as hidden when compiling for the device.
-  if (Context.getLangOpts().OpenMP && Context.getLangOpts().OpenMPIsDevice)
+  // Mark the symbols as hidden when compiling for the device unless it is a
+  // declare target definition.
+  const VarDecl* VD = dyn_cast(D);
+  if (Context.getLangOpts().OpenMP && Context.getLangOpts().OpenMPIsDevice &&
+  !(VD && OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD)))
 LV.mergeVisibility(HiddenVisibility, /*newExplicit=*/false);
 
   return LV;


Index: clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
===
--- clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
+++ clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
@@ -57,11 +57,11 @@
 
 // TODO: It is odd, 

[PATCH] D117320: [OpenMP] Mark device RTL variables as hidden

2022-01-14 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 400137.
jhuber6 added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Fix test and add the fact that its hidden to the comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117320/new/

https://reviews.llvm.org/D117320

Files:
  clang/test/OpenMP/target_globals_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
  llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp


Index: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
===
--- llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -254,6 +254,7 @@
   new GlobalVariable(M, I32Ty,
  /* isConstant = */ true, GlobalValue::WeakODRLinkage,
  ConstantInt::get(I32Ty, Value), Name);
+  GV->setVisibility(GlobalValue::HiddenVisibility);
 
   return GV;
 }
Index: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
===
--- llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -689,7 +689,8 @@
  omp::IdentFlag Flags = omp::IdentFlag(0),
  unsigned Reserve2Flags = 0);
 
-  /// Create a global flag \p Namein the module with initial value \p Value.
+  /// Create a hidden global flag \p Name in the module with initial value \p
+  /// Value.
   GlobalValue *createGlobalFlag(unsigned Value, StringRef Name);
 
   /// Generate control flow and cleanup for cancellation.
Index: clang/test/OpenMP/target_globals_codegen.cpp
===
--- clang/test/OpenMP/target_globals_codegen.cpp
+++ clang/test/OpenMP/target_globals_codegen.cpp
@@ -12,25 +12,25 @@
 #define HEADER
 
 //.
-// CHECK: @__omp_rtl_debug_kind = weak_odr constant i32 1
-// CHECK: @__omp_rtl_assume_teams_oversubscription = weak_odr constant i32 0
-// CHECK: @__omp_rtl_assume_threads_oversubscription = weak_odr constant i32 0
+// CHECK: @__omp_rtl_debug_kind = weak_odr hidden constant i32 1
+// CHECK: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden constant 
i32 0
+// CHECK: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
-// CHECK-EQ: @__omp_rtl_debug_kind = weak_odr constant i32 111
-// CHECK-EQ: @__omp_rtl_assume_teams_oversubscription = weak_odr constant i32 0
-// CHECK-EQ: @__omp_rtl_assume_threads_oversubscription = weak_odr constant 
i32 0
+// CHECK-EQ: @__omp_rtl_debug_kind = weak_odr hidden constant i32 111
+// CHECK-EQ: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 0
+// CHECK-EQ: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
-// CHECK-DEFAULT: @__omp_rtl_debug_kind = weak_odr constant i32 0
-// CHECK-DEFAULT: @__omp_rtl_assume_teams_oversubscription = weak_odr constant 
i32 0
-// CHECK-DEFAULT: @__omp_rtl_assume_threads_oversubscription = weak_odr 
constant i32 0
+// CHECK-DEFAULT: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-DEFAULT: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 0
+// CHECK-DEFAULT: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
-// CHECK-THREADS: @__omp_rtl_debug_kind = weak_odr constant i32 0
-// CHECK-THREADS: @__omp_rtl_assume_teams_oversubscription = weak_odr constant 
i32 0
-// CHECK-THREADS: @__omp_rtl_assume_threads_oversubscription = weak_odr 
constant i32 1
+// CHECK-THREADS: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-THREADS: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 0
+// CHECK-THREADS: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 1
 //.
-// CHECK-TEAMS: @__omp_rtl_debug_kind = weak_odr constant i32 0
-// CHECK-TEAMS: @__omp_rtl_assume_teams_oversubscription = weak_odr constant 
i32 1
-// CHECK-TEAMS: @__omp_rtl_assume_threads_oversubscription = weak_odr constant 
i32 0
+// CHECK-TEAMS: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-TEAMS: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 1
+// CHECK-TEAMS: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
 void foo() {
 #pragma omp target


Index: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
===
--- llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -254,6 +254,7 @@
   new GlobalVariable(M, I32Ty,
  /* isConstant = */ true, GlobalValue::WeakODRLinkage,
  ConstantInt::get(I32Ty, Value), Name);
+  GV->setVisibility(GlobalValue::HiddenVisibility);
 
   return GV;
 }
Index: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h

[PATCH] D117320: [OpenMP] Mark device RTL variables as hidden

2022-01-18 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGdcb83b236421: [OpenMP] Mark device RTL variables as hidden 
(authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117320/new/

https://reviews.llvm.org/D117320

Files:
  clang/test/OpenMP/target_globals_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
  llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp


Index: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
===
--- llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -254,6 +254,7 @@
   new GlobalVariable(M, I32Ty,
  /* isConstant = */ true, GlobalValue::WeakODRLinkage,
  ConstantInt::get(I32Ty, Value), Name);
+  GV->setVisibility(GlobalValue::HiddenVisibility);
 
   return GV;
 }
Index: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
===
--- llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+++ llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
@@ -689,7 +689,8 @@
  omp::IdentFlag Flags = omp::IdentFlag(0),
  unsigned Reserve2Flags = 0);
 
-  /// Create a global flag \p Namein the module with initial value \p Value.
+  /// Create a hidden global flag \p Name in the module with initial value \p
+  /// Value.
   GlobalValue *createGlobalFlag(unsigned Value, StringRef Name);
 
   /// Generate control flow and cleanup for cancellation.
Index: clang/test/OpenMP/target_globals_codegen.cpp
===
--- clang/test/OpenMP/target_globals_codegen.cpp
+++ clang/test/OpenMP/target_globals_codegen.cpp
@@ -12,25 +12,25 @@
 #define HEADER
 
 //.
-// CHECK: @__omp_rtl_debug_kind = weak_odr constant i32 1
-// CHECK: @__omp_rtl_assume_teams_oversubscription = weak_odr constant i32 0
-// CHECK: @__omp_rtl_assume_threads_oversubscription = weak_odr constant i32 0
+// CHECK: @__omp_rtl_debug_kind = weak_odr hidden constant i32 1
+// CHECK: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden constant 
i32 0
+// CHECK: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
-// CHECK-EQ: @__omp_rtl_debug_kind = weak_odr constant i32 111
-// CHECK-EQ: @__omp_rtl_assume_teams_oversubscription = weak_odr constant i32 0
-// CHECK-EQ: @__omp_rtl_assume_threads_oversubscription = weak_odr constant 
i32 0
+// CHECK-EQ: @__omp_rtl_debug_kind = weak_odr hidden constant i32 111
+// CHECK-EQ: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 0
+// CHECK-EQ: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
-// CHECK-DEFAULT: @__omp_rtl_debug_kind = weak_odr constant i32 0
-// CHECK-DEFAULT: @__omp_rtl_assume_teams_oversubscription = weak_odr constant 
i32 0
-// CHECK-DEFAULT: @__omp_rtl_assume_threads_oversubscription = weak_odr 
constant i32 0
+// CHECK-DEFAULT: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-DEFAULT: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 0
+// CHECK-DEFAULT: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
-// CHECK-THREADS: @__omp_rtl_debug_kind = weak_odr constant i32 0
-// CHECK-THREADS: @__omp_rtl_assume_teams_oversubscription = weak_odr constant 
i32 0
-// CHECK-THREADS: @__omp_rtl_assume_threads_oversubscription = weak_odr 
constant i32 1
+// CHECK-THREADS: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-THREADS: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 0
+// CHECK-THREADS: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 1
 //.
-// CHECK-TEAMS: @__omp_rtl_debug_kind = weak_odr constant i32 0
-// CHECK-TEAMS: @__omp_rtl_assume_teams_oversubscription = weak_odr constant 
i32 1
-// CHECK-TEAMS: @__omp_rtl_assume_threads_oversubscription = weak_odr constant 
i32 0
+// CHECK-TEAMS: @__omp_rtl_debug_kind = weak_odr hidden constant i32 0
+// CHECK-TEAMS: @__omp_rtl_assume_teams_oversubscription = weak_odr hidden 
constant i32 1
+// CHECK-TEAMS: @__omp_rtl_assume_threads_oversubscription = weak_odr hidden 
constant i32 0
 //.
 void foo() {
 #pragma omp target


Index: llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
===
--- llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -254,6 +254,7 @@
   new GlobalVariable(M, I32Ty,
  /* isConstant = */ true, GlobalValue::WeakODRLinkage,
  ConstantInt::get(I32Ty, Value), Name);
+  GV->setVisibility(GlobalValue::HiddenVisibility);
 
   return GV;
 }
Index: llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h

[PATCH] D117362: [OpenMP] Remove hidden visibility for declare target variables

2022-01-18 Thread Joseph Huber via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd081bfcd17c1: [OpenMP] Remove hidden visibility for declare 
target variables (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117362/new/

https://reviews.llvm.org/D117362

Files:
  clang/lib/AST/Decl.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/declare_target_only_one_side_compilation.cpp


Index: clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
===
--- clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
+++ clang/test/OpenMP/declare_target_only_one_side_compilation.cpp
@@ -57,11 +57,11 @@
 
 // TODO: It is odd, probably wrong, that we don't mangle all variables.
 
-// DEVICE-DAG: @G1 = hidden {{.*}}global i32 0, align 4
+// DEVICE-DAG: @G1 = {{.*}}global i32 0, align 4
 // DEVICE-DAG: @_ZL2G2 = internal {{.*}}global i32 0, align 4
-// DEVICE-DAG: @G3 = hidden {{.*}}global i32 0, align 4
+// DEVICE-DAG: @G3 = {{.*}}global i32 0, align 4
 // DEVICE-DAG: @_ZL2G4 = internal {{.*}}global i32 0, align 4
-// DEVICE-DAG: @G5 = hidden {{.*}}global i32 0, align 4
+// DEVICE-DAG: @G5 = {{.*}}global i32 0, align 4
 // DEVICE-DAG: @_ZL2G6 = internal {{.*}}global i32 0, align 4
 // DEVICE-NOT: ref
 // DEVICE-NOT: llvm.used
Index: clang/test/OpenMP/declare_target_codegen.cpp
===
--- clang/test/OpenMP/declare_target_codegen.cpp
+++ clang/test/OpenMP/declare_target_codegen.cpp
@@ -26,25 +26,26 @@
 // CHECK-NOT: define {{.*}}{{baz1|baz4|maini1|Base|virtual_}}
 // CHECK-DAG: Bake
 // CHECK-NOT: @{{hhh|ggg|fff|eee}} =
-// CHECK-DAG: @flag = hidden global i8 undef,
+// CHECK-DAG: @flag = global i8 undef,
 // CHECK-DAG: @aaa = external global i32,
-// CHECK-DAG: @bbb ={{ hidden | }}global i32 0,
+// CHECK-DAG: @bbb = global i32 0,
 // CHECK-DAG: weak constant %struct.__tgt_offload_entry { i8* bitcast (i32* 
@bbb to i8*),
 // CHECK-DAG: @ccc = external global i32,
-// CHECK-DAG: @ddd ={{ hidden | }}global i32 0,
+// CHECK-DAG: @ddd = global i32 0,
 // CHECK-DAG: @hhh_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @ggg_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @fff_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @eee_decl_tgt_ref_ptr = weak global i32* null
 // CHECK-DAG: @{{.*}}maini1{{.*}}aaa = internal global i64 23,
 // CHECK-DAG: @pair = {{.*}}addrspace(3) global %struct.PAIR undef
-// CHECK-DAG: @b ={{ hidden | }}global i32 15,
-// CHECK-DAG: @d ={{ hidden | }}global i32 0,
+// CHECK-DAG: @_ZN2SS3SSSE = global i32 1,
+// CHECK-DAG: @b = global i32 15,
+// CHECK-DAG: @d = global i32 0,
 // CHECK-DAG: @c = external global i32,
-// CHECK-DAG: @globals ={{ hidden | }}global %struct.S zeroinitializer,
+// CHECK-DAG: @globals = global %struct.S zeroinitializer,
 // CHECK-DAG: [[STAT:@.+stat]] = internal global %struct.S zeroinitializer,
 // CHECK-DAG: [[STAT_REF:@.+]] = internal constant %struct.S* [[STAT]]
-// CHECK-DAG: @out_decl_target ={{ hidden | }}global i32 0,
+// CHECK-DAG: @out_decl_target = global i32 0,
 // CHECK-DAG: @llvm.used = appending global [2 x i8*] [i8* bitcast (void ()* 
@__omp_offloading__{{.+}}_globals_l[[@LINE+84]]_ctor to i8*), i8* bitcast (void 
()* @__omp_offloading__{{.+}}_stat_l[[@LINE+85]]_ctor to i8*)],
 // CHECK-DAG: @llvm.compiler.used = appending global [1 x i8*] [i8* bitcast 
(%struct.S** [[STAT_REF]] to i8*)],
 
@@ -283,4 +284,11 @@
   X->emitted();
 }
 #pragma omp end declare target
+
+struct SS {
+#pragma omp declare target
+  static int SSS;
+#pragma omp end declare target
+};
+int SS::SSS = 1;
 #endif
Index: clang/lib/AST/Decl.cpp
===
--- clang/lib/AST/Decl.cpp
+++ clang/lib/AST/Decl.cpp
@@ -786,6 +786,11 @@
 //
 // Note that we don't want to make the variable non-external
 // because of this, but unique-external linkage suits us.
+
+// We need variables inside OpenMP declare target directives to be visible.
+if (OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(Var))
+  return LinkageInfo::external();
+
 if (Context.getLangOpts().CPlusPlus && !isFirstInExternCContext(Var) &&
 !IgnoreVarTypeLinkage) {
   LinkageInfo TypeLV = getLVForType(*Var->getType(), computation);
@@ -1069,6 +1074,12 @@
 
   // Finally, merge in information from the class.
   LV.mergeMaybeWithVisibility(classLV, considerClassVisibility);
+
+  // We need variables inside OpenMP declare target directives to be visible.
+  if (const VarDecl *VD = dyn_cast(D))
+if (OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD))
+  return LinkageInfo(LV.getLinkage(), DefaultVisibility, false);
+
   return LV;
 }
 


Index: clang/test/OpenMP/declare_target_only_one_side_compilation.cpp

[PATCH] D117806: [OpenMP] Change default visibility to protected for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/test/OpenMP/declare_target_codegen.cpp:293
-};
-int SS::SSS = 1;
 #endif

jdoerfert wrote:
> What happened here?
That was a special case I added that was only necessary when we were trying to 
remove things from being hidden. Now that hidden is not the default it's not 
necessary anymore.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117806/new/

https://reviews.llvm.org/D117806

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117777: [OpenMP] Don't pass empty files to nvlink

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGaf5600420b93: [OpenMP] Dont pass empty files to nvlink 
(authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D11/new/

https://reviews.llvm.org/D11

Files:
  clang/test/Driver/Inputs/openmp_static_device_link/empty.o
  clang/test/Driver/Inputs/openmp_static_device_link/lib.bc
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp


Index: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
===
--- clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
+++ clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
@@ -55,12 +55,22 @@
 static cl::list
 NVArgs(cl::Sink, cl::desc("..."));
 
+static bool isEmptyFile(StringRef Filename) {
+  ErrorOr> BufOrErr =
+  MemoryBuffer::getFileOrSTDIN(Filename, false, false);
+  if (std::error_code EC = BufOrErr.getError())
+return false;
+  return (*BufOrErr)->getBuffer().empty();
+}
+
 static Error runNVLink(std::string NVLinkPath,
SmallVectorImpl ) {
   std::vector NVLArgs;
   NVLArgs.push_back(NVLinkPath);
+  StringRef Output = *(llvm::find(Args, "-o") + 1);
   for (auto  : Args) {
-NVLArgs.push_back(Arg);
+if (!(sys::fs::exists(Arg) && Arg != Output && isEmptyFile(Arg)))
+  NVLArgs.push_back(Arg);
   }
 
   if (sys::ExecuteAndWait(NVLinkPath, NVLArgs))
Index: clang/test/Driver/fat_archive_nvptx.cpp
===
--- clang/test/Driver/fat_archive_nvptx.cpp
+++ clang/test/Driver/fat_archive_nvptx.cpp
@@ -10,7 +10,8 @@
 // CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" 
"nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "[[GPU:sm_[0-9]+]]"{{.*}}"-o" 
"[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
 // CHECK: clang-offload-bundler" "-unbundle" "-type=a" 
"-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" 
"-targets=openmp-nvptx64-nvidia-cuda-[[GPU]]" 
"-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" "-allow-missing-bundles"
 // CHECK: clang-nvlink-wrapper{{.*}}"-o" "{{.*}}.out" "-arch" "[[GPU]]" 
"{{.*}}[[DEVICESPECIFICARCHIVE]]"
-// expected-no-diagnostics
+// RUN: not %clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s 
%S/Inputs/openmp_static_device_link/empty.o 
--libomptarget-nvptx-bc-path=%S/Inputs/openmp_static_device_link/lib.bc 2>&1 | 
FileCheck %s --check-prefix=EMPTY
+// EMPTY-NOT: Could not open input file
 
 #ifndef HEADER
 #define HEADER


Index: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
===
--- clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
+++ clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
@@ -55,12 +55,22 @@
 static cl::list
 NVArgs(cl::Sink, cl::desc("..."));
 
+static bool isEmptyFile(StringRef Filename) {
+  ErrorOr> BufOrErr =
+  MemoryBuffer::getFileOrSTDIN(Filename, false, false);
+  if (std::error_code EC = BufOrErr.getError())
+return false;
+  return (*BufOrErr)->getBuffer().empty();
+}
+
 static Error runNVLink(std::string NVLinkPath,
SmallVectorImpl ) {
   std::vector NVLArgs;
   NVLArgs.push_back(NVLinkPath);
+  StringRef Output = *(llvm::find(Args, "-o") + 1);
   for (auto  : Args) {
-NVLArgs.push_back(Arg);
+if (!(sys::fs::exists(Arg) && Arg != Output && isEmptyFile(Arg)))
+  NVLArgs.push_back(Arg);
   }
 
   if (sys::ExecuteAndWait(NVLinkPath, NVLArgs))
Index: clang/test/Driver/fat_archive_nvptx.cpp
===
--- clang/test/Driver/fat_archive_nvptx.cpp
+++ clang/test/Driver/fat_archive_nvptx.cpp
@@ -10,7 +10,8 @@
 // CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" "nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "[[GPU:sm_[0-9]+]]"{{.*}}"-o" "[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
 // CHECK: clang-offload-bundler" "-unbundle" "-type=a" "-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" "-targets=openmp-nvptx64-nvidia-cuda-[[GPU]]" "-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" "-allow-missing-bundles"
 // CHECK: clang-nvlink-wrapper{{.*}}"-o" "{{.*}}.out" "-arch" "[[GPU]]" "{{.*}}[[DEVICESPECIFICARCHIVE]]"
-// expected-no-diagnostics
+// RUN: not %clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s %S/Inputs/openmp_static_device_link/empty.o --libomptarget-nvptx-bc-path=%S/Inputs/openmp_static_device_link/lib.bc 2>&1 | FileCheck %s --check-prefix=EMPTY
+// EMPTY-NOT: Could not open input file
 
 #ifndef HEADER
 #define HEADER
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117806: [OpenMP] Remove overriding visibility for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 401716.
jhuber6 added a comment.

Changing to use default `protected` instead of passing `-Bsymbolic` this should 
be more portable and make the intentions clearer.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117806/new/

https://reviews.llvm.org/D117806

Files:
  clang/lib/AST/Decl.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/target_attribute_convergent.cpp

Index: clang/test/OpenMP/target_attribute_convergent.cpp
===
--- clang/test/OpenMP/target_attribute_convergent.cpp
+++ clang/test/OpenMP/target_attribute_convergent.cpp
@@ -9,5 +9,5 @@
 #pragma omp end declare target
 
 // CHECK: Function Attrs: {{.*}}convergent{{.*}}
-// CHECK: define hidden void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
+// CHECK: define protected void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
 // CHECK: attributes [[ATTRIBUTE_NUMBER]] = { {{.*}}convergent{{.*}} }
Index: clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
===
--- clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
+++ clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
@@ -34,18 +34,18 @@
 #pragma omp declare target
 T a = T();
 T f = a;
-// CHECK: define{{ hidden | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
 void foo(T a = T()) {
   return;
 }
-// CHECK: define{{ hidden | }}[6 x i64] @{{.+}}bar{{.+}}()
+// CHECK: define{{ protected | }}[6 x i64] @{{.+}}bar{{.+}}()
 T bar() {
 // CHECK:  bitcast [[T]]* %{{.+}} to [6 x i64]*
 // CHECK-NEXT: load [6 x i64], [6 x i64]* %{{.+}},
 // CHECK-NEXT: ret [6 x i64]
   return T();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz{{.+}}()
+// CHECK: define{{ protected | }}void @{{.+}}baz{{.+}}()
 void baz() {
 // CHECK:  call [6 x i64] @{{.+}}bar{{.+}}()
 // CHECK-NEXT: bitcast [[T]]* %{{.+}} to [6 x i64]*
@@ -54,17 +54,17 @@
 }
 T1 a1 = T1();
 T1 f1 = a1;
-// CHECK: define{{ hidden | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
 void foo1(T1 a = T1()) {
   return;
 }
-// CHECK: define{{ hidden | }}[[T1]] @{{.+}}bar1{{.+}}()
+// CHECK: define{{ protected | }}[[T1]] @{{.+}}bar1{{.+}}()
 T1 bar1() {
 // CHECK:  load [[T1]], [[T1]]*
 // CHECK-NEXT: ret [[T1]]
   return T1();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz1{{.+}}()
+// CHECK: define{{ protected | }}void @{{.+}}baz1{{.+}}()
 void baz1() {
 // CHECK: call [[T1]] @{{.+}}bar1{{.+}}()
   T1 t = bar1();
Index: clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
===
--- clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
+++ clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
@@ -10,8 +10,8 @@
 #define HEADER
 
 // CHECK-NOT: class_type_info
-// CHECK-DAG: @_ZTV7Derived = linkonce_odr hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%class.Derived*)* @_ZN7Derived3fooEv to i8*)] }
-// CHECK-DAG: @_ZTV4Base = linkonce_odr hidden unnamed_addr constant { [3 x i8*] } zeroinitializer
+// CHECK-DAG: @_ZTV7Derived = linkonce_odr protected unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%class.Derived*)* @_ZN7Derived3fooEv to i8*)] }
+// CHECK-DAG: @_ZTV4Base = linkonce_odr protected unnamed_addr constant { [3 x i8*] } zeroinitializer
 // CHECK-NOT: class_type_info
 class Base {
   public:
Index: clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
===
--- clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
+++ clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
@@ -16,9 +16,9 @@
 // SIMD-ONLY-NOT: {{__kmpc|__tgt}}
 
 // DEVICE-DAG: [[C_ADDR:.+]] = internal global i32 0,
-// DEVICE-DAG: [[CD_ADDR:@.+]] ={{ hidden | }}global %struct.S zeroinitializer,
+// DEVICE-DAG: [[CD_ADDR:@.+]] ={{ protected | }}global %struct.S zeroinitializer,
 // HOST-DAG: @[[C_ADDR:.+]] = internal global i32 0,
-// HOST-DAG: @[[CD_ADDR:.+]] ={{( hidden | dso_local)?}} global %struct.S zeroinitializer,
+// HOST-DAG: @[[CD_ADDR:.+]] ={{( protected | dso_local)?}} global %struct.S zeroinitializer,
 
 #pragma omp declare target
 int foo() { return 0; }
@@ -34,12 +34,12 @@
 #pragma omp declare target (bar)
 int caz() { return 0; }
 
-// DEVICE-DAG: define{{ hidden | }}noundef i32 [[FOO:@.*foo.*]]()
-// DEVICE-DAG: define{{ hidden | }}noundef i32 [[BAR:@.*bar.*]]()
-// DEVICE-DAG: 

[PATCH] D117806: [OpenMP] Remove overriding visibility for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, JonChesterfield, ABataev.
Herald added subscribers: asavonic, guansong, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch removes the special-case handling of visibility when
compiling for an OpenMP target offloading device. This was orignally
added as a precaution against the bug encountered in PR41826 when
symbols in the device were being preempted by shared library symbols.
This should instead by done to more specifically disable symbol
preemption on the device and allow the user to control device visibility
more directly. This is done by passing the `-Bsymbolic` flag to the
toolchain, indicating that all symbols bind locally and cannot bind to
another symbol at runtime. It is assumed now that when we compile for the
device, no symbol addresses should be preempted by pending shared library
loads.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D117806

Files:
  clang/lib/AST/Decl.cpp
  clang/lib/Driver/ToolChains/Gnu.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/target_attribute_convergent.cpp

Index: clang/test/OpenMP/target_attribute_convergent.cpp
===
--- clang/test/OpenMP/target_attribute_convergent.cpp
+++ clang/test/OpenMP/target_attribute_convergent.cpp
@@ -9,5 +9,5 @@
 #pragma omp end declare target
 
 // CHECK: Function Attrs: {{.*}}convergent{{.*}}
-// CHECK: define hidden void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
+// CHECK: define dso_local void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
 // CHECK: attributes [[ATTRIBUTE_NUMBER]] = { {{.*}}convergent{{.*}} }
Index: clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
===
--- clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
+++ clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
@@ -34,18 +34,18 @@
 #pragma omp declare target
 T a = T();
 T f = a;
-// CHECK: define{{ hidden | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
+// CHECK: define{{ dso_local | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
 void foo(T a = T()) {
   return;
 }
-// CHECK: define{{ hidden | }}[6 x i64] @{{.+}}bar{{.+}}()
+// CHECK: define{{ dso_local | }}[6 x i64] @{{.+}}bar{{.+}}()
 T bar() {
 // CHECK:  bitcast [[T]]* %{{.+}} to [6 x i64]*
 // CHECK-NEXT: load [6 x i64], [6 x i64]* %{{.+}},
 // CHECK-NEXT: ret [6 x i64]
   return T();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz{{.+}}()
+// CHECK: define{{ dso_local | }}void @{{.+}}baz{{.+}}()
 void baz() {
 // CHECK:  call [6 x i64] @{{.+}}bar{{.+}}()
 // CHECK-NEXT: bitcast [[T]]* %{{.+}} to [6 x i64]*
@@ -54,17 +54,17 @@
 }
 T1 a1 = T1();
 T1 f1 = a1;
-// CHECK: define{{ hidden | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
+// CHECK: define{{ dso_local | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
 void foo1(T1 a = T1()) {
   return;
 }
-// CHECK: define{{ hidden | }}[[T1]] @{{.+}}bar1{{.+}}()
+// CHECK: define{{ dso_local | }}[[T1]] @{{.+}}bar1{{.+}}()
 T1 bar1() {
 // CHECK:  load [[T1]], [[T1]]*
 // CHECK-NEXT: ret [[T1]]
   return T1();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz1{{.+}}()
+// CHECK: define{{ dso_local | }}void @{{.+}}baz1{{.+}}()
 void baz1() {
 // CHECK: call [[T1]] @{{.+}}bar1{{.+}}()
   T1 t = bar1();
Index: clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
===
--- clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
+++ clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
@@ -10,8 +10,8 @@
 #define HEADER
 
 // CHECK-NOT: class_type_info
-// CHECK-DAG: @_ZTV7Derived = linkonce_odr hidden unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%class.Derived*)* @_ZN7Derived3fooEv to i8*)] }
-// CHECK-DAG: @_ZTV4Base = linkonce_odr hidden unnamed_addr constant { [3 x i8*] } zeroinitializer
+// CHECK-DAG: @_ZTV7Derived = linkonce_odr unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* null, i8* bitcast (void (%class.Derived*)* @_ZN7Derived3fooEv to i8*)] }
+// CHECK-DAG: @_ZTV4Base = linkonce_odr unnamed_addr constant { [3 x i8*] } zeroinitializer
 // CHECK-NOT: class_type_info
 class Base {
   public:
Index: clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
===
--- clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
+++ clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
@@ -16,7 +16,7 @@
 // SIMD-ONLY-NOT: {{__kmpc|__tgt}}
 
 // DEVICE-DAG: [[C_ADDR:.+]] = internal global i32 0,
-// DEVICE-DAG: 

[PATCH] D117806: [OpenMP] Change default visibility to protected for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 401766.
jhuber6 added a comment.

Changing to use '-fvisibility=protected' when we construct the job. This is 
much more transparent and leaves the option open for the user to override it if 
they need default visibility.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117806/new/

https://reviews.llvm.org/D117806

Files:
  clang/lib/AST/Decl.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/target_attribute_convergent.cpp

Index: clang/test/OpenMP/target_attribute_convergent.cpp
===
--- clang/test/OpenMP/target_attribute_convergent.cpp
+++ clang/test/OpenMP/target_attribute_convergent.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -o - | FileCheck %s
-// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx-unknown-unknown -fopenmp-targets=nvptx-nvidia-cuda -emit-llvm %s -fopenmp-is-device -o - | FileCheck %s
+// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -o - | FileCheck %s
+// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx-unknown-unknown -fopenmp-targets=nvptx-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -o - | FileCheck %s
 // expected-no-diagnostics
 
 #pragma omp declare target
@@ -9,5 +9,5 @@
 #pragma omp end declare target
 
 // CHECK: Function Attrs: {{.*}}convergent{{.*}}
-// CHECK: define hidden void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
+// CHECK: define protected void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
 // CHECK: attributes [[ATTRIBUTE_NUMBER]] = { {{.*}}convergent{{.*}} }
Index: clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
===
--- clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
+++ clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
@@ -1,8 +1,8 @@
 // Test target codegen - host bc file has to be created first.
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
 // expected-no-diagnostics
 
 // CHECK-DAG: [[T:%.+]] = type {{.+}}, {{fp128|ppc_fp128}},
@@ -34,18 +34,18 @@
 #pragma omp declare target
 T a = T();
 T f = a;
-// CHECK: define{{ hidden | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
 void foo(T a = T()) {
   return;
 }
-// CHECK: define{{ hidden | }}[6 x i64] @{{.+}}bar{{.+}}()
+// CHECK: define{{ protected | }}[6 x i64] @{{.+}}bar{{.+}}()
 T bar() {
 // CHECK:  bitcast [[T]]* %{{.+}} to [6 x i64]*
 // CHECK-NEXT: load [6 x i64], [6 x i64]* %{{.+}},
 // CHECK-NEXT: ret [6 x i64]
   return T();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz{{.+}}()
+// CHECK: define{{ protected | }}void @{{.+}}baz{{.+}}()
 void baz() {
 // CHECK:  call [6 x i64] @{{.+}}bar{{.+}}()
 // CHECK-NEXT: bitcast [[T]]* %{{.+}} to [6 x i64]*
@@ -54,17 +54,17 @@
 }
 T1 a1 = T1();
 T1 f1 = a1;
-// CHECK: define{{ hidden | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
 void foo1(T1 a = T1()) {
   return;
 

[PATCH] D117806: [OpenMP] Change default visibility to protected for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
jhuber6 marked an inline comment as done.
Closed by commit rG0dfe953294ba: [OpenMP] Change default visibility to 
protected for device declarations (authored by jhuber6).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117806/new/

https://reviews.llvm.org/D117806

Files:
  clang/lib/AST/Decl.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/target_attribute_convergent.cpp

Index: clang/test/OpenMP/target_attribute_convergent.cpp
===
--- clang/test/OpenMP/target_attribute_convergent.cpp
+++ clang/test/OpenMP/target_attribute_convergent.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -o - | FileCheck %s
-// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx-unknown-unknown -fopenmp-targets=nvptx-nvidia-cuda -emit-llvm %s -fopenmp-is-device -o - | FileCheck %s
+// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -o - | FileCheck %s
+// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx-unknown-unknown -fopenmp-targets=nvptx-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -o - | FileCheck %s
 // expected-no-diagnostics
 
 #pragma omp declare target
@@ -9,5 +9,5 @@
 #pragma omp end declare target
 
 // CHECK: Function Attrs: {{.*}}convergent{{.*}}
-// CHECK: define hidden void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
+// CHECK: define protected void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
 // CHECK: attributes [[ATTRIBUTE_NUMBER]] = { {{.*}}convergent{{.*}} }
Index: clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
===
--- clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
+++ clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
@@ -1,8 +1,8 @@
 // Test target codegen - host bc file has to be created first.
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
 // expected-no-diagnostics
 
 // CHECK-DAG: [[T:%.+]] = type {{.+}}, {{fp128|ppc_fp128}},
@@ -34,18 +34,18 @@
 #pragma omp declare target
 T a = T();
 T f = a;
-// CHECK: define{{ hidden | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
 void foo(T a = T()) {
   return;
 }
-// CHECK: define{{ hidden | }}[6 x i64] @{{.+}}bar{{.+}}()
+// CHECK: define{{ protected | }}[6 x i64] @{{.+}}bar{{.+}}()
 T bar() {
 // CHECK:  bitcast [[T]]* %{{.+}} to [6 x i64]*
 // CHECK-NEXT: load [6 x i64], [6 x i64]* %{{.+}},
 // CHECK-NEXT: ret [6 x i64]
   return T();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz{{.+}}()
+// CHECK: define{{ protected | }}void @{{.+}}baz{{.+}}()
 void baz() {
 // CHECK:  call [6 x i64] @{{.+}}bar{{.+}}()
 // CHECK-NEXT: bitcast [[T]]* %{{.+}} to [6 x i64]*
@@ -54,17 +54,17 @@
 }
 T1 a1 = T1();
 T1 f1 = a1;
-// CHECK: define{{ hidden | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
 void foo1(T1 a = T1()) {
   return;
 }
-// CHECK: 

[PATCH] D117806: [OpenMP] Change default visibility to protected for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done.
jhuber6 added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:5830
+  // host, makes the system more robust, and improves performance.
+  if (IsOpenMPDevice) {
+CmdArgs.push_back("-fvisibility");

JonChesterfield wrote:
> I think we need some more code here to detect if the user has already 
> specified a value for fvisibility so that we don't clobber it
I put the code there with the intent of putting the else there, but forgot to 
actually put it there.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117806/new/

https://reviews.llvm.org/D117806

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117806: [OpenMP] Change default visibility to protected for device declarations

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 401811.
jhuber6 added a comment.

Forgot to make this mutually exclusive with user defined visibility value.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117806/new/

https://reviews.llvm.org/D117806

Files:
  clang/lib/AST/Decl.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/OpenMP/declare_target_codegen.cpp
  clang/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp
  clang/test/OpenMP/nvptx_target_pure_deleted_codegen.cpp
  clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
  clang/test/OpenMP/target_attribute_convergent.cpp

Index: clang/test/OpenMP/target_attribute_convergent.cpp
===
--- clang/test/OpenMP/target_attribute_convergent.cpp
+++ clang/test/OpenMP/target_attribute_convergent.cpp
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -o - | FileCheck %s
-// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx-unknown-unknown -fopenmp-targets=nvptx-nvidia-cuda -emit-llvm %s -fopenmp-is-device -o - | FileCheck %s
+// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -o - | FileCheck %s
+// RUN: %clang_cc1 -debug-info-kind=limited -verify -fopenmp -x c++ -triple nvptx-unknown-unknown -fopenmp-targets=nvptx-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -o - | FileCheck %s
 // expected-no-diagnostics
 
 #pragma omp declare target
@@ -9,5 +9,5 @@
 #pragma omp end declare target
 
 // CHECK: Function Attrs: {{.*}}convergent{{.*}}
-// CHECK: define hidden void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
+// CHECK: define protected void @_Z3foov() [[ATTRIBUTE_NUMBER:#[0-9]+]]
 // CHECK: attributes [[ATTRIBUTE_NUMBER]] = { {{.*}}convergent{{.*}} }
Index: clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
===
--- clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
+++ clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp
@@ -1,8 +1,8 @@
 // Test target codegen - host bc file has to be created first.
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple x86_64-unknown-linux -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device -fvisibility protected -fopenmp-host-ir-file-path %t-host.bc -o - | FileCheck %s
 // expected-no-diagnostics
 
 // CHECK-DAG: [[T:%.+]] = type {{.+}}, {{fp128|ppc_fp128}},
@@ -34,18 +34,18 @@
 #pragma omp declare target
 T a = T();
 T f = a;
-// CHECK: define{{ hidden | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo{{.+}}([[T]]* noundef byval([[T]]) align {{.+}})
 void foo(T a = T()) {
   return;
 }
-// CHECK: define{{ hidden | }}[6 x i64] @{{.+}}bar{{.+}}()
+// CHECK: define{{ protected | }}[6 x i64] @{{.+}}bar{{.+}}()
 T bar() {
 // CHECK:  bitcast [[T]]* %{{.+}} to [6 x i64]*
 // CHECK-NEXT: load [6 x i64], [6 x i64]* %{{.+}},
 // CHECK-NEXT: ret [6 x i64]
   return T();
 }
-// CHECK: define{{ hidden | }}void @{{.+}}baz{{.+}}()
+// CHECK: define{{ protected | }}void @{{.+}}baz{{.+}}()
 void baz() {
 // CHECK:  call [6 x i64] @{{.+}}bar{{.+}}()
 // CHECK-NEXT: bitcast [[T]]* %{{.+}} to [6 x i64]*
@@ -54,17 +54,17 @@
 }
 T1 a1 = T1();
 T1 f1 = a1;
-// CHECK: define{{ hidden | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
+// CHECK: define{{ protected | }}void @{{.+}}foo1{{.+}}([[T1]]* noundef byval([[T1]]) align {{.+}})
 void foo1(T1 a = T1()) {
   return;
 }
-// CHECK: define{{ hidden | }}[[T1]] @{{.+}}bar1{{.+}}()
+// CHECK: define{{ protected | }}[[T1]] 

[PATCH] D116910: [OpenMP][3/3] Introduce the KernelEnvironment into Clang tests

2022-01-21 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 accepted this revision.
jhuber6 added a comment.
This revision is now accepted and ready to land.

LGTM if it passes all the tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116910/new/

https://reviews.llvm.org/D116910

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117706: [openmp] Unconditionally set march commandline argument

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 accepted this revision.
jhuber6 added a comment.
This revision is now accepted and ready to land.

LGTM, with this you should be able to replace calls for the AMDGPU arch with 
querying the ToolChain args, e.g. 
`TCArgs.getLastArgValue(options::OPT_march_EQ)`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117706/new/

https://reviews.llvm.org/D117706

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117049: [OpenMP] Add support for embedding bitcode images in wrapper tool

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D117049#3255748 , @saiislam wrote:

> It seems that this patch along with D117156 
>  and D117246 
>  is giving `patch application failed` error 
> [https://buildkite.com/llvm-project/diff-checks/builds/82688].
> `arc patch` is also giving the same error.

I think what happened is I didn't include a small fix patch I had locally in 
the list somewhere which made the diffs not apply. I could probably try to 
squash them and reapply if it's important.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117049/new/

https://reviews.llvm.org/D117049

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117246: [OpenMP] Add support for linking AMDGPU images

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp:297
+
+std::string Arch = DAL->getLastArgValue(options::OPT_march_EQ).str();
+if (Arch.empty()) {

JonChesterfield wrote:
> This part is valuable as-is and probably independently testable, in that I 
> think -### will have no march= before this change and will contain 
> march=gfx*** after it. Though for that test to work reliably, I think it 
> would have to run on a machine where amdgpu-arch succeeded, which is probably 
> more bother than it's worth. Will check locally.
Yes, I believe with this we could remove the other calls to 
`checkSystemForAMDGPU` and just reference the `-march` in the driver arguments. 
Not sure about testing because it requires the binary of course.



Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:623
+  // Copy system library paths used by the host linker.
+  for (StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))

JonChesterfield wrote:
> Does this mean to take the library paths the host used to link, and pass them 
> unchanged for the device link as well? Doesn't seem a given that this is the 
> right behaviour. Do we have a mechanism for passing arguments to the device 
> linker? Might be ugly, e.g. `-Xopenmp-target=sm_70 
> -Xopenmp-target=-Wl,whatever`
That's a good idea for general control over the arguments to the linker 
wrapper. I don't think the `-L` arguments are strictly necessary here. Could 
probably move to your approach if needed, the `-L` arguments were already 
present in the `clang-nvlink-wrapper` so I just replicated that without really 
caring if it was necessary.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117246/new/

https://reviews.llvm.org/D117246

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117634: [OpenMP] Expand short verisions of OpenMP offloading triples

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 401357.
jhuber6 edited the summary of this revision.
jhuber6 added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changing approach to simply expand the triple where we parse it for OpenMP.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117634/new/

https://reviews.llvm.org/D117634

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  openmp/libomptarget/DeviceRTL/CMakeLists.txt


Index: openmp/libomptarget/DeviceRTL/CMakeLists.txt
===
--- openmp/libomptarget/DeviceRTL/CMakeLists.txt
+++ openmp/libomptarget/DeviceRTL/CMakeLists.txt
@@ -227,7 +227,7 @@
 
 # Generate a Bitcode library for all the compute capabilities the user 
requested
 foreach(sm ${nvptx_sm_list})
-  compileDeviceRTLLibrary(sm_${sm} nvptx -target nvptx64 -Xclang 
-target-feature -Xclang +ptx61 "-D__CUDA_ARCH__=${sm}0")
+  compileDeviceRTLLibrary(sm_${sm} nvptx -target nvptx64-nvidia-cuda -Xclang 
-target-feature -Xclang +ptx61 "-D__CUDA_ARCH__=${sm}0")
 endforeach()
 
 foreach(mcpu ${amdgpu_mcpus})
Index: clang/test/Driver/fat_archive_nvptx.cpp
===
--- clang/test/Driver/fat_archive_nvptx.cpp
+++ clang/test/Driver/fat_archive_nvptx.cpp
@@ -6,9 +6,9 @@
 
 // Given a FatArchive, clang-offload-bundler should be called to create a
 // device specific archive, which should be passed to clang-nvlink-wrapper.
-// RUN: %clang -O2 -### -fopenmp -fopenmp-targets=nvptx64 %s 
-L%S/Inputs/openmp_static_device_link -lFatArchive 2>&1 | FileCheck %s
-// CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" "nvptx64"{{.*}}"-target-cpu" 
"[[GPU:sm_[0-9]+]]"{{.*}}"-o" "[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
-// CHECK: clang-offload-bundler" "-unbundle" "-type=a" 
"-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" 
"-targets=openmp-nvptx64-[[GPU]]" "-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" 
"-allow-missing-bundles"
+// RUN: %clang -O2 -### -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s 
-L%S/Inputs/openmp_static_device_link -lFatArchive 2>&1 | FileCheck %s
+// CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" 
"nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "[[GPU:sm_[0-9]+]]"{{.*}}"-o" 
"[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
+// CHECK: clang-offload-bundler" "-unbundle" "-type=a" 
"-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" 
"-targets=openmp-nvptx64-nvidia-cuda-[[GPU]]" 
"-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" "-allow-missing-bundles"
 // CHECK: clang-nvlink-wrapper{{.*}}"-o" "{{.*}}.out" "-arch" "[[GPU]]" 
"{{.*}}[[DEVICESPECIFICARCHIVE]]"
 // expected-no-diagnostics
 
@@ -72,8 +72,8 @@
 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa 
-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908 -c func_1.c -o func_1_gfx908.o
 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa 
-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 -c func_2.c -o func_2_gfx906.o
 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa 
-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908 -c func_2.c -o func_2_gfx908.o
-clang -O2 -fopenmp -fopenmp-targets=nvptx64 -c func_1.c -o func_1_nvptx.o
-clang -O2 -fopenmp -fopenmp-targets=nvptx64 -c func_2.c -o func_2_nvptx.o
+clang -O2 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -c func_1.c -o 
func_1_nvptx.o
+clang -O2 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -c func_2.c -o 
func_2_nvptx.o
 
 2. Create a fat archive by combining all the object file(s)
 llvm-ar cr libFatArchive.a func_1_gfx906.o func_1_gfx908.o func_2_gfx906.o 
func_2_gfx908.o func_1_nvptx.o func_2_nvptx.o
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -774,6 +774,18 @@
   llvm::Triple TT(Val);
   std::string NormalizedName = TT.normalize();
 
+  // We want to normalize the shortened versions of triples passed in 
to
+  // the values used for the bitcode libraries.
+  if (TT.getVendor() == llvm::Triple::UnknownVendor ||
+  TT.getOS() == llvm::Triple::UnknownOS) {
+if (TT.getArch() == llvm::Triple::nvptx)
+  TT = llvm::Triple("nvptx-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::nvptx64)
+  TT = llvm::Triple("nvptx64-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::amdgcn)
+  TT = llvm::Triple("amdgcn-amd-amdhsa");
+  }
+
   // Make sure we don't have a duplicate triple.
   auto Duplicate = FoundNormalizedTriples.find(NormalizedName);
   if (Duplicate != FoundNormalizedTriples.end()) {


Index: openmp/libomptarget/DeviceRTL/CMakeLists.txt
===
--- openmp/libomptarget/DeviceRTL/CMakeLists.txt
+++ 

[PATCH] D117362: [OpenMP] Remove hidden visibility for declare target variables

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a subscriber: ronlieb.
jhuber6 added a comment.

In D117362#3254681 , @JonChesterfield 
wrote:

> If I'm following correctly, this broke the amdgpu buildbot and it has been 
> moved into staging as a workaround. I haven't debugged what breaks but 
> 'DefaultVisibility' is relatively likely to behave differently on cuda vs hsa 
> systems so that's my first guess.

Yes, this patch is necessary because of the different handling of `hidden` 
visibility by NVIDIA and AMDGPU. Without this patch we cannot run any code that 
uses `#pragma omp declare target` on AMDGPU because they are all hidden. The 
runtime will try to load it and fail so we will crash.




Comment at: clang/lib/AST/Decl.cpp:792
+if (OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(Var))
+  return LinkageInfo::external();
+

JonChesterfield wrote:
> Would this change static variables to non-static and thus introduce multiple 
> definition errors? Not immediately obvious to me that the variables have to 
> be directly visible to other translation units
That was my first guess, I had @ronlieb run an alternate approach where we run 
everything as normal, but at the end we inherit the linkage but set the 
visibility to default. he said it still caused problems.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117362/new/

https://reviews.llvm.org/D117362

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117634: [OpenMP] Expand short verisions of OpenMP offloading triples

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG28d718602ad2: [OpenMP] Expand short verisions of OpenMP 
offloading triples (authored by jhuber6).

Changed prior to commit:
  https://reviews.llvm.org/D117634?vs=401357=401454#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117634/new/

https://reviews.llvm.org/D117634

Files:
  clang/lib/Driver/Driver.cpp
  clang/test/Driver/fat_archive_nvptx.cpp
  openmp/libomptarget/DeviceRTL/CMakeLists.txt


Index: openmp/libomptarget/DeviceRTL/CMakeLists.txt
===
--- openmp/libomptarget/DeviceRTL/CMakeLists.txt
+++ openmp/libomptarget/DeviceRTL/CMakeLists.txt
@@ -227,7 +227,7 @@
 
 # Generate a Bitcode library for all the compute capabilities the user 
requested
 foreach(sm ${nvptx_sm_list})
-  compileDeviceRTLLibrary(sm_${sm} nvptx -target nvptx64 -Xclang 
-target-feature -Xclang +ptx61 "-D__CUDA_ARCH__=${sm}0")
+  compileDeviceRTLLibrary(sm_${sm} nvptx -target nvptx64-nvidia-cuda -Xclang 
-target-feature -Xclang +ptx61 "-D__CUDA_ARCH__=${sm}0")
 endforeach()
 
 foreach(mcpu ${amdgpu_mcpus})
Index: clang/test/Driver/fat_archive_nvptx.cpp
===
--- clang/test/Driver/fat_archive_nvptx.cpp
+++ clang/test/Driver/fat_archive_nvptx.cpp
@@ -6,9 +6,9 @@
 
 // Given a FatArchive, clang-offload-bundler should be called to create a
 // device specific archive, which should be passed to clang-nvlink-wrapper.
-// RUN: %clang -O2 -### -fopenmp -fopenmp-targets=nvptx64 %s 
-L%S/Inputs/openmp_static_device_link -lFatArchive 2>&1 | FileCheck %s
-// CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" "nvptx64"{{.*}}"-target-cpu" 
"[[GPU:sm_[0-9]+]]"{{.*}}"-o" "[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
-// CHECK: clang-offload-bundler" "-unbundle" "-type=a" 
"-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" 
"-targets=openmp-nvptx64-[[GPU]]" "-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" 
"-allow-missing-bundles"
+// RUN: %clang -O2 -### -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s 
-L%S/Inputs/openmp_static_device_link -lFatArchive 2>&1 | FileCheck %s
+// CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" 
"nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "[[GPU:sm_[0-9]+]]"{{.*}}"-o" 
"[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
+// CHECK: clang-offload-bundler" "-unbundle" "-type=a" 
"-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" 
"-targets=openmp-nvptx64-nvidia-cuda-[[GPU]]" 
"-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" "-allow-missing-bundles"
 // CHECK: clang-nvlink-wrapper{{.*}}"-o" "{{.*}}.out" "-arch" "[[GPU]]" 
"{{.*}}[[DEVICESPECIFICARCHIVE]]"
 // expected-no-diagnostics
 
@@ -72,8 +72,8 @@
 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa 
-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908 -c func_1.c -o func_1_gfx908.o
 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa 
-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 -c func_2.c -o func_2_gfx906.o
 clang -O2 -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa 
-Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908 -c func_2.c -o func_2_gfx908.o
-clang -O2 -fopenmp -fopenmp-targets=nvptx64 -c func_1.c -o func_1_nvptx.o
-clang -O2 -fopenmp -fopenmp-targets=nvptx64 -c func_2.c -o func_2_nvptx.o
+clang -O2 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -c func_1.c -o 
func_1_nvptx.o
+clang -O2 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -c func_2.c -o 
func_2_nvptx.o
 
 2. Create a fat archive by combining all the object file(s)
 llvm-ar cr libFatArchive.a func_1_gfx906.o func_1_gfx908.o func_2_gfx906.o 
func_2_gfx908.o func_1_nvptx.o func_2_nvptx.o
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -774,6 +774,18 @@
   llvm::Triple TT(Val);
   std::string NormalizedName = TT.normalize();
 
+  // We want to expand the shortened versions of the triples passed in 
to
+  // the values used for the bitcode libraries for convenience.
+  if (TT.getVendor() == llvm::Triple::UnknownVendor ||
+  TT.getOS() == llvm::Triple::UnknownOS) {
+if (TT.getArch() == llvm::Triple::nvptx)
+  TT = llvm::Triple("nvptx-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::nvptx64)
+  TT = llvm::Triple("nvptx64-nvidia-cuda");
+else if (TT.getArch() == llvm::Triple::amdgcn)
+  TT = llvm::Triple("amdgcn-amd-amdhsa");
+  }
+
   // Make sure we don't have a duplicate triple.
   auto Duplicate = FoundNormalizedTriples.find(NormalizedName);
   if (Duplicate != FoundNormalizedTriples.end()) {


Index: openmp/libomptarget/DeviceRTL/CMakeLists.txt
===
--- openmp/libomptarget/DeviceRTL/CMakeLists.txt

[PATCH] D117246: [OpenMP] Add support for linking AMDGPU images

2022-01-19 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 401466.
jhuber6 added a comment.

Updating after upstreaming a portion of this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117246/new/

https://reviews.llvm.org/D117246

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -512,7 +512,7 @@
   // Create a new file to write the linked device image to.
   SmallString<128> TempFile;
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+  "lto-" + TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
@@ -590,6 +590,50 @@
   return static_cast(TempFile);
 }
 } // namespace nvptx
+namespace amdgcn {
+Expected link(ArrayRef InputFiles,
+   ArrayRef LinkerArgs, Triple TheTriple,
+   StringRef Arch) {
+  // AMDGPU uses the lld binary to link device object files.
+  ErrorOr LLDPath =
+  sys::findProgramByName("lld", sys::path::parent_path(LinkerExecutable));
+  if (!LLDPath)
+LLDPath = sys::findProgramByName("lld");
+  if (!LLDPath)
+return createStringError(LLDPath.getError(),
+ "Unable to find 'lld' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch + "-image", "out", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  SmallVector CmdArgs;
+  CmdArgs.push_back(*LLDPath);
+  CmdArgs.push_back("-flavor");
+  CmdArgs.push_back("gnu");
+  CmdArgs.push_back("--no-undefined");
+  CmdArgs.push_back("-shared");
+  CmdArgs.push_back("-o");
+  CmdArgs.push_back(TempFile);
+
+  // Copy system library paths used by the host linker.
+  for (StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))
+  CmdArgs.push_back(Arg);
+
+  // Add extracted input files.
+  for (StringRef Input : InputFiles)
+CmdArgs.push_back(Input);
+
+  if (sys::ExecuteAndWait(*LLDPath, CmdArgs))
+return createStringError(inconvertibleErrorCode(), "'lld' failed");
+
+  return static_cast(TempFile);
+}
+} // namespace amdgcn
 
 Expected linkDevice(ArrayRef InputFiles,
  ArrayRef LinkerArgs,
@@ -599,7 +643,7 @@
   case Triple::nvptx64:
 return nvptx::link(InputFiles, LinkerArgs, TheTriple, Arch);
   case Triple::amdgcn:
-// TODO: AMDGCN linking support.
+return amdgcn::link(InputFiles, LinkerArgs, TheTriple, Arch);
   case Triple::x86:
   case Triple::x86_64:
 // TODO: x86 linking support.


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -512,7 +512,7 @@
   // Create a new file to write the linked device image to.
   SmallString<128> TempFile;
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+  "lto-" + TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
@@ -590,6 +590,50 @@
   return static_cast(TempFile);
 }
 } // namespace nvptx
+namespace amdgcn {
+Expected link(ArrayRef InputFiles,
+   ArrayRef LinkerArgs, Triple TheTriple,
+   StringRef Arch) {
+  // AMDGPU uses the lld binary to link device object files.
+  ErrorOr LLDPath =
+  sys::findProgramByName("lld", sys::path::parent_path(LinkerExecutable));
+  if (!LLDPath)
+LLDPath = sys::findProgramByName("lld");
+  if (!LLDPath)
+return createStringError(LLDPath.getError(),
+ "Unable to find 'lld' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch + "-image", "out", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  SmallVector CmdArgs;
+  CmdArgs.push_back(*LLDPath);
+  CmdArgs.push_back("-flavor");
+  CmdArgs.push_back("gnu");
+  CmdArgs.push_back("--no-undefined");
+  CmdArgs.push_back("-shared");
+  CmdArgs.push_back("-o");
+  CmdArgs.push_back(TempFile);
+
+  // Copy system library paths used by the host linker.
+  for (StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))
+

[PATCH] D117777: [OpenMP] Don't pass empty files to nvlink

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, ye-luo, lechenyu.
Herald added subscribers: guansong, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch adds and exception to the nvlink wrapper tool to not pass
empty cubin files to the nvlink job. If an empty file is passed to
nvlink it will cause an error indicating that the file could not be
opened. This would occur if the user tried to link object files that
contained offloading code with a file that didnt. This will act as a 
workaround until the new OpenMP offloading driver becomes the default.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D11

Files:
  clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp


Index: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
===
--- clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
+++ clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
@@ -55,12 +55,22 @@
 static cl::list
 NVArgs(cl::Sink, cl::desc("..."));
 
+static bool isEmptyFile(StringRef Filename) {
+  ErrorOr> BufOrErr =
+  MemoryBuffer::getFileOrSTDIN(Filename, false, false);
+  if (std::error_code EC = BufOrErr.getError())
+return false;
+  return (*BufOrErr)->getBuffer().empty();
+}
+
 static Error runNVLink(std::string NVLinkPath,
SmallVectorImpl ) {
   std::vector NVLArgs;
   NVLArgs.push_back(NVLinkPath);
+  StringRef Output = *(llvm::find(Args, "-o") + 1);
   for (auto  : Args) {
-NVLArgs.push_back(Arg);
+if (!(sys::fs::exists(Arg) && Arg != Output && isEmptyFile(Arg)))
+  NVLArgs.push_back(Arg);
   }
 
   if (sys::ExecuteAndWait(NVLinkPath, NVLArgs))


Index: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
===
--- clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
+++ clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
@@ -55,12 +55,22 @@
 static cl::list
 NVArgs(cl::Sink, cl::desc("..."));
 
+static bool isEmptyFile(StringRef Filename) {
+  ErrorOr> BufOrErr =
+  MemoryBuffer::getFileOrSTDIN(Filename, false, false);
+  if (std::error_code EC = BufOrErr.getError())
+return false;
+  return (*BufOrErr)->getBuffer().empty();
+}
+
 static Error runNVLink(std::string NVLinkPath,
SmallVectorImpl ) {
   std::vector NVLArgs;
   NVLArgs.push_back(NVLinkPath);
+  StringRef Output = *(llvm::find(Args, "-o") + 1);
   for (auto  : Args) {
-NVLArgs.push_back(Arg);
+if (!(sys::fs::exists(Arg) && Arg != Output && isEmptyFile(Arg)))
+  NVLArgs.push_back(Arg);
   }
 
   if (sys::ExecuteAndWait(NVLinkPath, NVLArgs))
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117777: [OpenMP] Don't pass empty files to nvlink

2022-01-20 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 401646.
jhuber6 added a comment.
Herald added a subscriber: asavonic.

Adding test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D11/new/

https://reviews.llvm.org/D11

Files:
  clang/test/Driver/Inputs/openmp_static_device_link/empty.o
  clang/test/Driver/Inputs/openmp_static_device_link/lib.bc
  clang/test/Driver/fat_archive_nvptx.cpp
  clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp


Index: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
===
--- clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
+++ clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
@@ -55,12 +55,22 @@
 static cl::list
 NVArgs(cl::Sink, cl::desc("..."));
 
+static bool isEmptyFile(StringRef Filename) {
+  ErrorOr> BufOrErr =
+  MemoryBuffer::getFileOrSTDIN(Filename, false, false);
+  if (std::error_code EC = BufOrErr.getError())
+return false;
+  return (*BufOrErr)->getBuffer().empty();
+}
+
 static Error runNVLink(std::string NVLinkPath,
SmallVectorImpl ) {
   std::vector NVLArgs;
   NVLArgs.push_back(NVLinkPath);
+  StringRef Output = *(llvm::find(Args, "-o") + 1);
   for (auto  : Args) {
-NVLArgs.push_back(Arg);
+if (!(sys::fs::exists(Arg) && Arg != Output && isEmptyFile(Arg)))
+  NVLArgs.push_back(Arg);
   }
 
   if (sys::ExecuteAndWait(NVLinkPath, NVLArgs))
Index: clang/test/Driver/fat_archive_nvptx.cpp
===
--- clang/test/Driver/fat_archive_nvptx.cpp
+++ clang/test/Driver/fat_archive_nvptx.cpp
@@ -10,7 +10,8 @@
 // CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" 
"nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "[[GPU:sm_[0-9]+]]"{{.*}}"-o" 
"[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
 // CHECK: clang-offload-bundler" "-unbundle" "-type=a" 
"-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" 
"-targets=openmp-nvptx64-nvidia-cuda-[[GPU]]" 
"-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" "-allow-missing-bundles"
 // CHECK: clang-nvlink-wrapper{{.*}}"-o" "{{.*}}.out" "-arch" "[[GPU]]" 
"{{.*}}[[DEVICESPECIFICARCHIVE]]"
-// expected-no-diagnostics
+// RUN: not %clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s 
%S/Inputs/openmp_static_device_link/empty.o 
--libomptarget-nvptx-bc-path=%S/Inputs/openmp_static_device_link/lib.bc 2>&1 | 
FileCheck %s --check-prefix=EMPTY
+// EMPTY-NOT: Could not open input file
 
 #ifndef HEADER
 #define HEADER


Index: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
===
--- clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
+++ clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp
@@ -55,12 +55,22 @@
 static cl::list
 NVArgs(cl::Sink, cl::desc("..."));
 
+static bool isEmptyFile(StringRef Filename) {
+  ErrorOr> BufOrErr =
+  MemoryBuffer::getFileOrSTDIN(Filename, false, false);
+  if (std::error_code EC = BufOrErr.getError())
+return false;
+  return (*BufOrErr)->getBuffer().empty();
+}
+
 static Error runNVLink(std::string NVLinkPath,
SmallVectorImpl ) {
   std::vector NVLArgs;
   NVLArgs.push_back(NVLinkPath);
+  StringRef Output = *(llvm::find(Args, "-o") + 1);
   for (auto  : Args) {
-NVLArgs.push_back(Arg);
+if (!(sys::fs::exists(Arg) && Arg != Output && isEmptyFile(Arg)))
+  NVLArgs.push_back(Arg);
   }
 
   if (sys::ExecuteAndWait(NVLinkPath, NVLArgs))
Index: clang/test/Driver/fat_archive_nvptx.cpp
===
--- clang/test/Driver/fat_archive_nvptx.cpp
+++ clang/test/Driver/fat_archive_nvptx.cpp
@@ -10,7 +10,8 @@
 // CHECK: clang{{.*}}"-cc1"{{.*}}"-triple" "nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "[[GPU:sm_[0-9]+]]"{{.*}}"-o" "[[HOSTBC:.*.s]]" "-x" "c++"{{.*}}.cpp
 // CHECK: clang-offload-bundler" "-unbundle" "-type=a" "-inputs={{.*}}/Inputs/openmp_static_device_link/libFatArchive.a" "-targets=openmp-nvptx64-nvidia-cuda-[[GPU]]" "-outputs=[[DEVICESPECIFICARCHIVE:.*.a]]" "-allow-missing-bundles"
 // CHECK: clang-nvlink-wrapper{{.*}}"-o" "{{.*}}.out" "-arch" "[[GPU]]" "{{.*}}[[DEVICESPECIFICARCHIVE]]"
-// expected-no-diagnostics
+// RUN: not %clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda %s %S/Inputs/openmp_static_device_link/empty.o --libomptarget-nvptx-bc-path=%S/Inputs/openmp_static_device_link/lib.bc 2>&1 | FileCheck %s --check-prefix=EMPTY
+// EMPTY-NOT: Could not open input file
 
 #ifndef HEADER
 #define HEADER
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D117246: [OpenMP] Add support for linking AMDGPU images

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 402929.
jhuber6 added a comment.

Update commits.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117246/new/

https://reviews.llvm.org/D117246

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -512,7 +512,7 @@
   // Create a new file to write the linked device image to.
   SmallString<128> TempFile;
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+  "lto-" + TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
@@ -590,6 +590,50 @@
   return static_cast(TempFile);
 }
 } // namespace nvptx
+namespace amdgcn {
+Expected link(ArrayRef InputFiles,
+   ArrayRef LinkerArgs, Triple TheTriple,
+   StringRef Arch) {
+  // AMDGPU uses the lld binary to link device object files.
+  ErrorOr LLDPath =
+  sys::findProgramByName("lld", sys::path::parent_path(LinkerExecutable));
+  if (!LLDPath)
+LLDPath = sys::findProgramByName("lld");
+  if (!LLDPath)
+return createStringError(LLDPath.getError(),
+ "Unable to find 'lld' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch + "-image", "out", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  SmallVector CmdArgs;
+  CmdArgs.push_back(*LLDPath);
+  CmdArgs.push_back("-flavor");
+  CmdArgs.push_back("gnu");
+  CmdArgs.push_back("--no-undefined");
+  CmdArgs.push_back("-shared");
+  CmdArgs.push_back("-o");
+  CmdArgs.push_back(TempFile);
+
+  // Copy system library paths used by the host linker.
+  for (StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))
+  CmdArgs.push_back(Arg);
+
+  // Add extracted input files.
+  for (StringRef Input : InputFiles)
+CmdArgs.push_back(Input);
+
+  if (sys::ExecuteAndWait(*LLDPath, CmdArgs))
+return createStringError(inconvertibleErrorCode(), "'lld' failed");
+
+  return static_cast(TempFile);
+}
+} // namespace amdgcn
 
 Expected linkDevice(ArrayRef InputFiles,
  ArrayRef LinkerArgs,
@@ -599,7 +643,7 @@
   case Triple::nvptx64:
 return nvptx::link(InputFiles, LinkerArgs, TheTriple, Arch);
   case Triple::amdgcn:
-// TODO: AMDGCN linking support.
+return amdgcn::link(InputFiles, LinkerArgs, TheTriple, Arch);
   case Triple::x86:
   case Triple::x86_64:
 // TODO: x86 linking support.


Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -512,7 +512,7 @@
   // Create a new file to write the linked device image to.
   SmallString<128> TempFile;
   if (std::error_code EC = sys::fs::createTemporaryFile(
-  TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
+  "lto-" + TheTriple.getArchName() + "-" + Arch, "cubin", TempFile))
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
@@ -590,6 +590,50 @@
   return static_cast(TempFile);
 }
 } // namespace nvptx
+namespace amdgcn {
+Expected link(ArrayRef InputFiles,
+   ArrayRef LinkerArgs, Triple TheTriple,
+   StringRef Arch) {
+  // AMDGPU uses the lld binary to link device object files.
+  ErrorOr LLDPath =
+  sys::findProgramByName("lld", sys::path::parent_path(LinkerExecutable));
+  if (!LLDPath)
+LLDPath = sys::findProgramByName("lld");
+  if (!LLDPath)
+return createStringError(LLDPath.getError(),
+ "Unable to find 'lld' in path");
+
+  // Create a new file to write the linked device image to.
+  SmallString<128> TempFile;
+  if (std::error_code EC = sys::fs::createTemporaryFile(
+  TheTriple.getArchName() + "-" + Arch + "-image", "out", TempFile))
+return createFileError(TempFile, EC);
+  TempFiles.push_back(static_cast(TempFile));
+
+  SmallVector CmdArgs;
+  CmdArgs.push_back(*LLDPath);
+  CmdArgs.push_back("-flavor");
+  CmdArgs.push_back("gnu");
+  CmdArgs.push_back("--no-undefined");
+  CmdArgs.push_back("-shared");
+  CmdArgs.push_back("-o");
+  CmdArgs.push_back(TempFile);
+
+  // Copy system library paths used by the host linker.
+  for (StringRef Arg : LinkerArgs)
+if (Arg.startswith("-L"))
+  CmdArgs.push_back(Arg);
+
+  // 

[PATCH] D118155: [OpenMP] Improve symbol resolution for OpenMP Offloading LTO

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, JonChesterfield, ronlieb, saiislam.
Herald added subscribers: guansong, inglorion, yaxunl.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch improves the symbol resolution done for LTO with offloading
applications. The symbol resolution done here allows the LTO backend to
internalize more functions. The symbol resoltion done is a simplified
view that does not take into account various options like `--wrap` or
`--dyanimic-list` and always assumes we are creating a shared object.
The actual target may be an executable, but semantically it is used as a
shared object because certain objects need to be visible outside of the
executable when they are read by the OpenMP plugin.

Depends on D117246 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D118155

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -750,6 +750,7 @@
   SmallVector, 4> BitcodeFiles;
   SmallVector NewInputFiles;
   StringMap UsedInRegularObj;
+  StringMap UsedInSharedLib;
 
   // Search for bitcode files in the input and create an LTO input file. If it
   // is not a bitcode file, scan its symbol table for symbols we need to
@@ -773,7 +774,11 @@
 if (!Name)
   return Name.takeError();
 
-UsedInRegularObj[*Name] = true;
+// Record if we've seen these symbols in any object or shared libraries.
+if ((*ObjFile)->isRelocatableObject()) {
+  UsedInRegularObj[*Name] = true;
+} else
+  UsedInSharedLib[*Name] = true;
   }
 } else {
   Expected> InputFileOrErr =
@@ -781,6 +786,7 @@
   if (!InputFileOrErr)
 return InputFileOrErr.takeError();
 
+  // Save the input file and the buffer associated with its memory.
   BitcodeFiles.push_back(std::move(*InputFileOrErr));
   SavedBuffers.push_back(std::move(*BufferOrErr));
 }
@@ -811,22 +817,16 @@
 return false;
   };
 
-  // We have visibility of the whole program if every input is bitcode, all
-  // inputs are statically linked so there should be no external references.
+  // We assume visibility of the whole program if every input file was bitcode.
   bool WholeProgram = BitcodeFiles.size() == InputFiles.size();
   auto LTOBackend = (EmbedBC)
 ? createLTO(TheTriple, Arch, WholeProgram, LinkOnly)
 : createLTO(TheTriple, Arch, WholeProgram);
 
-  // TODO: Run more tests to verify that this is correct.
-  // Create the LTO instance with the necessary config and add the bitcode files
-  // to it after resolving symbols. We make a few assumptions about symbol
-  // resolution.
-  // 1. The target is going to be a stand-alone executable file.
-  // 2. We do not support relocatable object files.
-  // 3. All inputs are relocatable object files extracted from host binaries, so
-  //there is no resolution to a dynamic library.
-  StringMap PrevailingSymbols;
+  // We need to resolve the symbols so the LTO backend knows which symbols need
+  // to be kept or can be internalized. This is a simplified symbol resolution
+  // scheme to approximate the full resolution a linker would do.
+  DenseSet PrevailingSymbols;
   for (auto  : BitcodeFiles) {
 const auto Symbols = BitcodeFile->symbols();
 SmallVector Resolutions(Symbols.size());
@@ -835,35 +835,43 @@
   lto::SymbolResolution  = Resolutions[Idx++];
 
   // We will use this as the prevailing symbol definition in LTO unless
-  // it is undefined in the module or another symbol has already been used.
-  Res.Prevailing = !Sym.isUndefined() && !PrevailingSymbols[Sym.getName()];
-
-  // We need LTO to preserve symbols referenced in other object files, or
-  // are needed by the rest of the toolchain.
+  // it is undefined or another definition has already been used.
+  Res.Prevailing =
+  !Sym.isUndefined() && PrevailingSymbols.insert(Sym.getName()).second;
+
+  // We need LTO to preseve the following global symbols:
+  // 1) Symbols used in regular objects.
+  // 2) Sections that will be given a __start/__stop symbol.
+  // 3) Prevailing symbols that are needed visibile to external libraries.
   Res.VisibleToRegularObj =
   UsedInRegularObj[Sym.getName()] ||
   isValidCIdentifier(Sym.getSectionName()) ||
-  (Res.Prevailing && Sym.getName().startswith("__omp"));
-
-  // We do not currently support shared libraries, so no symbols will be
-  // referenced externally by shared libraries.
-  Res.ExportDynamic = false;
-
-  // The result will 

[PATCH] D117049: [OpenMP] Add support for embedding bitcode images in wrapper tool

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 402926.
jhuber6 added a comment.

Rework commits.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117049/new/

https://reviews.llvm.org/D117049

Files:
  clang/include/clang/Basic/DiagnosticDriverKinds.td
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -76,12 +76,18 @@
cl::desc("Path for the target bitcode library"),
cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt EmbedBC(
+"target-embed-bc", cl::ZeroOrMore,
+cl::desc("Embed linked bitcode instead of an executable device image."),
+cl::init(false), cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
-HostLinkerArgs(cl::Sink, cl::desc("..."));
+HostLinkerArgs(cl::Positional,
+   cl::desc("..."));
 
 /// Path of the current binary.
-static std::string LinkerExecutable;
+static const char *LinkerExecutable;
 
 /// Temporary files created by the linker wrapper.
 static SmallVector TempFiles;
@@ -422,8 +428,8 @@
 
   std::unique_ptr Buffer =
   MemoryBuffer::getMemBuffer(Library.getMemoryBufferRef(), false);
-  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(),
-true, Library.isThin(), std::move(Buffer)))
+  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(), true,
+   Library.isThin(), std::move(Buffer)))
 return std::move(Err);
 
   return static_cast(TempFile);
@@ -500,7 +506,7 @@
   return static_cast(TempFile);
 }
 
-Expected link(ArrayRef InputFiles,
+Expected link(ArrayRef InputFiles,
ArrayRef LinkerArgs, Triple TheTriple,
StringRef Arch) {
   // NVPTX uses the nvlink binary to link device object files.
@@ -534,7 +540,7 @@
   CmdArgs.push_back(Arg);
 
   // Add extracted input files.
-  for (auto Input : InputFiles)
+  for (StringRef Input : InputFiles)
 CmdArgs.push_back(Input);
 
   if (sys::ExecuteAndWait(*NvlinkPath, CmdArgs))
@@ -544,7 +550,7 @@
 }
 } // namespace nvptx
 
-Expected linkDevice(ArrayRef InputFiles,
+Expected linkDevice(ArrayRef InputFiles,
  ArrayRef LinkerArgs,
  Triple TheTriple, StringRef Arch) {
   switch (TheTriple.getArch()) {
@@ -611,8 +617,10 @@
   llvm_unreachable("Invalid optimization level");
 }
 
-std::unique_ptr createLTO(const Triple , StringRef Arch,
-bool WholeProgram) {
+template >
+std::unique_ptr createLTO(
+const Triple , StringRef Arch, bool WholeProgram,
+ModuleHook Hook = [](size_t, const Module &) { return true; }) {
   lto::Config Conf;
   lto::ThinBackend Backend;
   // TODO: Handle index-only thin-LTO
@@ -631,7 +639,7 @@
   Conf.PTO.LoopVectorization = Conf.OptLevel > 1;
   Conf.PTO.SLPVectorization = Conf.OptLevel > 1;
 
-  // TODO: Handle outputting bitcode using a module hook.
+  Conf.PostInternalizeModuleHook = Hook;
   if (TheTriple.isNVPTX())
 Conf.CGFileType = CGFT_AssemblyFile;
   else
@@ -651,11 +659,11 @@
  [](char C) { return C == '_' || isAlnum(C); });
 }
 
-Expected> linkBitcodeFiles(ArrayRef InputFiles,
- const Triple ,
- StringRef Arch) {
+Error linkBitcodeFiles(SmallVectorImpl ,
+   const Triple , StringRef Arch) {
   SmallVector, 4> SavedBuffers;
   SmallVector, 4> BitcodeFiles;
+  SmallVector NewInputFiles;
   StringMap UsedInRegularObj;
 
   // Search for bitcode files in the input and create an LTO input file. If it
@@ -674,6 +682,7 @@
   if (!ObjFile)
 return ObjFile.takeError();
 
+  NewInputFiles.push_back(File.str());
   for (auto  : (*ObjFile)->symbols()) {
 Expected Name = Sym.getName();
 if (!Name)
@@ -693,12 +702,36 @@
   }
 
   if (BitcodeFiles.empty())
-return None;
+return Error::success();
+
+  auto HandleError = [&](std::error_code EC) {
+logAllUnhandledErrors(errorCodeToError(EC),
+  WithColor::error(errs(), LinkerExecutable));
+exit(1);
+  };
+
+  // LTO Module hook to output bitcode without running the backend.
+  auto LinkOnly = [&](size_t Task, const Module ) {
+SmallString<128> TempFile;
+if (std::error_code EC = sys::fs::createTemporaryFile(
+"jit-" + TheTriple.getTriple(), "bc", TempFile))
+  HandleError(EC);
+std::error_code EC;
+raw_fd_ostream LinkedBitcode(TempFile, EC, sys::fs::OF_None);
+if (EC)
+  

[PATCH] D117048: [OpenMP] Link the bitcode library late for device LTO

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 402925.
jhuber6 added a comment.

Squash other uncommitted changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117048/new/

https://reviews.llvm.org/D117048

Files:
  clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -68,9 +68,14 @@
 
 static cl::opt OptLevel("opt-level",
  cl::desc("Optimization level for LTO"),
- cl::init("O0"),
+ cl::init("O2"),
  cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt
+BitcodeLibrary("target-library",
+   cl::desc("Path for the target bitcode library"),
+   cl::cat(ClangLinkerWrapperCategory));
+
 // Do not parse linker options.
 static cl::list
 HostLinkerArgs(cl::Sink, cl::desc("..."));
@@ -201,7 +206,7 @@
   std::unique_ptr Output = std::move(*OutputOrErr);
   std::copy(Contents->begin(), Contents->end(), Output->getBufferStart());
   if (Error E = Output->commit())
-return E;
+return std::move(E);
 
   DeviceFiles.emplace_back(DeviceTriple, Arch, TempFile);
   ToBeStripped.push_back(*Name);
@@ -229,7 +234,7 @@
 std::unique_ptr Output = std::move(*OutputOrErr);
 std::copy(Contents.begin(), Contents.end(), Output->getBufferStart());
 if (Error E = Output->commit())
-  return E;
+  return std::move(E);
 StripFile = TempFile;
   }
 
@@ -318,7 +323,7 @@
 std::unique_ptr Output = std::move(*OutputOrErr);
 std::copy(Contents.begin(), Contents.end(), Output->getBufferStart());
 if (Error E = Output->commit())
-  return E;
+  return std::move(E);
 
 DeviceFiles.emplace_back(DeviceTriple, Arch, TempFile);
 ToBeDeleted.push_back();
@@ -329,7 +334,7 @@
 
   // We need to materialize the lazy module before we make any changes.
   if (Error Err = M->materializeAll())
-return Err;
+return std::move(Err);
 
   // Remove the global from the module and write it to a new file.
   for (GlobalVariable *GV : ToBeDeleted) {
@@ -403,7 +408,7 @@
   }
 
   if (Err)
-return Err;
+return std::move(Err);
 
   if (!NewMembers)
 return None;
@@ -417,9 +422,9 @@
 
   std::unique_ptr Buffer =
   MemoryBuffer::getMemBuffer(Library.getMemoryBufferRef(), false);
-  if (Error WriteErr = writeArchive(TempFile, Members, true, Library.kind(),
+  if (Error Err = writeArchive(TempFile, Members, true, Library.kind(),
 true, Library.isThin(), std::move(Buffer)))
-return WriteErr;
+return std::move(Err);
 
   return static_cast(TempFile);
 }
@@ -740,7 +745,7 @@
 
 // Add the bitcode file with its resolved symbols to the LTO job.
 if (Error Err = LTOBackend->add(std::move(BitcodeFile), Resolutions))
-  return Err;
+  return std::move(Err);
   }
 
   // Run the LTO job to compile the bitcode.
@@ -758,7 +763,7 @@
 std::make_unique(FD, true));
   };
   if (Error Err = LTOBackend->run(AddStream))
-return Err;
+return std::move(Err);
 
   for (auto  : Files) {
 if (!TheTriple.isNVPTX())
@@ -976,6 +981,17 @@
 }
   }
 
+  // Add the device bitcode library to the device files if it was passed in.
+  if (!BitcodeLibrary.empty()) {
+// FIXME: Hacky workaround to avoid a backend crash at O0.
+if (OptLevel[1] - '0' == 0)
+  OptLevel[1] = '1';
+auto DeviceAndPath = StringRef(BitcodeLibrary).split('=');
+auto TripleAndArch = DeviceAndPath.first.rsplit('-');
+DeviceFiles.emplace_back(TripleAndArch.first, TripleAndArch.second,
+ DeviceAndPath.second);
+  }
+
   // Link the device images extracted from the linker input.
   SmallVector LinkedImages;
   if (Error Err = linkDeviceFiles(DeviceFiles, LinkerArgs, LinkedImages))
Index: clang/lib/Driver/ToolChains/Cuda.cpp
===
--- clang/lib/Driver/ToolChains/Cuda.cpp
+++ clang/lib/Driver/ToolChains/Cuda.cpp
@@ -744,6 +744,10 @@
   return;
 }
 
+// Link the bitcode library late if we're using device LTO.
+if (getDriver().isUsingLTO(/* IsOffload */ true))
+  return;
+
 std::string BitcodeSuffix;
 if (DriverArgs.hasFlag(options::OPT_fopenmp_target_new_runtime,
options::OPT_fno_openmp_target_new_runtime, true))
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ 

[PATCH] D117156: [OpenMP] Add extra flag handling to linker wrapper

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 402928.
jhuber6 added a comment.

Rework commits


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117156/new/

https://reviews.llvm.org/D117156

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -48,6 +48,12 @@
 
 static cl::opt Help("h", cl::desc("Alias for -help"), cl::Hidden);
 
+enum DebugKind {
+  NoDebugInfo,
+  DirectivesOnly,
+  FullDebugInfo,
+};
+
 // Mark all our options with this category, everything else (except for -help)
 // will be hidden.
 static cl::OptionCategory
@@ -58,29 +64,53 @@
 cl::desc("Strip offloading sections from the host object file."),
 cl::init(true), cl::cat(ClangLinkerWrapperCategory));
 
-static cl::opt LinkerUserPath("linker-path",
+static cl::opt LinkerUserPath("linker-path", cl::Required,
cl::desc("Path of linker binary"),
cl::cat(ClangLinkerWrapperCategory));
 
 static cl::opt
-TargetFeatures("target-feature", cl::desc("Target features for triple"),
+TargetFeatures("target-feature", cl::ZeroOrMore,
+   cl::desc("Target features for triple"),
cl::cat(ClangLinkerWrapperCategory));
 
-static cl::opt OptLevel("opt-level",
+static cl::opt OptLevel("opt-level", cl::ZeroOrMore,
  cl::desc("Optimization level for LTO"),
  cl::init("O2"),
  cl::cat(ClangLinkerWrapperCategory));
 
 static cl::opt
-BitcodeLibrary("target-library",
+BitcodeLibrary("target-library", cl::ZeroOrMore,
cl::desc("Path for the target bitcode library"),
cl::cat(ClangLinkerWrapperCategory));
 
 static cl::opt EmbedBC(
 "target-embed-bc", cl::ZeroOrMore,
-cl::desc("Embed linked bitcode instead of an executable device image."),
+cl::desc("Embed linked bitcode instead of an executable device image"),
 cl::init(false), cl::cat(ClangLinkerWrapperCategory));
 
+static cl::opt
+HostTriple("host-triple", cl::ZeroOrMore,
+   cl::desc("Triple to use for the host compilation"),
+   cl::init(sys::getDefaultTargetTriple()),
+   cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt
+PtxasOption("ptxas-option", cl::ZeroOrMore,
+cl::desc("Argument to pass to the ptxas invocation"),
+cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt Verbose("v", cl::ZeroOrMore,
+ cl::desc("Verbose output from tools"),
+ cl::init(false),
+ cl::cat(ClangLinkerWrapperCategory));
+
+static cl::opt DebugInfo(
+cl::desc("Choose debugging level:"), cl::init(NoDebugInfo),
+cl::values(clEnumValN(NoDebugInfo, "g0", "No debug information"),
+   clEnumValN(DirectivesOnly, "gline-directives-only",
+  "Direction information"),
+   clEnumValN(FullDebugInfo, "g", "Full debugging support")));
+
 // Do not parse linker options.
 static cl::list
 HostLinkerArgs(cl::Positional,
@@ -491,6 +521,14 @@
   std::string Opt = "-" + OptLevel;
   CmdArgs.push_back(*PtxasPath);
   CmdArgs.push_back(TheTriple.isArch64Bit() ? "-m64" : "-m32");
+  if (Verbose)
+CmdArgs.push_back("-v");
+  if (DebugInfo == DirectivesOnly && OptLevel[1] == '0')
+CmdArgs.push_back("-lineinfo");
+  else if (DebugInfo == FullDebugInfo && OptLevel[1] == '0')
+CmdArgs.push_back("-g");
+  if (!PtxasOption.empty())
+CmdArgs.push_back(PtxasOption);
   CmdArgs.push_back("-o");
   CmdArgs.push_back(TempFile);
   CmdArgs.push_back(Opt);
@@ -525,10 +563,13 @@
 return createFileError(TempFile, EC);
   TempFiles.push_back(static_cast(TempFile));
 
-  // TODO: Pass in arguments like `-g` and `-v` from the driver.
   SmallVector CmdArgs;
   CmdArgs.push_back(*NvlinkPath);
   CmdArgs.push_back(TheTriple.isArch64Bit() ? "-m64" : "-m32");
+  if (Verbose)
+CmdArgs.push_back("-v");
+  if (DebugInfo != NoDebugInfo)
+CmdArgs.push_back("-g");
   CmdArgs.push_back("-o");
   CmdArgs.push_back(TempFile);
   CmdArgs.push_back("-arch");
@@ -577,16 +618,16 @@
 
   switch (DI.getSeverity()) {
   case DS_Error:
-WithColor::error(errs(), LinkerExecutable) << ErrStorage;
+WithColor::error(errs(), LinkerExecutable) << ErrStorage << "\n";
 break;
   case DS_Warning:
-WithColor::warning(errs(), LinkerExecutable) << ErrStorage;
+WithColor::warning(errs(), LinkerExecutable) << ErrStorage << "\n";
 break;
   case DS_Note:
-WithColor::note(errs(), 

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 403404.
jhuber6 added a comment.
Herald added a subscriber: mgorny.

Updating approach, use a vector of string pairs now. Multiple files are simply
passed multiple times. Will add filename to the offloading section name laterf,
as similar sections could be merged when performing a relocatable link.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/CodeGen/BackendUtil.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/test/Frontend/embed-object.ll
  llvm/include/llvm/Bitcode/BitcodeWriter.h
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/Bitcode/Writer/CMakeLists.txt

Index: llvm/lib/Bitcode/Writer/CMakeLists.txt
===
--- llvm/lib/Bitcode/Writer/CMakeLists.txt
+++ llvm/lib/Bitcode/Writer/CMakeLists.txt
@@ -11,6 +11,7 @@
   Analysis
   Core
   MC
+  TransformUtils
   Object
   Support
   )
Index: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
===
--- llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -69,6 +69,7 @@
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/SHA1.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
 #include 
 #include 
 #include 
@@ -4972,3 +4973,18 @@
   llvm::ConstantArray::get(ATy, UsedArray), "llvm.compiler.used");
   NewUsed->setSection("llvm.metadata");
 }
+
+void llvm::EmbedObjectInModule(llvm::Module , llvm::MemoryBufferRef Buf,
+   StringRef SectionName) {
+  ArrayRef ModuleData = ArrayRef(Buf.getBufferStart(), Buf.getBufferSize());
+
+  // Embed the data in the
+  llvm::Constant *ModuleConstant =
+  llvm::ConstantDataArray::get(M.getContext(), ModuleData);
+  llvm::GlobalVariable *GV = new llvm::GlobalVariable(
+  M, ModuleConstant->getType(), true, llvm::GlobalValue::PrivateLinkage,
+  ModuleConstant, "llvm.embedded.object");
+  GV->setSection(SectionName);
+
+  appendToCompilerUsed(M, GV);
+}
Index: llvm/include/llvm/Bitcode/BitcodeWriter.h
===
--- llvm/include/llvm/Bitcode/BitcodeWriter.h
+++ llvm/include/llvm/Bitcode/BitcodeWriter.h
@@ -165,6 +165,11 @@
 bool EmbedCmdline,
 const std::vector );
 
+  /// Embeds the memory buffer \p Buf into the module \p M as a global using the
+  /// section name \p SectionName.
+  void EmbedObjectInModule(Module , MemoryBufferRef Buf,
+   StringRef SectionName);
+
 } // end namespace llvm
 
 #endif // LLVM_BITCODE_BITCODEWRITER_H
Index: clang/test/Frontend/embed-object.ll
===
--- /dev/null
+++ clang/test/Frontend/embed-object.ll
@@ -0,0 +1,13 @@
+; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
+; RUN:-fembed-offload-object=%S/Inputs/empty.h,section -x ir %s -o - \
+; RUN:| FileCheck %s -check-prefix=CHECK
+
+; CHECK: @llvm.embedded.object = private constant [0 x i8] zeroinitializer, section ".llvm.offloading.section"
+; CHECK: @llvm.compiler.used = appending global [2 x i8*] [i8* @x, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @llvm.embedded.object, i32 0, i32 0)], section "llvm.metadata"
+
+@x = private constant i8 1
+@llvm.compiler.used = appending global [1 x i8*] [i8* @x], section "llvm.metadata"
+
+define i32 @foo() {
+  ret i32 0
+}
Index: clang/lib/CodeGen/CodeGenAction.cpp
===
--- clang/lib/CodeGen/CodeGenAction.cpp
+++ clang/lib/CodeGen/CodeGenAction.cpp
@@ -1134,6 +1134,7 @@
 TheModule->setTargetTriple(TargetOpts.Triple);
   }
 
+  EmbedBinary(TheModule.get(), CodeGenOpts, Diagnostics);
   EmbedBitcode(TheModule.get(), CodeGenOpts, *MainFile);
 
   LLVMContext  = TheModule->getContext();
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -1745,8 +1745,31 @@
  llvm::MemoryBufferRef Buf) {
   if (CGOpts.getEmbedBitcode() == CodeGenOptions::Embed_Off)
 return;
+
   llvm::EmbedBitcodeInModule(
   *M, Buf, CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Marker,
   CGOpts.getEmbedBitcode() != CodeGenOptions::Embed_Bitcode,
   CGOpts.CmdArgs);
 }
+
+void clang::EmbedBinary(llvm::Module *M, const CodeGenOptions ,
+DiagnosticsEngine ) {
+  if (CGOpts.OffloadObjects.empty())
+return;
+
+  for (StringRef OffloadObject : CGOpts.OffloadObjects) {
+auto FilenameAndSection = OffloadObject.split(',');
+

[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: llvm/lib/Bitcode/Writer/CMakeLists.txt:14
   MC
+  TransformUtils
   Object

I'm not sure if it's worth linking TransformUtils just for 
`appendToCompilerUsed` I can copy the implementation locally if not.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116543: [OpenMP] Embed device files into the host IR

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 403413.
jhuber6 added a comment.

Updating after changing flag in D116542 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116543/new/

https://reviews.llvm.org/D116543

Files:
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/Driver/openmp-offload-gpu.c


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: 
["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", 
"[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: 
["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda 
-Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  
--libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc
 \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes %s -o 
openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: 
-fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4365,9 +4365,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is the primary input.
@@ -4384,8 +4384,8 @@
   CudaDeviceInput = 
 } else if (IsOpenMPDevice && !OpenMPDeviceInput) {
   OpenMPDeviceInput = 
-} else if (IsOpenMPHost && !OpenMPHostInput) {
-  OpenMPHostInput = 
+} else if (IsOpenMPHost) {
+  OpenMPHostInputs.push_back(I);
 } else {
   llvm_unreachable("unexpectedly given multiple inputs");
 }
@@ -6891,6 +6891,24 @@
 }
   }
 
+  // Host-side OpenMP offloading recieves the device object files and embeds it
+  // in a named section including the associated target triple and 
architecture.
+  if (IsOpenMPHost && !OpenMPHostInputs.empty()) {
+auto InputFile = OpenMPHostInputs.begin();
+auto OpenMPTCs = C.getOffloadToolChains();
+for (auto TI = OpenMPTCs.first, TE = OpenMPTCs.second; TI != TE;
+ ++TI, ++InputFile) {
+  const ToolChain *TC = TI->second;
+  const ArgList  = C.getArgsForToolChain(TC, "", 
Action::OFK_OpenMP);
+  StringRef File =
+  C.getArgs().MakeArgString(TC->getInputFilename(*InputFile));
+
+  CmdArgs.push_back(Args.MakeArgString(
+  "-fembed-offload-object=" + File + "," + TC->getTripleString() + "." 
+
+  TCArgs.getLastArgValue(options::OPT_march_EQ)));
+}
+  }
+
   if (Triple.isAMDGPU()) {
 handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);
 


Index: clang/test/Driver/openmp-offload-gpu.c
===
--- clang/test/Driver/openmp-offload-gpu.c
+++ clang/test/Driver/openmp-offload-gpu.c
@@ -353,3 +353,10 @@
 // NEW_DRIVER: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "[[DEVICE_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.+]]" 
 // NEW_DRIVER: "x86_64-unknown-linux-gnu" - "[[LINKER:.+]]", inputs: ["[[HOST_OBJ]]"], output: "openmp-offload-gpu"
+
+// RUN:   %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
+// RUN:  --libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc \
+// RUN:  -fopenmp-new-driver -no-canonical-prefixes %s -o openmp-offload-gpu 2>&1 \
+// RUN:   | FileCheck -check-prefix=NEW_DRIVER_EMBEDDING %s
+
+// NEW_DRIVER_EMBEDDING: -fembed-offload-object=[[CUBIN:.*\.cubin]],nvptx64-nvidia-cuda.sm_70
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4365,9 +4365,9 @@
   IsHeaderModulePrecompile ? HeaderModuleInput : Inputs[0];
 
   InputInfoList ModuleHeaderInputs;
+  InputInfoList OpenMPHostInputs;
   const InputInfo *CudaDeviceInput = nullptr;
   const InputInfo *OpenMPDeviceInput = nullptr;
-  const InputInfo *OpenMPHostInput = nullptr;
   for (const InputInfo  : Inputs) {
 if ( == ) {
   // This is 

[PATCH] D116541: [OpenMP] Introduce new flag to change offloading driver pipeline

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/include/clang/Driver/Driver.h:45
+class Command;
+class Compilation;
+class JobList;

JonChesterfield wrote:
> This looks like it should be a breaking change - InputInfo is no longer 
> forward declared. Would it be reasonable to keep the forward declaration and 
> put the typedef between class statements and the LTOKind enum?
We can't forward declare a struct and use it by-value in a container, I would 
need to change it to a pointer. It's doable, but I don't think it's ideal. I'm 
not sure why this would break anything, the forward declarations simply were 
there to avoid including more files here. I could be wrong however.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116541/new/

https://reviews.llvm.org/D116541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116542: [OpenMP] Add a flag for embedding a file into the module

2022-01-26 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments.



Comment at: clang/include/clang/Basic/CodeGenOptions.h:279
 
+  /// List of file passed with -fembed-offload-binary option to embed
+  /// device-side offloading binaries in the host object file.

JonChesterfield wrote:
> This is unclear. List in what sense? It's a std::string, which suggests it's 
> a freeform argument that gets parsed somewhere else.
> 
> The relation between the two strings is also unclear. Are the lists meant to 
> be the same length, implying a one to one mapping? If there is a strong 
> relation between the two we can probably remove a bunch of error handling by 
> taking one argument instead of two.
> 
> Perhaps the variable should be something like a vector section>>?
It's a list of comma separated values, but you're right. This should be parsed 
out when we handle the flags.



Comment at: clang/lib/CodeGen/BackendUtil.cpp:1757
+
+  assert(BinaryFilenames.size() == BinarySections.size() &&
+ "Different number of filenames and section names in embedding");

JonChesterfield wrote:
> Definitely don't want to assert on invalid commandline argument
Will remove.



Comment at: clang/lib/CodeGen/BackendUtil.cpp:1774
+  SectionName += ".";
+  SectionName += *BinarySection;
+}

JonChesterfield wrote:
> This looks lossy - if two files use the same section name, they'll end up 
> appended in an order that is probably an implementation quirk of llvm-link, 
> and I think we've thrown away the filename info so can't get back to where we 
> were.
> 
> Would .llvm.offloading.filename be a reasonable name for each section, with 
> either error on duplicates or warning + discard?
We only care about the sections per-file right. When I extract these in the 
`linker-wrapper` I simply look at each file's sections, and put them into a 
list of device inputs, we don't need them to be unique as long as there aren't 
multiple in the same file.



Comment at: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4982
+  Type *UsedElementType = Type::getInt8Ty(M.getContext())->getPointerTo(0);
+  GlobalVariable *Used = collectUsedGlobalVariables(M, UsedGlobals, true);
+  for (auto *GV : UsedGlobals) {

JonChesterfield wrote:
> I think I've written some handling very like this in an LDS pass that I meant 
> to factor out for reuse but haven't got around to - we're removing a value 
> from compiler.used?
This handling of the compiler.used variable was copied from the implementation 
of `-fembed-bitcode` above. I wasn't sure about their methodology but I figured 
it worked for them. I can tidy it up to be more straightforward.



Comment at: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:4994
+
+  // Embed the data in the
+  llvm::Constant *ModuleConstant =

JonChesterfield wrote:
> missing end of comment
Will fix.



Comment at: llvm/lib/Bitcode/Writer/BitcodeWriter.cpp:5003
+  // sections after linking.
+  GV->setAlignment(Align(1));
+  UsedArray.push_back(

JonChesterfield wrote:
> Is this necessary? 1 seems a likely default for a uint8_t array
Will remove.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116542/new/

https://reviews.llvm.org/D116542

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D118198: [OpenMP] Remove call to 'clang-offload-wrapper' binary

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, JonChesterfield, ronlieb, saiislam.
Herald added subscribers: guansong, yaxunl, mgorny.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch removes the system call to the `clang-offload-wrapper` tool
by replicating its functionality in a new file. This improves
performance and makes the future wrapping functionality easier to
change.

Depends on D118197 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D118198

Files:
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
  clang/tools/clang-linker-wrapper/OffloadWrapper.cpp
  clang/tools/clang-linker-wrapper/OffloadWrapper.h

Index: clang/tools/clang-linker-wrapper/OffloadWrapper.h
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/OffloadWrapper.h
@@ -0,0 +1,20 @@
+//===- OffloadWrapper.h -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CLANG_TOOLS_CLANG_LINKER_WRAPPER_OFFLOAD_WRAPPER_H
+#define LLVM_CLANG_TOOLS_CLANG_LINKER_WRAPPER_OFFLOAD_WRAPPER_H
+
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/IR/Module.h"
+
+/// Wrap the input device images into the module \p M as global symbols and
+/// registers the images with the OpenMP Offloading runtime libomptarget.
+llvm::Error wrapBinaries(llvm::Module ,
+ llvm::ArrayRef> Images);
+
+#endif
Index: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp
===
--- /dev/null
+++ clang/tools/clang-linker-wrapper/OffloadWrapper.cpp
@@ -0,0 +1,267 @@
+//===- OffloadWrapper.cpp ---*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "OffloadWrapper.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/Triple.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include "llvm/Transforms/Utils/ModuleUtils.h"
+
+using namespace llvm;
+
+namespace {
+
+IntegerType *getSizeTTy(Module ) {
+  LLVMContext  = M.getContext();
+  switch (M.getDataLayout().getPointerTypeSize(Type::getInt8PtrTy(C))) {
+  case 4u:
+return Type::getInt32Ty(C);
+  case 8u:
+return Type::getInt64Ty(C);
+  }
+  llvm_unreachable("unsupported pointer type size");
+}
+
+// struct __tgt_offload_entry {
+//   void *addr;
+//   char *name;
+//   size_t size;
+//   int32_t flags;
+//   int32_t reserved;
+// };
+StructType *getEntryTy(Module ) {
+  LLVMContext  = M.getContext();
+  StructType *EntryTy = StructType::getTypeByName(C, "__tgt_offload_entry");
+  if (!EntryTy)
+EntryTy = StructType::create("__tgt_offload_entry", Type::getInt8PtrTy(C),
+ Type::getInt8PtrTy(C), getSizeTTy(M),
+ Type::getInt32Ty(C), Type::getInt32Ty(C));
+  return EntryTy;
+}
+
+PointerType *getEntryPtrTy(Module ) {
+  return PointerType::getUnqual(getEntryTy(M));
+}
+
+// struct __tgt_device_image {
+//   void *ImageStart;
+//   void *ImageEnd;
+//   __tgt_offload_entry *EntriesBegin;
+//   __tgt_offload_entry *EntriesEnd;
+// };
+StructType *getDeviceImageTy(Module ) {
+  LLVMContext  = M.getContext();
+  StructType *ImageTy = StructType::getTypeByName(C, "__tgt_device_image");
+  if (!ImageTy)
+ImageTy = StructType::create("__tgt_device_image", Type::getInt8PtrTy(C),
+ Type::getInt8PtrTy(C), getEntryPtrTy(M),
+ getEntryPtrTy(M));
+  return ImageTy;
+}
+
+PointerType *getDeviceImagePtrTy(Module ) {
+  return PointerType::getUnqual(getDeviceImageTy(M));
+}
+
+// struct __tgt_bin_desc {
+//   int32_t NumDeviceImages;
+//   __tgt_device_image *DeviceImages;
+//   __tgt_offload_entry *HostEntriesBegin;
+//   __tgt_offload_entry *HostEntriesEnd;
+// };
+StructType *getBinDescTy(Module ) {
+  LLVMContext  = M.getContext();
+  StructType *DescTy = StructType::getTypeByName(C, "__tgt_bin_desc");
+  if (!DescTy)
+DescTy = StructType::create("__tgt_bin_desc", Type::getInt32Ty(C),
+getDeviceImagePtrTy(M), getEntryPtrTy(M),
+getEntryPtrTy(M));
+  return 

[PATCH] D118197: [OpenMP] Replace sysmtem call to `llc` with target machine

2022-01-25 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 created this revision.
jhuber6 added reviewers: jdoerfert, JonChesterfield, ronlieb, saiislam.
Herald added subscribers: mikhail.ramalho, guansong, yaxunl, mgorny.
jhuber6 requested review of this revision.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

This patch replaces the system call to the `llc` binary with a library
call to the target machine interface. This should be faster than
relying on an external system call to compile the final wrapper binary.

Depends on D118155 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D118197

Files:
  clang/tools/clang-linker-wrapper/CMakeLists.txt
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -23,6 +23,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/IRReader/IRReader.h"
 #include "llvm/LTO/LTO.h"
+#include "llvm/MC/TargetRegistry.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/ArchiveWriter.h"
 #include "llvm/Object/Binary.h"
@@ -42,6 +43,7 @@
 #include "llvm/Support/TargetSelect.h"
 #include "llvm/Support/WithColor.h"
 #include "llvm/Support/raw_ostream.h"
+#include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
 using namespace llvm::object;
@@ -969,6 +971,49 @@
   return Error::success();
 }
 
+// Compile the module to an object file using the appropriate target machine for
+// the host triple.
+Expected compileModule(Module ) {
+  if (M.getTargetTriple().empty())
+M.setTargetTriple(HostTriple);
+
+  std::string Msg;
+  const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
+  if (!T)
+return createStringError(inconvertibleErrorCode(), Msg);
+
+  auto Options =
+  codegen::InitTargetOptionsFromCodeGenFlags(Triple(M.getTargetTriple()));
+  StringRef CPU = "";
+  StringRef Features = "";
+  std::unique_ptr TM(T->createTargetMachine(
+  HostTriple, CPU, Features, Options, Reloc::PIC_, M.getCodeModel()));
+
+  if (M.getDataLayout().isDefault())
+M.setDataLayout(TM->createDataLayout());
+
+  SmallString<128> ObjectFile;
+  int FD = -1;
+  if (Error Err = createOutputFile(sys::path::filename(ExecutableName) +
+   "offload-wrapper",
+   "o", ObjectFile))
+return std::move(Err);
+  if (std::error_code EC = sys::fs::openFileForWrite(ObjectFile, FD))
+return errorCodeToError(EC);
+
+  auto OS = std::make_unique(FD, true);
+
+  legacy::PassManager CodeGenPasses;
+  TargetLibraryInfoImpl TLII(Triple(M.getTargetTriple()));
+  CodeGenPasses.add(new TargetLibraryInfoWrapperPass(TLII));
+  if (TM->addPassesToEmitFile(CodeGenPasses, *OS, nullptr, CGFT_ObjectFile))
+return createStringError(inconvertibleErrorCode(),
+ "Failed to execute host backend");
+  CodeGenPasses.run(M);
+
+  return static_cast(ObjectFile);
+}
+
 /// Creates an object file containing the device image stored in the filename \p
 /// ImageFile that can be linked with the host.
 Expected wrapDeviceImage(StringRef ImageFile) {
@@ -1000,33 +1045,11 @@
 return createStringError(inconvertibleErrorCode(),
  "'clang-offload-wrapper' failed");
 
-  ErrorOr CompilerPath =
-  sys::findProgramByName("llc", sys::path::parent_path(LinkerExecutable));
-  if (!WrapperPath)
-WrapperPath = sys::findProgramByName("llc");
-  if (!WrapperPath)
-return createStringError(WrapperPath.getError(),
- "Unable to find 'llc' in path");
-
-  // Create a new file to write the wrapped bitcode file to.
-  SmallString<128> ObjectFile;
-  if (Error Err = createOutputFile(sys::path::filename(ExecutableName) +
-   "-offload-wrapper",
-   "o", ObjectFile))
-return std::move(Err);
-
-  SmallVector CompilerArgs;
-  CompilerArgs.push_back(*CompilerPath);
-  CompilerArgs.push_back("--filetype=obj");
-  CompilerArgs.push_back("--relocation-model=pic");
-  CompilerArgs.push_back("-o");
-  CompilerArgs.push_back(ObjectFile);
-  CompilerArgs.push_back(BitcodeFile);
-
-  if (sys::ExecuteAndWait(*CompilerPath, CompilerArgs))
-return createStringError(inconvertibleErrorCode(), "'llc' failed");
+  LLVMContext Context;
+  SMDiagnostic Err;
+  std::unique_ptr M = parseIRFile(BitcodeFile, Err, Context);
 
-  return static_cast(ObjectFile);
+  return compileModule(*M);
 }
 
 Optional findFile(StringRef Dir, const Twine ) {
Index: clang/tools/clang-linker-wrapper/CMakeLists.txt
===
--- clang/tools/clang-linker-wrapper/CMakeLists.txt
+++ clang/tools/clang-linker-wrapper/CMakeLists.txt
@@ -4,6 +4,8 @@
   Core
   BinaryFormat
   MC
+  

[PATCH] D122760: [OpenMP] Add OpenMP variant extension to keep the unmangled name

2022-04-05 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 420604.
jhuber6 added a comment.

Updating to stop erroring if a function with the same mangled error exists, but 
is never emitted.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122760/new/

https://reviews.llvm.org/D122760

Files:
  clang/include/clang/Basic/Attr.td
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Parse/ParseOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/declare_variant_no_mangling.cpp
  clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def

Index: llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
===
--- llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
+++ llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
@@ -1157,6 +1157,7 @@
 __OMP_TRAIT_PROPERTY(implementation, extension, match_none)
 __OMP_TRAIT_PROPERTY(implementation, extension, disable_implicit_base)
 __OMP_TRAIT_PROPERTY(implementation, extension, allow_templates)
+__OMP_TRAIT_PROPERTY(implementation, extension, keep_original_name)
 
 __OMP_TRAIT_SET(user)
 
Index: clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
===
--- clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
+++ clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
@@ -1,13 +1,15 @@
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv}'
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv|@_ZL3foov}'
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -emit-pch -o %t
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -include-pch %t -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv}'
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -include-pch %t -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv|@_ZL3foov}'
 // expected-no-diagnostics
 
 // CHECK-DAG: @_Z3barv
 // CHECK-DAG: @_Z3bazv
+// CHECK-DAG: call noundef i32 @_ZL3foov()
 // CHECK-DAG: define{{.*}} @"_Z53bar$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"
 // CHECK-DAG: define{{.*}} @"_Z53baz$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"
+// CHECK-DAG: define{{.*}} @_ZL3foov
 // CHECK-DAG: call noundef i32 @"_Z53bar$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"()
 // CHECK-DAG: call noundef i32 @"_Z53baz$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"()
 
@@ -20,6 +22,8 @@
 
 int baz() { return 5; }
 
+static int foo() { return 3; }
+
 #pragma omp begin declare variant match(device = {arch(nvptx, nvptx64)}, implementation = {extension(match_any)})
 
 int bar() { return 2; }
@@ -28,13 +32,19 @@
 
 #pragma omp end declare variant
 
+#pragma omp begin declare variant match(device = {arch(nvptx, nvptx64)}, implementation = {extension(match_any, keep_original_name)})
+
+static int foo() { return 4; }
+
+#pragma omp end declare variant
+
 #pragma omp end declare target
 
 int main() {
   int res;
 #pragma omp target map(from \
: res)
-  res = bar() + baz();
+  res = bar() + baz() + foo();
   return res;
 }
 
Index: clang/test/OpenMP/declare_variant_no_mangling.cpp
===
--- /dev/null
+++ clang/test/OpenMP/declare_variant_no_mangling.cpp
@@ -0,0 +1,53 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature --include-generated-funcs
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -emit-llvm %s -o - | FileCheck %s
+// expected-no-diagnostics
+
+#ifndef HEADER
+#define HEADER
+
+static int foo() { return 0; }
+extern inline int bar() { return 0; }
+int baz() { return 0; }
+
+#pragma omp begin declare variant match( \
+device = {arch(ppc64le, ppc64)}, \
+implementation = {extension(match_any, keep_original_name)})
+
+static int foo() { return 1; }
+extern inline int bar() { return 

[PATCH] D122760: [OpenMP] Add OpenMP variant extension to keep the unmangled name

2022-04-05 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 420608.
jhuber6 added a comment.

Make suggested changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122760/new/

https://reviews.llvm.org/D122760

Files:
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/AttrDocs.td
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Parse/ParseOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/declare_variant_no_mangling.cpp
  clang/test/OpenMP/declare_variant_no_mangling_messages.cpp
  clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def

Index: llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
===
--- llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
+++ llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
@@ -1157,6 +1157,7 @@
 __OMP_TRAIT_PROPERTY(implementation, extension, match_none)
 __OMP_TRAIT_PROPERTY(implementation, extension, disable_implicit_base)
 __OMP_TRAIT_PROPERTY(implementation, extension, allow_templates)
+__OMP_TRAIT_PROPERTY(implementation, extension, keep_original_name)
 
 __OMP_TRAIT_SET(user)
 
Index: clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
===
--- clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
+++ clang/test/OpenMP/nvptx_declare_variant_name_mangling.cpp
@@ -1,13 +1,15 @@
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv}'
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv|@_ZL3foov}'
 // RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -emit-pch -o %t
-// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -include-pch %t -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv}'
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown -aux-triple powerpc64le-unknown-unknown -emit-llvm %s -fopenmp-is-device -fopenmp-host-ir-file-path %t-ppc-host.bc -include-pch %t -o - | FileCheck %s --implicit-check-not='call i32 {@_Z3bazv|@_Z3barv|@_ZL3foov}'
 // expected-no-diagnostics
 
 // CHECK-DAG: @_Z3barv
 // CHECK-DAG: @_Z3bazv
+// CHECK-DAG: call noundef i32 @_ZL3foov()
 // CHECK-DAG: define{{.*}} @"_Z53bar$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"
 // CHECK-DAG: define{{.*}} @"_Z53baz$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"
+// CHECK-DAG: define{{.*}} @_ZL3foov
 // CHECK-DAG: call noundef i32 @"_Z53bar$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"()
 // CHECK-DAG: call noundef i32 @"_Z53baz$ompvariant$S2$s7$Pnvptx$Pnvptx64$S3$s9$Pmatch_anyv"()
 
@@ -20,6 +22,8 @@
 
 int baz() { return 5; }
 
+static int foo() { return 3; }
+
 #pragma omp begin declare variant match(device = {arch(nvptx, nvptx64)}, implementation = {extension(match_any)})
 
 int bar() { return 2; }
@@ -28,13 +32,19 @@
 
 #pragma omp end declare variant
 
+#pragma omp begin declare variant match(device = {arch(nvptx, nvptx64)}, implementation = {extension(match_any, keep_original_name)})
+
+static int foo() { return 4; }
+
+#pragma omp end declare variant
+
 #pragma omp end declare target
 
 int main() {
   int res;
 #pragma omp target map(from \
: res)
-  res = bar() + baz();
+  res = bar() + baz() + foo();
   return res;
 }
 
Index: clang/test/OpenMP/declare_variant_no_mangling_messages.cpp
===
--- /dev/null
+++ clang/test/OpenMP/declare_variant_no_mangling_messages.cpp
@@ -0,0 +1,18 @@
+// RUN: not %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -emit-llvm %s -o - 2>&1 | FileCheck %s
+
+// CHECK: definition with same mangled name '_Z3foov' as another definition
+
+#ifndef HEADER
+#define HEADER
+
+int foo() { return 0; }
+
+#pragma omp begin declare variant match( \
+device = {arch(ppc64le, ppc64)}, \
+implementation = {extension(match_any, keep_original_name)})
+
+int foo() { return 1; }
+
+#pragma omp end declare variant
+
+#endif
Index: clang/test/OpenMP/declare_variant_no_mangling.cpp

[PATCH] D122760: [OpenMP] Add OpenMP variant extension to keep the unmangled name

2022-04-05 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 marked an inline comment as done.
jhuber6 added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:3062
+if (auto *A = Global->getAttr())
+  VariantGlobalsEmitted.insert(A->getFunction());
   }

jdoerfert wrote:
> This looks like you now disable the diagnostic for pretty much everything, no?
This should only get called if we plan to emit this global, if the global has 
the attribute stating that it should not be mangled we're basically just 
asserting that its associated non-variant declaration should not be found. So 
even if we hit a name mangling conflict, as long as we haven't tried to emit 
that global we should be fine. I can add a test for this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122760/new/

https://reviews.llvm.org/D122760

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123325: [Clang] Make enabling the new driver more generic

2022-04-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 421595.
jhuber6 added a comment.

Fix misplaced logic symbol that broke tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123325/new/

https://reviews.llvm.org/D123325

Files:
  clang/include/clang/Driver/Compilation.h
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/Clang.cpp

Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4400,7 +4400,8 @@
  JA.isDeviceOffloading(Action::OFK_Host));
   bool IsHostOffloadingAction =
   JA.isHostOffloading(C.getActiveOffloadKinds()) &&
-  Args.hasArg(options::OPT_fopenmp_new_driver);
+  (Args.hasArg(options::OPT_fopenmp_new_driver) ||
+   Args.hasArg(options::OPT_foffload_new_driver));
   bool IsUsingLTO = D.isUsingLTO(IsDeviceOffloadAction);
   auto LTOMode = D.getLTOMode(IsDeviceOffloadAction);
 
@@ -4688,7 +4689,8 @@
 if (JA.getType() == types::TY_LLVM_BC)
   CmdArgs.push_back("-emit-llvm-uselists");
 
-if (IsUsingLTO && !Args.hasArg(options::OPT_fopenmp_new_driver)) {
+if (IsUsingLTO && !(Args.hasArg(options::OPT_fopenmp_new_driver) ||
+Args.hasArg(options::OPT_foffload_new_driver))) {
   // Only AMDGPU supports device-side LTO.
   if (IsDeviceOffloadAction && !Triple.isAMDGPU()) {
 D.Diag(diag::err_drv_unsupported_opt_for_target)
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -3882,6 +3882,11 @@
   // Builder to be used to build offloading actions.
   OffloadingActionBuilder OffloadBuilder(C, Args, Inputs);
 
+  bool UseNewOffloadingDriver =
+  C.isOffloadingHostKind(C.getActiveOffloadKinds()) &&
+  (Args.hasArg(options::OPT_foffload_new_driver) ||
+   Args.hasArg(options::OPT_fopenmp_new_driver));
+
   // Construct the actions to perform.
   HeaderModulePrecompileJobAction *HeaderModuleAction = nullptr;
   ExtractAPIJobAction *ExtractAPIAction = nullptr;
@@ -3903,14 +3908,14 @@
 
 // Use the current host action in any of the offloading actions, if
 // required.
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   if (OffloadBuilder.addHostDependenceToDeviceActions(Current, InputArg))
 break;
 
 for (phases::ID Phase : PL) {
 
   // Add any offload action the host action depends on.
-  if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (!UseNewOffloadingDriver)
 Current = OffloadBuilder.addDeviceDependencesToHostAction(
 Current, InputArg, Phase, PL.back(), FullPL);
   if (!Current)
@@ -3953,7 +3958,7 @@
 
   // Try to build the offloading actions and add the result as a dependency
   // to the host.
-  if (Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (UseNewOffloadingDriver)
 Current = BuildOffloadingActions(C, Args, I, Current);
 
   // FIXME: Should we include any prior module file outputs as inputs of
@@ -3975,7 +3980,7 @@
 
   // Use the current host action in any of the offloading actions, if
   // required.
-  if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (!UseNewOffloadingDriver)
 if (OffloadBuilder.addHostDependenceToDeviceActions(Current, InputArg))
   break;
 
@@ -3988,7 +3993,7 @@
   Actions.push_back(Current);
 
 // Add any top level actions generated for offloading.
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   OffloadBuilder.appendTopLevelActions(Actions, Current, InputArg);
 else if (Current)
   Current->propagateHostOffloadInfo(C.getActiveOffloadKinds(),
@@ -4004,14 +4009,14 @@
   }
 
   if (!LinkerInputs.empty()) {
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   if (Action *Wrapper = OffloadBuilder.makeHostLinkAction())
 LinkerInputs.push_back(Wrapper);
 Action *LA;
 // Check if this Linker Job should emit a static library.
 if (ShouldEmitStaticLibrary(Args)) {
   LA = C.MakeAction(LinkerInputs, types::TY_Image);
-} else if (Args.hasArg(options::OPT_fopenmp_new_driver) &&
+} else if (UseNewOffloadingDriver &&
C.getActiveOffloadKinds() != Action::OFK_None) {
   LA = C.MakeAction(LinkerInputs, types::TY_Image);
   LA->propagateHostOffloadInfo(C.getActiveOffloadKinds(),
@@ -4019,7 +4024,7 @@
 } else {
   LA = C.MakeAction(LinkerInputs, types::TY_Image);
 }
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   LA = OffloadBuilder.processHostLinkAction(LA);
 Actions.push_back(LA);
   }
Index: 

[PATCH] D123325: [Clang] Make enabling the new driver more generic

2022-04-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 421607.
jhuber6 added a comment.

Rebase.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123325/new/

https://reviews.llvm.org/D123325

Files:
  clang/include/clang/Driver/Compilation.h
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/Clang.cpp

Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -4400,7 +4400,8 @@
  JA.isDeviceOffloading(Action::OFK_Host));
   bool IsHostOffloadingAction =
   JA.isHostOffloading(C.getActiveOffloadKinds()) &&
-  Args.hasArg(options::OPT_fopenmp_new_driver);
+  (Args.hasArg(options::OPT_fopenmp_new_driver) ||
+   Args.hasArg(options::OPT_foffload_new_driver));
   bool IsUsingLTO = D.isUsingLTO(IsDeviceOffloadAction);
   auto LTOMode = D.getLTOMode(IsDeviceOffloadAction);
 
@@ -4688,7 +4689,8 @@
 if (JA.getType() == types::TY_LLVM_BC)
   CmdArgs.push_back("-emit-llvm-uselists");
 
-if (IsUsingLTO && !Args.hasArg(options::OPT_fopenmp_new_driver)) {
+if (IsUsingLTO && !(Args.hasArg(options::OPT_fopenmp_new_driver) ||
+Args.hasArg(options::OPT_foffload_new_driver))) {
   // Only AMDGPU supports device-side LTO.
   if (IsDeviceOffloadAction && !Triple.isAMDGPU()) {
 D.Diag(diag::err_drv_unsupported_opt_for_target)
Index: clang/lib/Driver/Driver.cpp
===
--- clang/lib/Driver/Driver.cpp
+++ clang/lib/Driver/Driver.cpp
@@ -3882,6 +3882,11 @@
   // Builder to be used to build offloading actions.
   OffloadingActionBuilder OffloadBuilder(C, Args, Inputs);
 
+  bool UseNewOffloadingDriver =
+  C.isOffloadingHostKind(C.getActiveOffloadKinds()) &&
+  (Args.hasArg(options::OPT_foffload_new_driver) ||
+   Args.hasArg(options::OPT_fopenmp_new_driver));
+
   // Construct the actions to perform.
   HeaderModulePrecompileJobAction *HeaderModuleAction = nullptr;
   ExtractAPIJobAction *ExtractAPIAction = nullptr;
@@ -3903,14 +3908,14 @@
 
 // Use the current host action in any of the offloading actions, if
 // required.
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   if (OffloadBuilder.addHostDependenceToDeviceActions(Current, InputArg))
 break;
 
 for (phases::ID Phase : PL) {
 
   // Add any offload action the host action depends on.
-  if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (!UseNewOffloadingDriver)
 Current = OffloadBuilder.addDeviceDependencesToHostAction(
 Current, InputArg, Phase, PL.back(), FullPL);
   if (!Current)
@@ -3953,7 +3958,7 @@
 
   // Try to build the offloading actions and add the result as a dependency
   // to the host.
-  if (Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (UseNewOffloadingDriver)
 Current = BuildOffloadingActions(C, Args, I, Current);
 
   // FIXME: Should we include any prior module file outputs as inputs of
@@ -3975,7 +3980,7 @@
 
   // Use the current host action in any of the offloading actions, if
   // required.
-  if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+  if (!UseNewOffloadingDriver)
 if (OffloadBuilder.addHostDependenceToDeviceActions(Current, InputArg))
   break;
 
@@ -3988,7 +3993,7 @@
   Actions.push_back(Current);
 
 // Add any top level actions generated for offloading.
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   OffloadBuilder.appendTopLevelActions(Actions, Current, InputArg);
 else if (Current)
   Current->propagateHostOffloadInfo(C.getActiveOffloadKinds(),
@@ -4004,14 +4009,14 @@
   }
 
   if (!LinkerInputs.empty()) {
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   if (Action *Wrapper = OffloadBuilder.makeHostLinkAction())
 LinkerInputs.push_back(Wrapper);
 Action *LA;
 // Check if this Linker Job should emit a static library.
 if (ShouldEmitStaticLibrary(Args)) {
   LA = C.MakeAction(LinkerInputs, types::TY_Image);
-} else if (Args.hasArg(options::OPT_fopenmp_new_driver) &&
+} else if (UseNewOffloadingDriver &&
C.getActiveOffloadKinds() != Action::OFK_None) {
   LA = C.MakeAction(LinkerInputs, types::TY_Image);
   LA->propagateHostOffloadInfo(C.getActiveOffloadKinds(),
@@ -4019,7 +4024,7 @@
 } else {
   LA = C.MakeAction(LinkerInputs, types::TY_Image);
 }
-if (!Args.hasArg(options::OPT_fopenmp_new_driver))
+if (!UseNewOffloadingDriver)
   LA = OffloadBuilder.processHostLinkAction(LA);
 Actions.push_back(LA);
   }
Index: clang/include/clang/Driver/Options.td

[PATCH] D120273: [OpenMP] Allow CUDA to be linked with OpenMP using the new driver

2022-04-08 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 421555.
jhuber6 added a comment.

Update handling for fatbinaries.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120273/new/

https://reviews.llvm.org/D120273

Files:
  clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Index: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
===
--- clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -149,6 +149,10 @@
 /// section string will be formatted as `.llvm.offloading..`.
 #define OFFLOAD_SECTION_MAGIC_STR ".llvm.offloading."
 
+/// The magic offset for the first object inside CUDA's fatbinary. This can be
+/// different but it should work for what is passed here.
+static constexpr unsigned FatbinaryOffset = 0x50;
+
 /// Information for a device offloading file extracted from the host.
 struct DeviceFile {
   DeviceFile(StringRef Kind, StringRef TheTriple, StringRef Arch,
@@ -162,7 +166,10 @@
 };
 
 namespace llvm {
-/// Helper that allows DeviceFile to be used as a key in a DenseMap.
+/// Helper that allows DeviceFile to be used as a key in a DenseMap. For now we
+/// assume device files with matching architectures and triples but different
+/// offloading kinds should be handlded together, this may not be true in the
+/// future.
 template <> struct DenseMapInfo {
   static DeviceFile getEmptyKey() {
 return {DenseMapInfo::getEmptyKey(),
@@ -202,11 +209,15 @@
 }
 
 static StringRef getDeviceFileExtension(StringRef DeviceTriple,
-bool IsBitcode = false) {
+file_magic Magic) {
   Triple TheTriple(DeviceTriple);
-  if (TheTriple.isAMDGPU() || IsBitcode)
+  if (Magic == file_magic::bitcode)
 return "bc";
-  if (TheTriple.isNVPTX())
+  if (Magic == file_magic::cuda_fatbinary)
+return "fatbin";
+  if (Magic == file_magic::unknown)
+return "s";
+  if (TheTriple.isNVPTX() && Magic == file_magic::elf_relocatable)
 return "cubin";
   return "o";
 }
@@ -310,8 +321,8 @@
 
 if (Expected Contents = Sec.getContents()) {
   SmallString<128> TempFile;
-  StringRef DeviceExtension = getDeviceFileExtension(
-  DeviceTriple, identify_magic(*Contents) == file_magic::bitcode);
+  StringRef DeviceExtension =
+  getDeviceFileExtension(DeviceTriple, identify_magic(*Contents));
   if (Error Err = createOutputFile(Prefix + "-" + Kind + "-" +
DeviceTriple + "-" + Arch,
DeviceExtension, TempFile))
@@ -424,8 +435,8 @@
 
 StringRef Contents = CDS->getAsString();
 SmallString<128> TempFile;
-StringRef DeviceExtension = getDeviceFileExtension(
-DeviceTriple, identify_magic(Contents) == file_magic::bitcode);
+StringRef DeviceExtension =
+getDeviceFileExtension(DeviceTriple, identify_magic(Contents));
 if (Error Err = createOutputFile(Prefix + "-" + Kind + "-" + DeviceTriple +
  "-" + Arch,
  DeviceExtension, TempFile))
@@ -933,11 +944,34 @@
 MemoryBuffer::getFileOrSTDIN(File);
 if (std::error_code EC = BufferOrErr.getError())
   return createFileError(File, EC);
+MemoryBufferRef Buffer = **BufferOrErr;
 
 file_magic Type = identify_magic((*BufferOrErr)->getBuffer());
-if (Type != file_magic::bitcode) {
+switch (Type) {
+case file_magic::bitcode: {
+  Expected> InputFileOrErr =
+  llvm::lto::InputFile::create(Buffer);
+  if (!InputFileOrErr)
+return InputFileOrErr.takeError();
+
+  // Save the input file and the buffer associated with its memory.
+  BitcodeFiles.push_back(std::move(*InputFileOrErr));
+  SavedBuffers.push_back(std::move(*BufferOrErr));
+  continue;
+}
+case file_magic::cuda_fatbinary: {
+  // Cuda fatbinaries almost almost have an object eighty bytes from the
+  // beginning. This should be sufficient to identify the symbols.
+  Buffer = MemoryBufferRef(
+  (*BufferOrErr)->getBuffer().drop_front(FatbinaryOffset), "FatBinary");
+  LLVM_FALLTHROUGH;
+}
+case file_magic::elf_relocatable:
+case file_magic::elf_shared_object:
+case file_magic::macho_object:
+case file_magic::coff_object: {
   Expected> ObjFile =
-  ObjectFile::createObjectFile(**BufferOrErr, Type);
+  ObjectFile::createObjectFile(Buffer);
   if (!ObjFile)
 return ObjFile.takeError();
 
@@ -953,15 +987,10 @@
 else
   UsedInSharedLib.insert(Saver.save(*Name));
   }
-} else {
-  Expected> InputFileOrErr =
-  llvm::lto::InputFile::create(**BufferOrErr);
-  if (!InputFileOrErr)
-return InputFileOrErr.takeError();
-
-  // Save the input file and the buffer 

[PATCH] D123471: [CUDA] Create offloading entries when using the new driver

2022-04-12 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D123471#3446464 , @tra wrote:

> I've mentioned in D123441  that it would be 
> useful to have a list of GPU-side symbols needed by the host and this offload 
> info is pretty close to what we need. The only remaining feature is being 
> able to extract them by external tool, so we could pass them to the GPU-side 
> linker. Perhaps we could just generate a GPU-side stub file which would only 
> have an array of needed GPU-side references, compile and add it to the 
> GPU-side linker as yet another input which would ensure we do link in the 
> exact set of GPU objects from the static libraries.

These will probably only be valid once the final executable is linked. Since 
the structure contains a pointers to other symbols they'll only have non-null 
values after the final linking. After linking for the host you should be able 
to just use something like `objdump -s -j cuda_offloading_entries` to get all 
of them. For my use-case I only need to be able to iterate these symbols when 
the program is run. If we want to use this for something else it would be good 
to keep them synced up to avoid duplicating error. Also the patches say "CUDA" 
but the vast majority will also apply to HIP without much change.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123471/new/

https://reviews.llvm.org/D123471

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123471: [CUDA] Create offloading entries when using the new driver

2022-04-12 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment.

In D123471#3446751 , @yaxunl wrote:

> HIP is considering a unified device binary embedding scheme with OpenMP. 
> However, some large MI frameworks are compiled with -fno-gpu-rdc. If 
> compiling with -fgpu-rdc, the linking time will significantly increase since 
> the post-linking optimizations take much longer time with the large linked 
> IR. Therefore, it would be desirable if the new OpenMP device binary 
> embedding scheme supports -fno-gpu-rdc mode.

This work should be very close to that, the new driver allows us to link 
everything together so OpenMP can call HIP / CUDA functions and vice-versa. I 
have done some preliminary tests with registering CUDA device variables with 
OpenMP, the only change required is to store these offloading sections at 
`omp_offloading_entries` and the OpenMP runtime will pick them up and try to 
register them. This method allows us to compile HIP / CUDA with OpenMP but 
since we're going to be registering two different images they'll have unique 
state. For full interoperability we'd need some way for make either HIP / CUDA 
or OpenMP "borrow" the other one's registered image so they can share the state.

> That said, I think this new scheme may work for -fno-gpu-rdc, probably with 
> some minor changes.

My understanding is that non-RDC builds do all the registration per-TU. Since 
that's the case then we should just be able to link them as we do now and they 
won't emit any device code that needs to be linked. So individual files could 
specify no-rdc and then they wouldn't be touched by the device linker run later.

> For -fno-gpu-rdc, each TU has its own device binary, so the device binaries 
> in the final image would be per GPU and per TU. That seems not a big problem 
> since they can be post-fixed with a unique ID for each TU.
>
> Different offload entries may have the same name in different TU's, therefore 
> an offload entry may not be uniquely identified by its name. To uniquely 
> identify an offload entry, it needs its name and the pointer to its belonging 
> device binary. Therefore, it would be desirable to have one extra field 
> 'owner':
>
>   Type struct __tgt_offload_entry {
> void*addr;  // Pointer to the offload entry info.
> // (function or global)
> char*name;  // Name of the function or global.
> size_t  size;   // Size of the entry info (0 if it a function).
> int32_t flags;
> void  *owner; // pointer to the device binary containing this 
> offload-entry
> int32_t reserved;
>   };
>
> It may be possible to use the `reserved` field for that purpose. However, it 
> is not sure if `reserved` will be used for some other purpose later.

For OpenMP we use an `exec_mode` global to control some kernel execution, 
there's a possibility we'd want to put it in the reserved field instead. We 
could add more fields to this, but it would break the ABI. We could work around 
that but it would be some additional complexity.

> Another choice is to let addr point to a struct which contains owner info. 
> However, that would introduce another level of indirection.

Yeah, I think for arbitrary extensions that would be the easiest way without 
breaking the ABI. We could use the reserved field to indicate if we have some 
"extension" there.

I think we're working through some similar stuff. I haven't worked much with 
HIP but I think there would be some benefit to bringing this all under the new 
driver I've been working on for OpenMP. Let me know if you want to collaborate 
on something for getting this to work with HIP.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123471/new/

https://reviews.llvm.org/D123471

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120272: [CUDA] Add driver support for compiling CUDA with the new driver

2022-04-07 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 updated this revision to Diff 421270.
jhuber6 added a comment.

Make `-foffload-new-driver` imply GPU-RDC mode, it won't work otherwise. Also 
adjust tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120272/new/

https://reviews.llvm.org/D120272

Files:
  clang/include/clang/Basic/Cuda.h
  clang/lib/Driver/Driver.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Driver/ToolChains/Cuda.cpp
  clang/lib/Driver/ToolChains/HIPAMD.cpp
  clang/test/Driver/cuda-openmp-driver.cu

Index: clang/test/Driver/cuda-openmp-driver.cu
===
--- /dev/null
+++ clang/test/Driver/cuda-openmp-driver.cu
@@ -0,0 +1,18 @@
+// REQUIRES: x86-registered-target
+// REQUIRES: nvptx-registered-target
+
+// RUN: %clang -### -target x86_64-linux-gnu -nocudalib -ccc-print-bindings -fgpu-rdc \
+// RUN:-foffload-new-driver --offload-arch=sm_35 --offload-arch=sm_70 %s 2>&1 \
+// RUN: | FileCheck -check-prefix BINDINGS %s
+
+// BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[PTX_SM_35:.+]]"
+// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_SM_35]]"], output: "[[CUBIN_SM_35:.+]]"
+// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Linker", inputs: ["[[CUBIN_SM_35]]", "[[PTX_SM_35]]"], output: "[[FATBIN_SM_35:.+]]"
+// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]"], output: "[[PTX_SM_70:.+]]"
+// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_SM_70:.+]]"], output: "[[CUBIN_SM_70:.+]]"
+// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Linker", inputs: ["[[CUBIN_SM_70]]", "[[PTX_SM_70:.+]]"], output: "[[FATBIN_SM_70:.+]]"
+// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT]]", "[[FATBIN_SM_35]]", "[[FATBIN_SM_70]]"], output: "[[HOST_OBJ:.+]]"
+// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"
+
+// RUN: %clang -### -nocudalib -foffload-new-driver %s 2>&1 | FileCheck -check-prefix RDC %s
+// RDC: ptxas{{.*}}-c
Index: clang/lib/Driver/ToolChains/HIPAMD.cpp
===
--- clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -188,7 +188,7 @@
 CC1Args.push_back("-fcuda-approx-transcendentals");
 
   if (!DriverArgs.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc,
-  false))
+  false) || DriverArgs.hasArg(options::OPT_foffload_new_driver))
 CC1Args.append({"-mllvm", "-amdgpu-internalize-symbols"});
 
   StringRef MaxThreadsPerBlock =
Index: clang/lib/Driver/ToolChains/Cuda.cpp
===
--- clang/lib/Driver/ToolChains/Cuda.cpp
+++ clang/lib/Driver/ToolChains/Cuda.cpp
@@ -461,7 +461,8 @@
options::OPT_fnoopenmp_relocatable_target,
/*Default=*/true);
   else if (JA.isOffloading(Action::OFK_Cuda))
-Relocatable = Args.hasFlag(options::OPT_fgpu_rdc,
+Relocatable = Args.hasArg(options::OPT_foffload_new_driver) || 
+  Args.hasFlag(options::OPT_fgpu_rdc,
options::OPT_fno_gpu_rdc, /*Default=*/false);
 
   if (Relocatable)
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -6248,6 +6248,8 @@
   if (IsCuda || IsHIP) {
 if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false))
   CmdArgs.push_back("-fgpu-rdc");
+if (Args.hasArg(options::OPT_foffload_new_driver))
+  CmdArgs.push_back("-fgpu-rdc");
 if (Args.hasFlag(options::OPT_fgpu_defer_diag,
  options::OPT_fno_gpu_defer_diag, false))
   CmdArgs.push_back("-fgpu-defer-diag");
@@ -6930,6 +6932,8 @@
   CmdArgs.push_back(CudaDeviceInput->getFilename());
   if (Args.hasFlag(options::OPT_fgpu_rdc, options::OPT_fno_gpu_rdc, false))
 CmdArgs.push_back("-fgpu-rdc");
+  if (Args.hasArg(options::OPT_foffload_new_driver))
+CmdArgs.push_back("-fgpu-rdc");
   }
 
   if (IsCuda) {
@@ -8250,14 +8254,17 @@
   ArgStringList CmdArgs;
 
   // Pass the CUDA path to the linker wrapper tool.
-  for (auto  : llvm::make_range(OpenMPTCRange.first, OpenMPTCRange.second)) {
-const ToolChain *TC = I.second;
-if (TC->getTriple().isNVPTX()) {
-  CudaInstallationDetector CudaInstallation(D, TheTriple, Args);
-  if (CudaInstallation.isValid())
-CmdArgs.push_back(Args.MakeArgString(
-"--cuda-path=" + CudaInstallation.getInstallPath()));
-  break;
+  for (Action::OffloadKind Kind : {Action::OFK_Cuda, Action::OFK_OpenMP}) {
+auto TCRange = C.getOffloadToolChains(Kind);
+for (auto  : llvm::make_range(TCRange.first, TCRange.second)) {
+ 

<    1   2   3   4   5   6   7   8   9   10   >