jhuber6 added a comment.

Where is the code to change the target registration? We need a new 
initialization runtime call to use and change the old one to reallocate the 
`__tgt_device_image` array. Did you work around that some other way?



================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:613-614
   CmdArgs.push_back("-shared");
+  std::string ArchArg = std::string("-plugin-opt=mcpu=").append(Arch.str());
+  CmdArgs.push_back(ArchArg);
   CmdArgs.push_back("-o");
----------------
Does this just pass the architecture to the AMD link? Is this related? If not 
move it to a separate patch and I'll review it.


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1069
     bool WholeProgram = false;
+    std::string TheArch = File.Arch;
 
----------------
Is this necessary?


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1153
   SmallVector<ArrayRef<char>, 4> ImagesToWrap;
+  SmallVector<ArrayRef<char>, 4> OffloadArchs;
+  std::string Arch;
----------------
This should be a StringRef, the data is stable. When you create the struct just 
use this and it will add the null terminator for you.
```
Constant *ConstantStr = ConstantDataArray::getString(M.getContext(), Str);
```


================
Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:1165
+
+    if (Arch.empty()) {
+      Arch = std::string(File.Arch);
----------------
An empty `StringRef` should already create a null string when you create a 
ConstantDataArray, should be fine to just push it back.


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:62-63
+//   int32_t version;
+//   int32_t image_number;
+//   int32_t number_images;
+//   char* offload_arch;
----------------
The struct should be contained within the image itself so we can query it 
during registration, why do we need to know the number?


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:64-65
+//   int32_t number_images;
+//   char* offload_arch;
+//   char* target_compile_opts;
+// };
----------------



================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:191-194
+  auto *NullPtr = llvm::ConstantPointerNull::get(Type::getInt8PtrTy(C));
+  unsigned int ImgCount = 0;
+  std::string OffloadArchBase = "__offload_arch";
+  std::string OffloadImageBase = "offload_image_info";
----------------
You should be able to pass the name to the constructor, if it's internal LLVM 
will give you a new name.


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:214
 
-    ImagesInits.push_back(ConstantStruct::get(getDeviceImageTy(M), ImageB,
-                                              ImageE, EntriesB, EntriesE));
+    auto OArch = OffloadArchs[ImgCount];
+    Constant *OArchV = ConstantDataArray::get(C, OArch);
----------------
I would just assert that the metadata vector's size matches the number of 
images, we're always going to generate these now so it makes sense to generate 
them in pairs. Then just use `llvm::zip` or just an iterator to go through both 
of these at the same time.


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:215-227
+    Constant *OArchV = ConstantDataArray::get(C, OArch);
+    std::string OffloadArchGV(OffloadArchBase),
+        OffloadImageGV(OffloadImageBase);
+    if (ImgCount) {
+      auto Suffix = std::to_string(ImgCount);
+      OffloadArchGV.append(".").append(Suffix);
+      OffloadImageGV.append(".").append(Suffix);
----------------
Why do we need all these strings? Should just be generating a constant 
initializer.


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:229-232
+    // store value of these variables (i.e. offload archs) into a custom
+    // section which will be used by "offload-arch -f". It won't be
+    // removed during binary stripping.
+    GV->setSection(".offload_arch_list");
----------------
Why does this need a custom section? We should just use it like this, not sure 
why we need these to  be anything but some internal struct.
```
for (auto *Image : BinDesc->Images) {
  if (Image->Info)
   // Use Info
}
```


================
Comment at: clang/tools/clang-linker-wrapper/OffloadWrapper.cpp:240-241
+        getImageInfoTy(M), ConstantInt::get(Type::getInt32Ty(C), 1),
+        ConstantInt::get(Type::getInt32Ty(C), ImgCount++),
+        ConstantInt::get(Type::getInt32Ty(C), (uint32_t)OffloadArchs.size()),
+        RequirementVPtr,
----------------
Why do we need this when we should already know which image we're looking at?


================
Comment at: openmp/libomptarget/include/omptarget.h:122
 
+/// __tgt_image_info:
+///
----------------
This comment is extremely long. It should be sufficient to just say what it 
does, 3 lines max. Specify that it may be null in the device image struct.


================
Comment at: openmp/libomptarget/include/omptarget.h:145-146
+  int32_t version;           // The version of this struct
+  int32_t image_number;      // Image number in image library starting from 0
+  int32_t number_images;     // Number of images, used for initial allocation
+  char *offload_arch;        // e.g. sm_30, sm_70, gfx906, includes features
----------------
I still feel like we don't need these, correct me if I'm wrong.


================
Comment at: openmp/libomptarget/include/omptarget.h:157
   __tgt_offload_entry *EntriesEnd;   // End of table (non inclusive)
+  __tgt_image_info *ImageInfo;       // Metadata about the image
 };
----------------



================
Comment at: openmp/libomptarget/include/omptarget.h:169-176
+/// __tgt_active_offload_env
+///
+/// This structure is created by __tgt_get_active_offload_env and is used
+/// to determine compatibility of the images with the current environment
+/// that is "in play".
+struct __tgt_active_offload_env {
+char *capabilities; // string returned by offload-arch -c
----------------
Shouldn't the plugin handle this? We should be able to get the architecture in 
the plugin and select the info struct that matches it, if it exists. Then if 
the plugin finds one that matches we indicate that it's compatible like we do 
now. Correct me if I'm wrong.


================
Comment at: openmp/libomptarget/src/rtl.cpp:444
+
+  RTLInfoTy *FoundRTL = NULL;
   PM->RTLsMtx.lock();
----------------
Shouldn't we be able to have the plugin handle its own architecture detection, 
then return from `R.is_valid_binary`? That makes more sense to me than 
introducing a new tool to libomptarget, since we can already query the 
architecture from the CUDA runtime for example.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124525/new/

https://reviews.llvm.org/D124525

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to