JonChesterfield added a comment.

Example of the function as opposed to intrinsics is 
__kmpc_get_hardware_num_threads_in_block from just above where you've modified. 
That corresponds to a function in the device runtime, e.g.

  int __kmpc_get_hardware_num_threads_in_block() {
    return get_workgroup_dim(__builtin_amdgcn_workgroup_id_x(),
                             __builtin_amdgcn_grid_size_x(),
                             __builtin_amdgcn_workgroup_size_x());
  }

and

  int __kmpc_get_hardware_num_threads_in_block() {
    return __nvvm_read_ptx_sreg_ntid_x();
  }



================
Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:3968
+
+  if (Triple.isNVPTX()) {
+    llvm::Function *F = llvm::Intrinsic::getDeclaration(
----------------
This does work. The benefit of adding the functions to the device runtime 
(which contain these intrinsic calls) is we get uniformity of the generated IR, 
modulo the unfortunate addrspace casts, so we can do nice things like pattern 
match on the name of the device runtime function


================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:245
   case llvm::Triple::nvptx:
   case llvm::Triple::nvptx64:
     assert(getLangOpts().OpenMPIsDevice &&
----------------
Looks like we could fold these cases by renaming the assert


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113421/new/

https://reviews.llvm.org/D113421

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to