JonChesterfield added a comment. Example of the function as opposed to intrinsics is __kmpc_get_hardware_num_threads_in_block from just above where you've modified. That corresponds to a function in the device runtime, e.g.
int __kmpc_get_hardware_num_threads_in_block() { return get_workgroup_dim(__builtin_amdgcn_workgroup_id_x(), __builtin_amdgcn_grid_size_x(), __builtin_amdgcn_workgroup_size_x()); } and int __kmpc_get_hardware_num_threads_in_block() { return __nvvm_read_ptx_sreg_ntid_x(); } ================ Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:3968 + + if (Triple.isNVPTX()) { + llvm::Function *F = llvm::Intrinsic::getDeclaration( ---------------- This does work. The benefit of adding the functions to the device runtime (which contain these intrinsic calls) is we get uniformity of the generated IR, modulo the unfortunate addrspace casts, so we can do nice things like pattern match on the name of the device runtime function ================ Comment at: clang/lib/CodeGen/CodeGenModule.cpp:245 case llvm::Triple::nvptx: case llvm::Triple::nvptx64: assert(getLangOpts().OpenMPIsDevice && ---------------- Looks like we could fold these cases by renaming the assert Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D113421/new/ https://reviews.llvm.org/D113421 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits