AlexVlx wrote: > > > > > We added sema check @ > > > > > https://github.com/llvm/llvm-project/blob/8378a6fa4f5c83298fb0b5e240bb7f254f7b1137/clang/lib/Sema/SemaCUDA.cpp#L83 > > > > > > > > > > to generate error message on HIP based on Sam's request as HIP > > > > > currently doesnt' support device-side kernel calls. I don't follow > > > > > how we could have `CUDAKernelCallExpr` in the device compilation. > > > > > Could you elaborate in details? > > > > > > > > > > > > The sema check doesn't work as is for `hipstdpar`, because it's gated > > > > on the current target being either a `__global__` function or a > > > > `__device__` function. What happens is that we do the parsing on a > > > > normal function, the <<<>>> expression is semantically valid, and then > > > > we try to `EmitCUDAKernelCallExpr`, because at CodeGen that is gated on > > > > whether the entire compilation is host or device, not on whether or not > > > > the caller is `__global__` or `__device__`. So either the latter check > > > > should actually establish the caller's context, or we should bypass > > > > this altogether when compiling for hipstdpar. This is the simplest NFC > > > > workaround to unbreak things. > > > > > > > > > Why not add `getLangOpts().HIPStdPar` check in sema to skip generating > > > device-side kernel call? So that we have a central place to make that > > > decision? > > > > > > Because, as far as I can ascertain, the `Sema` check is insufficient / the > > separate assert in `EmitCUDAKernelCallExpr` is disjoint. Here's what would > > happen: > > > > 1. In Sema what we see is that `IsDeviceKernelCall` is false - this is > > fine, but we still would emit a `CudaKernelCallExpr` for the `<<<>>>` > > callsite, which was the case anyways before this change; > > You mean that so far we could generate `CudaKernelCallExpr` in the device > compilation but it's not a device-side kernel call. I don't follow how that > could happen. You mean, under hipstdpar, `<<<>>>` could be used in the device > side but not being treated as a device kernel call. What's the semantics of > that? > > > 2. Later on, when we get to `CodeGen`, we see the `CudaKernelCallExpr`, and > > try to handle it, except now the assumption is that if we're compiling for > > device and we see that, it must be a device side launch, and go look up a > > non-existent symbol, and run into the bug.
The semantics of that is:, dead code (the host side call scaffolding obtained via `EmitCUDAKernelCallExpr` is emitted in a function that subsequently gets pruned. `hipstdpar` functions based on reachability rather than `__device__` / `__global__` / `__host__` explicit attributing, so when compiling for device any function is a tentative device function, including e.g. `main`. https://github.com/llvm/llvm-project/pull/171043 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
