AlexVlx wrote:

> > > > > We added sema check @ 
> > > > > https://github.com/llvm/llvm-project/blob/8378a6fa4f5c83298fb0b5e240bb7f254f7b1137/clang/lib/Sema/SemaCUDA.cpp#L83
> > > > > 
> > > > > to generate error message on HIP based on Sam's request as HIP 
> > > > > currently doesnt' support device-side kernel calls. I don't follow 
> > > > > how we could have `CUDAKernelCallExpr` in the device compilation. 
> > > > > Could you elaborate in details?
> > > > 
> > > > 
> > > > The sema check doesn't work as is for `hipstdpar`, because it's gated 
> > > > on the current target being either a `__global__` function or a 
> > > > `__device__` function. What happens is that we do the parsing on a 
> > > > normal function, the <<<>>> expression is semantically valid, and then 
> > > > we try to `EmitCUDAKernelCallExpr`, because at CodeGen that is gated on 
> > > > whether the entire compilation is host or device, not on whether or not 
> > > > the caller is `__global__` or `__device__`. So either the latter check 
> > > > should actually establish the caller's context, or we should bypass 
> > > > this altogether when compiling for hipstdpar. This is the simplest NFC 
> > > > workaround to unbreak things.
> > > 
> > > 
> > > Why not add `getLangOpts().HIPStdPar` check in sema to skip generating 
> > > device-side kernel call? So that we have a central place to make that 
> > > decision?
> > 
> > 
> > Because, as far as I can ascertain, the `Sema` check is insufficient / the 
> > separate assert in `EmitCUDAKernelCallExpr` is disjoint. Here's what would 
> > happen:
> > 
> > 1. In Sema what we see is that `IsDeviceKernelCall` is false - this is 
> > fine, but we still would emit a `CudaKernelCallExpr` for the `<<<>>>` 
> > callsite, which was the case anyways before this change;
> 
> You mean that so far we could generate `CudaKernelCallExpr` in the device 
> compilation but it's not a device-side kernel call. I don't follow how that 
> could happen. You mean, under hipstdpar, `<<<>>>` could be used in the device 
> side but not being treated as a device kernel call. What's the semantics of 
> that?
> 
> > 2. Later on, when we get to `CodeGen`, we see the `CudaKernelCallExpr`, and 
> > try to handle it, except now the assumption is that if we're compiling for 
> > device and we see that, it must be a device side launch, and go look up a 
> > non-existent symbol, and run into the bug.

The semantics of that is:, dead code (the host side call scaffolding obtained 
via `EmitCUDAKernelCallExpr`  is emitted in a function that subsequently gets 
pruned. `hipstdpar` functions based on reachability rather than `__device__` / 
`__global__` / `__host__` explicit attributing, so when compiling for device 
any function is a tentative device function, including e.g. `main`.

https://github.com/llvm/llvm-project/pull/171043
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to