[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

Alex Voicu via cfe-commits Tue, 17 Jun 2025 13:42:13 -0700

================
@@ -9102,6 +9102,15 @@ bool InitializationSequence::Diagnose(Sema &S,
 
   case FK_ConversionFailed: {
     QualType FromType = OnlyArg->getType();
+    // __amdgpu_feature_predicate_t can be explicitly cast to the logical op
+    // type, although this is almost always an error and we advise against it
----------------
AlexVlx wrote:


Very good questions. Optimisation matters because of e.g. inlining - it would 
not really do much much in the case I outlined, except if you had `foo` inlined 
into `bar`, then statically resolved the predicate value and picked just that. 
This is doable, but not guaranteed, and wouldn't happen at O0, which induces a 
dependence we do not want. There is a dedicated pass we are adding which does 
basic const folding of a predicate, so if that doesn't work (because of wading 
into one of these intricacies), the user gets a fairly obnoxious error - we do 
try to tell them where there's a problem and what happened, but it's still a 
bit opaque.

But let's step back a bit to the first case i.e. why would stashing these in a 
vector / container be dangerous (but let's generalise to squirrelling in 
general - I picked a fairly banal example, but substitute, say, an 
`atomic<bool>` in there). Let's recall what these are meant for: controlling 
the lowering of target specific sequences. Now, irrespective if the target is 
concrete (`gfxSMTH`), i.e. we expand the predicate value in the FE, or abstract 
(`amdgcnspirv`), i.e. we only know the concrete target at run time, we 
absolutely MUST remove sequences that are not viable on a particular target, if 
it's not present. I.e. if we're on `gfx900`, we have to be able to statically 
and unambiguously remove sequences that are absolutely not viable there. 
Otherwise, what obtains is at best a very opaque BE error or, more dangerously, 
an ISA sequence being executed to unknowable consequences. If we just allow 
users to for arbitrarily complex squirrelling / accessing schemes, we lose that 
ability:

- even if we are on a concrete target, the FE cannot drop expressions with 
potential side-effects even if by some stroke of luck an arbitrarily complex 
stash + access chain were otherwise viable for full evaluation;
- on an abstract target the dedicated ME pass isn't going to look through 
calls, and if we tried to look through calls then we run into having to 
consider things like alloca removal, CSE etc. -> essentially we have to run 
pretty intrusive optimisation to even have hope, which messes up O0.

Overall, the intention is to make things difficult to misuse - these are 
definitely not just cute booleans, they have teeth and can bite! So mandating 
needing an explicit cast / making the type irregular gives an easy tell / code 
smell when some gnarly bug unavoidably shows up - "oh, you seem to have cast 
this here, what was the intention?".

https://github.com/llvm/llvm-project/pull/134016
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

Reply via email to