================
@@ -1,10 +1,22 @@
 ; RUN: opt -S -dxil-op-lower -mtriple=dxil-pc-shadermodel6.3-compute %s | 
FileCheck %s
 
-define noundef <4 x i32> @wave_ballot_simple(i1 noundef %p1) {
+%dx.types.fouri32 = type { i32, i32, i32, i32 }
+
+define <4 x i32> @wave_ballot_simple(i1 noundef %p1) {
 entry:
-; CHECK: call <4 x i32> @dx.op.waveBallot.void(i32 118, i1 %p1)
-  %ret = call <4 x i32> @llvm.dx.wave.ballot(i1 %p1)
-  ret <4 x i32> %ret
+; CHECK: call %dx.types.fouri32 @dx.op.waveActiveBallot(i32 116, i1 %p1)
+; CHECK-NOT: ret %dx.types.fouri32
+; CHECK: ret <4 x i32>
----------------
bob80905 wrote:

WaveActiveBallot is expected to return uint4, a vector of 4 ints, each a 
bitmask of the corresponding thread's evaluation of the boolean argument.
The point of the custom struct type, IIRC from @farzonl, is that for both DXC 
and clang, backends that handle the intrinsic cannot handle returning vectors, 
they can only handle returning structs.
So, when we lower the builtin to an intrinsic, we also add code to extract the 
elements of the returned struct, and in actuality, return a vector. That is why 
the vector sticks around.
Does this clarify things?

https://github.com/llvm/llvm-project/pull/175105
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to