================
@@ -61,6 +65,78 @@ static uint32_t gpu_irregular_simd_reduce(void *reduce_data,
   return (logical_lane_id == 0);
 }
 
+// Reduction within a block on the GPU.
+//
+// Template parameters:
+// - checkLiveness: Whether to check the liveness of the lanes. This is only
+//                  useful if gpu_block_reduce is called in a context where
+//                  L2 parallel regions are possible.
----------------
jdoerfert wrote:

L2 parallel regions are sequentialized, no? That should be the trivial case of 
num_threads == 1 handled in nvptx_parallel_reduce_nowait. Am I missing 
something?

https://github.com/llvm/llvm-project/pull/195102
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to