================ @@ -61,6 +65,78 @@ static uint32_t gpu_irregular_simd_reduce(void *reduce_data, return (logical_lane_id == 0); } +// Reduction within a block on the GPU. +// +// Template parameters: +// - checkLiveness: Whether to check the liveness of the lanes. This is only +// useful if gpu_block_reduce is called in a context where +// L2 parallel regions are possible. ---------------- ro-i wrote:
they can have dispersed lanes afaiu? And, thus num_threads > 1, but not contiguous. But checkLiveness is also employed for contiguous partial warps (see previous parallel_reduce code). I fixed that in the commits I'm about to push. https://github.com/llvm/llvm-project/pull/195102 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
