On Tue, Oct 20, 2015 at 09:34:31PM +0300, Alexander Monakov wrote: > + asm ("bar.sync 0, %0;" : : "r"(32*bar->total));
Formatting, space between "(, spaces around * (in many places). As for re-convergence of threads in a warp, if we use threads in the warp other than thread 0 only for simd regions, I'd strongly hope that the end of simd region (or "vectorized" loop) is always a convergence point. Jakub