On Mon, May 03, 2021 at 12:24:10PM +0200, Tom de Vries wrote: > The test-case included in this patch contains this target region: > ... > for (int i0 = 0 ; i0 < N0 ; i0++ ) > counter_N0.i += 1; > ... > > When running with nvptx accelerator, the counter variable is expected to > be N0 after the region, but instead is N0 / 32. The problem is that rather > than getting the result for all warp lanes, we get it for just one lane. > > This is caused by the implementation of SIMT being incomplete. It handles > regular reductions, but appearantly not user-defined reductions. > > For now, make this explicit by erroring out for nvptx, like this: > ... > target-44.c: In function 'main': > target-44.c:20:9: error: SIMT reduction not fully implemented > ... > > Tested libgomp on x86_64-linux with and without nvptx accelerator. > > Any comments?
If you want a workaround, the workaround should be to disable SIMT if UDR reductions are seen, rather than erroring out. So e.g. in lower_rec_simd_input_clauses for sctx->is_simt if sctx->max_vf isn't 1 look for OMP_CLAUSE_REDUCTION with OMP_CLAUSE_REDUCTION_PLACEHOLDER and punt (set max_vf = 1) in that case. The right thing is to implement it properly of course. Jakub