On Mon, May 03, 2021 at 12:24:10PM +0200, Tom de Vries wrote:
> The test-case included in this patch contains this target region:
> ...
>   for (int i0 = 0 ; i0 < N0 ; i0++ )
>     counter_N0.i += 1;
> ...
> 
> When running with nvptx accelerator, the counter variable is expected to
> be N0 after the region, but instead is N0 / 32.  The problem is that rather
> than getting the result for all warp lanes, we get it for just one lane.
> 
> This is caused by the implementation of SIMT being incomplete.  It handles
> regular reductions, but appearantly not user-defined reductions.
> 
> For now, make this explicit by erroring out for nvptx, like this:
> ...
> target-44.c: In function 'main':
> target-44.c:20:9: error: SIMT reduction not fully implemented
> ...
> 
> Tested libgomp on x86_64-linux with and without nvptx accelerator.
> 
> Any comments?

If you want a workaround, the workaround should be to disable SIMT if
UDR reductions are seen, rather than erroring out.
So e.g. in lower_rec_simd_input_clauses for sctx->is_simt if sctx->max_vf
isn't 1 look for OMP_CLAUSE_REDUCTION with OMP_CLAUSE_REDUCTION_PLACEHOLDER
and punt (set max_vf = 1) in that case.

The right thing is to implement it properly of course.

        Jakub

Reply via email to