https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100408
Bug ID: 100408 Summary: [nvptx][OpenMP] Enable SIMT for user-defined reduction Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: missed-optimization, openmp Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Target Milestone: --- Target: nvptx-none Follow-up to PR100321 - which is about the same issue but solved it by disabling SIMT >From the patch: The test-case included in this patch contains this target region: ... for (int i0 = 0 ; i0 < N0 ; i0++ ) counter_N0.i += 1; ... When running with nvptx accelerator, the counter variable is expected to be N0 after the region, but instead is N0 / 32. The problem is that rather than getting the result for all warp lanes, we get it for just one lane. This is caused by the implementation of SIMT being incomplete. It handles regular reductions, but appearantly not user-defined reductions. For now, handle this by disabling SIMT in this case, specifically by setting sctx->max_vf to 1. For details (code location, longer testcases, etc.) see PR100321