https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100408

            Bug ID: 100408
           Summary: [nvptx][OpenMP] Enable SIMT for user-defined reduction
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization, openmp
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
  Target Milestone: ---
            Target: nvptx-none

Follow-up to PR100321 - which is about the same issue but solved it by
disabling SIMT

>From the patch:

    The test-case included in this patch contains this target region:
    ...
      for (int i0 = 0 ; i0 < N0 ; i0++ )
        counter_N0.i += 1;
    ...

    When running with nvptx accelerator, the counter variable is expected to
    be N0 after the region, but instead is N0 / 32.  The problem is that rather
    than getting the result for all warp lanes, we get it for just one lane.

    This is caused by the implementation of SIMT being incomplete.  It handles
    regular reductions, but appearantly not user-defined reductions.

    For now, handle this by disabling SIMT in this case, specifically by
setting
    sctx->max_vf to 1.

For details (code location, longer testcases, etc.) see PR100321

Reply via email to