https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88706
--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> --- (In reply to Tom de Vries from comment #0) > I think the same problem exists for the other work around in > nvptx_adjust_parallelism, this one: > ... > /* FIXME: This is overly conservative; worker and vector loop will > > eventually be combined. */ > if (wv) > return inner_mask & ~GOMP_DIM_MASK (GOMP_DIM_WORKER); > ... > It's just harder to spot because the workaround doesn't affect vector length. Confirmed. With this additional patch: ... @@ -5695,7 +5696,10 @@ nvptx_adjust_parallelism (unsigned inner_mask, unsigned outer_mask) /* FIXME: This is overly conservative; worker and vector loop will eventually be combined. */ if (wv) - return inner_mask & ~GOMP_DIM_MASK (GOMP_DIM_WORKER); + { + fprintf (stderr, "worker-vector loop workaround applied in %s\n", current_function_name ()); + return inner_mask & ~GOMP_DIM_MASK (GOMP_DIM_WORKER); + } /* It's difficult to guarantee that warps in large vector_lengths will remain convergent when a vector loop is nested inside a ... we see for the first case (vector_length set on parallel directive, no -fopenacc-dim=): ... oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 worker-vector loop workaround applied in test2._omp_fn.1 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 128 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 ... and for the second case (no vector_length set on parallel directive, using -fopenacc-dim=): ... oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 oa.vector_length in nvptx_adjust_parallelism: 32 ...