Hi, Continuing the thread from here:
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00198.html On Wed, 3 Feb 2016 19:52:09 +0300 Alexander Monakov <amona...@ispras.ru> wrote: > On Wed, 3 Feb 2016, Nathan Sidwell wrote: > > You can only override at runtime those dimensions that you said > > you'd override at runtime when you compiled your program. > > Ah, I see. That's not obvious to me, so perhaps added documentation > can be expanded to explain that? (I now see that the plugin silently > drops user-provided dimensions where a value recorded at compile time > is present; not sure if that'd be worth a runtime diagnostic, could > be very noisy) This version of the patch has slightly-expanded documentation. > > > I don't see why you say that because cuDeviceGetAttribute provides > > > CU_DEVICE_ATTRIBUTE_WARP_SIZE, > > > CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, > > > CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X (which is not too useful for > > > this case) and cuFuncGetAttribute that allows to get a > > > per-function thread limit. There's a patch on gomp-nvptx branch > > > that adds querying some of those to the plugin. > > > > thanks. There doesn't appear to be one for number of physical CTAs > > though, right? > > Sorry, I don't understand the question: CTA is a logical entity. One > could derive limit of possible concurrent CTAs from number of SMs > (CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT) multiplied by how many > CTAs fit on one multiprocessor. The latter figure can be taken as a > rough worst-case value, or semi-intelligent per-kernel estimate based > on register limits (there's code on gomp-nvptx branch that does > this), or one can use the cuOcc* API to ask the driver for a precise > per-kernel figure. While the runtime part of the patch already appears to have been committed as part of the following patch: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01589.html The compile-time part of the patch has not made it upstream yet. Thus, this rebased and retested patch consists of the parsing changes (for -fopenacc-dim=X:Y:Z, allowing '-') and warning changes (for strange partitioning choices), plus associated testsuite adjustments. Tested with offloading to NVPTX and bootstrapped. OK for trunk? Thanks, Julian 20xx-xx-xx Nathan Sidwell <nat...@acm.org> Tom de Vries <tdevr...@suse.de> Thomas Schwinge <tho...@codesourcery.com> Julian Brown <jul...@codesourcery.com> gcc/ * doc/invoke.texi (fopenacc-dim): Update. * omp-offload.c (oacc_parse_default_dims): Update. (oacc_validate_dims): Emit warnings about strange partitioning choices. gcc/testsuite/ * c-c++-common/goacc/acc-icf.c: Update. * c-c++-common/goacc/parallel-dims-1.c: Likewise. * c-c++-common/goacc/parallel-reduction.c: Likewise. * c-c++-common/goacc/pr70688.c: Likewise. * c-c++-common/goacc/routine-1.c: Likewise. * c-c++-common/goacc/uninit-dim-clause.c: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/routine-4.f90: Likewise. * gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise. * gfortran.dg/goacc/uninit-dim-clause.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Add -w. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c: New. * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise. * testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise. * testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise. * testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: New.
commit a918a8739ae7652250c978b0ececa181a587b0c0 Author: Julian Brown <jul...@codesourcery.com> Date: Fri Oct 5 11:11:47 2018 -0700 OpenACC default compute dimensions 20xx-xx-xx Nathan Sidwell <nat...@acm.org> Tom de Vries <tdevr...@suse.de> Thomas Schwinge <tho...@codesourcery.com> Julian Brown <jul...@codesourcery.com> gcc/ * doc/invoke.texi (fopenacc-dim): Update. * omp-offload.c (oacc_parse_default_dims): Update. (oacc_validate_dims): Emit warnings about strange partitioning choices. gcc/testsuite/ * c-c++-common/goacc/acc-icf.c: Update. * c-c++-common/goacc/parallel-dims-1.c: Likewise. * c-c++-common/goacc/parallel-reduction.c: Likewise. * c-c++-common/goacc/pr70688.c: Likewise. * c-c++-common/goacc/routine-1.c: Likewise. * c-c++-common/goacc/uninit-dim-clause.c: Likewise. * gfortran.dg/goacc/parallel-tree.f95: Likewise. * gfortran.dg/goacc/routine-4.f90: Likewise. * gfortran.dg/goacc/routine-level-of-parallelism-1.f90: Likewise. * gfortran.dg/goacc/uninit-dim-clause.f95: Likewise. libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-g-1.c: Add -w. * testsuite/libgomp.oacc-c-c++-common/loop-g-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c: New. * testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-w-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/mode-transitions.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/private-variables.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-g-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/routine-w-1.c: Likewise. * testsuite/libgomp.oacc-fortran/par-reduction-2-1.f: Likewise. * testsuite/libgomp.oacc-fortran/par-reduction-2-2.f: Likewise. * testsuite/libgomp.oacc-fortran/pr84028.f90: Likewise. * testsuite/libgomp.oacc-fortran/private-variables.f90: Likewise. * testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: New. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 167eef5..21ec028 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -2167,8 +2167,12 @@ have support for @option{-pthread}. @cindex OpenACC accelerator programming Specify default compute dimensions for parallel offload regions that do not explicitly specify. The @var{geom} value is a triple of -':'-separated sizes, in order 'gang', 'worker' and, 'vector'. A size -can be omitted, to use a target-specific default value. +':'-separated sizes, in order 'gang', 'worker' and, 'vector'. If a size +is to be deferred until execution '-' can be used, alternatively a size +can be omitted to use a target-specific default value. When deferring +to runtime, the environment variable @var{GOMP_OPENACC_DIM} can be set. +It has the same format as the option value, except that '-' is not +permitted. @item -fopenmp @opindex fopenmp diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index 0abf028..48d1a42 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -574,8 +574,9 @@ static int oacc_default_dims[GOMP_DIM_MAX]; static int oacc_min_dims[GOMP_DIM_MAX]; /* Parse the default dimension parameter. This is a set of - :-separated optional compute dimensions. Each specified dimension - is a positive integer. When device type support is added, it is + :-separated optional compute dimensions. Each dimension is either + a positive integer, or '-' for a dynamic value computed at + runtime. When device type support is added, it is planned to be a comma separated list of such compute dimensions, with all but the first prefixed by the colon-terminated device type. */ @@ -610,14 +611,20 @@ oacc_parse_default_dims (const char *dims) if (*pos != ':') { - long val; - const char *eptr; + long val = 0; - errno = 0; - val = strtol (pos, CONST_CAST (char **, &eptr), 10); - if (errno || val <= 0 || (int) val != val) - goto malformed; - pos = eptr; + if (*pos == '-') + pos++; + else + { + const char *eptr; + + errno = 0; + val = strtol (pos, CONST_CAST (char **, &eptr), 10); + if (errno || val <= 0 || (int) val != val) + goto malformed; + pos = eptr; + } oacc_default_dims[ix] = (int) val; } } @@ -659,6 +666,34 @@ oacc_validate_dims (tree fn, tree attrs, int *dims, int level, unsigned used) pos = TREE_CHAIN (pos); } + bool check = true; +#ifdef ACCEL_COMPILER + check = false; +#endif + if (check + && !lookup_attribute ("oacc kernels", DECL_ATTRIBUTES (fn))) + { + static char const *const axes[] = + /* Must be kept in sync with GOMP_DIM enumeration. */ + {"gang", "worker", "vector" }; + for (ix = level >= 0 ? level : 0; ix != GOMP_DIM_MAX; ix++) + if (dims[ix] < 0) + ; /* Defaulting axis. */ + else if ((used & GOMP_DIM_MASK (ix)) && dims[ix] == 1) + /* There is partitioned execution, but the user requested a + dimension size of 1. They're probably confused. */ + warning_at (DECL_SOURCE_LOCATION (fn), 0, + "region contains %s partitoned code but" + " is not %s partitioned", axes[ix], axes[ix]); + else if (!(used & GOMP_DIM_MASK (ix)) && dims[ix] != 1) + /* The dimension is explicitly partitioned to non-unity, but + no use is made within the region. */ + warning_at (DECL_SOURCE_LOCATION (fn), 0, + "region is %s partitioned but" + " does not contain %s partitioned code", + axes[ix], axes[ix]); + } + bool changed = targetm.goacc.validate_dims (fn, dims, level); /* Default anything left to 1 or a partitioned default. */ diff --git a/gcc/testsuite/c-c++-common/goacc/acc-icf.c b/gcc/testsuite/c-c++-common/goacc/acc-icf.c index ecfe3f2..fb2c791 100644 --- a/gcc/testsuite/c-c++-common/goacc/acc-icf.c +++ b/gcc/testsuite/c-c++-common/goacc/acc-icf.c @@ -4,7 +4,7 @@ #pragma acc routine gang int -routine1 (int n) +routine1 (int n) /* { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "" { xfail *-*-* } } */ { int i; @@ -17,7 +17,7 @@ routine1 (int n) #pragma acc routine gang int -routine2 (int n) +routine2 (int n) /* { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "" { xfail *-*-* } } */ { int i; diff --git a/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c b/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c index 57f682f..6cdbebe 100644 --- a/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c +++ b/gcc/testsuite/c-c++-common/goacc/parallel-dims-1.c @@ -3,9 +3,11 @@ void f(int i) { -#pragma acc kernels num_gangs(i) num_workers(i) vector_length(i) +#pragma acc kernels \ + num_gangs(i) num_workers(i) vector_length(i) ; -#pragma acc parallel num_gangs(i) num_workers(i) vector_length(i) +#pragma acc parallel /* { dg-bogus "region is (gang|worker|vector) partitioned" "" { xfail *-*-* } } */ \ + num_gangs(i) num_workers(i) vector_length(i) ; } diff --git a/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c b/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c index d7cc947..9a142c4 100644 --- a/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c +++ b/gcc/testsuite/c-c++-common/goacc/parallel-reduction.c @@ -6,7 +6,7 @@ main () #pragma acc data copy (dummy) { -#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum) +#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum) /* { dg-warning "gang partitioned" } */ { int v = 5; sum += 10 + v; diff --git a/gcc/testsuite/c-c++-common/goacc/pr70688.c b/gcc/testsuite/c-c++-common/goacc/pr70688.c index 5a23665..3f5584a 100644 --- a/gcc/testsuite/c-c++-common/goacc/pr70688.c +++ b/gcc/testsuite/c-c++-common/goacc/pr70688.c @@ -21,7 +21,7 @@ parallel_reduction () #pragma acc data copy (dummy) { -#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum) +#pragma acc parallel num_gangs (10) copy (sum) reduction (+:sum) /* { dg-warning "region is gang partitioned" } */ { int v = 5; sum += 10 + v; @@ -36,11 +36,11 @@ main () { int i, s = 0; -#pragma acc parallel num_gangs (10) copy (s) reduction (+:s) +#pragma acc parallel num_gangs (10) copy (s) reduction (+:s) /* { dg-warning "region is gang partitioned" } */ for (i = 0; i < n; i++) s += i+1; -#pragma acc parallel num_gangs (10) reduction (+:s) copy (s) +#pragma acc parallel num_gangs (10) reduction (+:s) copy (s) /* { dg-warning "region is gang partitioned" } */ for (i = 0; i < n; i++) s += i+1; diff --git a/gcc/testsuite/c-c++-common/goacc/routine-1.c b/gcc/testsuite/c-c++-common/goacc/routine-1.c index a756922..b90e2c1 100644 --- a/gcc/testsuite/c-c++-common/goacc/routine-1.c +++ b/gcc/testsuite/c-c++-common/goacc/routine-1.c @@ -1,16 +1,16 @@ #pragma acc routine gang -void gang (void) +void gang (void) /* { dg-warning "partitioned" 3 } */ { } #pragma acc routine worker -void worker (void) +void worker (void) /* { dg-warning "partitioned" 2 } */ { } #pragma acc routine vector -void vector (void) +void vector (void) /* { dg-warning "partitioned" 1 } */ { } diff --git a/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c b/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c index 9f11196..72aacd7 100644 --- a/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c +++ b/gcc/testsuite/c-c++-common/goacc/uninit-dim-clause.c @@ -4,14 +4,17 @@ void acc_parallel() { int i, j, k; - #pragma acc parallel num_gangs(i) /* { dg-warning "is used uninitialized in this function" } */ - ; + #pragma acc parallel loop gang num_gangs(i) /* { dg-warning "is used uninitialized in this function" } */ + for (i = 0; i < 1; i++) + ; - #pragma acc parallel num_workers(j) /* { dg-warning "is used uninitialized in this function" } */ - ; + #pragma acc parallel loop worker num_workers(j) /* { dg-warning "is used uninitialized in this function" } */ + for (j = 0; j < 1; j++) + ; - #pragma acc parallel vector_length(k) /* { dg-warning "is used uninitialized in this function" } */ - ; + #pragma acc parallel loop vector vector_length(k) /* { dg-warning "is used uninitialized in this function" } */ + for (k = 0; k < 1; k++) + ; } void acc_kernels() diff --git a/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 b/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 index 2697bb7..aaa1bfd 100644 --- a/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 @@ -11,6 +11,9 @@ program test !$acc reduction(max:q), copy(i), copyin(j), copyout(k), create(m) & !$acc present(o), pcopy(p), pcopyin(r), pcopyout(s), pcreate(t) & !$acc deviceptr(u), private(v), firstprivate(w) + ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" "" { target *-*-* } .-1 } + ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "" { target *-*-* } .-2 } + ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "" { target *-*-* } .-3 } !$acc end parallel end program test diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-4.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-4.f90 index 6714c7b..3fb60e7 100644 --- a/gcc/testsuite/gfortran.dg/goacc/routine-4.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/routine-4.f90 @@ -123,6 +123,7 @@ contains integer, intent (inout) :: a(N) integer :: i + !$acc loop gang worker vector do i = 1, N a(i) = a(i) - a(i) end do @@ -133,6 +134,7 @@ contains integer, intent (inout) :: a(N) integer :: i + !$acc loop worker vector do i = 1, N a(i) = a(i) - a(i) end do @@ -143,6 +145,7 @@ contains integer, intent (inout) :: a(N) integer :: i + !$acc loop vector do i = 1, N a(i) = a(i) - a(i) end do @@ -153,6 +156,7 @@ contains integer, intent (inout) :: a(N) integer :: i + !$acc loop seq do i = 1, N a(i) = a(i) - a(i) end do diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90 index 75dd1b0..1b41a68 100644 --- a/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90 @@ -2,8 +2,10 @@ ! parallelism with the OpenACC routine directive. The Fortran counterpart is ! c-c++-common/goacc/routine-level-of-parallelism-2.c -subroutine g_1 +subroutine g_1 ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" } !$acc routine gang +! { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "worker partitioned" { xfail *-*-* } .-2 } +! { dg-bogus "region is vector partitioned but does not contain vector partitioned code" "worker partitioned" { xfail *-*-* } .-3 } end subroutine g_1 subroutine s_1_2a diff --git a/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95 b/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95 index 5dea42b..8551140 100644 --- a/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/uninit-dim-clause.f95 @@ -4,14 +4,20 @@ subroutine acc_parallel implicit none integer :: i, j, k - !$acc parallel num_gangs(i) ! { dg-warning "is used uninitialized in this function" } - !$acc end parallel - - !$acc parallel num_workers(j) ! { dg-warning "is used uninitialized in this function" } - !$acc end parallel - - !$acc parallel vector_length(k) ! { dg-warning "is used uninitialized in this function" } - !$acc end parallel + !$acc parallel loop gang num_gangs(i) ! { dg-warning "is used uninitialized in this function" } + do i = 0, 1 + end do + !$acc end parallel loop + + !$acc parallel loop worker num_workers(j) ! { dg-warning "is used uninitialized in this function" } + do j = 0, 1 + end do + !$acc end parallel loop + + !$acc parallel loop vector vector_length(k) ! { dg-warning "is used uninitialized in this function" } + do k = 0, 1 + end do + !$acc end parallel loop end subroutine acc_parallel subroutine acc_kernels diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c index 689a443..14bc3af 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/firstprivate-1.c @@ -117,6 +117,8 @@ void t4 () arr[i] = 3; #pragma acc parallel firstprivate(x) copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 119 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 119 } */ { #pragma acc loop gang for (i = 0; i < 32; i++) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c index 34bc57e..8e2c1c9 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c @@ -147,7 +147,7 @@ int gang_1 (int *ary, int size) { clear (ary, size); -#pragma acc parallel num_gangs (32) num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size) +#pragma acc parallel num_gangs (32) num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size)/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } } */ { #pragma acc loop auto for (int jx = 0; jx < size / 64; jx++) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c new file mode 100644 index 0000000..6c479e4 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c @@ -0,0 +1,13 @@ +/* { dg-additional-options "-fopenacc-dim=16:16" } */ +/* This code uses nvptx inline assembly guarded with acc_on_device, which is + not optimized away at -O0, and then confuses the target assembler. + { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */ +/* { dg-set-target-env-var "GOMP_OPENACC_DIM" "8:8" } */ + +#include "loop-default.h" + +int main () +{ + /* Environment should be ignored. */ + return test_1 (16, 16, 32); +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c index 98f02e9..5831327 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-1.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-w" } */ #include <stdio.h> #include <openacc.h> #include <gomp-constants.h> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c index 4152a4e..82e8aae 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-g-2.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-w" } */ #include <stdio.h> #include <openacc.h> #include <gomp-constants.h> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c index 7107502..2f3a44f 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-w" } */ #include <stdio.h> #include <openacc.h> #include <gomp-constants.h> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c index 6bbd04f..a1bb845 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-1.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-w" } */ #include <stdio.h> #include <openacc.h> #include <gomp-constants.h> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c index c63a5d4..ae43bb4 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-w-2.c @@ -1,3 +1,4 @@ +/* { dg-additional-options "-w" } */ #include <stdio.h> #include <openacc.h> #include <gomp-constants.h> diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c index fa6fb91..10b80f1 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-w-1.c @@ -14,6 +14,7 @@ int main () ary[ix] = -1; #pragma acc parallel num_workers(32) vector_length(32) copy(ary) copy(ondev) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 16 } */ { #pragma acc loop worker for (unsigned ix = 0; ix < N; ix++) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c new file mode 100644 index 0000000..20a022f --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-warn-1.c @@ -0,0 +1,37 @@ + +/* Check warnings about suboptimal partitioning choices. */ + +int main () +{ + int ary[10]; + +#pragma acc parallel copy(ary) num_gangs (1) /* { dg-warning "is not gang partitioned" } */ + { + #pragma acc loop gang + for (int i = 0; i < 10; i++) + ary[i] = i; + } + +#pragma acc parallel copy(ary) num_workers (1) /* { dg-warning "is not worker partitioned" } */ + { + #pragma acc loop worker + for (int i = 0; i < 10; i++) + ary[i] = i; + } + +#pragma acc parallel copy(ary) num_gangs (8) /* { dg-warning "is gang partitioned" } */ + { + #pragma acc loop worker + for (int i = 0; i < 10; i++) + ary[i] = i; + } + +#pragma acc parallel copy(ary) num_workers (8) /* { dg-warning "is worker partitioned" } */ + { + #pragma acc loop gang + for (int i = 0; i < 10; i++) + ary[i] = i; + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c index 4474c12..f62daf0 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/mode-transitions.c @@ -287,6 +287,7 @@ void t7() int n = 0; #pragma acc parallel copy(n) \ num_gangs(1) num_workers(1) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 288 } */ { n++; } @@ -310,6 +311,7 @@ void t8() #pragma acc parallel copy(arr) \ num_gangs(gangs) num_workers(1) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 312 } */ { int j; #pragma acc loop gang @@ -339,6 +341,7 @@ void t9() #pragma acc parallel copy(arr) \ num_gangs(gangs) num_workers(1) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 342 } */ { int j; #pragma acc loop gang @@ -371,6 +374,7 @@ void t10() #pragma acc parallel copy(arr) \ num_gangs(gangs) num_workers(1) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 375 } */ { int j; #pragma acc loop gang @@ -404,6 +408,7 @@ void t11() #pragma acc parallel copy(arr) \ num_gangs(1024) num_workers(1) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 409 } */ { int j; @@ -442,6 +447,7 @@ void t12() #pragma acc parallel copyout(fizz, buzz, fizzbuzz) \ num_gangs(NUM_GANGS) num_workers(1) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 448 } */ { int j; @@ -488,6 +494,7 @@ void t13() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 495 } */ { int j; #pragma acc loop gang @@ -613,6 +620,7 @@ void t16() #pragma acc parallel copy(n, arr) \ num_gangs(8) num_workers(16) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 621 } */ { int j; #pragma acc loop gang @@ -665,6 +673,7 @@ void t17() #pragma acc parallel copyin(arr_a) copyout(arr_b) \ num_gangs(num_gangs) num_workers(num_workers) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 674 } */ { int j; #pragma acc loop gang @@ -882,6 +891,8 @@ void t21() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 892 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 892 } */ { int j; #pragma acc loop gang @@ -905,6 +916,8 @@ void t22() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 917 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 917 } */ { int j; #pragma acc loop gang @@ -931,6 +944,8 @@ void t23() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 945 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 945 } */ { int j; #pragma acc loop gang @@ -957,6 +972,8 @@ void t24() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 973 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 973 } */ { int j; #pragma acc loop gang @@ -988,6 +1005,7 @@ void t25() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 1006 } */ { int j; #pragma acc loop gang @@ -1020,6 +1038,7 @@ void t26() #pragma acc parallel copy(arr) \ num_gangs(8) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 1039 } */ { int j; #pragma acc loop gang @@ -1070,6 +1089,8 @@ void t27() #pragma acc parallel copy(n, arr) copyout(ondev) \ num_gangs(ACTUAL_GANGS) num_workers(8) vector_length(32) + /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" "gang" { target *-*-* } 1090 } */ + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 1090 } */ { int j; diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c index 7781b32..ebcb760 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-dims.c @@ -96,7 +96,7 @@ int main () int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; gangs_min = workers_min = vectors_min = INT_MAX; gangs_max = workers_max = vectors_max = INT_MIN; -#pragma acc parallel copy (gangs_actual) \ +#pragma acc parallel copy (gangs_actual) /* { dg-warning "region contains gang partitoned code but is not gang partitioned" } */ \ num_gangs (GANGS) /* { dg-warning "'num_gangs' value must be positive" "" { target c++ } } */ { /* We're actually executing with num_gangs (1). */ @@ -125,7 +125,7 @@ int main () int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; gangs_min = workers_min = vectors_min = INT_MAX; gangs_max = workers_max = vectors_max = INT_MIN; -#pragma acc parallel copy (workers_actual) \ +#pragma acc parallel copy (workers_actual) /* { dg-warning "region contains worker partitoned code but is not worker partitioned" } */ \ num_workers (WORKERS) /* { dg-warning "'num_workers' value must be positive" "" { target c++ } } */ { /* We're actually executing with num_workers (1). */ @@ -154,7 +154,8 @@ int main () int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; gangs_min = workers_min = vectors_min = INT_MAX; gangs_max = workers_max = vectors_max = INT_MIN; -#pragma acc parallel copy (vectors_actual) /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } } */ \ +#pragma acc parallel copy (vectors_actual) /* { dg-warning "region contains vector partitoned code but is not vector partitioned" } */ \ + /* { dg-warning "using vector_length \\(32\\), ignoring 1" "" { target openacc_nvidia_accel_selected } 157 } */ \ vector_length (VECTORS) /* { dg-warning "'vector_length' value must be positive" "" { target c++ } } */ { /* We're actually executing with vector_length (1), just the GCC nvptx @@ -198,7 +199,7 @@ int main () int gangs_min, gangs_max, workers_min, workers_max, vectors_min, vectors_max; gangs_min = workers_min = vectors_min = INT_MAX; gangs_max = workers_max = vectors_max = INT_MIN; -#pragma acc parallel copy (gangs_actual) \ +#pragma acc parallel copy (gangs_actual) /* { dg-warning "region is gang partitioned but does not contain gang partitioned code" } */ \ reduction (min: gangs_min, workers_min, vectors_min) reduction (max: gangs_max, workers_max, vectors_max) \ num_gangs (gangs) { diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c index 53f03d1..f0c3447 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-variables.c @@ -22,6 +22,8 @@ void local_g_1() arr[i] = 3; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 24 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 24 } */ { int x; @@ -295,6 +297,8 @@ void loop_g_1() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 299 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 299 } */ { #pragma acc loop gang private(x) for (i = 0; i < 32; i++) @@ -320,6 +324,7 @@ void loop_g_2() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 326 } */ { #pragma acc loop gang private(x) for (i = 0; i < 32; i++) @@ -348,6 +353,7 @@ void loop_g_3() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 355 } */ { #pragma acc loop gang private(x) for (i = 0; i < 32; i++) @@ -376,6 +382,7 @@ void loop_g_4() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 384 } */ { #pragma acc loop gang private(x) for (i = 0; i < 32; i++) @@ -408,6 +415,7 @@ void loop_g_5() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 417 } */ { #pragma acc loop gang private(x) for (i = 0; i < 32; i++) @@ -438,6 +446,7 @@ void loop_g_6() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 448 } */ { #pragma acc loop gang private(pt) for (i = 0; i < 32; i++) @@ -559,6 +568,7 @@ void loop_w_1() arr[i] = i; #pragma acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 570 } */ { int j; @@ -875,6 +885,8 @@ void parallel_g_1() arr[i] = 3; #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(8) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 887 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 887 } */ { #pragma acc loop gang(static:1) for (i = 0; i < 32; i++) @@ -904,6 +916,7 @@ void parallel_g_2() arr[i] = i; #pragma acc parallel private(x) copy(arr) num_gangs(32) num_workers(2) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 918 } */ { #pragma acc loop gang for (i = 0; i < 32; i++) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c index c4940b8..68ae919 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-7.c @@ -14,6 +14,8 @@ void g_np_1() arr[i] = i; #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 16 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 16 } */ { #pragma acc loop gang reduction(+:res) for (i = 0; i < 1024; i++) @@ -28,6 +30,8 @@ void g_np_1() res = hres = 1; #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 32 } */ + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 32 } */ { #pragma acc loop gang reduction(*:res) for (i = 0; i < 12; i++) @@ -52,6 +56,7 @@ void gv_np_1() arr[i] = i; #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 58 } */ { #pragma acc loop gang vector reduction(+:res) for (i = 0; i < 1024; i++) @@ -76,6 +81,7 @@ void gw_np_1() arr[i] = i; #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 83 } */ { #pragma acc loop gang worker reduction(+:res) for (i = 0; i < 1024; i++) @@ -239,6 +245,7 @@ void v_p_1() #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \ private(res) copyout(out) + /* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 246 } */ { #pragma acc loop gang for (j = 0; j < 32; j++) @@ -315,6 +322,7 @@ void w_p_1() #pragma acc parallel num_gangs(32) num_workers(32) vector_length(32) \ private(res) copyout(out) + /* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 323 } */ { #pragma acc loop gang for (j = 0; j < 32; j++) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c index a164f57..8c3b938 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c @@ -6,6 +6,8 @@ #pragma acc routine gang void __attribute__ ((noinline)) gang (int ary[N]) +/* { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 8 } */ +/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 8 } */ { #pragma acc loop gang for (unsigned ix = 0; ix < N; ix++) diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c index 81f1e03..e14947c 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-w-1.c @@ -6,6 +6,7 @@ #pragma acc routine worker void __attribute__ ((noinline)) worker (int ary[N]) +/* { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 8 } */ { #pragma acc loop worker for (unsigned ix = 0; ix < N; ix++) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f index aa1bb63..ff31116 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f +++ b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-1.f @@ -14,7 +14,7 @@ RES2 = 0 !$ACC PARALLEL NUM_GANGS(256) NUM_WORKERS(32) VECTOR_LENGTH(32) -!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1) +!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" } res1 = res1 + 5 !$ACC ATOMIC @@ -36,7 +36,7 @@ RES2 = 1 !$ACC PARALLEL NUM_GANGS(8) NUM_WORKERS(32) VECTOR_LENGTH(32) -!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1) +!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" } res1 = res1 * 5 !$ACC ATOMIC diff --git a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f index 5694de1..47c5ff3 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f +++ b/libgomp/testsuite/libgomp.oacc-fortran/par-reduction-2-2.f @@ -14,7 +14,7 @@ RES2 = 0 !$ACC PARALLEL NUM_GANGS(256) NUM_WORKERS(32) VECTOR_LENGTH(32) -!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1) +!$ACC& REDUCTION(+:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" } res1 = res1 + 5 !$ACC ATOMIC @@ -36,7 +36,7 @@ RES2 = 1 !$ACC PARALLEL NUM_GANGS(8) NUM_WORKERS(32) VECTOR_LENGTH(32) -!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1) +!$ACC& REDUCTION(*:RES1) COPY(RES1, RES2) ASYNC(1) ! { dg-warning "region is (gang|worker|vector) partitioned" } res1 = res1 * 5 !$ACC ATOMIC diff --git a/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90 b/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90 index 2b36122..8cb76a9 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/pr84028.f90 @@ -5,7 +5,7 @@ program foo a = 1 - !$acc parallel num_gangs(1) num_workers(2) + !$acc parallel num_gangs(1) num_workers(2) ! { dg-warning "region is worker partitioned" } if (any(a(1:3,1:3,1:3).ne.1)) STOP 1 diff --git a/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90 b/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90 index 472a6a1..fbff5cc 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/private-variables.f90 @@ -13,6 +13,8 @@ subroutine t1() end do !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 15 } + ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 15 } !$acc loop gang private(x) do i = 1, 32 x = i * 2; @@ -37,6 +39,7 @@ subroutine t2() end do !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 41 } !$acc loop gang private(x) do i = 0, 31 x = i * 2; @@ -65,6 +68,7 @@ subroutine t3() end do !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 70 } !$acc loop gang private(x) do i = 0, 31 x = i * 2; @@ -98,6 +102,7 @@ subroutine t4() end do !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 104 } !$acc loop gang private(pt) do i = 0, 31 pt%x = i @@ -208,6 +213,7 @@ subroutine t7() end do !$acc parallel copy(arr) num_gangs(32) num_workers(8) vector_length(32) + ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 215 } !$acc loop gang private(x) do i = 0, 31 !$acc loop worker private(x) @@ -507,6 +513,8 @@ subroutine t14() end do !$acc parallel private(x) copy(arr) num_gangs(n) num_workers(8) vector_length(32) + ! { dg-warning "region is worker partitioned but does not contain worker partitioned code" "worker" { target *-*-* } 515 } + ! { dg-warning "region is vector partitioned but does not contain vector partitioned code" "vector" { target *-*-* } 515 } !$acc loop gang(static:1) do i = 1, n x = i * 2; diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 index f58a95f..a83e92a 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 @@ -1,4 +1,3 @@ - ! { dg-do run } ! { dg-additional-options "-cpp" } @@ -100,7 +99,7 @@ subroutine gang (a) integer, intent (inout) :: a(N) integer :: i - !$acc loop gang + !$acc loop gang worker vector do i = 1, N a(i) = a(i) - i end do