Hi! Will this patch be acceptable for GCC trunk in the current development stage? In its current incarnation, this patch depends on my 'Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"' patch, <http://news.gmane.org/find-root.php?message_id=%3C87zivg8rcy.fsf%40hertz.schwinge.homeip.net%3E>, which Bernd suggested "has to be considered after gcc-6". So, I'll have to re-work this patch here, hence I'm first checking if it generally meets approval?
On Fri, 5 Feb 2016 13:06:17 +0100, I wrote: > On Mon, 9 Nov 2015 18:39:19 +0100, Tom de Vries <tom_devr...@mentor.com> > wrote: > > On 09/11/15 16:35, Tom de Vries wrote: > > > this patch series for stage1 trunk adds support to: > > > - parallelize oacc kernels regions using parloops, and > > > - map the loops onto the oacc gang dimension. > > > Atm, the parallelization behaviour for the kernels region is controlled > > by flag_tree_parallelize_loops, which is also used to control generic > > auto-parallelization by autopar using omp. That is not ideal, and we may > > want a separate flag (or param) to control the behaviour for oacc > > kernels, f.i. -foacc-kernels-gang-parallelize=<n>. I'm open to suggestions. > > I suggest to use plain -fopenacc to enable OpenACC kernels processing > (which just makes sense, I hope) ;-) and have later processing stages > determine the actual parametrization (currently: number of gangs) (that > is, Nathan's recent "Default compute dimensions" patches). > > The code changes are simple enough; OK for trunk? (This patch depends on > my 'Un-parallelized OpenACC kernels constructs with nvptx offloading: > "avoid offloading"' pending review, > <http://news.gmane.org/find-root.php?message_id=%3C87zivg8rcy.fsf%40hertz.schwinge.homeip.net%3E>.) > > Originally, I want to use: > > OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, > n_threads == 0 ? -1 : n_threads); > > ... to store -1 "have the compiler decidew" (instead of now 0 "have the > run-time decide", which might prevent some code optimizations, as I > understand it) for the n_threads == 0 case, but it seems that for an > offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is > called with the parameter "used" set to 0 instead of "gang", and then the > "Default anything left to 1 or a partitioned default" logic will default > dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the > oacc_default_dims["gang"] (that is, 32). Nathan, does that smell like a > bug (and could you look into that)? > > diff --git gcc/tree-parloops.c gcc/tree-parloops.c > index 139e38c..e498e5b 100644 > --- gcc/tree-parloops.c > +++ gcc/tree-parloops.c > @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop, > /* Create the parallel constructs for LOOP as described in gen_parallel_loop. > LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL. > NEW_DATA is the variable that should be initialized from the argument > - of LOOP_FN. N_THREADS is the requested number of threads. */ > + of LOOP_FN. N_THREADS is the requested number of threads, which can be 0 > if > + that number is to be determined later. */ > > static void > create_parallel_loop (struct loop *loop, tree loop_fn, tree data, > @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, > tree data, > basic_block paral_bb = single_pred (bb); > gsi = gsi_last_bb (paral_bb); > > + gcc_checking_assert (n_threads != 0); > t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS); > OMP_CLAUSE_NUM_THREADS_EXPR (t) > = build_int_cst (integer_type_node, n_threads); > @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, > tree data, > } > > /* Generates code to execute the iterations of LOOP in N_THREADS > - threads in parallel. > + threads in parallel, which can be 0 if that number is to be determined > + later. > > NITER describes number of iterations of LOOP. > REDUCTION_LIST describes the reductions existent in the LOOP. */ > @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop, > else > m_p_thread=MIN_PER_THREAD; > > + gcc_checking_assert (n_threads != 0); > many_iterations_cond = > fold_build2 (GE_EXPR, boolean_type_node, > nit, build_int_cst (type, m_p_thread * n_threads)); > @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop, > static bool > parallelize_loops (bool oacc_kernels_p) > { > - unsigned n_threads = flag_tree_parallelize_loops; > + unsigned n_threads; > bool changed = false; > struct loop *loop; > struct loop *skip_loop = NULL; > @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p) > if (cfun->has_nonlocal_label) > return false; > > + /* For OpenACC kernels, n_threads will be determined later; otherwise, it's > + the argument to -ftree-parallelize-loops. */ > + if (oacc_kernels_p) > + n_threads = 0; > + else > + n_threads = flag_tree_parallelize_loops; > + > gcc_obstack_init (&parloop_obstack); > reduction_info_table_type reduction_list (10); > > @@ -3361,7 +3372,13 @@ public: > {} > > /* opt_pass methods: */ > - virtual bool gate (function *) { return flag_tree_parallelize_loops > 1; } > + virtual bool gate (function *) > + { > + if (oacc_kernels_p) > + return flag_openacc; > + else > + return flag_tree_parallelize_loops > 1; > + } > virtual unsigned int execute (function *); > opt_pass * clone () { return new pass_parallelize_loops (m_ctxt); } > void set_pass_param (unsigned int n, bool param) > diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c > index bdbade5..4c39fbc 100644 > --- gcc/tree-ssa-loop.c > +++ gcc/tree-ssa-loop.c > @@ -148,7 +148,7 @@ make_pass_tree_loop (gcc::context *ctxt) > static bool > gate_oacc_kernels (function *fn) > { > - if (flag_tree_parallelize_loops <= 1) > + if (!flag_openacc) > return false; > > tree oacc_function_attr = get_oacc_fn_attrib (fn->decl); > @@ -230,10 +230,9 @@ public: > virtual bool gate (function *) > { > return (optimize > - /* Don't bother doing anything if the program has errors. */ > - && !seen_error () > && flag_openacc > - && flag_tree_parallelize_loops > 1); > + /* Don't bother doing anything if the program has errors. */ > + && !seen_error ()); > } > > }; // class pass_ipa_oacc > diff --git gcc/config/nvptx/nvptx.c gcc/config/nvptx/nvptx.c > index fe28154..2fd3d52 100644 > --- gcc/config/nvptx/nvptx.c > +++ gcc/config/nvptx/nvptx.c > @@ -4140,7 +4140,7 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int > fn_level) > bool avoid_offloading_p = true; > for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++) > { > - if (dims[ix] > 1) > + if (dims[ix] > 1 || dims[ix] == 0) > { > avoid_offloading_p = false; > break; > diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c > index bc24651..f795bf7 100644 > --- libgomp/oacc-parallel.c > +++ libgomp/oacc-parallel.c > @@ -103,6 +103,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), > return; > } > > + /* Default: let the runtime choose. */ > + for (i = 0; i != GOMP_DIM_MAX; i++) > + dims[i] = 0; > + > va_start (ap, kinds); > /* TODO: This will need amending when device_type is implemented. */ > while ((tag = va_arg (ap, unsigned)) != 0) > diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c > index 7ec1810..3f1bb6d 100644 > --- libgomp/plugin/plugin-nvptx.c > +++ libgomp/plugin/plugin-nvptx.c > @@ -894,9 +894,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, > void **devaddrs, > /* Initialize the launch dimensions. Typically this is constant, > provided by the device compiler, but we must permit runtime > values. */ > - for (i = 0; i != 3; i++) > - if (targ_fn->launch->dim[i]) > - dims[i] = targ_fn->launch->dim[i]; > + int seen_zero = 0; > + for (i = 0; i != GOMP_DIM_MAX; i++) > + { > + if (targ_fn->launch->dim[i]) > + dims[i] = targ_fn->launch->dim[i]; > + if (!dims[i]) > + seen_zero = 1; > + } > + > + if (seen_zero) > + { > + for (i = 0; i != GOMP_DIM_MAX; i++) > + if (!dims[i]) > + dims[i] = /* TODO */ 32; > + } > > /* This reserves a chunk of a pre-allocated page of memory mapped on both > the host and the device. HP is a host pointer to the new chunk, and DP > is > > The TODO in libgomp/plugin/plugin-nvptx.c:nvptx_exec will be resolved by > Nathan's "Default compute dimensions (runtime)", > <http://news.gmane.org/find-root.php?message_id=%3C56B21D23.5060209%40acm.org%3E>. > > The remainder is just "mechanical" updates to the test cases: > > diff --git > gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > index e8b5357..17f240e 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -51,4 +50,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > index c39d674..750f576 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -34,4 +33,4 @@ foo (unsigned int n) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > index 3501d0d..df60d6a 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -34,4 +33,4 @@ foo (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > index f97584d..913d91f 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -67,4 +66,4 @@ main (void) > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.1" 1 > "optimized" } } */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.2" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 3 > "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > index 530d62a..1822d2a 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -45,5 +44,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > index 4f1c2c5..e946319 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > @@ -1,6 +1,5 @@ > /* { dg-additional-options "-O2" } */ > /* { dg-additional-options "-g" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -13,5 +12,4 @@ > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > index 151db51..9b63b45 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -49,4 +48,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > index bee5f5a..279f797 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -52,5 +51,4 @@ foo (COUNTERTYPE n) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > index ea0e342..db1071f 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -36,4 +35,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop.c > gcc/testsuite/c-c++-common/goacc/kernels-loop.c > index ab5dfb9..abf7a3c 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -52,5 +51,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > index b16a8cd..95f4817 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -50,5 +49,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > index 61c5df3..6f5a418 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -32,5 +31,4 @@ foo (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 > "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 > "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" > } } */ > diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > index 4db3a50..3334741 100644 > --- gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > +++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > @@ -1,5 +1,4 @@ > ! { dg-additional-options "-O2" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > > program main > implicit none > diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > index fef3d10..fb92da8 100644 > --- gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > +++ gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > @@ -1,5 +1,4 @@ > ! { dg-additional-options "-O2" } > -! { dg-additional-options "-ftree-parallelize-loops=10" } > > program main > implicit none > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > index 08745fc..366b4f5 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > @@ -1,6 +1,5 @@ > /* Test that the compiler decides to "avoid offloading". */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* The ACC_DEVICE_TYPE environment variable gets set in the testing > framework, and that overrides the "avoid offloading" flag at run time. > { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } } */ > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > index 724228a..a63ec97 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > @@ -1,8 +1,6 @@ > /* Test that a user can override the compiler's "avoid offloading" > decision at run time. */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <openacc.h> > > int main(void) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > index 2fb5196..da01d02 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > @@ -1,7 +1,6 @@ > /* Test that a user can override the compiler's "avoid offloading" > decision at compile time. */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* Override the compiler's "avoid offloading" decision. > { dg-additional-options "-foffload-force" } */ > > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > index 87ca378..39899ab 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > @@ -1,7 +1,5 @@ > /* This test exercises combined directives. */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > int > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > index 8f0144c..31da8b1 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <openacc.h> > > int test_parallel () > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > index 3ef6f9b..51745ba 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > @@ -1,5 +1,4 @@ > /* { dg-do run { target openacc_nvidia_accel_selected } } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-lcuda -lcublas -lcudart" } */ > > #include <stdlib.h> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > index 614ad33..588e864 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > int i; > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > index 13e57bd..c7592d6 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N (1024 * 512) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > index f61a74a..31114ac 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N (1024 * 512) > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > index 5cdc200..3ffdfe2 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > index 2e4d4d2..a554d66 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > index 5bf00db..f0144b4 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > index d39b667..4719edd 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > index bb2e85b..ca4f638 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > index e513827..d2fff38 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > index c4791a4..0df4b3f 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 100 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > index 96b6e4e..88258be 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > @@ -1,5 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-g" } */ > > #include "kernels-loop.c" > diff --git > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > index 1433cb2..147ebb5 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N ((1024 * 512) + 1) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > index fd0d5b1..9a3eaca 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N ((1024 * 512) + 1) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > index 21d2599..28c725a 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 1000 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > index 3762e5a..355123c 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N (1024 * 512) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > index 511e25f..8647a94 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define n 10000 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > index 94a5ae2..83cddb5 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > int > diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > index 5f18b94..ca5cd01 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > @@ -2,7 +2,6 @@ > > ! { dg-do run } > ! { dg-additional-options "-cpp" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } > } > ! The ACC_DEVICE_TYPE environment variable gets set in the testing > diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > index 51801ad..6200b37 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > @@ -3,7 +3,6 @@ > > ! { dg-do run } > ! { dg-additional-options "-cpp" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } > } > > diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > index bea6ab8..865d09f 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > @@ -3,7 +3,6 @@ > > ! { dg-do run } > ! { dg-additional-options "-cpp" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! Override the compiler's "avoid offloading" decision. > ! { dg-additional-options "-foffload-force" } > > diff --git libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > index 4b52579..12ff36c 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > +++ libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > @@ -1,7 +1,6 @@ > ! This test exercises combined directives. > > ! { dg-do run } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } > } > > diff --git libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > index b9298c7..0643e89 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > +++ libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > @@ -2,7 +2,6 @@ > ! offloaded regions are properly mapped using present_or_copy. > > ! { dg-do run } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } > } Grüße Thomas