On 11/11/15 12:03, Richard Biener wrote:
On Mon, 9 Nov 2015, Tom de Vries wrote:
On 09/11/15 16:35, Tom de Vries wrote:
Hi,
this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.
The patch series contains these patches:
1 Insert new exit block only when needed in
transform_to_exit_first_loop_alt
2 Make create_parallel_loop return void
3 Ignore reduction clause on kernels directive
4 Implement -foffload-alias
5 Add in_oacc_kernels_region in struct loop
6 Add pass_oacc_kernels
7 Add pass_dominator_oacc_kernels
8 Add pass_ch_oacc_kernels
9 Add pass_parallelize_loops_oacc_kernels
10 Add pass_oacc_kernels pass group in passes.def
11 Update testcases after adding kernels pass group
12 Handle acc loop directive
13 Add c-c++-common/goacc/kernels-*.c
14 Add gfortran.dg/goacc/kernels-*.f95
15 Add libgomp.oacc-c-c++-common/kernels-*.c
16 Add libgomp.oacc-fortran/kernels-*.f95
The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.
Bootstrapped and reg-tested on x86_64.
Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
I'll post the individual patches in reply to this message.
This patch updates existing testcases with new pass numbers, given the passes
that were added in the pass list in patch 10.
I think it would be nice to be able to specify the number in the .def
file instead so we can avoid this kind of churn everytime we do this.
How about something along the lines of:
...
/* pass_build_ealias is a dummy pass that ensures that we
execute TODO_rebuild_alias at this point. */
NEXT_PASS (pass_build_ealias);
/* Pass group that runs when there are oacc kernels in the
function. */
NEXT_PASS (pass_oacc_kernels);
PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
PUSH_ID ("oacc_kernels")
...
POP_ID ()
POP_INSERT_PASSES ()
NEXT_PASS (pass_fre);
...
where the PUSH_ID/POP_ID pair has the functionality that all the
contained passes:
- have the id prefixed to the dump file, so the dump file of pass_ch
which normally is "ch" becomes "oacc_kernels_ch", and
- the pass name in pass_instances.def becomes pass_oacc_kernels_ch, such
that it doesn't count as numbered instance of pass_ch
?
Thanks,
- Tom