This patch series provides support for worker partitioning in the middle end. The OpenACC device-lowering pass (oaccdevlow) is split into three passes: the first assigns parallelism levels to loops, the second (new) part rewrites basic blocks to implement a neutering/broadcasting scheme for the OpenACC worker-partitioned execution mode, and the third part performs the rest of the previous device-lowering pass.
Also included are patches to add support for placing gang-private variables in special memory (e.g. LDS, "local-data share", on AMD GCN), and to rewrite reductions targeting reference variables to use temporary local scalar variables instead. Further commentary is provided alongside individual patches. Tested with offloading to AMD GCN. I will apply to the openacc-gcc-9-branch shortly. Thanks, Julian Julian Brown (6): [og9] Target-dependent gang-private variable decl rewriting [og9] OpenACC middle-end worker-partitioning support [og9] AMD GCN adjustments for middle-end worker partitioning [og9] Fix up tests for oaccdevlow pass splitting [og9] Reference reduction localization [og9] Enable worker partitioning for AMD GCN gcc/ChangeLog.openacc | 83 + gcc/Makefile.in | 1 + gcc/config/gcn/gcn-protos.h | 2 +- gcc/config/gcn/gcn-tree.c | 6 +- gcc/config/gcn/gcn.c | 15 +- gcc/config/gcn/gcn.opt | 2 +- gcc/doc/tm.texi | 14 + gcc/doc/tm.texi.in | 6 + gcc/gimplify.c | 102 + gcc/omp-builtins.def | 8 + gcc/omp-low.c | 47 +- gcc/omp-offload.c | 290 ++- gcc/omp-offload.h | 1 + gcc/omp-sese.c | 2036 +++++++++++++++++ gcc/omp-sese.h | 26 + gcc/passes.def | 2 + gcc/target.def | 19 + gcc/targhooks.h | 1 + gcc/testsuite/ChangeLog.openacc | 12 + .../goacc/classify-kernels-unparallelized.c | 8 +- .../c-c++-common/goacc/classify-kernels.c | 8 +- .../c-c++-common/goacc/classify-parallel.c | 8 +- .../c-c++-common/goacc/classify-routine.c | 8 +- .../goacc/classify-kernels-unparallelized.f95 | 8 +- .../gfortran.dg/goacc/classify-kernels.f95 | 8 +- .../gfortran.dg/goacc/classify-parallel.f95 | 8 +- .../gfortran.dg/goacc/classify-routine.f95 | 8 +- gcc/tree-core.h | 4 +- gcc/tree-pass.h | 2 + gcc/tree.c | 11 +- gcc/tree.h | 2 + libgomp/ChangeLog.openacc | 5 + libgomp/plugin/plugin-gcn.c | 4 +- 33 files changed, 2660 insertions(+), 105 deletions(-) create mode 100644 gcc/omp-sese.c create mode 100644 gcc/omp-sese.h -- 2.22.0