This patch series brings together support for worker partitioning on AMD GCN and various support patches to ensure no testsuite regressions.
Some of these patches have been sent upstream previously. Most are present on the openacc-gcc-9-branch, and have been tested with both AMD GCN and nVidia GPUs. The series has been tested as a whole with offloading to AMD GCN. Further commentary is provided alongside individual patches. OK for trunk? Thanks, Julian Julian Brown (13): Add support for gang local storage allocation in shared memory Target-dependent gang-private variable decl rewriting Rewrite OpenACC private or reduction reference variables OpenACC middle-end worker-partitioning support AMD GCN adjustments for middle-end worker partitioning Fix up tests for oaccdevlow pass split Fix OpenACC "ephemeral" asynchronous host-to-device copies Fix host-to-device copies from rodata for AMD GCN AMD GCN libgomp plugin queue-full condition locking fix Race conditions in OpenACC async tests AMD GCN symbol output with null cfun Fix parallel-dims.f90 for AMD GCN Enable worker partitioning for AMD GCN gcc/Makefile.in | 1 + gcc/config/gcn/gcn-protos.h | 4 +- gcc/config/gcn/gcn-tree.c | 11 +- gcc/config/gcn/gcn.c | 25 +- gcc/config/gcn/gcn.opt | 2 +- gcc/config/nvptx/nvptx.c | 699 +----- gcc/doc/tm.texi | 23 + gcc/doc/tm.texi.in | 8 + gcc/expr.c | 13 +- gcc/gimplify.c | 116 + gcc/internal-fn.c | 2 + gcc/internal-fn.h | 3 +- gcc/omp-builtins.def | 8 + gcc/omp-low.c | 172 +- gcc/omp-offload.c | 322 ++- gcc/omp-offload.h | 1 + gcc/omp-sese.c | 2086 +++++++++++++++++ gcc/omp-sese.h | 32 + gcc/passes.def | 2 + gcc/target.def | 30 + gcc/targhooks.h | 1 + .../goacc/classify-kernels-unparallelized.c | 8 +- .../c-c++-common/goacc/classify-kernels.c | 8 +- .../c-c++-common/goacc/classify-parallel.c | 8 +- .../c-c++-common/goacc/classify-routine.c | 8 +- .../gcc.dg/goacc/loop-processing-1.c | 4 +- .../goacc/classify-kernels-unparallelized.f95 | 8 +- .../gfortran.dg/goacc/classify-kernels.f95 | 8 +- .../gfortran.dg/goacc/classify-parallel.f95 | 8 +- .../gfortran.dg/goacc/classify-routine.f95 | 8 +- gcc/tree-core.h | 4 +- gcc/tree-pass.h | 2 + gcc/tree.c | 11 +- gcc/tree.h | 2 + libgomp/libgomp-plugin.h | 3 +- libgomp/libgomp.h | 2 +- libgomp/oacc-host.c | 1 + libgomp/oacc-mem.c | 4 +- libgomp/plugin/plugin-gcn.c | 82 +- libgomp/plugin/plugin-nvptx.c | 13 +- libgomp/target.c | 92 +- .../libgomp.oacc-c++/privatized-ref-2.C | 64 + .../libgomp.oacc-c++/privatized-ref-3.C | 64 + .../gang-private-1.c | 38 + .../libgomp.oacc-c-c++-common/lib-94.c | 4 +- .../libgomp.oacc-c-c++-common/loop-gwv-2.c | 95 + .../gangprivate-attrib-1.f90 | 25 + .../gangprivate-attrib-2.f90 | 25 + .../libgomp.oacc-fortran/lib-16-2.f90 | 5 + .../libgomp.oacc-fortran/parallel-dims-aux.c | 9 +- .../libgomp.oacc-fortran/privatized-ref-1.f95 | 71 + 51 files changed, 3426 insertions(+), 819 deletions(-) create mode 100644 gcc/omp-sese.c create mode 100644 gcc/omp-sese.h create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-2.C create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-3.C create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-1.f95 -- 2.23.0