This series consists of support for accelerated 2D/3D memory copies for omp_target_memcpy_rect and "target update" directives, using underlying API-provided routines (CUDA for nVidia, or via an AMD-specific extension for HSA).
One of the patches (by Tobias) is already on mainline, so this is a backport, and another is a bug fix that has been submitted for mainline but not yet reviewed. The final patch is new (the AMD support for 2D/3D copies). Tested with offloading to both NVPTX and AMD GCN. I will push (to the og13 branch) shortly. Julian Brown (2): [og13] OpenMP, NVPTX: memcpy[23]D bias correction [og13] OpenMP: Support accelerated 2D/3D memory copies for AMD GCN Tobias Burnus (1): [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect include/cuda/cuda.h | 87 +++ libgomp/libgomp-plugin.h | 7 + libgomp/libgomp.h | 2 + libgomp/libgomp.texi | 5 + libgomp/oacc-host.c | 2 + libgomp/plugin/cuda-lib.def | 3 + libgomp/plugin/plugin-gcn.c | 359 ++++++++++++ libgomp/plugin/plugin-nvptx.c | 185 ++++++ libgomp/target.c | 164 +++++- libgomp/testsuite/libgomp.c/target-12.c | 6 +- .../testsuite/libgomp.fortran/target-12.f90 | 6 +- .../libgomp.fortran/target-memcpy-rect-1.f90 | 531 ++++++++++++++++++ 12 files changed, 1332 insertions(+), 25 deletions(-) create mode 100644 libgomp/testsuite/libgomp.fortran/target-memcpy-rect-1.f90 -- 2.25.1