This series consists of support for accelerated 2D/3D memory copies for
omp_target_memcpy_rect and "target update" directives, using underlying
API-provided routines (CUDA for nVidia, or via an AMD-specific extension
for HSA).

One of the patches (by Tobias) is already on mainline, so this is a
backport, and another is a bug fix that has been submitted for mainline
but not yet reviewed.  The final patch is new (the AMD support for
2D/3D copies).

Tested with offloading to both NVPTX and AMD GCN.  I will push (to the
og13 branch) shortly.

Julian Brown (2):
  [og13] OpenMP, NVPTX: memcpy[23]D bias correction
  [og13] OpenMP: Support accelerated 2D/3D memory copies for AMD GCN

Tobias Burnus (1):
  [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for
    omp_target_memcpy_rect

 include/cuda/cuda.h                           |  87 +++
 libgomp/libgomp-plugin.h                      |   7 +
 libgomp/libgomp.h                             |   2 +
 libgomp/libgomp.texi                          |   5 +
 libgomp/oacc-host.c                           |   2 +
 libgomp/plugin/cuda-lib.def                   |   3 +
 libgomp/plugin/plugin-gcn.c                   | 359 ++++++++++++
 libgomp/plugin/plugin-nvptx.c                 | 185 ++++++
 libgomp/target.c                              | 164 +++++-
 libgomp/testsuite/libgomp.c/target-12.c       |   6 +-
 .../testsuite/libgomp.fortran/target-12.f90   |   6 +-
 .../libgomp.fortran/target-memcpy-rect-1.f90  | 531 ++++++++++++++++++
 12 files changed, 1332 insertions(+), 25 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-memcpy-rect-1.f90

-- 
2.25.1

Reply via email to