Re: [PATCH] openmp, nvptx: ompx_gnu_managed_mem_alloc

Tobias Burnus Tue, 11 Nov 2025 03:57:58 -0800

Andrew Stubbs wrote:

On 10/11/2025 10:44, Tobias Burnus wrote:

There is still no clear note in the documentation that
this allocator uses the default device when doing the
allocation – and even less so that it must be the same
device (actually: device runtime) as for the allocation.


The documentation is there now.

Thanks.

and pass it to target.c as
'gomp_managed_alloc (size, &used_device);' and
'gomp_managed_free (ptr, used_device);'

(Note: With some handling to avoid races.)
I don't think I like the idea of hidden magic that a) might preventclever solutions, and b) once we start doing it we can never stop. Iprefer the clear documentation together with diagnosing the cases thatwe can.

Well, I am not sure whether one hidden logic is better than the other;we could also store the value device value in the descriptor.


But that solution is also fine if sufficiently documented.

* * *

--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -1265,11 +1271,13 @@ parse_allocator (const char *env, const char *val, void 
*const params[])
    C (omp_pteam_mem_alloc, false)
    C (omp_thread_mem_alloc, false)
    C (ompx_gnu_pinned_mem_alloc, false)
+  C (ompx_gnu_managed_mem_alloc, false)
    C (omp_default_mem_space, true)
    C (omp_large_cap_mem_space, true)
    C (omp_const_mem_space, true)
    C (omp_high_bw_mem_space, true)
    C (omp_low_lat_mem_space, true)
+  C (ompx_gnu_managed_mem_space, false)
  #undef C


Not quite. The second argument is:
#define C(v, m) \
...
      memspace = m;                                     \

Thus: please use ', true)' for the last line.

* * *

+@item @code{ompx_gnu_managed_mem_space} is a GNU extension that provides

...

+      already accessible on the device.  If managed memory is not supported by
+      the default device, as configured at the moment the allocator is called,
+      then the allocator will use the fall-back setting.


Actually, on non-Linux, the normal 'malloc' is used – and not the fallback.

Example:

omp target device(dev)
{
  const omp_alloctrait_t traits[]
     = { { omp_atk_alignment, 256},
         {omp_atk_fallback, omp_atv_abort_fb} };
  omp_allocator_handle_t myalloc
    = omp_init_allocator(ompx_gnu_managed_mem_space, 2, traits);
  void *ptr = omp_alloc(n, ompx_gnu_managed_mem_alloc);
}


For dev == omp_initial_device (on a Linux host):
* default device == an nvidia device
  → managed memory is used (honoring the alignment)
* default device != nvidia device (e.g. the host)
  → fall back is used → ABORT
  (with default_fb, 'malloc' would be used but
   without the 265-bit alignment).

For dev == any non-host device
* default mem space is used ('malloc'), honoring
  the alignment.

The current wording does not really make his clear:

+@item @code{ompx_gnu_managed_mem_space} is a GNU extension that provides
+      managed memory accessible by both host and device (as determined by the
+      @var{default-device-var} ICV); it is only available for supported offload
+      targets (see @ref{Offload-Target Specifics}).  This memory is accessible


Actually, the "and device" is also not quite right. For Nvidia devices, all
Nvidia devices can access that memory – it is not device specific, either.

* * *

How about, e.g. (note also 'devices' [-s]):

@item @code{ompx_gnu_managed_mem_space} is a GNU extension that provides
      managed memory accessible by both host and devices.  The memory space
      is available if the offload target associated with the
      @var{default-device-var} ICV supports managed memory (see
      @ref{Offload-Target Specifics}).  Otherwise, on Linux the
      fall-back setting of the allocator is used and on other systems
      the default memory space.

and continuing as in the patch:

+      This memory is accessible
+      by both the host and the device at the same address, so it need not be
+      mapped with @code{map} clauses.  Instead, use the @code{is_device_ptr}
+      clause or @code{has_device_addr} clause to indicate that the pointer is
+      already accessible on the device.  If managed memory is not supported by
+      the default device, as configured at the moment the allocator is called,
+      then the allocator will use the fall-back setting. If the default device
+      is configured differently when the memory is freed, via @code{omp_free}
+      or @code{omp_realloc}, the result may be undefined.

Albeit the "If managed ... fall-back setting." can be removed.

* * *

[AMD GPUs]

+@item Managed memory allocated with the OpenMP
+      @code{ompx_gnu_managed_mem_alloc} allocator or in the
+      @code{ompx_gnu_managed_mem_space} is not currently supported on AMD GPU
+      devices; attempting to use it in an allocator will trigger the fall-back
+      trait.


I think we need again "for" instead of "on" – as using
it on the device is fine, except that it will not be
managed but the default mem space. (Possibly adding 'on the host', but
it is not really needed.)

* * *

[Nvidia GPUs]

+@item Managed memory allocated with the @code{ompx_gnu_managed_mem_alloc}

Maybe change this to  'Managed memory allocated *on* *the* *host*'?

+      allocator or in the @code{ompx_gnu_managed_mem_space} (both GNU
+      extensions) allocate memory in the CUDA Managed Memory space using
+      @code{cuMemAllocManaged}.  This memory is accessible by both the host and
+      the device at the same address, so it need not be mapped with @code{map}
+      clauses.  Instead, use the @code{is_device_ptr} clause or
+      @code{has_device_addr} clause to indicate that the pointer is already
+      accessible on the device.  The CUDA runtime will automatically handle
+      data migration between host and device as needed.

+      If managed memory is not supported by the default device, as configured
+      at the moment the allocator is called, then the allocator will use the
+      fall-back setting.

I wonder whether this is needed – or the wording in the other section isenough? I am slightly inclined of removing it.

+ If the default device is configured differently when
+      the memory is freed, via @code{omp_free} or @code{omp_realloc}, the
+      result may be undefined.


While this one is redundant, I think it is sensible to keep it – to
avoid user surprises. It also hints at the default device in case
the user missed it in the other section.

* * *

All in all: LGTM with the env.c issue fixed and considering some
.texi wording changes.

Thanks for the patch!

Tobias

Re: [PATCH] openmp, nvptx: ompx_gnu_managed_mem_alloc

Reply via email to