[patch] libgomp: Enable USM for some nvptx devices

2024-05-28 Thread Tobias Burnus
While most of the nvptx systems I have access to don't have the support for CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES, one has: Tesla V100-SXM2-16GB (as installed, e.g., on ORNL's Summit) does support this feature. And with that feature, unified-shared memory support doe

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-05-28 Thread Tobias Burnus
Tobias Burnus wrote: While most of the nvptx systems I have access to don't have the support for CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES, one has: Actually, CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS is sufficient. And I finally also found the proper webpage for this

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-05-29 Thread Jakub Jelinek
On Wed, May 29, 2024 at 08:20:01AM +0200, Tobias Burnus wrote: > + if (num_devices > 0 > + && (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY)) > +for (int dev = 0; dev < num_devices; dev++) > + { > + int pi; > + CUresult r; > + r = CUDA_CALL_NOCHECK (cuDeviceGet

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-03 Thread Andrew Stubbs
On 28/05/2024 23:33, Tobias Burnus wrote: While most of the nvptx systems I have access to don't have the support for CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES, one has: Tesla V100-SXM2-16GB (as installed, e.g., on ORNL's Summit) does support this feature. And with that

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-03 Thread Tobias Burnus
Andrew Stubbs wrote: +    /* If USM has been requested and is supported by all devices +   of this type, set the capability accordingly.  */ +    if (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY) +  current_device.capabilities |= GOMP_OFFLOAD_CAP_SHARED_MEM; +

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-03 Thread Andrew Stubbs
On 03/06/2024 17:46, Tobias Burnus wrote: Andrew Stubbs wrote: +    /* If USM has been requested and is supported by all devices +   of this type, set the capability accordingly.  */ +    if (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY) +  current_device.capab

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-03 Thread Tobias Burnus
Andrew Stubbs wrote: On 03/06/2024 17:46, Tobias Burnus wrote: Andrew Stubbs wrote: +    /* If USM has been requested and is supported by all devices +   of this type, set the capability accordingly. */ +    if (omp_requires_mask & GOMP_REQUIRES_UNIFIED_SHARED_MEMORY) + 

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-04 Thread Andrew Stubbs
On 03/06/2024 21:40, Tobias Burnus wrote: Andrew Stubbs wrote: On 03/06/2024 17:46, Tobias Burnus wrote: Andrew Stubbs wrote: +    /* If USM has been requested and is supported by all devices +   of this type, set the capability accordingly. */ +    if (omp_requires_mask & GOMP

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-04 Thread Tobias Burnus
Andrew Stubbs wrote: PS: I would love to do some comparisons [...] Actually, I think testing only data transfer is fine for this, but we might like to try some different access patterns, besides straight linear copies. I have now tried it on my laptop with BabelStream,https://github.com/UoB-

Re: [patch] libgomp: Enable USM for some nvptx devices

2024-06-05 Thread Tobias Burnus
Hi Andrew, hello world, Now with AMD Instinct MI200 data - see below. And a better look at the numbers. In terms of USM, there does not seem to be any clear winner of both approaches. If we want to draw conclusions, definitely more runs are needed (statistics): The runs below show that the diff