https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109837
Bug ID: 109837 Summary: [OpenMP] despite 'requires unified_address' there is segfault when 'is_device_ptr' is not used Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: libgomp Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org CC: jakub at gcc dot gnu.org Target Milestone: --- Cf. https://github.com/SOLLVE/sollve_vv/pull/734 for a testcase. For nvptx, we currently have: GOMP_OFFLOAD_get_num_devices (unsigned int omp_requires_mask) ... && ((omp_requires_mask & ~(GOMP_REQUIRES_UNIFIED_ADDRESS | GOMP_REQUIRES_REVERSE_OFFLOAD)) != 0)) That is: we accept for nvptx omp requires unfied_address However, while the address space is the same, the following is not handled: > the is_device_ptr clause is not necessary to obtain device addresses from > device pointers for use inside target regions. Expected: (A) Mapping related: is_device_ptr can be left out. (B) omp_target_is_accessible - will properly work for such pointers. For nvptx, the check can be done via cudaPointerGetAttributes if I understand https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__UNIFIED.html correctly. * * * NOTE: Something similar is needed for GCN, except that its GOMP_OFFLOAD_get_num_devices currently returns -1 when GOMP_REQUIRES_UNIFIED_ADDRESS has been requested. It seems as if hsa_amd_pointer_info is the function to be used, cf. https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/master/src/inc/hsa_ext_amd.h