Hi PA,

not a review but a head up (or heads down, I am currently confused …).

Paul-Antoine Arras wrote:
Consider the following OMP directive, assuming tiles is allocatable:

!$omp target enter data &
!$omp   map(to: chunk%tiles(1)%field%density0) &
!$omp   map(to: chunk%left_rcv_buffer)

libgomp reports an illegal memory access error at runtime. This is because
density0 is referenced through tiles, which requires its descriptor to be mapped
along its content.
This patch ensures that such intervening allocatable in a reference chain is
properly mapped. For the above example, the frontend has to create the following
three additional map clauses:

(1) map (alloc: *(struct tile_type[0:] * restrict) chunk.tiles.data [len: 0])
(2) map (to: chunk.tiles [pointer set, len: 64])
(3) map (attach_detach: (struct tile_type[0:] * restrict) chunk.tiles.data
[bias: -1])

I think I need to think about this a bit more.

To have something lighter, I tried:
----------------------
integer, allocatable :: aa(:)
integer, pointer :: pp(:)

pp => null()
!$omp target enter data map(aa)
!$omp target enter data map(pp)

!$omp target map(present, alloc: aa, pp)
  if (associated(pp) .or. allocated(aa)) i = 1
!$omp end target

!$omp target exit data map(pp)
!$omp target exit data map(aa)

allocate(aa, pp, source=[1,2,3])
!$omp target enter data map(pp)
!$omp target enter data map(aa)

!$omp target map(always, to: aa) map(to: pp)
! GCC + ftn: 'map(present, alloc:' -> 'aa' and 'pp' not in the present table
!  if (associated(pp) .or. allocated(aa)) stop 1
!  if (any (pp /= [1,2,3])) stop 1
!  if (any (aa /= [1,2,3])) stop 1
  pp = pp * 2
  aa = aa * 3
!$omp end target

!$omp target exit data map(from: aa)
!$omp target exit data map(from: pp)

print *, aa ! 3,6,9
print *, pp ! 0 (?) w/ cray, 2,4,6 with GCC.
end
-----------------

I think the GCC result for that program (as currently written)
makes sense - but I am not sure I understand the 'present' → error not present.
I guess, I need to re-read the specification here.

If one comments the inner enter/exit for 'aa', only 'to' is
active and the result is the outer '1,2,3' - kind of makes sense
if 'aa' is not in the present table. (But should it?)

* * *

The original starting point is the following program where for
the first 'target' region, Cray ftn works (accepting the 'present')
but gfortran fails with:

The following program fails with GCC (and hopefully
correctly applied patches) with:
  libgomp: Trying to map into
    device [0x62d1a50..0x62d1b00) object
    when   [0x62d1a50..0x62d1aa8) is already mapped

which I find rather odd. If one uses 'to' instead of 'present'
(+ some ignored map type, let's pick: 'alloc'), it works with
GCC as well.

(BTW: Cray ftn rejects 'print' inside target and gives an ICE
when using 'stop'.)

* * *

Likewise for the second target region, where GCC does not
like the 'present' either. Using

  'alloc: ... density0'
  'always, to: density1'

it fails differently:
  libgomp: cuCtxSynchronize error: an illegal memory access was encountered

However, with
  'to: density0'
  'always, to: density1'
the program compiles and runs past this target region.

However, at runtime, 'from:' in target exit data doesn't bring the data back
for 'density1' (but for density0) - while 'always, from' (for density1)
will cause:
  libgomp: cuCtxSynchronize error: an illegal memory access was encountered

Again, this message is a bit surprising - while the failing copy back seems
to be due to 'density1' not being in the present table, I'd guess.

As Cray ftn shares the not-in-present-table behavior for the scalar
case, it is not surprising that it also uses the host value for 'density1'.
But it doesn't have the odd crash GCC has. It behaves identical for
'always, from' and 'from', contrary to GCC:

-------------------------
module m
  implicit none
  type field_type
    real(kind=8), allocatable :: density0(:,:), density1(:,:)
  end type field_type

  type tile_type
    type(field_type) :: field
  end type tile_type

  type chunk_type
    real(kind=8), allocatable :: left_rcv_buffer(:)
    type(tile_type), allocatable :: tiles(:)
  end type chunk_type

  type(chunk_type) :: chunk
end

use m
implicit none
allocate(chunk%tiles(1))
chunk%tiles(1)%field%density0 = reshape([1,2,3,4],[2,2])

!$omp target enter data &
!$omp   map(to: chunk%tiles(1)%field%density0) &
!$omp   map(to: chunk%tiles(1)%field%density1)

!$omp target map(present, alloc: chunk%tiles(1)%field%density0)
!  if (.not. allocated(chunk%tiles(1)%field%density0)) stop 1
!  if (any (chunk%tiles(1)%field%density0 /= reshape([1,2,3,4],[2,2]))) stop 1
  chunk%tiles(1)%field%density0 = chunk%tiles(1)%field%density0 * 2
!$omp end target

chunk%tiles(1)%field%density1 = reshape([11,22,33,44],[2,2])

!$omp target map(present, alloc: chunk%tiles(1)%field%density0) &
!$omp        map(always, present, to: chunk%tiles(1)%field%density1)
!  if (.not. allocated(chunk%tiles(1)%field%density0)) stop 1
!  if (.not. allocated(chunk%tiles(1)%field%density1)) stop 1
!  if (any (chunk%tiles(1)%field%density0 /= 2*reshape([1,2,3,4],[2,2]))) stop 1
!  if (any (chunk%tiles(1)%field%density1 /= reshape([11,22,33,44],[2,2]))) 
stop 1
  chunk%tiles(1)%field%density0 = chunk%tiles(1)%field%density0 * 7
  chunk%tiles(1)%field%density1 = chunk%tiles(1)%field%density1 * 3
!$omp end target

!$omp target exit data &
!$omp   map(from: chunk%tiles(1)%field%density0) &
!$omp   map(from: chunk%tiles(1)%field%density1)

print *, chunk%tiles(1)%field%density0
print *, chunk%tiles(1)%field%density1

if (any (chunk%tiles(1)%field%density0 /= 7*2*reshape([1,2,3,4],[2,2]))) stop 1
if (any (chunk%tiles(1)%field%density1 /= 3*reshape([11,22,33,44],[2,2]))) stop 
2

end
-------------------------

* * *

Tobias,
who is now trying to understand when things are supposed to end up in the
present table and when only the data and when the pointed-to data gets
mapped.

OpenMP 6.0 added ref_ptee, ref_ptr, and ref_ptr_ptee as map modifiers, which
might help to explain some fine print a bit better as those are for mapping
the pointer target vs. the pointee. (I think some fine print might have been
fixed in TR14 or post-TR14, i.e. reading the newest version possible might
help.)

Reply via email to