Hi all,
Comment to reviewers:
* Fortran: Except for ensuring that the version field in array descriptors
is always set to the default (zero), the generated code should only be
affected when -fopenmp-allocators is set, even though several files are
touched.
* Middle-end: BUILT_IN_GOMP_REALLOC has been added - otherwise untouched.
* Otherwise, smaller libgomp changes, conditions to (de)allocation code in
fortran/trans*.cc and and some checking updates (mostly openmp.cc)
* * *
GCC supports OpenMP's allocators, which work typically as:
my_c_ptr = omp_alloc (byte_size, my_allocator)
...
call omp_free (my_c_ptr, omp_null_allocator)
where (if called as such) the runtime has to find the used allocator in
order to handle the 'free' (and likewise: omp_realloc) correctly. libgomp
implements this by allocating a bit more bytes - and using the first bytes
to store the handle for the allocator such that 'my_c_ptr minus size of handle'
will be the address. See also OpenMP spec and:
https://gcc.gnu.org/onlinedocs/libgomp/OMP_005fALLOCATOR.html
https://gcc.gnu.org/onlinedocs/libgomp/Memory-Management-Routines.html
https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html
and https://gcc.gnu.org/wiki/cauldron2023 (OpenMP BoF; video recordings not
yet available, slide is)
FOR FORTRAN, OpenMP permits to allocate ALLOCATABLES and POINTERS also as
follows:
!$omp allocators allocate(allocator(my_alloc), align(128) : A)
allocate(A(10), B)
A = [1,2,3] ! reallocate with same allocator
call intent_out_function(B) ! Has to use proper deallocator
deallocate(A) ! Likewise.
! end of scope deallocation: Likewise.
(Side remark: In 5.{1,2}, '!$omp allocate(A,B) allocator(my_alloc) align(123)'
is the syntax to use - which has nearly the same effect, except that for
non-specified variables, 'omp allocators' uses the normal Fortran allocation
while for a 'omp allocate' without a variable list uses that OpenMP allocator
for nonlisted variables.)
* * *
The problem is really that 'malloc'ed memory has to be freed/realloced by 'free'
and 'realloc' while 'omp_alloc'ed memory has to be by handled by 'omp_free'
and 'omp_realloc' - getting this wrong will nearly always crash the program-
I assume that the propagation depth is rather slow, i.e. most likely all
deallocation
will happen in the file as the allocation, but that's not guaranteed and I bet
that
a few "leaks" to other files are likely in every other software package.
* * *
ASSUMPTIONS for the attached implementation:
* Most OpenMP code will not use '!$omp allocators'
(Note: Using the API routines or 'allocate' clauses on block-associated
directives (like: omp parallel firstprivate(a) allocate(allocator(my_alloc)
:a)')
or 'omp allocate' for stack variables are separate and pose no problems.)
* The (de,re)allocation will not happen in a hot code
* And, if used, the number of scalar variables of this kind will be small
SOLUTION as implemented:
* All code that uses 'omp allocator' and all code that might deallocate such
memory
must be compiled by a special flag:
-fopenmp-allocators
This solves the issues:
- Always having an overhead even if -fopenmp code does not need it
- Permitting (de,re)allocation of such a variable from code which is not
compiled
with -fopenmp
While -fopenmp-allocators could be auto-enabled when 'omp allocators' shows
up in
a file, I decided to require it explicitly by the user in order to highlight
that
other files might require the same flag as thy might do (de,re)allocation on
such
memory.
* For ARRAYS, we fortunately can encode it in the descriptor. I (mis)use the
version
field for this: version = 0 means standard Fortran way while version = 1
means using
omp_alloc and friends.
* For SCALARS, there is no way to store this. As (see assumptions) this is
neither in a
hot path nor are there very many variables, we simply keep track of such
variables in
a separate way. (O (log2 N)) in libgomp - by keekping track of the pointer
address in
libgomp.
Disclaimer:
* I am not 100% sure that I have caught all corner cases for
deallocation/reallocation;
however, it should covers most.
* One area that is probably not fully covered is BIND(C). A Fortran actual to a
BIND(C)
intent(out) should work (dealloced on the caller side), once converted to a
CFI descriptor,
all deallocations will likely fail, be it a later intrinsic-assignment
realloc,
cfi_deallocate or 'deallocate' after conversion to Fortran.
This can be fixed but requires (a) adding the how-allocated to the CFI
descriptor but not
as version (as that is user visible) and (b) handling it in CFI_deallocate.
The latter will add a dependency on 'omp_free', which somehow has to be
resolved.
(Like weak symbols, which is likely not supported on all platforms.)
Thus, as very special case, it has been left out - but it could be added. If
a user
code hits it, it should cause a repro