Hi,

On 15.12.22 20:42, Tobias Burnus wrote:
If the libgomp plugin doesn't request special
'host_to_dev_cpy'/'dev_to_host_cpy' for 'gomp_target_rev', then standard
'gomp_copy_host2dev'/'gomp_copy_dev2host' are used, which use
'gomp_device_copy', which expects the device to be locked.  (As can be
told by the unconditional 'gomp_mutex_unlock (&devicep->lock);' before
'gomp_fatal'.)  However, in a number of the
'gomp_copy_host2dev'/'gomp_copy_dev2host' calls from 'gomp_target_rev',
the device definitely is not locked; see

Actually, reading it + the source code again, I think it makes sense to
return a boolean – similar to devicep->host2dev_func and
devicep->dev2host_func — and possibly wrap it into some convenience
function, similar to gomp_device_copy – at least a bare exit() without
further diagnostic does not seem to userfriendly.

BTW: In line with the other code, you could use CUDA_CALL instead of
CUDA_CALL_ERET; the fomer already calls the latter with 'false' as first
argument + is used elsewhere.

Regarding the lock: It seems the problem is the copying of
devaddrs/sizes/kinds; this does not need any lock as the stack variables
are on the device and only used for this reverse offload. Thus, there is
no need for a lock as there are no races.

However, as the existing gomp_copy_dev2host removes the lock, we could
simply keep this lock – and probably should move it down to just before
the user-function call – removing all (non-error) locks and unlocks on
the way. — I mean something like the attached patch.

Finally, I think we need to find a solution for the issue Andrew tried
to address. — The current code invokes CUDA_CALL_ASSERT – which calls
GOMP_PLUGIN_fatal.

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
diff --git a/libgomp/target.c b/libgomp/target.c
index e38cc3b6f1c..4b7233307cd 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3319,5 +3319,6 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
   gomp_mutex_lock (&devicep->lock);
   n = gomp_map_lookup_rev (&devicep->mem_map_rev, &k);
-  gomp_mutex_unlock (&devicep->lock);
+  if (devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
+    gomp_mutex_unlock (&devicep->lock);
 
   if (n == NULL)
@@ -3409,5 +3410,4 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
       cdata = gomp_alloca (sizeof (*cdata) * mapnum);
       memset (cdata, '\0', sizeof (*cdata) * mapnum);
-      gomp_mutex_lock (&devicep->lock);
       for (uint64_t i = 0; i < mapnum; i++)
 	{
@@ -3643,4 +3643,5 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
       uint64_t struct_cpy = 0;
       bool clean_struct = false;
+      gomp_mutex_lock (&devicep->lock);
       for (uint64_t i = 0; i < mapnum; i++)
 	{
@@ -3695,5 +3696,5 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
 	      gomp_aligned_free ((void *) (uintptr_t) devaddrs[i]);
 	    }
-
+      gomp_mutex_unlock (&devicep->lock);
       free (devaddrs);
       free (sizes);

Reply via email to