Ping.
On 2015/8/27 09:44 PM, Chung-Lin Tang wrote:
> We've discovered that, for several of the libgomp plugin interface routines,
> if the target specific routine calls exit() (usually upon a fatal condition),
> deadlock ensues. We found this using nvptx, but it's possible on intelmic as
> well.
>
> This is due to many of the plugin routines are called with the device lock
> held,
> and when exit() is called inside the plugin code, the GOMP_unregister_var()
> destructor
> tries to iterate through and acquire all device locks to cleanup. Since we
> already hold
> one of the device locks, this just gets stuck. Also because gomp_mutex_t is a
> simple futex based lock implementation (instead of pthreads), we don't have a
> trylock mechanism to use either.
>
> So this patch tries to alleviate this problem by changing the plugin
> interface;
> the plugin routines that are called while holding the device lock are adjusted
> to assume to never fatal exit, but return a value back to libgomp proper to
> indicate execution results. The core libgomp code then may unlock and call
> gomp_fatal().
>
> We believe this is the right route to solve the problem, since there's only
> two accel target plugins so far. Besides the nvptx plugin, I have made some
> effort
> to update the intelmic plugin as well, though it's not as thoroughly audited.
> Intel folks might want to further make sure your plugin code is free of this
> problem as well.
>
> This patch contains the libgomp proper changes. The nvptx and intelmic
> patches follow.
> I have tested the libgomp testsuite without regressions for both accel
> targets, is this
> okay for trunk?
>
> Thanks,
> Chung-Lin
>
> 2015-08-27 Chung-Lin Tang <clt...@codesourcery.com>
>
> * oacc-host.c (host_init_device): Change return type to bool.
> (host_fini_device): Likewise.
> (host_dev2host): Likewise.
> (host_host2dev): Likewise.
> (host_free): Likewise.
> (host_alloc): Change return type to bool, change to use out
> parameter to return allocated pointer.
> * oacc-mem.c (acc_malloc): Adjust plugin hook declaration change,
> handle fatal error.
> (acc_free): Likewise.
> (acc_memcpy_to_device): Likewise.
> (acc_memcpy_from_device): Likewise.
> * oacc-init.c (acc_init_1): Handle gomp_init_device return code,
> handle fatal error.
> (acc_set_device_type): Likewise.
> (acc_set_device_num): Likewise.
> * target.c (gomp_map_vars): Adjust alloc_func plugin hook call,
> add device unlock, handle fatal error.
> (gomp_unmap_tgt): Change return type to bool, adjust free_func
> plugin call.
> (gomp_copy_from_async): Handle dev2host_func return code, handle
> fatal error.
> (gomp_unmap_vars): Likewise.
> (gomp_init_device): Change return type to bool, adjust call to
> init_device_func plugin hook.
> (GOMP_target): Adjust call to gomp_init_device, handle fatal error.
> (GOMP_target_data): Likewise.
> (GOMP_target_update): Likewise.
> * libgomp.h (gomp_device_descr.init_device_func): Change return
> type to bool.
> (gomp_device_descr.fini_device_func): Likewise.
> (gomp_device_descr.free_func): Likewise.
> (gomp_device_descr.dev2host_func): Likewise.
> (gomp_device_descr.host2dev_func) Likewise.
> (gomp_device_descr.alloc_func): Change return
> type to bool, use out parameter to return pointer.
> (gomp_init_device): Change return
> type to bool.
>