[gomp4] Re: acc_on_device for device_type_host_nonshm
On 28/05/15 15:16, Julian Brown wrote: On Thu, 28 May 2015 04:48:58 -0700 H.J. Lu hjl.to...@gmail.com wrote: On Thu, May 21, 2015 at 4:10 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 21, 2015 at 01:02:12PM +0200, Thomas Schwinge wrote: Hi! On Thu, 7 May 2015 19:32:26 +0100, Julian Brown jul...@codesourcery.com wrote: Here's a new version of the patch [...] OK for trunk? Makes sense to me (with just a request to drop the testsuite changes, see below), to get the existing regressions under control. Jakub? Ok for trunk. PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. It breaks bootstrap on x86: https://gcc.gnu.org/ml/gcc-regression/2015-05/msg00389.html I checked in this to fix it. Apologies, and thanks! And backported it to gomp-4_0-branch. Thanks, - Tom Mark parameters with ATTRIBUTE_UNUSED 2015-06-04 Tom de Vries t...@codesourcery.com backport from trunk: 2015-05-28 H.J. Lu hongjiu...@intel.com * builtins.c (expand_builtin_acc_on_device): Mark parameters with ATTRIBUTE_UNUSED. --- gcc/builtins.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/builtins.c b/gcc/builtins.c index bfa9832..6574413 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -5915,7 +5915,8 @@ expand_stack_save (void) acceleration device (ACCEL_COMPILER conditional). */ static rtx -expand_builtin_acc_on_device (tree exp, rtx target) +expand_builtin_acc_on_device (tree exp ATTRIBUTE_UNUSED, + rtx target ATTRIBUTE_UNUSED) { #ifdef ACCEL_COMPILER if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE)) -- 1.9.1
[PR libgomp/65742, PR middle-end/66332] XFAIL acc_on_device compile-time evaluation (was: acc_on_device for device_type_host_nonshm)
Hi! On Thu, 7 May 2015 19:32:26 +0100, Julian Brown jul...@codesourcery.com wrote: On Fri, 17 Apr 2015 15:16:19 +0200 Jakub Jelinek ja...@redhat.com wrote: On Tue, Apr 14, 2015 at 05:43:26PM +0200, Thomas Schwinge wrote: Really, acc_on_device is implemented as a compiler builtin (which is just disabled for a few libgomp test cases, in order to test the acc_on_device library function in libgomp), and I never understood why the fallback implementation in libgomp (cited above) should be doing anything different from the GCC builtin. Is the problem actually, that some The question is if the builtin expansion isn't wrong, at least as long as the host_nonshm device is meant to be supported. The #ifdef ACCEL_COMPILER case is easier, at least as long as ACCEL_COMPILER compiled code is not meant to be able to offload to other devices (or host again), but the non-ACCEL_COMPILER case means the code is either on the host, or host_nonshm, or e.g. with Intel MIC you could have some shared library be compiled by the host compiler, but then actuall linked into the MIC offloaded path. In all those cases, I think it is just the library that can determine the return value. E.g. OpenMP omp_is_initial_device function is also only implemented in the library, perhaps at some point I could expand it for #ifdef ACCEL_COMPILER as builtin, but not for the host code, at least not due to the host-nonshm plugin. Here's a new version of the patch that doesn't use the open-coded expansion for acc_on_device for the host compiler at all. This means that the host and the host_nonshm plugin should DTRT without any special compiler options (which have thus been removed from the libgomp tests that set them or refer to them). So now, for the host, acc_on_device returns: acc_on_device (acc_device_none): true acc_on_device (acc_device_host): true otherwise: false When the host_nonshm plugin is active, acc_on_device returns: acc_on_device (acc_device_host_nonshm): true (except when host fallback is in effect, i.e. because of a false if clause). acc_on_device (acc_device_not_host): likewise. otherwise: false In particular, the host_nonshm plugin doesn't consider itself to be running code on the host. PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. As reported in https://gcc.gnu.org/PR66332, this caused the following regression (C testing): PASS: c-c++-common/goacc/acc_on_device-2.c (test for excess errors) [-PASS:-]{+FAIL:+} c-c++-common/goacc/acc_on_device-2.c scan-rtl-dump-times expand \\(call [^\\n]* acc_on_device 0 Committed to trunk in r224028: commit 1c2d9da9cee04516151b3894edb107e3cdf2c8b9 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Tue Jun 2 11:48:56 2015 + [PR libgomp/65742, PR middle-end/66332] XFAIL acc_on_device compile-time evaluation The OpenACC 2.0a specification mandates differently, but we currently do get a library call in the host code. PR libgomp/65742 PR middle-end/66332 gcc/testsuite/ * c-c++-common/goacc/acc_on_device-2.c: XFAIL for C, too. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@224028 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/testsuite/ChangeLog| 6 ++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c | 10 +- 2 files changed, 15 insertions(+), 1 deletion(-) diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog index d91cf7c..3f51b10 100644 --- gcc/testsuite/ChangeLog +++ gcc/testsuite/ChangeLog @@ -1,3 +1,9 @@ +2015-06-02 Thomas Schwinge tho...@codesourcery.com + + PR libgomp/65742 + PR middle-end/66332 + * c-c++-common/goacc/acc_on_device-2.c: XFAIL for C, too. + 2015-06-02 Uros Bizjak ubiz...@gmail.com * g++.dg/abi/mangle-regparm.C (dg-do): Fix x86_32 target selector. diff --git gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c index 8db0a66..6e3d292 100644 --- gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c +++ gcc/testsuite/c-c++-common/goacc/acc_on_device-2.c @@ -20,9 +20,17 @@ f (void) } /* With -fopenacc, we're expecting the builtin to be expanded, so no calls. + TODO: in C++, even under extern C, the use of enum for acc_device_t perturbs expansion as a builtin, which expects an int parameter. It's fine when changing acc_device_t to plain int, but that's not what we're doing in openacc.h. - { dg-final { scan-rtl-dump-times \\\(call \[^\\n\]* acc_on_device 0 expand { xfail c++ } } } */ + + TODO: given that we can't expand acc_on_device in + gcc/builtins.c:expand_builtin_acc_on_device for in the !ACCEL_COMPILER case + (because at that point we don't know whether we're acc_device_host or + acc_device_host_nonshm), we'll (erroneously) get a
Re: acc_on_device for device_type_host_nonshm
On Thu, May 21, 2015 at 4:10 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 21, 2015 at 01:02:12PM +0200, Thomas Schwinge wrote: Hi! On Thu, 7 May 2015 19:32:26 +0100, Julian Brown jul...@codesourcery.com wrote: Here's a new version of the patch [...] OK for trunk? Makes sense to me (with just a request to drop the testsuite changes, see below), to get the existing regressions under control. Jakub? Ok for trunk. PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. It breaks bootstrap on x86: https://gcc.gnu.org/ml/gcc-regression/2015-05/msg00389.html I checked in this to fix it. -- H.J. --- Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 223804) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2015-05-28 H.J. Lu hongjiu...@intel.com + + * builtins.c (expand_builtin_acc_on_device): Mark parameters + with ATTRIBUTE_UNUSED. + 2015-05-28 Julian Brown jul...@codesourcery.com PR libgomp/65742 Index: gcc/builtins.c === --- gcc/builtins.c (revision 223804) +++ gcc/builtins.c (working copy) @@ -5911,7 +5911,8 @@ acceleration device (ACCEL_COMPILER conditional). */ static rtx -expand_builtin_acc_on_device (tree exp, rtx target) +expand_builtin_acc_on_device (tree exp ATTRIBUTE_UNUSED, + rtx target ATTRIBUTE_UNUSED) { #ifdef ACCEL_COMPILER if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE))
Re: acc_on_device for device_type_host_nonshm
On Thu, 28 May 2015 04:48:58 -0700 H.J. Lu hjl.to...@gmail.com wrote: On Thu, May 21, 2015 at 4:10 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, May 21, 2015 at 01:02:12PM +0200, Thomas Schwinge wrote: Hi! On Thu, 7 May 2015 19:32:26 +0100, Julian Brown jul...@codesourcery.com wrote: Here's a new version of the patch [...] OK for trunk? Makes sense to me (with just a request to drop the testsuite changes, see below), to get the existing regressions under control. Jakub? Ok for trunk. PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. It breaks bootstrap on x86: https://gcc.gnu.org/ml/gcc-regression/2015-05/msg00389.html I checked in this to fix it. Apologies, and thanks! Julian
Re: acc_on_device for device_type_host_nonshm
On Thu, May 21, 2015 at 01:02:12PM +0200, Thomas Schwinge wrote: Hi! On Thu, 7 May 2015 19:32:26 +0100, Julian Brown jul...@codesourcery.com wrote: Here's a new version of the patch [...] OK for trunk? Makes sense to me (with just a request to drop the testsuite changes, see below), to get the existing regressions under control. Jakub? Ok for trunk. PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. libgomp/ * oacc-init.c (plugin/plugin-host.h): Include. (acc_on_device): Check whether we're in an offloaded region for host_nonshm plugin. Don't use __builtin_acc_on_device. * plugin/plugin-host.c (GOMP_OFFLOAD_openacc_parallel): Set nonshm_exec flag in thread-local data. (GOMP_OFFLOAD_openacc_create_thread_data): Allocate thread-local data for host_nonshm plugin. (GOMP_OFFLOAD_openacc_destroy_thread_data): Free thread-local data for host_nonshm plugin. * plugin/plugin-host.h: New. * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Remove -fno-builtin-acc_on_device flag. * testsuite/libgomp.oacc-c-c++-common/if-1.c: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Remove comment re: acc_on_device builtin. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. commit adccf2e7d313263d585f63e752a4d36653d47811 Jakub
Re: acc_on_device for device_type_host_nonshm
Hi! On Thu, 7 May 2015 19:32:26 +0100, Julian Brown jul...@codesourcery.com wrote: Here's a new version of the patch [...] OK for trunk? Makes sense to me (with just a request to drop the testsuite changes, see below), to get the existing regressions under control. Jakub? PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. libgomp/ * oacc-init.c (plugin/plugin-host.h): Include. (acc_on_device): Check whether we're in an offloaded region for host_nonshm plugin. Don't use __builtin_acc_on_device. * plugin/plugin-host.c (GOMP_OFFLOAD_openacc_parallel): Set nonshm_exec flag in thread-local data. (GOMP_OFFLOAD_openacc_create_thread_data): Allocate thread-local data for host_nonshm plugin. (GOMP_OFFLOAD_openacc_destroy_thread_data): Free thread-local data for host_nonshm plugin. * plugin/plugin-host.h: New. * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Remove -fno-builtin-acc_on_device flag. * testsuite/libgomp.oacc-c-c++-common/if-1.c: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Remove comment re: acc_on_device builtin. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise. commit adccf2e7d313263d585f63e752a4d36653d47811 Author: Julian Brown jul...@codesourcery.com Date: Tue Apr 21 12:40:45 2015 -0700 Non-SHM acc_on_device fixes diff --git a/gcc/builtins.c b/gcc/builtins.c index 6fe1456..5930fe4 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -5917,6 +5917,7 @@ expand_stack_save (void) static rtx expand_builtin_acc_on_device (tree exp, rtx target) { +#ifdef ACCEL_COMPILER if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE)) return NULL_RTX; @@ -5925,13 +5926,8 @@ expand_builtin_acc_on_device (tree exp, rtx target) /* Return (arg == v1 || arg == v2) ? 1 : 0. */ machine_mode v_mode = TYPE_MODE (TREE_TYPE (arg)); rtx v = expand_normal (arg), v1, v2; -#ifdef ACCEL_COMPILER v1 = GEN_INT (GOMP_DEVICE_NOT_HOST); v2 = GEN_INT (ACCEL_COMPILER_acc_device); -#else - v1 = GEN_INT (GOMP_DEVICE_NONE); - v2 = GEN_INT (GOMP_DEVICE_HOST); -#endif machine_mode target_mode = TYPE_MODE (integer_type_node); if (!target || !register_operand (target, target_mode)) target = gen_reg_rtx (target_mode); @@ -5945,6 +5941,9 @@ expand_builtin_acc_on_device (tree exp, rtx target) emit_label (done_label); return target; +#else + return NULL; +#endif } diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index 335ffd4..157147a 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -29,6 +29,7 @@ #include libgomp.h #include oacc-int.h #include openacc.h +#include plugin/plugin-host.h #include assert.h #include stdlib.h #include strings.h @@ -611,11 +612,18 @@ ialias (acc_set_device_num) int acc_on_device (acc_device_t dev) { - if (acc_get_device_type () == acc_device_host_nonshm) + struct goacc_thread *thr = goacc_thread (); + + /* We only want to appear to be the host_nonshm plugin from offloaded + code -- i.e. within a parallel region. Test a flag set by the + openacc_parallel hook of the host_nonshm plugin to determine that. */ + if (acc_get_device_type () == acc_device_host_nonshm + thr thr-target_tls + ((struct nonshm_thread *)thr-target_tls)-nonshm_exec) return dev == acc_device_host_nonshm || dev == acc_device_not_host; - /* Just rely on the compiler builtin. */ - return __builtin_acc_on_device (dev); + /* For OpenACC, libgomp is only built for the host, so this is sufficient. */ + return dev == acc_device_host || dev == acc_device_none; } ialias (acc_on_device) diff --git a/libgomp/plugin/plugin-host.c b/libgomp/plugin/plugin-host.c index 1faf5bc..3cb4dab 100644 --- a/libgomp/plugin/plugin-host.c +++ b/libgomp/plugin/plugin-host.c @@ -44,6 +44,7 @@ #include stdlib.h #include string.h #include stdio.h +#include stdbool.h #ifdef HOST_NONSHM_PLUGIN #define STATIC @@ -55,6 +56,10 @@ #define SELF host: #endif +#ifdef HOST_NONSHM_PLUGIN +#include plugin-host.h +#endif + STATIC const char * GOMP_OFFLOAD_get_name (void) { @@ -174,7 +179,10 @@ GOMP_OFFLOAD_openacc_parallel (void (*fn) (void *), void *targ_mem_desc __attribute__ ((unused))) { #ifdef HOST_NONSHM_PLUGIN + struct nonshm_thread *thd = GOMP_PLUGIN_acc_thread (); + thd-nonshm_exec = true; fn (devaddrs); + thd-nonshm_exec = false; #else fn (hostaddrs); #endif @@ -232,11 +240,20 @@ STATIC void * GOMP_OFFLOAD_openacc_create_thread_data (int ord __attribute__ ((unused))) { +#ifdef HOST_NONSHM_PLUGIN + struct nonshm_thread *thd += GOMP_PLUGIN_malloc
Re: acc_on_device for device_type_host_nonshm (was: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks) (PR65742)
On Fri, 17 Apr 2015 15:16:19 +0200 Jakub Jelinek ja...@redhat.com wrote: On Tue, Apr 14, 2015 at 05:43:26PM +0200, Thomas Schwinge wrote: On Tue, 14 Apr 2015 15:15:02 +0100, Julian Brown jul...@codesourcery.com wrote: On Wed, 8 Apr 2015 17:58:56 +0300 Ilya Verbin iver...@gmail.com wrote: I see several regressions: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_on_device-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test I think there may be multiple issues here. The attached patch addresses one -- acc_device_type not distinguishing between offloaded and host code with the host_nonshm plugin. (You mean acc_on_device?) --- libgomp/oacc-init.c (revision 221922) +++ libgomp/oacc-init.c (working copy) @@ -548,7 +549,14 @@ ialias (acc_set_device_num) int acc_on_device (acc_device_t dev) { - if (acc_get_device_type () == acc_device_host_nonshm) + struct goacc_thread *thr = goacc_thread (); + + /* We only want to appear to be the host_nonshm plugin from offloaded + code -- i.e. within a parallel region. Test a flag set by the + openacc_parallel hook of the host_nonshm plugin to determine that. */ + if (acc_get_device_type () == acc_device_host_nonshm + thr thr-target_tls + ((struct nonshm_thread *)thr-target_tls)-nonshm_exec) return dev == acc_device_host_nonshm || dev == acc_device_not_host; /* Just rely on the compiler builtin. */ Really, acc_on_device is implemented as a compiler builtin (which is just disabled for a few libgomp test cases, in order to test the acc_on_device library function in libgomp), and I never understood why the fallback implementation in libgomp (cited above) should be doing anything different from the GCC builtin. Is the problem actually, that some The question is if the builtin expansion isn't wrong, at least as long as the host_nonshm device is meant to be supported. The #ifdef ACCEL_COMPILER case is easier, at least as long as ACCEL_COMPILER compiled code is not meant to be able to offload to other devices (or host again), but the non-ACCEL_COMPILER case means the code is either on the host, or host_nonshm, or e.g. with Intel MIC you could have some shared library be compiled by the host compiler, but then actuall linked into the MIC offloaded path. In all those cases, I think it is just the library that can determine the return value. E.g. OpenMP omp_is_initial_device function is also only implemented in the library, perhaps at some point I could expand it for #ifdef ACCEL_COMPILER as builtin, but not for the host code, at least not due to the host-nonshm plugin. Here's a new version of the patch that doesn't use the open-coded expansion for acc_on_device for the host compiler at all. This means that the host and the host_nonshm plugin should DTRT without any special compiler options (which have thus been removed from the libgomp tests that set them or refer to them). So now, for the host, acc_on_device returns: acc_on_device (acc_device_none): true acc_on_device (acc_device_host): true otherwise: false When the host_nonshm plugin is active, acc_on_device returns: acc_on_device (acc_device_host_nonshm): true (except when host fallback is in effect, i.e. because of a false if clause). acc_on_device (acc_device_not_host): likewise. otherwise: false In particular, the host_nonshm plugin doesn't consider itself to be running code on the host. OK for trunk? Julian ChangeLog PR libgomp/65742 gcc/ * builtins.c (expand_builtin_acc_on_device): Don't use open-coded sequence for !ACCEL_COMPILER. libgomp/ * oacc-init.c (plugin/plugin-host.h): Include. (acc_on_device): Check whether we're in an offloaded region for host_nonshm plugin. Don't use __builtin_acc_on_device. * plugin/plugin-host.c (GOMP_OFFLOAD_openacc_parallel): Set nonshm_exec flag in thread-local data. (GOMP_OFFLOAD_openacc_create_thread_data): Allocate thread-local data for host_nonshm plugin. (GOMP_OFFLOAD_openacc_destroy_thread_data): Free thread-local data for host_nonshm plugin. * plugin/plugin-host.h: New. * testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Remove -fno-builtin-acc_on_device flag. * testsuite/libgomp.oacc-c-c++-common/if-1.c: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Remove comment re: acc_on_device builtin. * testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise. * testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.commit adccf2e7d313263d585f63e752a4d36653d47811 Author: Julian Brown jul...@codesourcery.com Date: Tue Apr 21 12:40:45 2015 -0700 Non-SHM acc_on_device fixes diff --git a/gcc/builtins.c
Re: acc_on_device for device_type_host_nonshm (was: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks)
On Tue, Apr 14, 2015 at 05:43:26PM +0200, Thomas Schwinge wrote: On Tue, 14 Apr 2015 15:15:02 +0100, Julian Brown jul...@codesourcery.com wrote: On Wed, 8 Apr 2015 17:58:56 +0300 Ilya Verbin iver...@gmail.com wrote: I see several regressions: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_on_device-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test I think there may be multiple issues here. The attached patch addresses one -- acc_device_type not distinguishing between offloaded and host code with the host_nonshm plugin. (You mean acc_on_device?) --- libgomp/oacc-init.c (revision 221922) +++ libgomp/oacc-init.c (working copy) @@ -548,7 +549,14 @@ ialias (acc_set_device_num) int acc_on_device (acc_device_t dev) { - if (acc_get_device_type () == acc_device_host_nonshm) + struct goacc_thread *thr = goacc_thread (); + + /* We only want to appear to be the host_nonshm plugin from offloaded + code -- i.e. within a parallel region. Test a flag set by the + openacc_parallel hook of the host_nonshm plugin to determine that. */ + if (acc_get_device_type () == acc_device_host_nonshm + thr thr-target_tls + ((struct nonshm_thread *)thr-target_tls)-nonshm_exec) return dev == acc_device_host_nonshm || dev == acc_device_not_host; /* Just rely on the compiler builtin. */ Really, acc_on_device is implemented as a compiler builtin (which is just disabled for a few libgomp test cases, in order to test the acc_on_device library function in libgomp), and I never understood why the fallback implementation in libgomp (cited above) should be doing anything different from the GCC builtin. Is the problem actually, that some The question is if the builtin expansion isn't wrong, at least as long as the host_nonshm device is meant to be supported. The #ifdef ACCEL_COMPILER case is easier, at least as long as ACCEL_COMPILER compiled code is not meant to be able to offload to other devices (or host again), but the non-ACCEL_COMPILER case means the code is either on the host, or host_nonshm, or e.g. with Intel MIC you could have some shared library be compiled by the host compiler, but then actuall linked into the MIC offloaded path. In all those cases, I think it is just the library that can determine the return value. E.g. OpenMP omp_is_initial_device function is also only implemented in the library, perhaps at some point I could expand it for #ifdef ACCEL_COMPILER as builtin, but not for the host code, at least not due to the host-nonshm plugin. Jakub
acc_on_device for device_type_host_nonshm (was: libgomp nvptx plugin: rework initialisation and support the proposed load/unload hooks)
Hi! On Tue, 14 Apr 2015 15:15:02 +0100, Julian Brown jul...@codesourcery.com wrote: On Wed, 8 Apr 2015 17:58:56 +0300 Ilya Verbin iver...@gmail.com wrote: I see several regressions: FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_on_device-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test I think there may be multiple issues here. The attached patch addresses one -- acc_device_type not distinguishing between offloaded and host code with the host_nonshm plugin. (You mean acc_on_device?) --- libgomp/oacc-init.c (revision 221922) +++ libgomp/oacc-init.c (working copy) @@ -548,7 +549,14 @@ ialias (acc_set_device_num) int acc_on_device (acc_device_t dev) { - if (acc_get_device_type () == acc_device_host_nonshm) + struct goacc_thread *thr = goacc_thread (); + + /* We only want to appear to be the host_nonshm plugin from offloaded + code -- i.e. within a parallel region. Test a flag set by the + openacc_parallel hook of the host_nonshm plugin to determine that. */ + if (acc_get_device_type () == acc_device_host_nonshm + thr thr-target_tls + ((struct nonshm_thread *)thr-target_tls)-nonshm_exec) return dev == acc_device_host_nonshm || dev == acc_device_not_host; /* Just rely on the compiler builtin. */ Really, acc_on_device is implemented as a compiler builtin (which is just disabled for a few libgomp test cases, in order to test the acc_on_device library function in libgomp), and I never understood why the fallback implementation in libgomp (cited above) should be doing anything different from the GCC builtin. Is the problem actually, that some libgomp test cases are expecting from acc_on_device for acc_device_host_nonshm a different answer than the one they're currently getting? What is the expected answer? Given that the OpenACC specification doesn't talk about a host_nonshm device type, can we accordingly define what the expected behavior is, so that we can just have libgomp/oacc-init.c:acc_on_device »rely on the compiler builtin«? Grüße, Thomas pgpBZ3vu24nLn.pgp Description: PGP signature