Re: Merge DEF_GOACC_BUILTIN into DEF_GOMP_BUILTIN? (was: OpenACC middle end changes)
On Thu, Jul 09, 2015 at 05:52:20PM +0200, Thomas Schwinge wrote: --- gcc/builtins.def +++ gcc/builtins.def @@ -182,7 +182,9 @@ along with GCC; see the file COPYING3. If not see #define DEF_GOMP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \ DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\ false, true, true, ATTRS, false, \ -(flag_openmp || flag_tree_parallelize_loops \ +(flag_openmp \ + || flag_tree_parallelize_loops 1 \ + || flag_cilkplus \ || flag_offload_abi != OFFLOAD_ABI_UNSET)) /* Builtin used by implementation of Cilk Plus. Most of these are decomposed Before this patch, all DEF_GOMP_BUILTINs (erroneously) had always been available, due to flag_tree_parallelize_loops's default value of 1. With gcc/omp-low.c:lower_reduction_clauses using a BUILT_IN_GOMP_ATOMIC_START/BUILT_IN_GOMP_ATOMIC_END sequence as a last resort, and that being chosen for some kind of OpenACC reduction clauses (which is present on gomp-4_0-branch only), we're then running into ICEs, as those two DEF_GOMP_BUILTINs are not available with plain -fopenacc. Now, it there actually a good reason to have separate DEF_GOACC_BUILTIN and DEF_GOMP_BUILTIN directives (which I basically just initially did to be least intrusive, http://news.gmane.org/find-root.php?message_id=%3C1383766943-8863-6-git-send-email-thomas%40codesourcery.com%3E), or should I just add flag_openacc to DEF_GOMP_BUILTIN, and change all DEF_GOACC_BUILTIN instantiations to DEF_GOMP_BUILTIN? Merging them definitely makes sense to me now, so OK to do the obvious? Having DEF_GOMP_BUILTIN and DEF_GOACC_BUILTIN is nice, it tells you if it is OpenMP or OpenACC builtin. I'd say just add || flag_openacc to DEF_GOMP_BUILTIN if you need it. E.g. for -ftree-parallelize-loops or -fcilkplus I doubt you want the OpenACC builtins ;). Jakub
Merge DEF_GOACC_BUILTIN into DEF_GOMP_BUILTIN? (was: OpenACC middle end changes)
Hi! On Thu, 13 Nov 2014 19:09:49 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 13, 2014 at 05:59:11PM +0100, Thomas Schwinge wrote: * should gcc/oacc-builtins.def just be merged into gcc/omp-builtins.def; Why not. The reason why they aren't in gcc/builtins.def is that the Fortran FE doesn't source those, but OpenACC supports the same languages as OpenMP. (We've done that, http://news.gmane.org/find-root.php?message_id=%3C871tnxubgm.fsf%40schwinge.name%3E.) Now, trying to merge trunk into gomp-4_0-branch, I've hit the problem that Tom applied http://news.gmane.org/find-root.php?message_id=%3C5583E052.2050207%40mentor.com%3E in trunk r224745: --- gcc/builtins.def +++ gcc/builtins.def @@ -182,7 +182,9 @@ along with GCC; see the file COPYING3. If not see #define DEF_GOMP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \ DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\ false, true, true, ATTRS, false, \ - (flag_openmp || flag_tree_parallelize_loops \ + (flag_openmp \ + || flag_tree_parallelize_loops 1 \ + || flag_cilkplus \ || flag_offload_abi != OFFLOAD_ABI_UNSET)) /* Builtin used by implementation of Cilk Plus. Most of these are decomposed Before this patch, all DEF_GOMP_BUILTINs (erroneously) had always been available, due to flag_tree_parallelize_loops's default value of 1. With gcc/omp-low.c:lower_reduction_clauses using a BUILT_IN_GOMP_ATOMIC_START/BUILT_IN_GOMP_ATOMIC_END sequence as a last resort, and that being chosen for some kind of OpenACC reduction clauses (which is present on gomp-4_0-branch only), we're then running into ICEs, as those two DEF_GOMP_BUILTINs are not available with plain -fopenacc. Now, it there actually a good reason to have separate DEF_GOACC_BUILTIN and DEF_GOMP_BUILTIN directives (which I basically just initially did to be least intrusive, http://news.gmane.org/find-root.php?message_id=%3C1383766943-8863-6-git-send-email-thomas%40codesourcery.com%3E), or should I just add flag_openacc to DEF_GOMP_BUILTIN, and change all DEF_GOACC_BUILTIN instantiations to DEF_GOMP_BUILTIN? Merging them definitely makes sense to me now, so OK to do the obvious? Grüße, Thomas signature.asc Description: PGP signature
Re: OpenACC middle end changes
On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote: Another change that's required is (something like) the following. For ptx, we need to know whether to output something as a .func (callable from ptx code) or a .kernel (callable from the host). That means we need to mark the kernel functions somehow in omp-low.c, and the following does that by way of a new attribute (already recognized by the nvptx backend). On a second though, I guess this is ok. Adding a cgraph bit that is interesting to just a single target and is quite rare is probably waste, especially when it would need to be streamed in and out in every cgraph node. As nvptx backend already recognizes it and we have omp declare target attribute already, this is ok for trunk. * omp-low.c (create_omp_child_function): Tag entrypoint functions with a special attribute. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 42ba317..8408025 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool task_copy) break; } } + if (cgraph_node::get_create (decl)-offloadable + !lookup_attribute (omp declare target, + DECL_ATTRIBUTES (current_function_decl))) +DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier (omp target entrypoint), + NULL_TREE, DECL_ATTRIBUTES (decl)); t = build_decl (DECL_SOURCE_LOCATION (decl), RESULT_DECL, NULL_TREE, void_type_node); Jakub
nvptx offloading: Tag entrypoint functions with a special attribute (was: OpenACC middle end changes)
Hi! On Fri, 20 Feb 2015 10:47:13 +0100, Jakub Jelinek ja...@redhat.com wrote: On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote: Another change that's required is (something like) the following. For ptx, we need to know whether to output something as a .func (callable from ptx code) or a .kernel (callable from the host). That means we need to mark the kernel functions somehow in omp-low.c, and the following does that by way of a new attribute (already recognized by the nvptx backend). On a second though, I guess this is ok. Adding a cgraph bit that is interesting to just a single target and is quite rare is probably waste, especially when it would need to be streamed in and out in every cgraph node. Heh, that's precisely the question I had just drafted in an email, just about to send, when your email arrived. ;-) As nvptx backend already recognizes it and we have omp declare target attribute already, this is ok for trunk. Bernd, are you going to commit this, and the other approved changes of yours? * omp-low.c (create_omp_child_function): Tag entrypoint functions with a special attribute. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 42ba317..8408025 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool task_copy) break; } } + if (cgraph_node::get_create (decl)-offloadable + !lookup_attribute (omp declare target, + DECL_ATTRIBUTES (current_function_decl))) +DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier (omp target entrypoint), + NULL_TREE, DECL_ATTRIBUTES (decl)); t = build_decl (DECL_SOURCE_LOCATION (decl), RESULT_DECL, NULL_TREE, void_type_node); Grüße, Thomas pgpavNrQqucdO.pgp Description: PGP signature
Re: OpenACC middle end changes
Hi! On Thu, 18 Dec 2014 14:16:52 +0100, I wrote: --- /dev/null +++ gcc/config/i386/intelmic-offload.h +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC This one I got right... --- /dev/null +++ gcc/config/nvptx/offload.h @@ -0,0 +1,35 @@ +#define ACCEL_COMPILER_acc_device GOMP_TARGET_NVIDIA_PTX ..., but not this one. Committed to trunk in r220686: commit 8fbeb4361af9e77c57d3b15c7be11759a4f608c0 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Fri Feb 13 16:20:01 2015 + GOMP_TARGET_* have been renamed to GOMP_DEVICE_* some time ago. gcc/ * config/nvptx/offload.h (ACCEL_COMPILER_acc_device): Define to GOMP_DEVICE_NVIDIA_PTX. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@220686 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog |5 + gcc/config/nvptx/offload.h |2 +- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git gcc/ChangeLog gcc/ChangeLog index e06f69a..d9c58b9 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,8 @@ +2015-02-13 Thomas Schwinge tho...@codesourcery.com + + * config/nvptx/offload.h (ACCEL_COMPILER_acc_device): Define to + GOMP_DEVICE_NVIDIA_PTX. + 2015-02-13 Jakub Jelinek ja...@redhat.com PR ipa/65034 diff --git gcc/config/nvptx/offload.h gcc/config/nvptx/offload.h index 02c5e8b..9a749a2 100644 --- gcc/config/nvptx/offload.h +++ gcc/config/nvptx/offload.h @@ -30,6 +30,6 @@ #include gomp-constants.h -#define ACCEL_COMPILER_acc_device GOMP_TARGET_NVIDIA_PTX +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_NVIDIA_PTX #endif Grüße, Thomas signature.asc Description: PGP signature
Re: OpenACC middle end changes
Hi! On Thu, 13 Nov 2014 19:09:49 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Nov 13, 2014 at 05:59:11PM +0100, Thomas Schwinge wrote: --- gcc/builtins.c +++ gcc/builtins.c +/* Expand OpenACC acc_on_device. + + This has to happen late (that is, not in early folding; expand_builtin_*, + rather than fold_builtin_*), as we have to act differently for host and + acceleration device (ACCEL_COMPILER conditional). */ + +static rtx +expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED) +{ + if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE)) +return NULL_RTX; + + tree arg, v1, v2, ret; + location_t loc; + + arg = CALL_EXPR_ARG (exp, 0); + arg = builtin_save_expr (arg); + loc = EXPR_LOCATION (exp); + + /* Build: (arg == v1 || arg == v2) ? 1 : 0. */ + +#ifdef ACCEL_COMPILER + v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_not_host */ 3); + v2 = build_int_cst (TREE_TYPE (arg), ACCEL_COMPILER_acc_device); +#else + v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_none */ 0); + v2 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_host */ 2); +#endif + + v1 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v1); + v2 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v2); + + /* Can't use TRUTH_ORIF_EXPR, as that is not supported by + expand_expr_real*. */ + ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, v1, v1, v2); + ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, +ret, integer_one_node, integer_zero_node); + + return expand_normal (ret); If you can't fold it late (which is indeed a problem for -O0), then I'd suggest to implement this more RTL-ish. So, avoid the builtin_save_expr, instead rtx op = expand_normal (arg); Don't build v1/v2 as trees (and, please fix the TODOs), but rtxes, (acc_device_* TODOs already resolved earlier on.) just rtx v1 = GEN_INT (...); rtx v2 = GEN_INT (...); machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node)); emit_move_insn (ret, const0_rtx); rtx_code_label *done_label = gen_label_rtx (); emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_move_insn (ret, const1_rtx); emit_label (done_label); return ret; or similar. Thanks for the review/suggestion/code! Note, it would still be worthwhile to fold the builtin, at least when optimizing, after IPA. Dunno if we have some property you can check, and Richard B. could suggest where it would be most appropriate (if GIMPLE guarded match.pd entry, or what), gimple_fold, etc. I'll make a note to have a look at that later on. I bet I should handle omp_is_initial_device (); similarly. Yeah. Committed to gomp-4_0-branch in r218858: commit da5ad5aec1c0f9b230ecb2dc00620a5598de5066 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Dec 18 10:42:30 2014 + OpenACC acc_on_device: Make builtin expansion more RTXy. gcc/ * builtins.c (expand_builtin_acc_on_device): Make more RTXy. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218858 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 5 + gcc/builtins.c | 44 +--- 2 files changed, 26 insertions(+), 23 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index b370616..a3650c5 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,3 +1,8 @@ +2014-12-18 Thomas Schwinge tho...@codesourcery.com + Jakub Jelinek ja...@redhat.com + + * builtins.c (expand_builtin_acc_on_device): Make more RTXy. + 2014-12-17 Thomas Schwinge tho...@codesourcery.com Bernd Schmidt ber...@codesourcery.com diff --git gcc/builtins.c gcc/builtins.c index fcf3f53..e946521 100644 --- gcc/builtins.c +++ gcc/builtins.c @@ -5889,38 +5889,36 @@ expand_stack_save (void) acceleration device (ACCEL_COMPILER conditional). */ static rtx -expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED) +expand_builtin_acc_on_device (tree exp, rtx target) { if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE)) return NULL_RTX; - tree arg, v1, v2, ret; - location_t loc; - - arg = CALL_EXPR_ARG (exp, 0); - arg = builtin_save_expr (arg); - loc = EXPR_LOCATION (exp); - - /* Build: (arg == v1 || arg == v2) ? 1 : 0. */ + tree arg = CALL_EXPR_ARG (exp, 0); + /* Return (arg == v1 || arg == v2) ? 1 : 0. */ + machine_mode v_mode = TYPE_MODE (TREE_TYPE (arg)); + rtx v = expand_normal (arg), v1, v2; #ifdef ACCEL_COMPILER - v1 = build_int_cst (TREE_TYPE (arg), GOMP_DEVICE_NOT_HOST); - v2 = build_int_cst (TREE_TYPE (arg),
Re: OpenACC middle end changes
directives I find the gcc_assert (!is_gimple_omp_oacc_specifically (ctx-stmt)); completely unnecessary. Now all removed. My thinking was that some of those clauses are parsed/generated not only in the front ends, but also synthesized in middle end processing, and I wanted to catch those. @@ -1625,13 +1799,41 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_DIST_SCHEDULE: case OMP_CLAUSE_DEPEND: case OMP_CLAUSE__CILK_FOR_COUNT_: + gcc_assert (!is_gimple_omp_oacc_specifically (ctx-stmt)); + /* FALLTHRU */ + case OMP_CLAUSE_IF: [...] if there are some spots you want to keep them in for now, consider gcc_checking_assert instead. Now using this a few times. --- gcc/tree-nested.c +++ gcc/tree-nested.c @@ -627,6 +627,8 @@ walk_gimple_omp_for (gimple for_stmt, walk_stmt_fn callback_stmt, walk_tree_fn callback_op, struct nesting_info *info) { + gcc_assert (!is_gimple_omp_oacc_specifically (for_stmt)); + That surely can be reached and you can easily construct testcase, can't you? @@ -1323,6 +1325,10 @@ convert_nonlocal_reference_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, } break; +case GIMPLE_OACC_KERNELS: +case GIMPLE_OACC_PARALLEL: + gcc_unreachable (); + Ditto etc. Same reasoning as for gimple_copy given above. (And, asserts/gcc_unreachable now all gone.) Do you want me to repost the OpenACC Middle End changes patch, or would you be OK with reviewing the code on gomp-4_0-branch, diffing against the last trunk merge point, 0fcfaa33cbf333ac69cc2b01a7277e5272ff8a3d, r218679? Grüße, Thomas pgpt9HC17Kacl.pgp Description: PGP signature
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 11:46:00AM +0100, Thomas Schwinge wrote: just rtx v1 = GEN_INT (...); rtx v2 = GEN_INT (...); machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node)); emit_move_insn (ret, const0_rtx); rtx_code_label *done_label = gen_label_rtx (); emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_move_insn (ret, const1_rtx); emit_label (done_label); return ret; or similar. Thanks for the review/suggestion/code! Note, as I found later, emit_cmp_and_jump_insns is good enough only for certain modes on certain architectures (in particular, for cases where can_compare_p returns true). So it is better to use do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns, because it handles also the cases which emit_cmp_and_jump_insns silently mishandles. You'll need to reorder the arguments a little bit and add one NULL_RTX argument. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63848#c4 Jakub
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 12:07:01PM +0100, Thomas Schwinge wrote: - case GF_OMP_FOR_KIND_SIMD: - kind = simd; - break; - case GF_OMP_FOR_KIND_CILKSIMD: - kind = cilksimd; - break; case GF_OMP_FOR_KIND_DISTRIBUTE: kind = distribute; break; case GF_OMP_FOR_KIND_CILKFOR: kind = _Cilk_for; break; + case GF_OMP_FOR_KIND_OACC_LOOP: + kind = oacc_loop; + break; + case GF_OMP_FOR_KIND_SIMD: + kind = simd; + break; + case GF_OMP_FOR_KIND_CILKSIMD: + kind = cilksimd; + break; Why the reshuffling? The result isn't alphabetically sorted anyway. I'd just add new stuff at the end ;) It's the order in which the GF_OMP_FOR_KIND_* are defined. At least for my mind ;-) that makes it very easy to grasp that all of them are covered. Ok. +/* Return true if STMT is any of the OpenACC types specifically. */ + +static inline bool +is_gimple_omp_oacc_specifically (const_gimple stmt) Why not is_gimple_oacc or gimple_oacc_p ? The idea is to make it clear in the name that STMT must be an OMP one. Now renamed to the shorter is_gimple_omp_oacc. Ok. If you want to shift from bitmasks in the enum to extra on the side bits (why?), then combined for parallel is another thing. Right, but I've now dropped (reverted) this and further gimplification changes. Maybe this is material for the next stage 1, but maybe not useful enough. Ack. Do you want me to repost the OpenACC Middle End changes patch, or would you be OK with reviewing the code on gomp-4_0-branch, diffing against the last trunk merge point, 0fcfaa33cbf333ac69cc2b01a7277e5272ff8a3d, r218679? So, is what is on the gomp-4_0-branch now all that you'd like to merge to trunk now? Has it been tested on nvptx? I guess we should test it with XeonPhi offloading too to make sure it doesn't break. And then you or together with your coworkers should write the summary ChangeLogs (i.e. what changed compared to trunk in a single giant entry for each ChangeLog file as opposed to many ChangeLog.gomp change entries). Jakub
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote: So, is what is on the gomp-4_0-branch now all that you'd like to merge to trunk now? Has it been tested on nvptx? I guess we should test it with XeonPhi offloading too to make sure it doesn't break. And then you or together with your coworkers should write the summary ChangeLogs (i.e. what changed compared to trunk in a single giant entry for each ChangeLog file as opposed to many ChangeLog.gomp change entries). Also, it would be nice to update wiki/Offloading to give details on all the steps how to configure nvptx offloading (how to grab nvptx-newlib, nvptx-tools, how to configure nvptx-none compiler, in what order to build those etc.). Jakub
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 01:02:22PM +0100, Jakub Jelinek wrote: On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote: So, is what is on the gomp-4_0-branch now all that you'd like to merge to trunk now? Has it been tested on nvptx? I guess we should test it with XeonPhi offloading too to make sure it doesn't break. And then you or together with your coworkers should write the summary ChangeLogs (i.e. what changed compared to trunk in a single giant entry for each ChangeLog file as opposed to many ChangeLog.gomp change entries). Also, it would be nice to update wiki/Offloading to give details on all the steps how to configure nvptx offloading (how to grab nvptx-newlib, nvptx-tools, how to configure nvptx-none compiler, in what order to build those etc.). FYI, just tried to build gomp-4_0-branch with: ../configure --build=x86_64-intelmicemul-linux-gnu --host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap make -j16 and the build failed with: ../../gcc/builtins.c: In function ‘rtx_def* expand_builtin_acc_on_device(tree, rtx)’: ../../gcc/builtins.c:5904:17: error: ‘ACCEL_COMPILER_acc_device’ was not declared in this scope v2 = GEN_INT (ACCEL_COMPILER_acc_device); ^ ../../gcc/rtl.h:3186:51: note: in definition of macro ‘GEN_INT’ #define GEN_INT(N) gen_rtx_CONST_INT (VOIDmode, (N)) ^ Where is ACCEL_COMPILER_acc_device macro supposed to be defined? Jakub
Re: OpenACC middle end changes
Hi Jakub! On Thu, 18 Dec 2014 13:15:38 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Dec 18, 2014 at 01:02:22PM +0100, Jakub Jelinek wrote: On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote: So, is what is on the gomp-4_0-branch now all that you'd like to merge to trunk now? Basically, yes. Only basically, because there are still a few unaddressed review issues in the front ends -- which I'll look into now. (Meaning really: now.) :-) Doing the merge as one big commit on trunk will be the easiest approach, of course. Is that OK, or is there any requirement to single out any of the changes, such as the libgomp/testsuite/libgomp-test-support.exp file just discussed, or the libgomp »Offloading and Multi Processing Runtime Library« renaming, or anything else? Has it been tested on nvptx? I have always been testing on gomp-4_0-branch with ACC_DEVICE_TYPE=host and ACC_DEVICE_TYPE=host_nonshm, plus with ACC_DEVICE_TYPE=nvidia in an internal branch. This branch corresponds to gomp-4_0-branch, but also includes a few patches related to offloading that Bernd has posted for trunk approval, but has not yet gotten approved. I guess we should test it with XeonPhi offloading too to make sure it doesn't break. Right. Do you happen to be set up for such testing? I have not yet managed to properly change my build/test scripts for x86_64-intelmicemul-linux-gnu. And then you or together with your coworkers should write the summary ChangeLogs (i.e. what changed compared to trunk in a single giant entry for each ChangeLog file as opposed to many ChangeLog.gomp change entries). Right, I'll do that once it's time to merge -- otherwise it'll be too cumbersome to keep those up to date. Also, it would be nice to update wiki/Offloading to give details on all the steps how to configure nvptx offloading (how to grab nvptx-newlib, nvptx-tools, how to configure nvptx-none compiler, in what order to build those etc.). Right. FYI, just tried to build gomp-4_0-branch with: ../configure --build=x86_64-intelmicemul-linux-gnu --host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap make -j16 and the build failed with: ../../gcc/builtins.c: In function ‘rtx_def* expand_builtin_acc_on_device(tree, rtx)’: ../../gcc/builtins.c:5904:17: error: ‘ACCEL_COMPILER_acc_device’ was not declared in this scope v2 = GEN_INT (ACCEL_COMPILER_acc_device); ^ ../../gcc/rtl.h:3186:51: note: in definition of macro ‘GEN_INT’ #define GEN_INT(N) gen_rtx_CONST_INT (VOIDmode, (N)) ^ Where is ACCEL_COMPILER_acc_device macro supposed to be defined? From b6781092de7cc9fc8c24600815e0a1223e1241f5 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge tho...@codesourcery.com Date: Wed, 17 Dec 2014 10:10:53 +0100 Subject: [PATCH] Intel MIC offloading. --- gcc/config.gcc | 1 + gcc/config/i386/intelmic-offload.h | 35 +++ 2 files changed, 36 insertions(+) create mode 100644 gcc/config/i386/intelmic-offload.h diff --git gcc/config.gcc gcc/config.gcc index 8541274..faad47d 100644 --- gcc/config.gcc +++ gcc/config.gcc @@ -2906,6 +2906,7 @@ esac case ${target} in *-intelmic-* | *-intelmicemul-*) tmake_file=${tmake_file} i386/t-intelmic + tm_file=${tm_file} i386/intelmic-offload.h ;; esac diff --git gcc/config/i386/intelmic-offload.h gcc/config/i386/intelmic-offload.h new file mode 100644 index 000..dc346c7 --- /dev/null +++ gcc/config/i386/intelmic-offload.h @@ -0,0 +1,35 @@ +/* Definitions for Intel MIC offloading. + + Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + http://www.gnu.org/licenses/. */ + +#ifndef INTELMIC_OFFLOAD_H +#define INTELMIC_OFFLOAD_H + +/* Support for OpenACC acc_on_device. */ + +#include gomp-constants.h + +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC + +#endif -- 1.9.1 Grüße, Thomas
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 01:24:20PM +0100, Thomas Schwinge wrote: Hi Jakub! On Thu, 18 Dec 2014 13:15:38 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Dec 18, 2014 at 01:02:22PM +0100, Jakub Jelinek wrote: On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote: So, is what is on the gomp-4_0-branch now all that you'd like to merge to trunk now? Basically, yes. Only basically, because there are still a few unaddressed review issues in the front ends -- which I'll look into now. (Meaning really: now.) :-) Doing the merge as one big commit on trunk will be the easiest approach, of course. Is that OK, or is there any requirement to single out any of the changes, such as the libgomp/testsuite/libgomp-test-support.exp file just discussed, or the libgomp »Offloading and Multi Processing Runtime Library« renaming, or anything else? Doing one big merge is ok with me. If one needs to bisect something, they can look at the gomp-4_0-branch. Has it been tested on nvptx? I have always been testing on gomp-4_0-branch with ACC_DEVICE_TYPE=host and ACC_DEVICE_TYPE=host_nonshm, plus with ACC_DEVICE_TYPE=nvidia in an internal branch. This branch corresponds to gomp-4_0-branch, but also includes a few patches related to offloading that Bernd has posted for trunk approval, but has not yet gotten approved. Do you have a list of them (URLs)? Have they been pinged? Are they show stoppers for the offloading, or just some tests fail because of that? I guess we should test it with XeonPhi offloading too to make sure it doesn't break. Right. Do you happen to be set up for such testing? I have not yet managed to properly change my build/test scripts for x86_64-intelmicemul-linux-gnu. Anyone with x86_64-linux should be able to test that (i.e. the emulation), for real offloading (that ix x86_64-intelmic-linux-gnu) you supposedly need 2 libraries, some kernel module, some distro installed on the offloading device and most importantly the hw. --- gcc/config.gcc +++ gcc/config.gcc @@ -2906,6 +2906,7 @@ esac case ${target} in *-intelmic-* | *-intelmicemul-*) tmake_file=${tmake_file} i386/t-intelmic + tm_file=${tm_file} i386/intelmic-offload.h ;; esac diff --git gcc/config/i386/intelmic-offload.h gcc/config/i386/intelmic-offload.h new file mode 100644 index 000..dc346c7 --- /dev/null +++ gcc/config/i386/intelmic-offload.h @@ -0,0 +1,35 @@ +/* Definitions for Intel MIC offloading. + + Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + http://www.gnu.org/licenses/. */ + +#ifndef INTELMIC_OFFLOAD_H +#define INTELMIC_OFFLOAD_H + +/* Support for OpenACC acc_on_device. */ + +#include gomp-constants.h + +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC + +#endif LGTM. Jakub
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 01:31:45PM +0100, Jakub Jelinek wrote: --- gcc/config.gcc +++ gcc/config.gcc @@ -2906,6 +2906,7 @@ esac case ${target} in *-intelmic-* | *-intelmicemul-*) tmake_file=${tmake_file} i386/t-intelmic + tm_file=${tm_file} i386/intelmic-offload.h ;; esac +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC + +#endif Oh, and where is this defined for nvptx-none target? Jakub
Re: OpenACC middle end changes
Hi Jakub! On Thu, 18 Dec 2014 12:33:11 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Dec 18, 2014 at 11:46:00AM +0100, Thomas Schwinge wrote: just rtx v1 = GEN_INT (...); rtx v2 = GEN_INT (...); machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node)); emit_move_insn (ret, const0_rtx); rtx_code_label *done_label = gen_label_rtx (); emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_move_insn (ret, const1_rtx); emit_label (done_label); return ret; or similar. Thanks for the review/suggestion/code! Note, as I found later, emit_cmp_and_jump_insns is good enough only for certain modes on certain architectures (in particular, for cases where can_compare_p returns true). So it is better to use do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns, because it handles also the cases which emit_cmp_and_jump_insns silently mishandles. You'll need to reorder the arguments a little bit and add one NULL_RTX argument. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63848#c4 Thanks again; committed to gomp-4_0-branch in r218862: commit a58e1475324e6dd6c34a95883f5efc854e204fde Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Dec 18 13:13:06 2014 + OpenACC acc_on_device: Harden builtin expansion. gcc/ * builtins.c (expand_builtin_acc_on_device): Use do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218862 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 3 +++ gcc/builtins.c | 8 2 files changed, 7 insertions(+), 4 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index a3650c5..1e6df5f 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,6 +1,9 @@ 2014-12-18 Thomas Schwinge tho...@codesourcery.com Jakub Jelinek ja...@redhat.com + * builtins.c (expand_builtin_acc_on_device): Use + do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns. + * builtins.c (expand_builtin_acc_on_device): Make more RTXy. 2014-12-17 Thomas Schwinge tho...@codesourcery.com diff --git gcc/builtins.c gcc/builtins.c index e946521..33025a5 100644 --- gcc/builtins.c +++ gcc/builtins.c @@ -5911,10 +5911,10 @@ expand_builtin_acc_on_device (tree exp, rtx target) target = gen_reg_rtx (target_mode); emit_move_insn (target, const0_rtx); rtx_code_label *done_label = gen_label_rtx (); - emit_cmp_and_jump_insns (v, v1, NE, NULL_RTX, v_mode, - false, done_label, PROB_EVEN); - emit_cmp_and_jump_insns (v, v2, NE, NULL_RTX, v_mode, - false, done_label, PROB_EVEN); + do_compare_rtx_and_jump (v, v1, NE, false, v_mode, NULL_RTX, + NULL_RTX, done_label, PROB_EVEN); + do_compare_rtx_and_jump (v, v2, NE, false, v_mode, NULL_RTX, + NULL_RTX, done_label, PROB_EVEN); emit_move_insn (target, const1_rtx); emit_label (done_label); Grüße, Thomas pgpybie4WoKNR.pgp Description: PGP signature
Re: OpenACC middle end changes
Hi Jakub! On Thu, 18 Dec 2014 13:36:16 +0100, Jakub Jelinek ja...@redhat.com wrote: On Thu, Dec 18, 2014 at 01:31:45PM +0100, Jakub Jelinek wrote: --- gcc/config.gcc +++ gcc/config.gcc @@ -2906,6 +2906,7 @@ esac case ${target} in *-intelmic-* | *-intelmicemul-*) tmake_file=${tmake_file} i386/t-intelmic + tm_file=${tm_file} i386/intelmic-offload.h ;; esac +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC + +#endif Oh, and where is this defined for nvptx-none target? One of the pending commits (together with the nvptx mkoffload); I now applied the following to gomp-4_0-branch in r218863: commit c3f63a62eb4332f651be5fd377d2d289c8c949f5 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Dec 18 13:13:17 2014 + Support for OpenACC acc_on_device in offloading configurations. gcc/ * config/i386/intelmic-offload.h: New file. * config/nvptx/offload.h: Likewise. * config.gcc *-intelmic-*, *-intelmicemul-*, nvptx-*: Point to them via tm_file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218863 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 7 +++ gcc/config.gcc | 2 ++ gcc/config/i386/intelmic-offload.h | 35 +++ gcc/config/nvptx/offload.h | 35 +++ 4 files changed, 79 insertions(+) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index 1e6df5f..a744ebf 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,4 +1,11 @@ 2014-12-18 Thomas Schwinge tho...@codesourcery.com + + * config/i386/intelmic-offload.h: New file. + * config/nvptx/offload.h: Likewise. + * config.gcc *-intelmic-*, *-intelmicemul-*, nvptx-*: Point to + them via tm_file. + +2014-12-18 Thomas Schwinge tho...@codesourcery.com Jakub Jelinek ja...@redhat.com * builtins.c (expand_builtin_acc_on_device): Use diff --git gcc/config.gcc gcc/config.gcc index 8541274..1e453e9 100644 --- gcc/config.gcc +++ gcc/config.gcc @@ -2178,6 +2178,7 @@ nios2-*-*) nvptx-*) tm_file=${tm_file} newlib-stdint.h tmake_file=nvptx/t-nvptx + tm_file=${tm_file} nvptx/offload.h ;; pdp11-*-*) tm_file=${tm_file} newlib-stdint.h @@ -2906,6 +2907,7 @@ esac case ${target} in *-intelmic-* | *-intelmicemul-*) tmake_file=${tmake_file} i386/t-intelmic + tm_file=${tm_file} i386/intelmic-offload.h ;; esac diff --git gcc/config/i386/intelmic-offload.h gcc/config/i386/intelmic-offload.h new file mode 100644 index 000..bea18ed --- /dev/null +++ gcc/config/i386/intelmic-offload.h @@ -0,0 +1,35 @@ +/* Support for Intel MIC offloading. + + Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + http://www.gnu.org/licenses/. */ + +#ifndef INTELMIC_OFFLOAD_H +#define INTELMIC_OFFLOAD_H + +/* Support for OpenACC acc_on_device. */ + +#include gomp-constants.h + +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC + +#endif diff --git gcc/config/nvptx/offload.h gcc/config/nvptx/offload.h new file mode 100644 index 000..63f9a02 --- /dev/null +++ gcc/config/nvptx/offload.h @@ -0,0 +1,35 @@ +/* Support for Nvidia PTX offloading. + + Copyright (C) 2014 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software
Re: OpenACC middle end changes
On Thu, Dec 18, 2014 at 12:07:01PM +0100, Thomas Schwinge wrote: Many thanks for the review comments! The very most have been addresed, here are just a few comments. If you feel strongly/differently about any, I'll address those, too. So, with your latest change both compilers build: mkdir objmic; cd objmic ../configure --build=x86_64-intelmicemul-linux-gnu --host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap make -j16 make DESTDIR=`pwd`/../objinst install mkdir ../obj; cd ../obj ../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gomp-4.0/objmic --disable-bootstrap make -j16 But there are issues during make check. I first did: make -j16 -k check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} gomp.exp goacc.exp goacc-gomp.exp' and that shows: Making a new config file... echo set tmpdir /usr/src/gomp-4.0/obj/gcc/testsuite ./site.tmp rm -rf testsuite/gcc-parallel rm -rf testsuite/g++-parallel rm -rf testsuite/gfortran-parallel rm -rf testsuite/objc-parallel mkdir: cannot create directory ‘testsuite’: File exists make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc' make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc' make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc' make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc' mkdir: cannot create directory ‘plugin’: File exists mkdir: cannot create directory ‘plugin’mkdir: : File existscannot create directory ‘plugin’ : File exists mkdir: cannot create directory ‘plugin’: File exists Makefile:3787: recipe for target 'check-parallel-gcc_1' failed make[1]: [check-parallel-gcc_1] Error 1 (ignored) Makefile:3787: recipe for target 'check-parallel-gcc_2' failed make[1]: [check-parallel-gcc_2] Error 1 (ignored) Makefile:3787: recipe for target 'check-parallel-gcc_3' failed make[1]: [check-parallel-gcc_3] Error 1 (ignored) Makefile:3787: recipe for target 'check-parallel-gcc_4' failed make[1]: [check-parallel-gcc_4] Error 1 (ignored) Clearly preexisting problem even on trunk, so not a show stopper for this. And in libgomp the testing fails completely: Making check in testsuite make[1]: Entering directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' make check-DEJAGNU make[2]: Entering directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' Making a new site.exp file... srcdir=`CDPATH=${ZSH_VERSION+.}: cd ../../../../libgomp/testsuite pwd`; export srcdir; \ EXPECT=expect; export EXPECT; \ runtest=runtest ; \ if /bin/sh -c $runtest --version /dev/null 21; then \ exit_status=0; l='libgomp'; for tool in $l; do \ if $runtest --tool $tool --srcdir $srcdir ; \ then :; else exit_status=1; fi; \ done; \ else echo WARNING: could not find \`runtest' 12; :;\ fi; \ exit $exit_status WARNING: Couldn't find the global config file. ERROR: tcl error sourcing libgomp-test-support.exp. can't read (target_alias): no such variable while executing set offload_additional_options -B/usr/src/gomp-4.0/objmic/libexec/gcc/$(target_alias)/$(gcc_version) -B/usr/src/gomp-4.0/objmic/bin (file libgomp-test-support.exp line 5) invoked from within source libgomp-test-support.exp (uplevel body line 1) invoked from within uplevel #0 source libgomp-test-support.exp invoked from within catch uplevel #0 source $file Makefile:277: recipe for target 'check-DEJAGNU' failed make[2]: *** [check-DEJAGNU] Error 1 make[2]: Leaving directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' Makefile:314: recipe for target 'check-am' failed make[1]: *** [check-am] Error 2 make[1]: Leaving directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' Makefile:856: recipe for target 'check-recursive' failed make: *** [check-recursive] Error 1 So clearly the *.exp files need to be taught where to look for libgomp-test-support.exp. Jakub
libgomp offloading testing (was: OpenACC middle end changes)
Hi Jakub! On Thu, 18 Dec 2014 15:20:42 +0100, Jakub Jelinek ja...@redhat.com wrote: So, with your latest change both compilers build: mkdir objmic; cd objmic ../configure --build=x86_64-intelmicemul-linux-gnu --host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap make -j16 make DESTDIR=`pwd`/../objinst install mkdir ../obj; cd ../obj ../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gomp-4.0/objmic --disable-bootstrap make -j16 Thanks; I'll look into reproducing such a build. And in libgomp the testing fails completely: What happens, in my understanding, is: Making check in testsuite make[1]: Entering directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' make check-DEJAGNU make[2]: Entering directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' Making a new site.exp file... srcdir=`CDPATH=${ZSH_VERSION+.}: cd ../../../../libgomp/testsuite pwd`; export srcdir; \ EXPECT=expect; export EXPECT; \ runtest=runtest ; \ if /bin/sh -c $runtest --version /dev/null 21; then \ exit_status=0; l='libgomp'; for tool in $l; do \ if $runtest --tool $tool --srcdir $srcdir ; \ then :; else exit_status=1; fi; \ done; \ else echo WARNING: could not find \`runtest' 12; :;\ fi; \ exit $exit_status WARNING: Couldn't find the global config file. ERROR: tcl error sourcing libgomp-test-support.exp. can't read (target_alias): no such variable The variable $(target_alias) is not available when in the line: while executing set offload_additional_options -B/usr/src/gomp-4.0/objmic/libexec/gcc/$(target_alias)/$(gcc_version) -B/usr/src/gomp-4.0/objmic/bin ... in the file: (file libgomp-test-support.exp line 5) invoked from within source libgomp-test-support.exp ... it is being parsed. Should target_alias and gcc_version be instantiated (AC_SUBST) by Autoconf already, when creating the libgomp-test-support.exp file from libgomp/testsuite/libgomp-test-support.exp.in? Or, should those be written (in libgomp/plugin/configfrag.ac) in TCL syntax, and evaluated only once libgomp-test-support.exp is sourced? target_alias is being provided in site.exp (as generated by libgomp/testsuite/Makefile), but gcc_version is not. (uplevel body line 1) invoked from within uplevel #0 source libgomp-test-support.exp invoked from within catch uplevel #0 source $file Makefile:277: recipe for target 'check-DEJAGNU' failed make[2]: *** [check-DEJAGNU] Error 1 make[2]: Leaving directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' Makefile:314: recipe for target 'check-am' failed make[1]: *** [check-am] Error 2 make[1]: Leaving directory '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite' Makefile:856: recipe for target 'check-recursive' failed make: *** [check-recursive] Error 1 So clearly the *.exp files need to be taught where to look for libgomp-test-support.exp. That's not the problem, if I'm understanding correctly. Grüße, Thomas pgpQUlVKSWcfG.pgp Description: PGP signature
Re: OpenACC middle end changes
Hi Jakub! On Thu, 13 Nov 2014 19:09:49 +0100, Jakub Jelinek ja...@redhat.com wrote: --- gcc/builtins.c +++ gcc/builtins.c +static rtx +expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED) +{ + if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE)) +return NULL_RTX; + + tree arg, v1, v2, ret; + location_t loc; + + arg = CALL_EXPR_ARG (exp, 0); + arg = builtin_save_expr (arg); + loc = EXPR_LOCATION (exp); + + /* Build: (arg == v1 || arg == v2) ? 1 : 0. */ + +#ifdef ACCEL_COMPILER + v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_not_host */ 3); + v2 = build_int_cst (TREE_TYPE (arg), ACCEL_COMPILER_acc_device); +#else + v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_none */ 0); + v2 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_host */ 2); +#endif + + v1 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v1); + v2 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v2); + + /* Can't use TRUTH_ORIF_EXPR, as that is not supported by + expand_expr_real*. */ + ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, v1, v1, v2); + ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, +ret, integer_one_node, integer_zero_node); + + return expand_normal (ret); If you can't fold it late (which is indeed a problem for -O0), then I'd suggest to implement this more RTL-ish. So, avoid the builtin_save_expr, instead rtx op = expand_normal (arg); Don't build v1/v2 as trees (and, please fix the TODOs), but rtxes, just rtx v1 = GEN_INT (...); rtx v2 = GEN_INT (...); machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node)); emit_move_insn (ret, const0_rtx); rtx_code_label *done_label = gen_label_rtx (); emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_move_insn (ret, const1_rtx); emit_label (done_label); return ret; or similar. ;-) Yes, similar, as I've now found; committed to gomp-4_0-branch in r218869: commit cb37a039eb7a7375d074bc092457349312c5a2e2 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4 Date: Thu Dec 18 16:07:23 2014 + OpenACC acc_on_device: Fix logic error introduced in an earlier change. ... but which didn't show up in testing until after a libgomp rebuild, because of the caching of the acc_on_device builtin that is being done in libgomp/oacc-init.c:acc_on_device. gcc/ * builtins.c (expand_builtin_acc_on_device): Fix logic error. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218869 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog.gomp | 2 ++ gcc/builtins.c | 8 2 files changed, 6 insertions(+), 4 deletions(-) diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp index a744ebf..a21fd92 100644 --- gcc/ChangeLog.gomp +++ gcc/ChangeLog.gomp @@ -1,5 +1,7 @@ 2014-12-18 Thomas Schwinge tho...@codesourcery.com + * builtins.c (expand_builtin_acc_on_device): Fix logic error. + * config/i386/intelmic-offload.h: New file. * config/nvptx/offload.h: Likewise. * config.gcc *-intelmic-*, *-intelmicemul-*, nvptx-*: Point to diff --git gcc/builtins.c gcc/builtins.c index 33025a5..6891229 100644 --- gcc/builtins.c +++ gcc/builtins.c @@ -5909,13 +5909,13 @@ expand_builtin_acc_on_device (tree exp, rtx target) machine_mode target_mode = TYPE_MODE (integer_type_node); if (!REG_P (target) || GET_MODE (target) != target_mode) target = gen_reg_rtx (target_mode); - emit_move_insn (target, const0_rtx); + emit_move_insn (target, const1_rtx); rtx_code_label *done_label = gen_label_rtx (); - do_compare_rtx_and_jump (v, v1, NE, false, v_mode, NULL_RTX, + do_compare_rtx_and_jump (v, v1, EQ, false, v_mode, NULL_RTX, NULL_RTX, done_label, PROB_EVEN); - do_compare_rtx_and_jump (v, v2, NE, false, v_mode, NULL_RTX, + do_compare_rtx_and_jump (v, v2, EQ, false, v_mode, NULL_RTX, NULL_RTX, done_label, PROB_EVEN); - emit_move_insn (target, const1_rtx); + emit_move_insn (target, const0_rtx); emit_label (done_label); return target; Grüße, Thomas pgp6P0PultR2w.pgp Description: PGP signature
Re: OpenACC middle end changes
On 11/20/2014 07:52 AM, Jakub Jelinek wrote: On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote: Thomas had apparently already pointed out an issue with the new gomp_target class (there are multiple similar types of statements we want to handle with OpenACC, they have different codes but we want to have function pointers operating on any of them) back in July. That seems to have been ignored. By necessity, some of David's changes are reverted in the following patch. I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just two new gimple_omp_target_kind GF_* flags. If that's the case I'll leave it to Thomas to make these changes. At the moment I'm just trying to put together all the pieces into versions that apply to trunk and can be made to work together. Bernd
Re: OpenACC middle end changes
On 11/19/2014 02:50 AM, Bernd Schmidt wrote: @@ -8417,6 +8926,9 @@ expand_omp_target (struct omp_region *region) /* Add the new function to the offload table. */ vec_safe_push (offload_funcs, child_fn); + /* Add the new function to the offload table. */ + vec_safe_push (offload_funcs, child_fn); + /* Fix the callgraph edges for child_cfun. Those for cfun will be fixed in a following pass. */ push_cfun (child_cfun); This hunk also needs to go away. Bernd
Re: OpenACC middle end changes
Another change that's required is (something like) the following. For ptx, we need to know whether to output something as a .func (callable from ptx code) or a .kernel (callable from the host). That means we need to mark the kernel functions somehow in omp-low.c, and the following does that by way of a new attribute (already recognized by the nvptx backend). Bernd * omp-low.c (create_omp_child_function): Tag entrypoint functions with a special attribute. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 42ba317..8408025 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool task_copy) break; } } + if (cgraph_node::get_create (decl)-offloadable + !lookup_attribute (omp declare target, + DECL_ATTRIBUTES (current_function_decl))) +DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier (omp target entrypoint), + NULL_TREE, DECL_ATTRIBUTES (decl)); t = build_decl (DECL_SOURCE_LOCATION (decl), RESULT_DECL, NULL_TREE, void_type_node);
Re: OpenACC middle end changes
On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote: Another change that's required is (something like) the following. For ptx, we need to know whether to output something as a .func (callable from ptx code) or a .kernel (callable from the host). That means we need to mark the kernel functions somehow in omp-low.c, and the following does that by way of a new attribute (already recognized by the nvptx backend). I think Richard's and Honza's preference in this case is a flag in cgraph_node instead of an attribute. * omp-low.c (create_omp_child_function): Tag entrypoint functions with a special attribute. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 42ba317..8408025 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool task_copy) break; } } + if (cgraph_node::get_create (decl)-offloadable + !lookup_attribute (omp declare target, + DECL_ATTRIBUTES (current_function_decl))) +DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier (omp target entrypoint), + NULL_TREE, DECL_ATTRIBUTES (decl)); t = build_decl (DECL_SOURCE_LOCATION (decl), RESULT_DECL, NULL_TREE, void_type_node); Jakub
Re: OpenACC middle end changes
On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote: Thomas had apparently already pointed out an issue with the new gomp_target class (there are multiple similar types of statements we want to handle with OpenACC, they have different codes but we want to have function pointers operating on any of them) back in July. That seems to have been ignored. By necessity, some of David's changes are reverted in the following patch. I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just two new gimple_omp_target_kind GF_* flags. Jakub
Re: OpenACC middle end changes
On Thursday 2014-11-13 17:59, Thomas Schwinge wrote: Here is our current set of OpenACC middle end changes. As discussed before, this is not yet all of OpenACC 2.0 -- we shall a) document what is working already, and b) continue to work on closing the gap. As David wrote in a different context, strchrnul is a GNU extension and not present at least on AIX and FreeBSD 8 (and possibly 9). Gerald PS: Sorry, this mail got stuck in my outbox.
Re: OpenACC middle end changes
On Fri, Nov 14, 2014 at 11:28:15AM +0100, Richard Biener wrote: This patch is based on the last merge of trunk into gomp-4_0-branch, 9be82689 (trunk r216846, 2014-10-29), and still includes an old version of the offloading patches, as currently present on gomp-4_0-branch. We're already working on rebasing onto the set of offloading patches that has just been committed to trunk, but I didn't want to have this delay any further (it seems, the rebase/merge is not always trivial) the * ChangeLog snippets still need to be written. Badly needed - I wonder why you need changes to LTO files at all. I think he doesn't, but the LTO changes that were committed to trunk by Intel haven't been integrated yet into the branch AFAIK; at least I've skipped all those bits I expect to be in already. See above Thomas' comment. Jakub
Re: OpenACC middle end changes
On Thu, Nov 13, 2014 at 05:59:11PM +0100, Thomas Schwinge wrote: * should gcc/oacc-builtins.def just be merged into gcc/omp-builtins.def; Why not. The reason why they aren't in gcc/builtins.def is that the Fortran FE doesn't source those, but OpenACC supports the same languages as OpenMP. --- gcc/builtins.c +++ gcc/builtins.c @@ -5751,6 +5751,49 @@ expand_stack_save (void) return ret; } + +/* Expand OpenACC acc_on_device. + + This has to happen late (that is, not in early folding; expand_builtin_*, + rather than fold_builtin_*), as we have to act differently for host and + acceleration device (ACCEL_COMPILER conditional). */ + +static rtx +expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED) +{ + if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE)) +return NULL_RTX; + + tree arg, v1, v2, ret; + location_t loc; + + arg = CALL_EXPR_ARG (exp, 0); + arg = builtin_save_expr (arg); + loc = EXPR_LOCATION (exp); + + /* Build: (arg == v1 || arg == v2) ? 1 : 0. */ + +#ifdef ACCEL_COMPILER + v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_not_host */ 3); + v2 = build_int_cst (TREE_TYPE (arg), ACCEL_COMPILER_acc_device); +#else + v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_none */ 0); + v2 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_host */ 2); +#endif + + v1 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v1); + v2 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v2); + + /* Can't use TRUTH_ORIF_EXPR, as that is not supported by + expand_expr_real*. */ + ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, v1, v1, v2); + ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, + ret, integer_one_node, integer_zero_node); + + return expand_normal (ret); If you can't fold it late (which is indeed a problem for -O0), then I'd suggest to implement this more RTL-ish. So, avoid the builtin_save_expr, instead rtx op = expand_normal (arg); Don't build v1/v2 as trees (and, please fix the TODOs), but rtxes, just rtx v1 = GEN_INT (...); rtx v2 = GEN_INT (...); machine_mode mode = TYPE_MODE (TREE_TYPE (arg)); rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node)); emit_move_insn (ret, const0_rtx); rtx_code_label *done_label = gen_label_rtx (); emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode, false, done_label, PROB_EVEN); emit_move_insn (ret, const1_rtx); emit_label (done_label); return ret; or similar. Note, it would still be worthwhile to fold the builtin, at least when optimizing, after IPA. Dunno if we have some property you can check, and Richard B. could suggest where it would be most appropriate (if GIMPLE guarded match.pd entry, or what), gimple_fold, etc. I bet I should handle omp_is_initial_device (); similarly. @@ -1818,7 +1818,7 @@ There are also several varieties of complex statements. * Empty Statements:: * Jumps:: * Cleanups:: -* OpenMP:: +* OpenACC and OpenMP:: I think it might be better just to have separate sections for each, not put them into the same. Start with OpenMP, and in OpenACC section put the OACC specific stuff and say what is shared with OpenMP (clauses, etc.). --- gcc/doc/gimple.texi +++ gcc/doc/gimple.texi @@ -439,6 +439,8 @@ The following table briefly describes the GIMPLE instruction set. @item @code{GIMPLE_GOTO} @tab x @tab x @item @code{GIMPLE_LABEL}@tab x @tab x @item @code{GIMPLE_NOP} @tab x @tab x +@item @code{GIMPLE_OACC_KERNELS} @tab x @tab x +@item @code{GIMPLE_OACC_PARALLEL}@tab x @tab x @item @code{GIMPLE_OMP_ATOMIC_LOAD} @tab x @tab x @item @code{GIMPLE_OMP_ATOMIC_STORE} @tab x @tab x @item @code{GIMPLE_OMP_CONTINUE} @tab x @tab x @@ -1006,6 +1008,8 @@ Return a deep copy of statement @code{STMT}. * @code{GIMPLE_EH_FILTER}:: * @code{GIMPLE_LABEL}:: * @code{GIMPLE_NOP}:: +* @code{GIMPLE_OACC_KERNELS}:: +* @code{GIMPLE_OACC_PARALLEL}:: * @code{GIMPLE_OMP_ATOMIC_LOAD}:: * @code{GIMPLE_OMP_ATOMIC_STORE}:: * @code{GIMPLE_OMP_CONTINUE}:: This will likely change, right? --- gcc/gimple-pretty-print.c +++ gcc/gimple-pretty-print.c @@ -1136,18 +1136,21 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags) case GF_OMP_FOR_KIND_FOR: kind = ; break; - case GF_OMP_FOR_KIND_SIMD: - kind = simd; - break; - case GF_OMP_FOR_KIND_CILKSIMD: - kind = cilksimd; - break; case GF_OMP_FOR_KIND_DISTRIBUTE: kind = distribute; break; case GF_OMP_FOR_KIND_CILKFOR: kind = _Cilk_for;
Re: Re: OpenACC middle end changes
I'll try to respond to the reduction stuff. It's been a while since I started working on it, so I may have lost some state. On 11/13/2014 10:09 AM, Jakub Jelinek wrote: @@ -233,6 +242,90 @@ static tree scan_omp_1_op (tree *, int *, void *); *handled_ops_p = false; \ break; +/* Helper function to get the reduction array name */ +static const char * +omp_get_id (tree node) Be more specific in the function name what it is for? It's the name of the array containing the partial reductions for original reduction variable. +{ + const char *id = IDENTIFIER_POINTER (DECL_NAME (node)); + int len = strlen (omp$) + strlen (id); + char *temp_name = (char *)alloca (len+1); + snprintf (temp_name, len+1, gfc$%s, id); gfc$ ? It's just a semi-random prefix I used to make the partial reduction array identifier unique to aid with debugging. I was working on the fortran front end at the time. Maybe s/gfc/oacc/? Use char *temp_name = XALLOCAVEC (char, len + 1); instead? + return IDENTIFIER_POINTER(get_identifier (temp_name)); Formatting (missing space before ( ). @@ -868,6 +981,25 @@ maybe_lookup_field (tree var, omp_context *ctx) return n ? (tree) n-value : NULL_TREE; } +static inline tree +lookup_reduction (const char *id, omp_context *ctx) Can't you use oacc_ in the name of OpenACC specific functions? Sure. [snip] @@ -8834,6 +9492,397 @@ make_pass_expand_omp (gcc::context *ctxt) /* Routines to lower OpenMP directives into OMP-GIMPLE. */ +/* Helper function to preform, potentially COMPLEX_TYPE, operation and + convert it to gimple. */ +static void +omp_gimple_assign_with_ops (tree_code op, tree dest, tree src, gimple_seq *seq) Makes me wonder why don't you put the reduction code earlier into reduction clause GENERIC and then lower into clauses' GIMPLE seq. If there is some reason, please name it oacc at least. I probably was trying to reuse as much of the existing code as possible. I've swapped out too much state on this. This can be renamed too. +static void +initialize_reduction_data (tree clauses, tree nthreads, gimple_seq *stmt_seqp, + omp_context *ctx) Likewise. +/* Helper function to process the array of partial reductions. Nthreads + indicates the number of threads. Unfortunately, GOACC_GET_NUM_THREADS + cannot be used here, because nthreads on the host may be different than + on the accelerator. */ + +static void +finalize_reduction_data (tree clauses, tree nthreads, gimple_seq *stmt_seqp, + omp_context *ctx) Likewise. +/* Scan through all of the gimple stmts searching for an OMP_FOR_EXPR, and + scan that for reductions. */ + +static void +process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp, +gimple_seq *out_stmt_seqp, omp_context *ctx) Likewise. Thomas, would you like me to handle the renaming, or will you? I could make those changes to gomp-4_0-branch if you like. Cesar
Re: Re: OpenACC middle end changes
On Thu, Nov 13, 2014 at 11:03:47AM -0800, Cesar Philippidis wrote: @@ -233,6 +242,90 @@ static tree scan_omp_1_op (tree *, int *, void *); *handled_ops_p = false; \ break; +/* Helper function to get the reduction array name */ +static const char * +omp_get_id (tree node) Be more specific in the function name what it is for? It's the name of the array containing the partial reductions for original reduction variable. +{ + const char *id = IDENTIFIER_POINTER (DECL_NAME (node)); + int len = strlen (omp$) + strlen (id); + char *temp_name = (char *)alloca (len+1); + snprintf (temp_name, len+1, gfc$%s, id); gfc$ ? It's just a semi-random prefix I used to make the partial reduction array identifier unique to aid with debugging. I was working on the fortran front end at the time. Maybe s/gfc/oacc/? Yeah, something (and please use the same string in the strlen and sprintf. If the symbol is emitted into assembly, you need to check for NO_DOLLARS_IN_LABELS and similar. Oh, and please use spaces around +. And name the function so that it is clear what is it for. Jakub
Re: OpenACC middle end changes
On Thu, 13 Nov 2014, Thomas Schwinge wrote: gcc/doc/invoke.texi | 14 You're adding documentation for -fopenacc, but I don't see any .opt file changes in this patch, and I'd expect the option to be added in the same patch as its documentation. -- Joseph S. Myers jos...@codesourcery.com
Re: OpenACC middle end changes
On 11/13/2014 11:09 AM, Jakub Jelinek wrote: On Thu, Nov 13, 2014 at 11:03:47AM -0800, Cesar Philippidis wrote: @@ -233,6 +242,90 @@ static tree scan_omp_1_op (tree *, int *, void *); *handled_ops_p = false; \ break; +/* Helper function to get the reduction array name */ +static const char * +omp_get_id (tree node) Be more specific in the function name what it is for? It's the name of the array containing the partial reductions for original reduction variable. +{ + const char *id = IDENTIFIER_POINTER (DECL_NAME (node)); + int len = strlen (omp$) + strlen (id); + char *temp_name = (char *)alloca (len+1); + snprintf (temp_name, len+1, gfc$%s, id); gfc$ ? It's just a semi-random prefix I used to make the partial reduction array identifier unique to aid with debugging. I was working on the fortran front end at the time. Maybe s/gfc/oacc/? Yeah, something (and please use the same string in the strlen and sprintf. If the symbol is emitted into assembly, you need to check for NO_DOLLARS_IN_LABELS and similar. Oh, and please use spaces around +. And name the function so that it is clear what is it for. The attached patch cleanup the various reduction functions and their usages. Thomas, I've applied this to gomp-4_0-branch. Cesar 2014-11-13 Cesar Philippidis ce...@codesourcery.com gcc/ * omp-low.c (omp_get_id): Rename to... (oacc_get_reduction_array_id): ... this. (lookup_oacc_reduction): ... this. (lookup_reduction): Rename to... (maybe_lookup_reduction): Rename to... (maybe_lookup_oacc_reduction): ... this. (scan_sharing_clauses): Update calls to renamed fns. (lower_reduction_var_helper): Rename to... (oacc_lower_reduction_var_helper): ... this. (lower_reduction_clauses): Rename to... (oacc_lower_reduction_clauses): ... this. (omp_gimple_assign_with_ops): Rename to... (oacc_gimple_assign_with_ops): ... this. (initialize_reduction_data): Rename to ... (oacc_initialize_reduction_data): ... this. (finalize_reduction_data): Rename to... (oacc_finalize_reduction_data): ... this. (process_reduction_data): Rename to... (oacc_process_reduction_data): ... this. (lower_omp_target): Update calls to renamed fns. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index e511846..da9c5a5 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -242,15 +242,16 @@ static tree scan_omp_1_op (tree *, int *, void *); *handled_ops_p = false; \ break; -/* Helper function to get the reduction array name */ +/* Helper function to get the name of the array containing the partial + reductions for OpenACC reductions. */ static const char * -omp_get_id (tree node) +oacc_get_reduction_array_id (tree node) { const char *id = IDENTIFIER_POINTER (DECL_NAME (node)); - int len = strlen (omp$) + strlen (id); - char *temp_name = (char *)alloca (len+1); - snprintf (temp_name, len+1, gfc$%s, id); - return IDENTIFIER_POINTER(get_identifier (temp_name)); + int len = strlen (OACC) + strlen (id); + char *temp_name = XALLOCAVEC (char, len + 1); + snprintf (temp_name, len+1, OACC%s, id); + return IDENTIFIER_POINTER (get_identifier (temp_name)); } /* Determine the number of threads OpenACC threads used to determine the @@ -983,7 +984,7 @@ maybe_lookup_field (tree var, omp_context *ctx) } static inline tree -lookup_reduction (const char *id, omp_context *ctx) +lookup_oacc_reduction (const char *id, omp_context *ctx) { gcc_assert (is_gimple_omp_oacc_specifically (ctx-stmt)); @@ -993,7 +994,7 @@ lookup_reduction (const char *id, omp_context *ctx) } static inline tree -maybe_lookup_reduction (tree var, omp_context *ctx) +maybe_lookup_oacc_reduction (tree var, omp_context *ctx) { splay_tree_node n = NULL; if (ctx-reduction_map) @@ -1759,14 +1760,15 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) tree var = OMP_CLAUSE_DECL (c); tree type = get_base_type (var); tree ptype = build_pointer_type (type); - tree array = create_tmp_var (ptype, omp_get_id (var)); + tree array = create_tmp_var (ptype, + oacc_get_reduction_array_id (var)); omp_context *c = (ctx-field_map ? ctx : ctx-outer); install_var_field (array, true, 3, c); install_var_local (array, c); /* Insert it into the current context. */ - splay_tree_insert (ctx-reduction_map, - (splay_tree_key) omp_get_id(var), + splay_tree_insert (ctx-reduction_map, (splay_tree_key) + oacc_get_reduction_array_id (var), (splay_tree_value) array); splay_tree_insert (ctx-reduction_map, (splay_tree_key) array, @@ -4419,8 +4421,8 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list, } static void -lower_reduction_var_helper (gimple_seq *stmt_seqp, omp_context *ctx, tree tid, - tree var, tree new_var) +oacc_lower_reduction_var_helper (gimple_seq *stmt_seqp, omp_context *ctx, + tree tid, tree var, tree new_var) { /* The atomic add at the end