Re: Merge DEF_GOACC_BUILTIN into DEF_GOMP_BUILTIN? (was: OpenACC middle end changes)

2015-07-09 Thread Jakub Jelinek
On Thu, Jul 09, 2015 at 05:52:20PM +0200, Thomas Schwinge wrote:
 --- gcc/builtins.def
 +++ gcc/builtins.def
 @@ -182,7 +182,9 @@ along with GCC; see the file COPYING3.  If not see
  #define DEF_GOMP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
 false, true, true, ATTRS, false, \
 -(flag_openmp || flag_tree_parallelize_loops \
 +(flag_openmp \
 + || flag_tree_parallelize_loops  1 \
 + || flag_cilkplus \
   || flag_offload_abi != OFFLOAD_ABI_UNSET))
  
  /* Builtin used by implementation of Cilk Plus.  Most of these are decomposed
 
 Before this patch, all DEF_GOMP_BUILTINs (erroneously) had always been
 available, due to flag_tree_parallelize_loops's default value of 1.
 
 With gcc/omp-low.c:lower_reduction_clauses using a
 BUILT_IN_GOMP_ATOMIC_START/BUILT_IN_GOMP_ATOMIC_END sequence as a last
 resort, and that being chosen for some kind of OpenACC reduction clauses
 (which is present on gomp-4_0-branch only), we're then running into ICEs,
 as those two DEF_GOMP_BUILTINs are not available with plain -fopenacc.
 
 Now, it there actually a good reason to have separate DEF_GOACC_BUILTIN
 and DEF_GOMP_BUILTIN directives (which I basically just initially did to
 be least intrusive,
 http://news.gmane.org/find-root.php?message_id=%3C1383766943-8863-6-git-send-email-thomas%40codesourcery.com%3E),
 or should I just add flag_openacc to DEF_GOMP_BUILTIN, and change all
 DEF_GOACC_BUILTIN instantiations to DEF_GOMP_BUILTIN?  Merging them
 definitely makes sense to me now, so OK to do the obvious?

Having DEF_GOMP_BUILTIN and DEF_GOACC_BUILTIN is nice, it tells you
if it is OpenMP or OpenACC builtin.  I'd say just add || flag_openacc
to DEF_GOMP_BUILTIN if you need it.  E.g. for -ftree-parallelize-loops
or -fcilkplus I doubt you want the OpenACC builtins ;).

Jakub


Merge DEF_GOACC_BUILTIN into DEF_GOMP_BUILTIN? (was: OpenACC middle end changes)

2015-07-09 Thread Thomas Schwinge
Hi!

On Thu, 13 Nov 2014 19:09:49 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Nov 13, 2014 at 05:59:11PM +0100, Thomas Schwinge wrote:
* should gcc/oacc-builtins.def just be merged into
  gcc/omp-builtins.def;
 
 Why not.  The reason why they aren't in gcc/builtins.def is that
 the Fortran FE doesn't source those, but OpenACC supports the same
 languages as OpenMP.

(We've done that,
http://news.gmane.org/find-root.php?message_id=%3C871tnxubgm.fsf%40schwinge.name%3E.)

Now, trying to merge trunk into gomp-4_0-branch, I've hit the problem
that Tom applied
http://news.gmane.org/find-root.php?message_id=%3C5583E052.2050207%40mentor.com%3E
in trunk r224745:

--- gcc/builtins.def
+++ gcc/builtins.def
@@ -182,7 +182,9 @@ along with GCC; see the file COPYING3.  If not see
 #define DEF_GOMP_BUILTIN(ENUM, NAME, TYPE, ATTRS) \
   DEF_BUILTIN (ENUM, __builtin_ NAME, BUILT_IN_NORMAL, TYPE, TYPE,\
false, true, true, ATTRS, false, \
-  (flag_openmp || flag_tree_parallelize_loops \
+  (flag_openmp \
+   || flag_tree_parallelize_loops  1 \
+   || flag_cilkplus \
|| flag_offload_abi != OFFLOAD_ABI_UNSET))
 
 /* Builtin used by implementation of Cilk Plus.  Most of these are decomposed

Before this patch, all DEF_GOMP_BUILTINs (erroneously) had always been
available, due to flag_tree_parallelize_loops's default value of 1.

With gcc/omp-low.c:lower_reduction_clauses using a
BUILT_IN_GOMP_ATOMIC_START/BUILT_IN_GOMP_ATOMIC_END sequence as a last
resort, and that being chosen for some kind of OpenACC reduction clauses
(which is present on gomp-4_0-branch only), we're then running into ICEs,
as those two DEF_GOMP_BUILTINs are not available with plain -fopenacc.

Now, it there actually a good reason to have separate DEF_GOACC_BUILTIN
and DEF_GOMP_BUILTIN directives (which I basically just initially did to
be least intrusive,
http://news.gmane.org/find-root.php?message_id=%3C1383766943-8863-6-git-send-email-thomas%40codesourcery.com%3E),
or should I just add flag_openacc to DEF_GOMP_BUILTIN, and change all
DEF_GOACC_BUILTIN instantiations to DEF_GOMP_BUILTIN?  Merging them
definitely makes sense to me now, so OK to do the obvious?


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: OpenACC middle end changes

2015-02-20 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote:
 Another change that's required is (something like) the following. For ptx,
 we need to know whether to output something as a .func (callable from ptx
 code) or a .kernel (callable from the host). That means we need to mark the
 kernel functions somehow in omp-low.c, and the following does that by way of
 a new attribute (already recognized by the nvptx backend).

On a second though, I guess this is ok.  Adding a cgraph bit that is
interesting to just a single target and is quite rare is probably waste,
especially when it would need to be streamed in and out in every cgraph
node.
As nvptx backend already recognizes it and we have omp declare target
attribute already, this is ok for trunk.

   * omp-low.c (create_omp_child_function): Tag entrypoint
 functions with a special attribute.
 
 diff --git a/gcc/omp-low.c b/gcc/omp-low.c
 index 42ba317..8408025 100644
 --- a/gcc/omp-low.c
 +++ b/gcc/omp-low.c
 @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool
 task_copy)
 break;
   }
  }
 +  if (cgraph_node::get_create (decl)-offloadable
 +   !lookup_attribute (omp declare target,
 +   DECL_ATTRIBUTES (current_function_decl)))
 +DECL_ATTRIBUTES (decl)
 +  = tree_cons (get_identifier (omp target entrypoint),
 +   NULL_TREE, DECL_ATTRIBUTES (decl));
 
t = build_decl (DECL_SOURCE_LOCATION (decl),
   RESULT_DECL, NULL_TREE, void_type_node);

Jakub


nvptx offloading: Tag entrypoint functions with a special attribute (was: OpenACC middle end changes)

2015-02-20 Thread Thomas Schwinge
Hi!

On Fri, 20 Feb 2015 10:47:13 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote:
  Another change that's required is (something like) the following. For ptx,
  we need to know whether to output something as a .func (callable from ptx
  code) or a .kernel (callable from the host). That means we need to mark the
  kernel functions somehow in omp-low.c, and the following does that by way of
  a new attribute (already recognized by the nvptx backend).
 
 On a second though, I guess this is ok.  Adding a cgraph bit that is
 interesting to just a single target and is quite rare is probably waste,
 especially when it would need to be streamed in and out in every cgraph
 node.

Heh, that's precisely the question I had just drafted in an email, just
about to send, when your email arrived.  ;-)

 As nvptx backend already recognizes it and we have omp declare target
 attribute already, this is ok for trunk.

Bernd, are you going to commit this, and the other approved changes of
yours?

  * omp-low.c (create_omp_child_function): Tag entrypoint
  functions with a special attribute.
  
  diff --git a/gcc/omp-low.c b/gcc/omp-low.c
  index 42ba317..8408025 100644
  --- a/gcc/omp-low.c
  +++ b/gcc/omp-low.c
  @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool
  task_copy)
  break;
}
   }
  +  if (cgraph_node::get_create (decl)-offloadable
  +   !lookup_attribute (omp declare target,
  +   DECL_ATTRIBUTES (current_function_decl)))
  +DECL_ATTRIBUTES (decl)
  +  = tree_cons (get_identifier (omp target entrypoint),
  +   NULL_TREE, DECL_ATTRIBUTES (decl));
  
 t = build_decl (DECL_SOURCE_LOCATION (decl),
RESULT_DECL, NULL_TREE, void_type_node);


Grüße,
 Thomas


pgpavNrQqucdO.pgp
Description: PGP signature


Re: OpenACC middle end changes

2015-02-13 Thread Thomas Schwinge
Hi!

On Thu, 18 Dec 2014 14:16:52 +0100, I wrote:
 --- /dev/null
 +++ gcc/config/i386/intelmic-offload.h

 +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC

This one I got right...

 --- /dev/null
 +++ gcc/config/nvptx/offload.h
 @@ -0,0 +1,35 @@

 +#define ACCEL_COMPILER_acc_device GOMP_TARGET_NVIDIA_PTX

..., but not this one.  Committed to trunk in r220686:

commit 8fbeb4361af9e77c57d3b15c7be11759a4f608c0
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Fri Feb 13 16:20:01 2015 +

GOMP_TARGET_* have been renamed to GOMP_DEVICE_* some time ago.

gcc/
* config/nvptx/offload.h (ACCEL_COMPILER_acc_device): Define to
GOMP_DEVICE_NVIDIA_PTX.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@220686 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog  |5 +
 gcc/config/nvptx/offload.h |2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git gcc/ChangeLog gcc/ChangeLog
index e06f69a..d9c58b9 100644
--- gcc/ChangeLog
+++ gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-13  Thomas Schwinge  tho...@codesourcery.com
+
+   * config/nvptx/offload.h (ACCEL_COMPILER_acc_device): Define to
+   GOMP_DEVICE_NVIDIA_PTX.
+
 2015-02-13  Jakub Jelinek  ja...@redhat.com
 
PR ipa/65034
diff --git gcc/config/nvptx/offload.h gcc/config/nvptx/offload.h
index 02c5e8b..9a749a2 100644
--- gcc/config/nvptx/offload.h
+++ gcc/config/nvptx/offload.h
@@ -30,6 +30,6 @@
 
 #include gomp-constants.h
 
-#define ACCEL_COMPILER_acc_device GOMP_TARGET_NVIDIA_PTX
+#define ACCEL_COMPILER_acc_device GOMP_DEVICE_NVIDIA_PTX
 
 #endif


Grüße,
 Thomas


signature.asc
Description: PGP signature


Re: OpenACC middle end changes

2014-12-18 Thread Thomas Schwinge
Hi!

On Thu, 13 Nov 2014 19:09:49 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Nov 13, 2014 at 05:59:11PM +0100, Thomas Schwinge wrote:
  --- gcc/builtins.c
  +++ gcc/builtins.c

  +/* Expand OpenACC acc_on_device.
  +
  +   This has to happen late (that is, not in early folding; 
  expand_builtin_*,
  +   rather than fold_builtin_*), as we have to act differently for host and
  +   acceleration device (ACCEL_COMPILER conditional).  */
  +
  +static rtx
  +expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED)
  +{
  +  if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE))
  +return NULL_RTX;
  +
  +  tree arg, v1, v2, ret;
  +  location_t loc;
  +
  +  arg = CALL_EXPR_ARG (exp, 0);
  +  arg = builtin_save_expr (arg);
  +  loc = EXPR_LOCATION (exp);
  +
  +  /* Build: (arg == v1 || arg == v2) ? 1 : 0.  */
  +
  +#ifdef ACCEL_COMPILER
  +  v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_not_host */ 3);
  +  v2 = build_int_cst (TREE_TYPE (arg), ACCEL_COMPILER_acc_device);
  +#else
  +  v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_none */ 0);
  +  v2 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_host */ 2);
  +#endif
  +
  +  v1 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v1);
  +  v2 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v2);
  +
  +  /* Can't use TRUTH_ORIF_EXPR, as that is not supported by
  + expand_expr_real*.  */
  +  ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, v1, v1, v2);
  +  ret = fold_build3_loc (loc, COND_EXPR, integer_type_node,
  +ret, integer_one_node, integer_zero_node);
  +
  +  return expand_normal (ret);
 
 If you can't fold it late (which is indeed a problem for -O0),
 then I'd suggest to implement this more RTL-ish.
 So, avoid the builtin_save_expr, instead
   rtx op = expand_normal (arg);
 Don't build v1/v2 as trees (and, please fix the TODOs), but rtxes,

(acc_device_* TODOs already resolved earlier on.)

 just
   rtx v1 = GEN_INT (...);
   rtx v2 = GEN_INT (...);
   machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
   rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node));
   emit_move_insn (ret, const0_rtx);
   rtx_code_label *done_label = gen_label_rtx ();
   emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode,
  false, done_label, PROB_EVEN);
   emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode,
  false, done_label, PROB_EVEN);
   emit_move_insn (ret, const1_rtx);
   emit_label (done_label);
   return ret;
 or similar.

Thanks for the review/suggestion/code!

 Note, it would still be worthwhile to fold the builtin, at least
 when optimizing, after IPA.  Dunno if we have some property you can check,
 and Richard B. could suggest where it would be most appropriate (if GIMPLE
 guarded match.pd entry, or what), gimple_fold, etc.

I'll make a note to have a look at that later on.

 I bet I should handle omp_is_initial_device (); similarly.

Yeah.

Committed to gomp-4_0-branch in r218858:

commit da5ad5aec1c0f9b230ecb2dc00620a5598de5066
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Thu Dec 18 10:42:30 2014 +

OpenACC acc_on_device: Make builtin expansion more RTXy.

gcc/
* builtins.c (expand_builtin_acc_on_device): Make more RTXy.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218858 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  5 +
 gcc/builtins.c | 44 +---
 2 files changed, 26 insertions(+), 23 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index b370616..a3650c5 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-12-18  Thomas Schwinge  tho...@codesourcery.com
+   Jakub Jelinek  ja...@redhat.com
+
+   * builtins.c (expand_builtin_acc_on_device): Make more RTXy.
+
 2014-12-17  Thomas Schwinge  tho...@codesourcery.com
Bernd Schmidt  ber...@codesourcery.com
 
diff --git gcc/builtins.c gcc/builtins.c
index fcf3f53..e946521 100644
--- gcc/builtins.c
+++ gcc/builtins.c
@@ -5889,38 +5889,36 @@ expand_stack_save (void)
acceleration device (ACCEL_COMPILER conditional).  */
 
 static rtx
-expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED)
+expand_builtin_acc_on_device (tree exp, rtx target)
 {
   if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE))
 return NULL_RTX;
 
-  tree arg, v1, v2, ret;
-  location_t loc;
-
-  arg = CALL_EXPR_ARG (exp, 0);
-  arg = builtin_save_expr (arg);
-  loc = EXPR_LOCATION (exp);
-
-  /* Build: (arg == v1 || arg == v2) ? 1 : 0.  */
+  tree arg = CALL_EXPR_ARG (exp, 0);
 
+  /* Return (arg == v1 || arg == v2) ? 1 : 0.  */
+  machine_mode v_mode = TYPE_MODE (TREE_TYPE (arg));
+  rtx v = expand_normal (arg), v1, v2;
 #ifdef ACCEL_COMPILER
-  v1 = build_int_cst (TREE_TYPE (arg), GOMP_DEVICE_NOT_HOST);
-  v2 = build_int_cst (TREE_TYPE (arg), 

Re: OpenACC middle end changes

2014-12-18 Thread Thomas Schwinge
 directives I find
 the gcc_assert (!is_gimple_omp_oacc_specifically (ctx-stmt));
 completely unnecessary.

Now all removed.  My thinking was that some of those clauses are
parsed/generated not only in the front ends, but also synthesized in
middle end processing, and I wanted to catch those.

  @@ -1625,13 +1799,41 @@ scan_sharing_clauses (tree clauses, omp_context 
  *ctx)
  case OMP_CLAUSE_DIST_SCHEDULE:
  case OMP_CLAUSE_DEPEND:
  case OMP_CLAUSE__CILK_FOR_COUNT_:
  + gcc_assert (!is_gimple_omp_oacc_specifically (ctx-stmt));
  + /* FALLTHRU */
  +   case OMP_CLAUSE_IF:
 
 [...] if there are
 some spots you want to keep them in for now, consider gcc_checking_assert
 instead.

Now using this a few times.


  --- gcc/tree-nested.c
  +++ gcc/tree-nested.c
  @@ -627,6 +627,8 @@ walk_gimple_omp_for (gimple for_stmt,
   walk_stmt_fn callback_stmt, walk_tree_fn 
  callback_op,
   struct nesting_info *info)
   {
  +  gcc_assert (!is_gimple_omp_oacc_specifically (for_stmt));
  +
 
 That surely can be reached and you can easily construct testcase, can't you?
 
  @@ -1323,6 +1325,10 @@ convert_nonlocal_reference_stmt 
  (gimple_stmt_iterator *gsi, bool *handled_ops_p,
  }
 break;
   
  +case GIMPLE_OACC_KERNELS:
  +case GIMPLE_OACC_PARALLEL:
  +  gcc_unreachable ();
  +
 
 Ditto etc.

Same reasoning as for gimple_copy given above.  (And,
asserts/gcc_unreachable now all gone.)


Do you want me to repost the OpenACC Middle End changes patch, or would
you be OK with reviewing the code on gomp-4_0-branch, diffing against the
last trunk merge point, 0fcfaa33cbf333ac69cc2b01a7277e5272ff8a3d,
r218679?


Grüße,
 Thomas


pgpt9HC17Kacl.pgp
Description: PGP signature


Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 11:46:00AM +0100, Thomas Schwinge wrote:
  just
rtx v1 = GEN_INT (...);
rtx v2 = GEN_INT (...);
machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node));
emit_move_insn (ret, const0_rtx);
rtx_code_label *done_label = gen_label_rtx ();
emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode,
 false, done_label, PROB_EVEN);
emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode,
 false, done_label, PROB_EVEN);
emit_move_insn (ret, const1_rtx);
emit_label (done_label);
return ret;
  or similar.
 
 Thanks for the review/suggestion/code!

Note, as I found later, emit_cmp_and_jump_insns is good enough only
for certain modes on certain architectures (in particular, for
cases where can_compare_p returns true).
So it is better to use do_compare_rtx_and_jump instead of
emit_cmp_and_jump_insns, because it handles also the cases which
emit_cmp_and_jump_insns silently mishandles.  You'll need to reorder
the arguments a little bit and add one NULL_RTX argument.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63848#c4

Jakub


Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 12:07:01PM +0100, Thomas Schwinge wrote:
   - case GF_OMP_FOR_KIND_SIMD:
   -   kind =  simd;
   -   break;
   - case GF_OMP_FOR_KIND_CILKSIMD:
   -   kind =  cilksimd;
   -   break;
 case GF_OMP_FOR_KIND_DISTRIBUTE:
   kind =  distribute;
   break;
 case GF_OMP_FOR_KIND_CILKFOR:
   kind =  _Cilk_for;
   break;
   + case GF_OMP_FOR_KIND_OACC_LOOP:
   +   kind =  oacc_loop;
   +   break;
   + case GF_OMP_FOR_KIND_SIMD:
   +   kind =  simd;
   +   break;
   + case GF_OMP_FOR_KIND_CILKSIMD:
   +   kind =  cilksimd;
   +   break;
  
  Why the reshuffling?  The result isn't alphabetically sorted
  anyway.  I'd just add new stuff at the end ;)
 
 It's the order in which the GF_OMP_FOR_KIND_* are defined.  At least for
 my mind ;-) that makes it very easy to grasp that all of them are
 covered.

Ok.

   +/* Return true if STMT is any of the OpenACC types specifically.  */
   +
   +static inline bool
   +is_gimple_omp_oacc_specifically (const_gimple stmt)
  
  Why not is_gimple_oacc or gimple_oacc_p ?
 
 The idea is to make it clear in the name that STMT must be an OMP one.
 Now renamed to the shorter is_gimple_omp_oacc.

Ok.

  If you want to shift from bitmasks in the enum
  to extra on the side bits (why?), then combined
  for parallel is another thing.
 
 Right, but I've now dropped (reverted) this and further gimplification
 changes.  Maybe this is material for the next stage 1, but maybe not
 useful enough.

Ack.

 Do you want me to repost the OpenACC Middle End changes patch, or would
 you be OK with reviewing the code on gomp-4_0-branch, diffing against the
 last trunk merge point, 0fcfaa33cbf333ac69cc2b01a7277e5272ff8a3d,
 r218679?

So, is what is on the gomp-4_0-branch now all that you'd like to merge to
trunk now?  Has it been tested on nvptx?  I guess we should test it with
XeonPhi offloading too to make sure it doesn't break.
And then you or together with your coworkers should write the summary
ChangeLogs (i.e. what changed compared to trunk in a single giant entry
for each ChangeLog file as opposed to many ChangeLog.gomp change entries).

Jakub


Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote:
 So, is what is on the gomp-4_0-branch now all that you'd like to merge to
 trunk now?  Has it been tested on nvptx?  I guess we should test it with
 XeonPhi offloading too to make sure it doesn't break.
 And then you or together with your coworkers should write the summary
 ChangeLogs (i.e. what changed compared to trunk in a single giant entry
 for each ChangeLog file as opposed to many ChangeLog.gomp change entries).

Also, it would be nice to update wiki/Offloading to give details on all the
steps how to configure nvptx offloading (how to grab nvptx-newlib,
nvptx-tools, how to configure nvptx-none compiler, in what order to build
those etc.).

Jakub


Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 01:02:22PM +0100, Jakub Jelinek wrote:
 On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote:
  So, is what is on the gomp-4_0-branch now all that you'd like to merge to
  trunk now?  Has it been tested on nvptx?  I guess we should test it with
  XeonPhi offloading too to make sure it doesn't break.
  And then you or together with your coworkers should write the summary
  ChangeLogs (i.e. what changed compared to trunk in a single giant entry
  for each ChangeLog file as opposed to many ChangeLog.gomp change entries).
 
 Also, it would be nice to update wiki/Offloading to give details on all the
 steps how to configure nvptx offloading (how to grab nvptx-newlib,
 nvptx-tools, how to configure nvptx-none compiler, in what order to build
 those etc.).

FYI, just tried to build gomp-4_0-branch with:
../configure --build=x86_64-intelmicemul-linux-gnu 
--host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu 
--enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap
make -j16
and the build failed with:
../../gcc/builtins.c: In function ‘rtx_def* expand_builtin_acc_on_device(tree, 
rtx)’:
../../gcc/builtins.c:5904:17: error: ‘ACCEL_COMPILER_acc_device’ was not 
declared in this scope
   v2 = GEN_INT (ACCEL_COMPILER_acc_device);
 ^
../../gcc/rtl.h:3186:51: note: in definition of macro ‘GEN_INT’
 #define GEN_INT(N)  gen_rtx_CONST_INT (VOIDmode, (N))
   ^

Where is ACCEL_COMPILER_acc_device macro supposed to be defined?

Jakub


Re: OpenACC middle end changes

2014-12-18 Thread Thomas Schwinge
Hi Jakub!

On Thu, 18 Dec 2014 13:15:38 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Dec 18, 2014 at 01:02:22PM +0100, Jakub Jelinek wrote:
  On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote:
   So, is what is on the gomp-4_0-branch now all that you'd like to merge to
   trunk now?

Basically, yes.  Only basically, because there are still a few
unaddressed review issues in the front ends -- which I'll look into now.
(Meaning really: now.)  :-)


Doing the merge as one big commit on trunk will be the easiest
approach, of course.  Is that OK, or is there any requirement to single
out any of the changes, such as the
libgomp/testsuite/libgomp-test-support.exp file just discussed, or the
libgomp »Offloading and Multi Processing Runtime Library« renaming, or
anything else?


   Has it been tested on nvptx?

I have always been testing on gomp-4_0-branch with ACC_DEVICE_TYPE=host
and ACC_DEVICE_TYPE=host_nonshm, plus with ACC_DEVICE_TYPE=nvidia in an
internal branch.  This branch corresponds to gomp-4_0-branch, but also
includes a few patches related to offloading that Bernd has posted for
trunk approval, but has not yet gotten approved.


   I guess we should test it with
   XeonPhi offloading too to make sure it doesn't break.

Right.  Do you happen to be set up for such testing?  I have not yet
managed to properly change my build/test scripts for
x86_64-intelmicemul-linux-gnu.


   And then you or together with your coworkers should write the summary
   ChangeLogs (i.e. what changed compared to trunk in a single giant entry
   for each ChangeLog file as opposed to many ChangeLog.gomp change entries).

Right, I'll do that once it's time to merge -- otherwise it'll be too
cumbersome to keep those up to date.


  Also, it would be nice to update wiki/Offloading to give details on all the
  steps how to configure nvptx offloading (how to grab nvptx-newlib,
  nvptx-tools, how to configure nvptx-none compiler, in what order to build
  those etc.).

Right.


 FYI, just tried to build gomp-4_0-branch with:
 ../configure --build=x86_64-intelmicemul-linux-gnu 
 --host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu 
 --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap
 make -j16
 and the build failed with:
 ../../gcc/builtins.c: In function ‘rtx_def* 
 expand_builtin_acc_on_device(tree, rtx)’:
 ../../gcc/builtins.c:5904:17: error: ‘ACCEL_COMPILER_acc_device’ was not 
 declared in this scope
v2 = GEN_INT (ACCEL_COMPILER_acc_device);
  ^
 ../../gcc/rtl.h:3186:51: note: in definition of macro ‘GEN_INT’
  #define GEN_INT(N)  gen_rtx_CONST_INT (VOIDmode, (N))
^
 
 Where is ACCEL_COMPILER_acc_device macro supposed to be defined?

From b6781092de7cc9fc8c24600815e0a1223e1241f5 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge tho...@codesourcery.com
Date: Wed, 17 Dec 2014 10:10:53 +0100
Subject: [PATCH] Intel MIC offloading.

---
 gcc/config.gcc |  1 +
 gcc/config/i386/intelmic-offload.h | 35 +++
 2 files changed, 36 insertions(+)
 create mode 100644 gcc/config/i386/intelmic-offload.h

diff --git gcc/config.gcc gcc/config.gcc
index 8541274..faad47d 100644
--- gcc/config.gcc
+++ gcc/config.gcc
@@ -2906,6 +2906,7 @@ esac
 case ${target} in
 *-intelmic-* | *-intelmicemul-*)
tmake_file=${tmake_file} i386/t-intelmic
+   tm_file=${tm_file} i386/intelmic-offload.h
;;
 esac
 
diff --git gcc/config/i386/intelmic-offload.h gcc/config/i386/intelmic-offload.h
new file mode 100644
index 000..dc346c7
--- /dev/null
+++ gcc/config/i386/intelmic-offload.h
@@ -0,0 +1,35 @@
+/* Definitions for Intel MIC offloading.
+
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   http://www.gnu.org/licenses/.  */
+
+#ifndef INTELMIC_OFFLOAD_H
+#define INTELMIC_OFFLOAD_H
+
+/* Support for OpenACC acc_on_device.  */
+
+#include gomp-constants.h
+
+#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC
+
+#endif
-- 
1.9.1


Grüße,
 Thomas



Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 01:24:20PM +0100, Thomas Schwinge wrote:
 Hi Jakub!
 
 On Thu, 18 Dec 2014 13:15:38 +0100, Jakub Jelinek ja...@redhat.com wrote:
  On Thu, Dec 18, 2014 at 01:02:22PM +0100, Jakub Jelinek wrote:
   On Thu, Dec 18, 2014 at 12:38:53PM +0100, Jakub Jelinek wrote:
So, is what is on the gomp-4_0-branch now all that you'd like to merge 
to
trunk now?
 
 Basically, yes.  Only basically, because there are still a few
 unaddressed review issues in the front ends -- which I'll look into now.
 (Meaning really: now.)  :-)
 
 
 Doing the merge as one big commit on trunk will be the easiest
 approach, of course.  Is that OK, or is there any requirement to single
 out any of the changes, such as the
 libgomp/testsuite/libgomp-test-support.exp file just discussed, or the
 libgomp »Offloading and Multi Processing Runtime Library« renaming, or
 anything else?

Doing one big merge is ok with me.  If one needs to bisect something, they
can look at the gomp-4_0-branch.

Has it been tested on nvptx?
 
 I have always been testing on gomp-4_0-branch with ACC_DEVICE_TYPE=host
 and ACC_DEVICE_TYPE=host_nonshm, plus with ACC_DEVICE_TYPE=nvidia in an
 internal branch.  This branch corresponds to gomp-4_0-branch, but also
 includes a few patches related to offloading that Bernd has posted for
 trunk approval, but has not yet gotten approved.

Do you have a list of them (URLs)?  Have they been pinged?
Are they show stoppers for the offloading, or just some tests fail because
of that?

I guess we should test it with
XeonPhi offloading too to make sure it doesn't break.
 
 Right.  Do you happen to be set up for such testing?  I have not yet
 managed to properly change my build/test scripts for
 x86_64-intelmicemul-linux-gnu.

Anyone with x86_64-linux should be able to test that (i.e. the emulation),
for real offloading (that ix x86_64-intelmic-linux-gnu) you supposedly need
2 libraries, some kernel module, some distro installed on the offloading
device and most importantly the hw.

 --- gcc/config.gcc
 +++ gcc/config.gcc
 @@ -2906,6 +2906,7 @@ esac
  case ${target} in
  *-intelmic-* | *-intelmicemul-*)
   tmake_file=${tmake_file} i386/t-intelmic
 + tm_file=${tm_file} i386/intelmic-offload.h
   ;;
  esac
  
 diff --git gcc/config/i386/intelmic-offload.h 
 gcc/config/i386/intelmic-offload.h
 new file mode 100644
 index 000..dc346c7
 --- /dev/null
 +++ gcc/config/i386/intelmic-offload.h
 @@ -0,0 +1,35 @@
 +/* Definitions for Intel MIC offloading.
 +
 +   Copyright (C) 2014 Free Software Foundation, Inc.
 +
 +   This file is part of GCC.
 +
 +   GCC is free software; you can redistribute it and/or modify
 +   it under the terms of the GNU General Public License as published by
 +   the Free Software Foundation; either version 3, or (at your option)
 +   any later version.
 +
 +   GCC is distributed in the hope that it will be useful,
 +   but WITHOUT ANY WARRANTY; without even the implied warranty of
 +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +   GNU General Public License for more details.
 +
 +   Under Section 7 of GPL version 3, you are granted additional
 +   permissions described in the GCC Runtime Library Exception, version
 +   3.1, as published by the Free Software Foundation.
 +
 +   You should have received a copy of the GNU General Public License and
 +   a copy of the GCC Runtime Library Exception along with this program;
 +   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 +   http://www.gnu.org/licenses/.  */
 +
 +#ifndef INTELMIC_OFFLOAD_H
 +#define INTELMIC_OFFLOAD_H
 +
 +/* Support for OpenACC acc_on_device.  */
 +
 +#include gomp-constants.h
 +
 +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC
 +
 +#endif

LGTM.

Jakub


Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 01:31:45PM +0100, Jakub Jelinek wrote:
  --- gcc/config.gcc
  +++ gcc/config.gcc
  @@ -2906,6 +2906,7 @@ esac
   case ${target} in
   *-intelmic-* | *-intelmicemul-*)
  tmake_file=${tmake_file} i386/t-intelmic
  +   tm_file=${tm_file} i386/intelmic-offload.h
  ;;
   esac
   
  +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC
  +
  +#endif

Oh, and where is this defined for nvptx-none target?

Jakub


Re: OpenACC middle end changes

2014-12-18 Thread Thomas Schwinge
Hi Jakub!

On Thu, 18 Dec 2014 12:33:11 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Dec 18, 2014 at 11:46:00AM +0100, Thomas Schwinge wrote:
   just
 rtx v1 = GEN_INT (...);
 rtx v2 = GEN_INT (...);
 machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
 rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node));
 emit_move_insn (ret, const0_rtx);
 rtx_code_label *done_label = gen_label_rtx ();
 emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode,
false, done_label, PROB_EVEN);
 emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode,
false, done_label, PROB_EVEN);
 emit_move_insn (ret, const1_rtx);
 emit_label (done_label);
 return ret;
   or similar.
  
  Thanks for the review/suggestion/code!
 
 Note, as I found later, emit_cmp_and_jump_insns is good enough only
 for certain modes on certain architectures (in particular, for
 cases where can_compare_p returns true).
 So it is better to use do_compare_rtx_and_jump instead of
 emit_cmp_and_jump_insns, because it handles also the cases which
 emit_cmp_and_jump_insns silently mishandles.  You'll need to reorder
 the arguments a little bit and add one NULL_RTX argument.
 See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63848#c4

Thanks again; committed to gomp-4_0-branch in r218862:

commit a58e1475324e6dd6c34a95883f5efc854e204fde
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Thu Dec 18 13:13:06 2014 +

OpenACC acc_on_device: Harden builtin expansion.

gcc/
* builtins.c (expand_builtin_acc_on_device): Use
do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218862 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp | 3 +++
 gcc/builtins.c | 8 
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index a3650c5..1e6df5f 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,6 +1,9 @@
 2014-12-18  Thomas Schwinge  tho...@codesourcery.com
Jakub Jelinek  ja...@redhat.com
 
+   * builtins.c (expand_builtin_acc_on_device): Use
+   do_compare_rtx_and_jump instead of emit_cmp_and_jump_insns.
+
* builtins.c (expand_builtin_acc_on_device): Make more RTXy.
 
 2014-12-17  Thomas Schwinge  tho...@codesourcery.com
diff --git gcc/builtins.c gcc/builtins.c
index e946521..33025a5 100644
--- gcc/builtins.c
+++ gcc/builtins.c
@@ -5911,10 +5911,10 @@ expand_builtin_acc_on_device (tree exp, rtx target)
 target = gen_reg_rtx (target_mode);
   emit_move_insn (target, const0_rtx);
   rtx_code_label *done_label = gen_label_rtx ();
-  emit_cmp_and_jump_insns (v, v1, NE, NULL_RTX, v_mode,
-  false, done_label, PROB_EVEN);
-  emit_cmp_and_jump_insns (v, v2, NE, NULL_RTX, v_mode,
-  false, done_label, PROB_EVEN);
+  do_compare_rtx_and_jump (v, v1, NE, false, v_mode, NULL_RTX,
+  NULL_RTX, done_label, PROB_EVEN);
+  do_compare_rtx_and_jump (v, v2, NE, false, v_mode, NULL_RTX,
+  NULL_RTX, done_label, PROB_EVEN);
   emit_move_insn (target, const1_rtx);
   emit_label (done_label);
 


Grüße,
 Thomas


pgpybie4WoKNR.pgp
Description: PGP signature


Re: OpenACC middle end changes

2014-12-18 Thread Thomas Schwinge
Hi Jakub!

On Thu, 18 Dec 2014 13:36:16 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Dec 18, 2014 at 01:31:45PM +0100, Jakub Jelinek wrote:
   --- gcc/config.gcc
   +++ gcc/config.gcc
   @@ -2906,6 +2906,7 @@ esac
case ${target} in
*-intelmic-* | *-intelmicemul-*)
 tmake_file=${tmake_file} i386/t-intelmic
   + tm_file=${tm_file} i386/intelmic-offload.h
 ;;
esac

   +#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC
   +
   +#endif
 
 Oh, and where is this defined for nvptx-none target?

One of the pending commits (together with the nvptx mkoffload); I now
applied the following to gomp-4_0-branch in r218863:

commit c3f63a62eb4332f651be5fd377d2d289c8c949f5
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Thu Dec 18 13:13:17 2014 +

Support for OpenACC acc_on_device in offloading configurations.

gcc/
* config/i386/intelmic-offload.h: New file.
* config/nvptx/offload.h: Likewise.
* config.gcc *-intelmic-*, *-intelmicemul-*, nvptx-*: Point to
them via tm_file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218863 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  7 +++
 gcc/config.gcc |  2 ++
 gcc/config/i386/intelmic-offload.h | 35 +++
 gcc/config/nvptx/offload.h | 35 +++
 4 files changed, 79 insertions(+)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 1e6df5f..a744ebf 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,4 +1,11 @@
 2014-12-18  Thomas Schwinge  tho...@codesourcery.com
+
+   * config/i386/intelmic-offload.h: New file.
+   * config/nvptx/offload.h: Likewise.
+   * config.gcc *-intelmic-*, *-intelmicemul-*, nvptx-*: Point to
+   them via tm_file.
+
+2014-12-18  Thomas Schwinge  tho...@codesourcery.com
Jakub Jelinek  ja...@redhat.com
 
* builtins.c (expand_builtin_acc_on_device): Use
diff --git gcc/config.gcc gcc/config.gcc
index 8541274..1e453e9 100644
--- gcc/config.gcc
+++ gcc/config.gcc
@@ -2178,6 +2178,7 @@ nios2-*-*)
 nvptx-*)
tm_file=${tm_file} newlib-stdint.h
tmake_file=nvptx/t-nvptx
+   tm_file=${tm_file} nvptx/offload.h
;;
 pdp11-*-*)
tm_file=${tm_file} newlib-stdint.h
@@ -2906,6 +2907,7 @@ esac
 case ${target} in
 *-intelmic-* | *-intelmicemul-*)
tmake_file=${tmake_file} i386/t-intelmic
+   tm_file=${tm_file} i386/intelmic-offload.h
;;
 esac
 
diff --git gcc/config/i386/intelmic-offload.h gcc/config/i386/intelmic-offload.h
new file mode 100644
index 000..bea18ed
--- /dev/null
+++ gcc/config/i386/intelmic-offload.h
@@ -0,0 +1,35 @@
+/* Support for Intel MIC offloading.
+
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   http://www.gnu.org/licenses/.  */
+
+#ifndef INTELMIC_OFFLOAD_H
+#define INTELMIC_OFFLOAD_H
+
+/* Support for OpenACC acc_on_device.  */
+
+#include gomp-constants.h
+
+#define ACCEL_COMPILER_acc_device GOMP_DEVICE_INTEL_MIC
+
+#endif
diff --git gcc/config/nvptx/offload.h gcc/config/nvptx/offload.h
new file mode 100644
index 000..63f9a02
--- /dev/null
+++ gcc/config/nvptx/offload.h
@@ -0,0 +1,35 @@
+/* Support for Nvidia PTX offloading.
+
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software 

Re: OpenACC middle end changes

2014-12-18 Thread Jakub Jelinek
On Thu, Dec 18, 2014 at 12:07:01PM +0100, Thomas Schwinge wrote:
 Many thanks for the review comments!  The very most have been addresed,
 here are just a few comments.  If you feel strongly/differently about
 any, I'll address those, too.

So, with your latest change both compilers build:
mkdir objmic; cd objmic
../configure --build=x86_64-intelmicemul-linux-gnu 
--host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu 
--enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap
make -j16
make DESTDIR=`pwd`/../objinst install
mkdir ../obj; cd ../obj
../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
--target=x86_64-pc-linux-gnu 
--enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gomp-4.0/objmic 
--disable-bootstrap
make -j16

But there are issues during make check.
I first did:
make -j16 -k check RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} gomp.exp 
goacc.exp goacc-gomp.exp'
and that shows:
Making a new config file...
echo set tmpdir /usr/src/gomp-4.0/obj/gcc/testsuite  ./site.tmp
rm -rf testsuite/gcc-parallel
rm -rf testsuite/g++-parallel
rm -rf testsuite/gfortran-parallel
rm -rf testsuite/objc-parallel
mkdir: cannot create directory ‘testsuite’: File exists
make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc'
make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc'
make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc'
make[1]: Entering directory '/usr/src/gomp-4.0/obj/gcc'
mkdir: cannot create directory ‘plugin’: File exists
mkdir: cannot create directory ‘plugin’mkdir: : File existscannot create 
directory ‘plugin’
: File exists
mkdir: cannot create directory ‘plugin’: File exists
Makefile:3787: recipe for target 'check-parallel-gcc_1' failed
make[1]: [check-parallel-gcc_1] Error 1 (ignored)
Makefile:3787: recipe for target 'check-parallel-gcc_2' failed
make[1]: [check-parallel-gcc_2] Error 1 (ignored)
Makefile:3787: recipe for target 'check-parallel-gcc_3' failed
make[1]: [check-parallel-gcc_3] Error 1 (ignored)
Makefile:3787: recipe for target 'check-parallel-gcc_4' failed
make[1]: [check-parallel-gcc_4] Error 1 (ignored)

Clearly preexisting problem even on trunk, so not a show stopper for this.

And in libgomp the testing fails completely:

Making check in testsuite
make[1]: Entering directory 
'/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
make  check-DEJAGNU
make[2]: Entering directory 
'/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
Making a new site.exp file...
srcdir=`CDPATH=${ZSH_VERSION+.}:  cd ../../../../libgomp/testsuite  pwd`; 
export srcdir; \
EXPECT=expect; export EXPECT; \
runtest=runtest ; \
if /bin/sh -c $runtest --version  /dev/null 21; then \
  exit_status=0; l='libgomp'; for tool in $l; do \
if $runtest  --tool $tool --srcdir $srcdir ; \
then :; else exit_status=1; fi; \
  done; \
else echo WARNING: could not find \`runtest' 12; :;\
fi; \
exit $exit_status
WARNING: Couldn't find the global config file.
ERROR: tcl error sourcing libgomp-test-support.exp.
can't read (target_alias): no such variable
while executing
set offload_additional_options  
-B/usr/src/gomp-4.0/objmic/libexec/gcc/$(target_alias)/$(gcc_version) 
-B/usr/src/gomp-4.0/objmic/bin
(file libgomp-test-support.exp line 5)
invoked from within
source libgomp-test-support.exp
(uplevel body line 1)
invoked from within
uplevel #0 source libgomp-test-support.exp
invoked from within
catch uplevel #0 source $file
Makefile:277: recipe for target 'check-DEJAGNU' failed
make[2]: *** [check-DEJAGNU] Error 1
make[2]: Leaving directory 
'/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
Makefile:314: recipe for target 'check-am' failed
make[1]: *** [check-am] Error 2
make[1]: Leaving directory 
'/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
Makefile:856: recipe for target 'check-recursive' failed
make: *** [check-recursive] Error 1

So clearly the *.exp files need to be taught where to look for 
libgomp-test-support.exp.

Jakub


libgomp offloading testing (was: OpenACC middle end changes)

2014-12-18 Thread Thomas Schwinge
Hi Jakub!

On Thu, 18 Dec 2014 15:20:42 +0100, Jakub Jelinek ja...@redhat.com wrote:
 So, with your latest change both compilers build:
 mkdir objmic; cd objmic
 ../configure --build=x86_64-intelmicemul-linux-gnu 
 --host=x86_64-intelmicemul-linux-gnu --target=x86_64-intelmicemul-linux-gnu 
 --enable-as-accelerator-for=x86_64-pc-linux-gnu --disable-bootstrap
 make -j16
 make DESTDIR=`pwd`/../objinst install
 mkdir ../obj; cd ../obj
 ../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu 
 --target=x86_64-pc-linux-gnu 
 --enable-offload-targets=x86_64-intelmicemul-linux-gnu=/usr/src/gomp-4.0/objmic
  --disable-bootstrap
 make -j16

Thanks; I'll look into reproducing such a build.


 And in libgomp the testing fails completely:

What happens, in my understanding, is:

 Making check in testsuite
 make[1]: Entering directory 
 '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
 make  check-DEJAGNU
 make[2]: Entering directory 
 '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
 Making a new site.exp file...
 srcdir=`CDPATH=${ZSH_VERSION+.}:  cd ../../../../libgomp/testsuite  
 pwd`; export srcdir; \
 EXPECT=expect; export EXPECT; \
 runtest=runtest ; \
 if /bin/sh -c $runtest --version  /dev/null 21; then \
   exit_status=0; l='libgomp'; for tool in $l; do \
 if $runtest  --tool $tool --srcdir $srcdir ; \
 then :; else exit_status=1; fi; \
   done; \
 else echo WARNING: could not find \`runtest' 12; :;\
 fi; \
 exit $exit_status
 WARNING: Couldn't find the global config file.
 ERROR: tcl error sourcing libgomp-test-support.exp.
 can't read (target_alias): no such variable

The variable $(target_alias) is not available when in the line:

 while executing
 set offload_additional_options  
 -B/usr/src/gomp-4.0/objmic/libexec/gcc/$(target_alias)/$(gcc_version) 
 -B/usr/src/gomp-4.0/objmic/bin

... in the file:

 (file libgomp-test-support.exp line 5)
 invoked from within
 source libgomp-test-support.exp

... it is being parsed.

Should target_alias and gcc_version be instantiated (AC_SUBST) by
Autoconf already, when creating the libgomp-test-support.exp file from
libgomp/testsuite/libgomp-test-support.exp.in?  Or, should those be
written (in libgomp/plugin/configfrag.ac) in TCL syntax, and evaluated
only once libgomp-test-support.exp is sourced?  target_alias is being
provided in site.exp (as generated by libgomp/testsuite/Makefile), but
gcc_version is not.

 (uplevel body line 1)
 invoked from within
 uplevel #0 source libgomp-test-support.exp
 invoked from within
 catch uplevel #0 source $file
 Makefile:277: recipe for target 'check-DEJAGNU' failed
 make[2]: *** [check-DEJAGNU] Error 1
 make[2]: Leaving directory 
 '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
 Makefile:314: recipe for target 'check-am' failed
 make[1]: *** [check-am] Error 2
 make[1]: Leaving directory 
 '/usr/src/gomp-4.0/obj/x86_64-pc-linux-gnu/libgomp/testsuite'
 Makefile:856: recipe for target 'check-recursive' failed
 make: *** [check-recursive] Error 1

 So clearly the *.exp files need to be taught where to look for 
 libgomp-test-support.exp.

That's not the problem, if I'm understanding correctly.


Grüße,
 Thomas


pgpQUlVKSWcfG.pgp
Description: PGP signature


Re: OpenACC middle end changes

2014-12-18 Thread Thomas Schwinge
Hi Jakub!

On Thu, 13 Nov 2014 19:09:49 +0100, Jakub Jelinek ja...@redhat.com wrote:
  --- gcc/builtins.c
  +++ gcc/builtins.c

  +static rtx
  +expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED)
  +{
  +  if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE))
  +return NULL_RTX;
  +
  +  tree arg, v1, v2, ret;
  +  location_t loc;
  +
  +  arg = CALL_EXPR_ARG (exp, 0);
  +  arg = builtin_save_expr (arg);
  +  loc = EXPR_LOCATION (exp);
  +
  +  /* Build: (arg == v1 || arg == v2) ? 1 : 0.  */
  +
  +#ifdef ACCEL_COMPILER
  +  v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_not_host */ 3);
  +  v2 = build_int_cst (TREE_TYPE (arg), ACCEL_COMPILER_acc_device);
  +#else
  +  v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_none */ 0);
  +  v2 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_host */ 2);
  +#endif
  +
  +  v1 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v1);
  +  v2 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v2);
  +
  +  /* Can't use TRUTH_ORIF_EXPR, as that is not supported by
  + expand_expr_real*.  */
  +  ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, v1, v1, v2);
  +  ret = fold_build3_loc (loc, COND_EXPR, integer_type_node,
  +ret, integer_one_node, integer_zero_node);
  +
  +  return expand_normal (ret);
 
 If you can't fold it late (which is indeed a problem for -O0),
 then I'd suggest to implement this more RTL-ish.
 So, avoid the builtin_save_expr, instead
   rtx op = expand_normal (arg);
 Don't build v1/v2 as trees (and, please fix the TODOs), but rtxes,
 just
   rtx v1 = GEN_INT (...);
   rtx v2 = GEN_INT (...);
   machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
   rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node));
   emit_move_insn (ret, const0_rtx);
   rtx_code_label *done_label = gen_label_rtx ();
   emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode,
  false, done_label, PROB_EVEN);
   emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode,
  false, done_label, PROB_EVEN);
   emit_move_insn (ret, const1_rtx);
   emit_label (done_label);
   return ret;
 or similar.

;-) Yes, similar, as I've now found; committed to gomp-4_0-branch in
r218869:

commit cb37a039eb7a7375d074bc092457349312c5a2e2
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Thu Dec 18 16:07:23 2014 +

OpenACC acc_on_device: Fix logic error introduced in an earlier change.

... but which didn't show up in testing until after a libgomp rebuild, 
because
of the caching of the acc_on_device builtin that is being done in
libgomp/oacc-init.c:acc_on_device.

gcc/
* builtins.c (expand_builtin_acc_on_device): Fix logic error.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218869 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp | 2 ++
 gcc/builtins.c | 8 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index a744ebf..a21fd92 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,7 @@
 2014-12-18  Thomas Schwinge  tho...@codesourcery.com
 
+   * builtins.c (expand_builtin_acc_on_device): Fix logic error.
+
* config/i386/intelmic-offload.h: New file.
* config/nvptx/offload.h: Likewise.
* config.gcc *-intelmic-*, *-intelmicemul-*, nvptx-*: Point to
diff --git gcc/builtins.c gcc/builtins.c
index 33025a5..6891229 100644
--- gcc/builtins.c
+++ gcc/builtins.c
@@ -5909,13 +5909,13 @@ expand_builtin_acc_on_device (tree exp, rtx target)
   machine_mode target_mode = TYPE_MODE (integer_type_node);
   if (!REG_P (target) || GET_MODE (target) != target_mode)
 target = gen_reg_rtx (target_mode);
-  emit_move_insn (target, const0_rtx);
+  emit_move_insn (target, const1_rtx);
   rtx_code_label *done_label = gen_label_rtx ();
-  do_compare_rtx_and_jump (v, v1, NE, false, v_mode, NULL_RTX,
+  do_compare_rtx_and_jump (v, v1, EQ, false, v_mode, NULL_RTX,
   NULL_RTX, done_label, PROB_EVEN);
-  do_compare_rtx_and_jump (v, v2, NE, false, v_mode, NULL_RTX,
+  do_compare_rtx_and_jump (v, v2, EQ, false, v_mode, NULL_RTX,
   NULL_RTX, done_label, PROB_EVEN);
-  emit_move_insn (target, const1_rtx);
+  emit_move_insn (target, const0_rtx);
   emit_label (done_label);
 
   return target;


Grüße,
 Thomas


pgp6P0PultR2w.pgp
Description: PGP signature


Re: OpenACC middle end changes

2014-11-20 Thread Bernd Schmidt

On 11/20/2014 07:52 AM, Jakub Jelinek wrote:

On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote:

Thomas had apparently already pointed out an issue with the new gomp_target
class (there are multiple similar types of statements we want to handle with
OpenACC, they have different codes but we want to have function pointers
operating on any of them) back in July. That seems to have been ignored. By
necessity, some of David's changes are reverted in the following patch.


I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just
two new gimple_omp_target_kind GF_* flags.


If that's the case I'll leave it to Thomas to make these changes. At the 
moment I'm just trying to put together all the pieces into versions that 
apply to trunk and can be made to work together.



Bernd




Re: OpenACC middle end changes

2014-11-19 Thread Bernd Schmidt

On 11/19/2014 02:50 AM, Bernd Schmidt wrote:


@@ -8417,6 +8926,9 @@ expand_omp_target (struct omp_region *region)
/* Add the new function to the offload table.  */
vec_safe_push (offload_funcs, child_fn);

+  /* Add the new function to the offload table.  */
+  vec_safe_push (offload_funcs, child_fn);
+
/* Fix the callgraph edges for child_cfun.  Those for cfun will be
 fixed in a following pass.  */
push_cfun (child_cfun);


This hunk also needs to go away.


Bernd




Re: OpenACC middle end changes

2014-11-19 Thread Bernd Schmidt
Another change that's required is (something like) the following. For 
ptx, we need to know whether to output something as a .func (callable 
from ptx code) or a .kernel (callable from the host). That means we need 
to mark the kernel functions somehow in omp-low.c, and the following 
does that by way of a new attribute (already recognized by the nvptx 
backend).



Bernd

* omp-low.c (create_omp_child_function): Tag entrypoint
functions with a special attribute.

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 42ba317..8408025 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool 
task_copy)

break;
  }
 }
+  if (cgraph_node::get_create (decl)-offloadable
+   !lookup_attribute (omp declare target,
+   DECL_ATTRIBUTES (current_function_decl)))
+DECL_ATTRIBUTES (decl)
+  = tree_cons (get_identifier (omp target entrypoint),
+   NULL_TREE, DECL_ATTRIBUTES (decl));

   t = build_decl (DECL_SOURCE_LOCATION (decl),
  RESULT_DECL, NULL_TREE, void_type_node);



Re: OpenACC middle end changes

2014-11-19 Thread Jakub Jelinek
On Wed, Nov 19, 2014 at 08:52:40PM +0100, Bernd Schmidt wrote:
 Another change that's required is (something like) the following. For ptx,
 we need to know whether to output something as a .func (callable from ptx
 code) or a .kernel (callable from the host). That means we need to mark the
 kernel functions somehow in omp-low.c, and the following does that by way of
 a new attribute (already recognized by the nvptx backend).

I think Richard's and Honza's preference in this case is a flag in
cgraph_node instead of an attribute.

   * omp-low.c (create_omp_child_function): Tag entrypoint
 functions with a special attribute.
 
 diff --git a/gcc/omp-low.c b/gcc/omp-low.c
 index 42ba317..8408025 100644
 --- a/gcc/omp-low.c
 +++ b/gcc/omp-low.c
 @@ -2228,6 +2228,12 @@ create_omp_child_function (omp_context *ctx, bool
 task_copy)
 break;
   }
  }
 +  if (cgraph_node::get_create (decl)-offloadable
 +   !lookup_attribute (omp declare target,
 +   DECL_ATTRIBUTES (current_function_decl)))
 +DECL_ATTRIBUTES (decl)
 +  = tree_cons (get_identifier (omp target entrypoint),
 +   NULL_TREE, DECL_ATTRIBUTES (decl));
 
t = build_decl (DECL_SOURCE_LOCATION (decl),
   RESULT_DECL, NULL_TREE, void_type_node);

Jakub


Re: OpenACC middle end changes

2014-11-19 Thread Jakub Jelinek
On Thu, Nov 20, 2014 at 03:19:11AM +0100, Bernd Schmidt wrote:
 Thomas had apparently already pointed out an issue with the new gomp_target
 class (there are multiple similar types of statements we want to handle with
 OpenACC, they have different codes but we want to have function pointers
 operating on any of them) back in July. That seems to have been ignored. By
 necessity, some of David's changes are reverted in the following patch.

I thought the agreement was to use GIMPLE_OMP_TARGET gimple_code and just
two new gimple_omp_target_kind GF_* flags.

Jakub


Re: OpenACC middle end changes

2014-11-15 Thread Gerald Pfeifer
On Thursday 2014-11-13 17:59, Thomas Schwinge wrote:
 Here is our current set of OpenACC middle end changes.  As discussed
 before, this is not yet all of OpenACC 2.0 -- we shall a) document what
 is working already, and b) continue to work on closing the gap.

As David wrote in a different context, strchrnul is a GNU extension and 
not present at least on AIX and FreeBSD 8 (and possibly 9).

Gerald

PS: Sorry, this mail got stuck in my outbox.


Re: OpenACC middle end changes

2014-11-14 Thread Jakub Jelinek
On Fri, Nov 14, 2014 at 11:28:15AM +0100, Richard Biener wrote:
  This patch is based on the last merge of trunk into gomp-4_0-branch,
  9be82689 (trunk r216846, 2014-10-29), and still includes an old version
  of the offloading patches, as currently present on gomp-4_0-branch.
  We're already working on rebasing onto the set of offloading patches that
  has just been committed to trunk, but I didn't want to have this delay
  any further (it seems, the rebase/merge is not always trivial) the


* ChangeLog snippets still need to be written.
 
 Badly needed - I wonder why you need changes to LTO files at all.

I think he doesn't, but the LTO changes that were committed to trunk by
Intel haven't been integrated yet into the branch AFAIK; at least
I've skipped all those bits I expect to be in already.  See above
Thomas' comment.

Jakub


Re: OpenACC middle end changes

2014-11-13 Thread Jakub Jelinek
On Thu, Nov 13, 2014 at 05:59:11PM +0100, Thomas Schwinge wrote:
   * should gcc/oacc-builtins.def just be merged into
 gcc/omp-builtins.def;

Why not.  The reason why they aren't in gcc/builtins.def is that
the Fortran FE doesn't source those, but OpenACC supports the same
languages as OpenMP.

 --- gcc/builtins.c
 +++ gcc/builtins.c
 @@ -5751,6 +5751,49 @@ expand_stack_save (void)
return ret;
  }
  
 +
 +/* Expand OpenACC acc_on_device.
 +
 +   This has to happen late (that is, not in early folding; expand_builtin_*,
 +   rather than fold_builtin_*), as we have to act differently for host and
 +   acceleration device (ACCEL_COMPILER conditional).  */
 +
 +static rtx
 +expand_builtin_acc_on_device (tree exp, rtx target ATTRIBUTE_UNUSED)
 +{
 +  if (!validate_arglist (exp, INTEGER_TYPE, VOID_TYPE))
 +return NULL_RTX;
 +
 +  tree arg, v1, v2, ret;
 +  location_t loc;
 +
 +  arg = CALL_EXPR_ARG (exp, 0);
 +  arg = builtin_save_expr (arg);
 +  loc = EXPR_LOCATION (exp);
 +
 +  /* Build: (arg == v1 || arg == v2) ? 1 : 0.  */
 +
 +#ifdef ACCEL_COMPILER
 +  v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_not_host */ 3);
 +  v2 = build_int_cst (TREE_TYPE (arg), ACCEL_COMPILER_acc_device);
 +#else
 +  v1 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_none */ 0);
 +  v2 = build_int_cst (TREE_TYPE (arg), /* TODO: acc_device_host */ 2);
 +#endif
 +
 +  v1 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v1);
 +  v2 = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg, v2);
 +
 +  /* Can't use TRUTH_ORIF_EXPR, as that is not supported by
 + expand_expr_real*.  */
 +  ret = fold_build3_loc (loc, COND_EXPR, integer_type_node, v1, v1, v2);
 +  ret = fold_build3_loc (loc, COND_EXPR, integer_type_node,
 +  ret, integer_one_node, integer_zero_node);
 +
 +  return expand_normal (ret);

If you can't fold it late (which is indeed a problem for -O0),
then I'd suggest to implement this more RTL-ish.
So, avoid the builtin_save_expr, instead
  rtx op = expand_normal (arg);
Don't build v1/v2 as trees (and, please fix the TODOs), but rtxes,
just
  rtx v1 = GEN_INT (...);
  rtx v2 = GEN_INT (...);
  machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
  rtx ret = gen_reg_rtx (TYPE_MODE (integer_type_node));
  emit_move_insn (ret, const0_rtx);
  rtx_code_label *done_label = gen_label_rtx ();
  emit_cmp_and_jump_insns (op, v1, NE, NULL_RTX, mode,
   false, done_label, PROB_EVEN);
  emit_cmp_and_jump_insns (op, v2, NE, NULL_RTX, mode,
   false, done_label, PROB_EVEN);
  emit_move_insn (ret, const1_rtx);
  emit_label (done_label);
  return ret;
or similar.

Note, it would still be worthwhile to fold the builtin, at least
when optimizing, after IPA.  Dunno if we have some property you can check,
and Richard B. could suggest where it would be most appropriate (if GIMPLE
guarded match.pd entry, or what), gimple_fold, etc.

I bet I should handle omp_is_initial_device (); similarly.

 @@ -1818,7 +1818,7 @@ There are also several varieties of complex statements.
  * Empty Statements::
  * Jumps::
  * Cleanups::
 -* OpenMP::
 +* OpenACC and OpenMP::

I think it might be better just to have separate sections for each, not
put them into the same.  Start with OpenMP, and in OpenACC section put the
OACC specific stuff and say what is shared with OpenMP (clauses, etc.).

 --- gcc/doc/gimple.texi
 +++ gcc/doc/gimple.texi
 @@ -439,6 +439,8 @@ The following table briefly describes the GIMPLE 
 instruction set.
  @item @code{GIMPLE_GOTO} @tab x  @tab x
  @item @code{GIMPLE_LABEL}@tab x  @tab x
  @item @code{GIMPLE_NOP}  @tab x  @tab x
 +@item @code{GIMPLE_OACC_KERNELS} @tab x  @tab x
 +@item @code{GIMPLE_OACC_PARALLEL}@tab x  @tab x
  @item @code{GIMPLE_OMP_ATOMIC_LOAD}  @tab x  @tab x
  @item @code{GIMPLE_OMP_ATOMIC_STORE} @tab x  @tab x
  @item @code{GIMPLE_OMP_CONTINUE} @tab x  @tab x
 @@ -1006,6 +1008,8 @@ Return a deep copy of statement @code{STMT}.
  * @code{GIMPLE_EH_FILTER}::
  * @code{GIMPLE_LABEL}::
  * @code{GIMPLE_NOP}::
 +* @code{GIMPLE_OACC_KERNELS}::
 +* @code{GIMPLE_OACC_PARALLEL}::
  * @code{GIMPLE_OMP_ATOMIC_LOAD}::
  * @code{GIMPLE_OMP_ATOMIC_STORE}::
  * @code{GIMPLE_OMP_CONTINUE}::

This will likely change, right?

 --- gcc/gimple-pretty-print.c
 +++ gcc/gimple-pretty-print.c
 @@ -1136,18 +1136,21 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple 
 gs, int spc, int flags)
   case GF_OMP_FOR_KIND_FOR:
 kind = ;
 break;
 - case GF_OMP_FOR_KIND_SIMD:
 -   kind =  simd;
 -   break;
 - case GF_OMP_FOR_KIND_CILKSIMD:
 -   kind =  cilksimd;
 -   break;
   case GF_OMP_FOR_KIND_DISTRIBUTE:
 kind =  distribute;
 break;
   case GF_OMP_FOR_KIND_CILKFOR:
 kind =  _Cilk_for;
  

Re: Re: OpenACC middle end changes

2014-11-13 Thread Cesar Philippidis
I'll try to respond to the reduction stuff. It's been a while since I
started working on it, so I may have lost some state.

On 11/13/2014 10:09 AM, Jakub Jelinek wrote:

 @@ -233,6 +242,90 @@ static tree scan_omp_1_op (tree *, int *, void *);
*handled_ops_p = false; \
break;
  
 +/* Helper function to get the reduction array name */
 +static const char *
 +omp_get_id (tree node)
 
 Be more specific in the function name what it is for?

It's the name of the array containing the partial reductions for
original reduction variable.

 +{
 +  const char *id = IDENTIFIER_POINTER (DECL_NAME (node));
 +  int len = strlen (omp$) + strlen (id);
 +  char *temp_name = (char *)alloca (len+1);
 +  snprintf (temp_name, len+1, gfc$%s, id);
 
 gfc$ ?

It's just a semi-random prefix I used to make the partial reduction
array identifier unique to aid with debugging. I was working on the
fortran front end at the time. Maybe s/gfc/oacc/?

 Use
   char *temp_name = XALLOCAVEC (char, len + 1);
 instead?
 
 +  return IDENTIFIER_POINTER(get_identifier (temp_name));
 
 Formatting (missing space before ( ).
 
 @@ -868,6 +981,25 @@ maybe_lookup_field (tree var, omp_context *ctx)
return n ? (tree) n-value : NULL_TREE;
  }
  
 +static inline tree
 +lookup_reduction (const char *id, omp_context *ctx)
 
 Can't you use oacc_ in the name of OpenACC specific functions?

Sure.

[snip]

 @@ -8834,6 +9492,397 @@ make_pass_expand_omp (gcc::context *ctxt)

  /* Routines to lower OpenMP directives into OMP-GIMPLE.  */
  
 +/* Helper function to preform, potentially COMPLEX_TYPE, operation and
 +   convert it to gimple.  */
 +static void
 +omp_gimple_assign_with_ops (tree_code op, tree dest, tree src, gimple_seq 
 *seq)
 
 Makes me wonder why don't you put the reduction code earlier into reduction
 clause GENERIC and then lower into clauses' GIMPLE seq.
 If there is some reason, please name it oacc at least.

I probably was trying to reuse as much of the existing code as possible.
I've swapped out too much state on this. This can be renamed too.

 +static void
 +initialize_reduction_data (tree clauses, tree nthreads, gimple_seq 
 *stmt_seqp,
 +   omp_context *ctx)
 
 Likewise.
 
 +/* Helper function to process the array of partial reductions.  Nthreads
 +   indicates the number of threads.  Unfortunately, GOACC_GET_NUM_THREADS
 +   cannot be used here, because nthreads on the host may be different than
 +   on the accelerator. */
 +
 +static void
 +finalize_reduction_data (tree clauses, tree nthreads, gimple_seq *stmt_seqp,
 + omp_context *ctx)
 
 Likewise.
 
 +/* Scan through all of the gimple stmts searching for an OMP_FOR_EXPR, and
 +   scan that for reductions.  */
 +
 +static void
 +process_reduction_data (gimple_seq *body, gimple_seq *in_stmt_seqp,
 +gimple_seq *out_stmt_seqp, omp_context *ctx)
 
 Likewise.

Thomas, would you like me to handle the renaming, or will you? I could
make those changes to gomp-4_0-branch if you like.

Cesar



Re: Re: OpenACC middle end changes

2014-11-13 Thread Jakub Jelinek
On Thu, Nov 13, 2014 at 11:03:47AM -0800, Cesar Philippidis wrote:
  @@ -233,6 +242,90 @@ static tree scan_omp_1_op (tree *, int *, void *);
 *handled_ops_p = false; \
 break;
   
  +/* Helper function to get the reduction array name */
  +static const char *
  +omp_get_id (tree node)
  
  Be more specific in the function name what it is for?
 
 It's the name of the array containing the partial reductions for
 original reduction variable.
 
  +{
  +  const char *id = IDENTIFIER_POINTER (DECL_NAME (node));
  +  int len = strlen (omp$) + strlen (id);
  +  char *temp_name = (char *)alloca (len+1);
  +  snprintf (temp_name, len+1, gfc$%s, id);
  
  gfc$ ?
 
 It's just a semi-random prefix I used to make the partial reduction
 array identifier unique to aid with debugging. I was working on the
 fortran front end at the time. Maybe s/gfc/oacc/?

Yeah, something (and please use the same string in the strlen and sprintf.
If the symbol is emitted into assembly, you need to check for
NO_DOLLARS_IN_LABELS and similar.  Oh, and please use spaces around +.
And name the function so that it is clear what is it for.

Jakub


Re: OpenACC middle end changes

2014-11-13 Thread Joseph Myers
On Thu, 13 Nov 2014, Thomas Schwinge wrote:

  gcc/doc/invoke.texi  |   14 

You're adding documentation for -fopenacc, but I don't see any .opt file 
changes in this patch, and I'd expect the option to be added in the same 
patch as its documentation.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: OpenACC middle end changes

2014-11-13 Thread Cesar Philippidis
On 11/13/2014 11:09 AM, Jakub Jelinek wrote:
 On Thu, Nov 13, 2014 at 11:03:47AM -0800, Cesar Philippidis wrote:
 @@ -233,6 +242,90 @@ static tree scan_omp_1_op (tree *, int *, void *);
*handled_ops_p = false; \
break;
  
 +/* Helper function to get the reduction array name */
 +static const char *
 +omp_get_id (tree node)

 Be more specific in the function name what it is for?

 It's the name of the array containing the partial reductions for
 original reduction variable.

 +{
 +  const char *id = IDENTIFIER_POINTER (DECL_NAME (node));
 +  int len = strlen (omp$) + strlen (id);
 +  char *temp_name = (char *)alloca (len+1);
 +  snprintf (temp_name, len+1, gfc$%s, id);

 gfc$ ?

 It's just a semi-random prefix I used to make the partial reduction
 array identifier unique to aid with debugging. I was working on the
 fortran front end at the time. Maybe s/gfc/oacc/?
 
 Yeah, something (and please use the same string in the strlen and sprintf.
 If the symbol is emitted into assembly, you need to check for
 NO_DOLLARS_IN_LABELS and similar.  Oh, and please use spaces around +.
 And name the function so that it is clear what is it for.

The attached patch cleanup the various reduction functions and their
usages. Thomas, I've applied this to gomp-4_0-branch.

Cesar
2014-11-13  Cesar Philippidis  ce...@codesourcery.com

	gcc/
	* omp-low.c (omp_get_id): Rename to...
	(oacc_get_reduction_array_id): ... this.
	(lookup_oacc_reduction): ... this.
	(lookup_reduction): Rename to...
	(maybe_lookup_reduction): Rename to...
	(maybe_lookup_oacc_reduction): ... this.
	(scan_sharing_clauses): Update calls to renamed fns.
	(lower_reduction_var_helper): Rename to...
	(oacc_lower_reduction_var_helper): ... this.
	(lower_reduction_clauses): Rename to...
	(oacc_lower_reduction_clauses): ... this.
	(omp_gimple_assign_with_ops): Rename to...
	(oacc_gimple_assign_with_ops): ... this.
	(initialize_reduction_data): Rename to ...
	(oacc_initialize_reduction_data): ... this.
	(finalize_reduction_data): Rename to...
	(oacc_finalize_reduction_data): ... this.
	(process_reduction_data): Rename to...
	(oacc_process_reduction_data): ... this.
	(lower_omp_target): Update calls to renamed fns.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index e511846..da9c5a5 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -242,15 +242,16 @@ static tree scan_omp_1_op (tree *, int *, void *);
   *handled_ops_p = false; \
   break;
 
-/* Helper function to get the reduction array name */
+/* Helper function to get the name of the array containing the partial
+   reductions for OpenACC reductions.  */
 static const char *
-omp_get_id (tree node)
+oacc_get_reduction_array_id (tree node)
 {
   const char *id = IDENTIFIER_POINTER (DECL_NAME (node));
-  int len = strlen (omp$) + strlen (id);
-  char *temp_name = (char *)alloca (len+1);
-  snprintf (temp_name, len+1, gfc$%s, id);
-  return IDENTIFIER_POINTER(get_identifier (temp_name));
+  int len = strlen (OACC) + strlen (id);
+  char *temp_name = XALLOCAVEC (char, len + 1);
+  snprintf (temp_name, len+1, OACC%s, id);
+  return IDENTIFIER_POINTER (get_identifier (temp_name));
 }
 
 /* Determine the number of threads OpenACC threads used to determine the
@@ -983,7 +984,7 @@ maybe_lookup_field (tree var, omp_context *ctx)
 }
 
 static inline tree
-lookup_reduction (const char *id, omp_context *ctx)
+lookup_oacc_reduction (const char *id, omp_context *ctx)
 {
   gcc_assert (is_gimple_omp_oacc_specifically (ctx-stmt));
 
@@ -993,7 +994,7 @@ lookup_reduction (const char *id, omp_context *ctx)
 }
 
 static inline tree
-maybe_lookup_reduction (tree var, omp_context *ctx)
+maybe_lookup_oacc_reduction (tree var, omp_context *ctx)
 {
   splay_tree_node n = NULL;
   if (ctx-reduction_map)
@@ -1759,14 +1760,15 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
 	  tree var = OMP_CLAUSE_DECL (c);
 	  tree type = get_base_type (var);
 	  tree ptype = build_pointer_type (type);
-	  tree array = create_tmp_var (ptype, omp_get_id (var));
+	  tree array = create_tmp_var (ptype,
+	   oacc_get_reduction_array_id (var));
 	  omp_context *c = (ctx-field_map ? ctx : ctx-outer);
 	  install_var_field (array, true, 3, c);
 	  install_var_local (array, c);
 
 	  /* Insert it into the current context.  */
-	  splay_tree_insert (ctx-reduction_map,
- (splay_tree_key) omp_get_id(var),
+	  splay_tree_insert (ctx-reduction_map, (splay_tree_key)
+ oacc_get_reduction_array_id (var),
  (splay_tree_value) array);
 	  splay_tree_insert (ctx-reduction_map,
  (splay_tree_key) array,
@@ -4419,8 +4421,8 @@ lower_lastprivate_clauses (tree clauses, tree predicate, gimple_seq *stmt_list,
 }
 
 static void
-lower_reduction_var_helper (gimple_seq *stmt_seqp, omp_context *ctx, tree tid,
-			tree var, tree new_var)
+oacc_lower_reduction_var_helper (gimple_seq *stmt_seqp, omp_context *ctx,
+ tree tid, tree var, tree new_var)
 {
   /* The atomic add at the end