Re: New post-LTO OpenACC pass

2015-09-28 Thread Nathan Sidwell

On 09/25/15 09:19, Bernd Schmidt wrote:

On 09/25/2015 03:03 PM, Bernd Schmidt wrote:

182  else if (acc_device_type (acc_dev->type) == acc_device_host)
(gdb) p acc_dev->type
$1 = OFFLOAD_TARGET_TYPE_HOST
(gdb) next
184  fn (hostaddrs);

It's not running the offloaded version, so the testcase I think should
fail.


... and that's because my system was no longer set up to run CUDA binaries,
after I fixed that the testcase passes.

So as far as I can tell almost everything here works as expected?


hm strange.  will take another look this week.  Thanks for looking.

nathan



Re: New post-LTO OpenACC pass

2015-09-25 Thread Nathan Sidwell

On 09/25/15 06:28, Bernd Schmidt wrote:



This is the c-c++-common/goacc/acc_on_device-2.c testcase. Is that expected to
be handled? If I change it to use __builtin_acc_on_device, I can step right into

Breakpoint 8, fold_call_stmt (stmt=0x70736e10, ignore=false) at
../../git/gcc/builtins.c:12277
12277  tree ret = NULL_TREE;

Maybe you were compiling without optimization? In that case
expand_builtin_acc_on_device (which already exists) should still end up doing
the right thing. In no case should you see a RTL call to a function, that
indicates that something else went wrong.


I think I was reading more into the std than it intended, as it claims 
on_deveice should evaluate 'to a constant'.  (no mention of 'when optimizing'). 
 It can't mean 'be useable in integral-constant-expression, as at the point we 
 need those, one doesn't know the value it should be.


thinking about it, I don't think a user can tell.  the case I had in mind (and 
have used it for), is something like


on_device (nvidia)  ? asm ("NVIDIA specific asm") : c-expr

and for that to work, one must turn the optimzer on to get the dead code 
removal, regardless of where on_device expands.  So my goal of  getting it 
expanded regardless of optimization level is not needed --- indeed getting it 
expanded in fold_call_stmt will mean the body of expand_on_device can go away (I 
think).


From the POV of what the programmer really cares about is that when optimizing 
the compiler  knows how to fold it.



Can you send me the patch you tried (and possibly a testcase you expect to be
handled), I'll see if I can find out what's going on.


Thanks!  When things didn't work, I tried getting it workong on the gomp4 
branch, as I new what to expect there.  So the patch is for that branch.


The fails I observed are:

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/gang-static-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0 
execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/gang-static-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2 
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/if-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/gang-static-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0 
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/gang-static-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2 
execution test



the diff I have is attached -- as you can see it's 'experimental'.

nathan
Index: builtins.c
===
--- builtins.c	(revision 228094)
+++ builtins.c	(working copy)
@@ -5866,6 +5866,8 @@ expand_stack_save (void)
 static rtx
 expand_builtin_acc_on_device (tree exp, rtx target)
 {
+   gcc_unreachable ();
+  
 #ifndef ACCEL_COMPILER
   gcc_assert (!get_oacc_fn_attrib (current_function_decl));
 #endif
@@ -10272,6 +10274,27 @@ fold_builtin_1 (location_t loc, tree fnd
 	return build_empty_stmt (loc);
   break;
 
+case BUILT_IN_ACC_ON_DEVICE:
+  /* Don't fold on_device until we know which compiler is active.  */
+  if (symtab->state == EXPANSION)
+	{
+	  unsigned val_host = GOMP_DEVICE_HOST;
+	  unsigned val_dev = GOMP_DEVICE_NONE;
+
+#ifdef ACCEL_COMPILER
+	  val_host = GOMP_DEVICE_NOT_HOST;
+	  val_dev = ACCEL_COMPILER_acc_device;
+#endif
+	  tree host = build2 (EQ_EXPR, boolean_type_node, arg0,
+			  build_int_cst (integer_type_node, val_host));
+	  tree dev = build2 (EQ_EXPR, boolean_type_node, arg0,
+			 build_int_cst (integer_type_node, val_dev));
+
+	  tree result = build2 (TRUTH_OR_EXPR, boolean_type_node, host, dev);
+	  return fold_convert (integer_type_node, result);
+	}
+  break;
+
 default:
   break;
 }
Index: omp-low.c
===
--- omp-low.c	(revision 228094)
+++ omp-low.c	(working copy)
@@ -14725,21 +14725,20 @@ static void
 oacc_xform_on_device (gcall *call)
 {
   tree arg = gimple_call_arg (call, 0);
-  unsigned val = GOMP_DEVICE_HOST;
-	  
-#ifdef ACCEL_COMPILER
-  val = GOMP_DEVICE_NOT_HOST;
-#endif
-  tree result = build2 (EQ_EXPR, boolean_type_node, arg,
-			build_int_cst (integer_type_node, val));
+  unsigned val_host = GOMP_DEVICE_HOST;
+  unsigned val_dev = GOMP_DEVICE_NONE;
+
 #ifdef ACCEL_COMPILER
-  {
-tree dev  = build2 (EQ_EXPR, boolean_type_node, arg,
-			build_int_cst (integer_type_node,
-   ACCEL_COMPILER_acc_device));
-result = build2 (TRUTH_OR_EXPR, boolean_type_node, result, dev);
-  }
+  val_host = GOMP_DEVICE_NOT_HOST;
+  val_dev = ACCEL_COMPILER_acc_device;
 #endif
+
+  tree host = build2 (EQ_EXPR, boolean_type_node, arg,
+		  

Re: New post-LTO OpenACC pass

2015-09-25 Thread Bernd Schmidt

On 09/25/2015 12:56 PM, Nathan Sidwell wrote:

On 09/25/15 06:28, Bernd Schmidt wrote:

Can you send me the patch you tried (and possibly a testcase you
expect to be
handled), I'll see if I can find out what's going on.


Thanks!  When things didn't work, I tried getting it workong on the
gomp4 branch, as I new what to expect there.  So the patch is for that
branch.

The fails I observed are:

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none
execution test


Ok, I tried to compile this one. When using -O for host cc1 and ptx 
lto1, I see fold_builtin_1 being executed with state == EXPANSION.


In host cc1:

10294 return fold_convert (integer_type_node, result);
(gdb) p result
$16 = 
(gdb) pge
warning: Expression is not an assignment (and might have no effect)
2 == 2 || 2 == 0

In ptx lto1:

(gdb) p result
$1 = 
(gdb) pge
warning: Expression is not an assignment (and might have no effect)
2 == 4 || 2 == 5

I'm not really sure about the logic, but are the results maybe switched 
(returning false on the device and true on the host)?


I think the reason you're seeing calls to acc_on_device when not 
optimizing is this code:


5931	  /* When not optimizing, generate calls to library functions for a 
certain

5932 set of builtins.  */
5933  if (!optimize
5934  && !called_as_built_in (fndecl)
5935  && fcode != BUILT_IN_FORK
[...]

which should probably have the acc_on_device code added to the list.


Bernd



Re: New post-LTO OpenACC pass

2015-09-25 Thread Bernd Schmidt

On 09/25/2015 02:30 PM, Bernd Schmidt wrote:


(gdb) p result
$1 = 
(gdb) pge
warning: Expression is not an assignment (and might have no effect)
2 == 4 || 2 == 5

I'm not really sure about the logic, but are the results maybe switched
(returning false on the device and true on the host)?


Eh, no, the testcase seems to want to know if it's running on the host, 
so that appears OK. But AFAICS it's doing the right thing. Stepping into 
libgomp:


182   else if (acc_device_type (acc_dev->type) == acc_device_host)
(gdb) p acc_dev->type
$1 = OFFLOAD_TARGET_TYPE_HOST
(gdb) next
184   fn (hostaddrs);

It's not running the offloaded version, so the testcase I think should fail.


Bernd


Re: New post-LTO OpenACC pass

2015-09-25 Thread Bernd Schmidt

On 09/25/2015 03:03 PM, Bernd Schmidt wrote:

182  else if (acc_device_type (acc_dev->type) == acc_device_host)
(gdb) p acc_dev->type
$1 = OFFLOAD_TARGET_TYPE_HOST
(gdb) next
184  fn (hostaddrs);

It's not running the offloaded version, so the testcase I think should
fail.


... and that's because my system was no longer set up to run CUDA 
binaries, after I fixed that the testcase passes.


So as far as I can tell almost everything here works as expected?


Bernd


Re: New post-LTO OpenACC pass

2015-09-25 Thread Bernd Schmidt

On 09/25/2015 12:38 AM, Nathan Sidwell wrote:

On 09/23/15 14:58, Nathan Sidwell wrote:

On 09/23/15 14:51, Bernd Schmidt wrote:

On 09/23/2015 08:42 PM, Nathan Sidwell wrote:

We have to defer folding until we know whether we're doing host or
device compilation.


Doesn't something like "symtab->state >= EXPANSION" give you that?


I've tried limiting expansion by checking symtab->state.  I have been
unable to succeed.

It either expands too early in the host compiler, or it doesn't get
expanded at  all and one ends up with an RTL call to the library
function.   For instance there doesn't appear to be call to fold
builtins when state == EXPANSION. lesser values are present in the host
compiler before LTO write out, AFAICT.


That's a bit odd:

Breakpoint 5, (anonymous namespace)::pass_fold_builtins::execute 
(this=0x1ce89a0, fun=0x70858348) at ../../git/gcc/tree-ssa-ccp.c:2722

[...]
(gdb) p stmt
$3 = (gimple *) 0x70736d80
(gdb) pgg
warning: Expression is not an assignment (and might have no effect)
# .MEM_2 = VDEF <.MEM_1(D)>
_3 = acc_on_device (123);
(gdb) p symtab->state
$4 = EXPANSION

On the other hand, it's not considered a builtin:

(gdb) p gimple_call_builtin_p(stmt, BUILT_IN_ACC_ON_DEVICE)
$6 = false

This is the c-c++-common/goacc/acc_on_device-2.c testcase. Is that 
expected to be handled? If I change it to use __builtin_acc_on_device, I 
can step right into


Breakpoint 8, fold_call_stmt (stmt=0x70736e10, ignore=false) at 
../../git/gcc/builtins.c:12277

12277 tree ret = NULL_TREE;

Maybe you were compiling without optimization? In that case 
expand_builtin_acc_on_device (which already exists) should still end up 
doing the right thing. In no case should you see a RTL call to a 
function, that indicates that something else went wrong.


Can you send me the patch you tried (and possibly a testcase you expect 
to be handled), I'll see if I can find out what's going on.



Bernd


Re: New post-LTO OpenACC pass

2015-09-24 Thread Nathan Sidwell

On 09/23/15 14:58, Nathan Sidwell wrote:

On 09/23/15 14:51, Bernd Schmidt wrote:

On 09/23/2015 08:42 PM, Nathan Sidwell wrote:


As I feared, builtin folding occurs in several places.  In particular
its first call is very early on in the host compiler, which is far too
soon.

We have to defer folding until we know whether we're doing host or
device compilation.


Doesn't something like "symtab->state >= EXPANSION" give you that?


I've tried limiting expansion by checking symtab->state.  I have been unable to 
succeed.


It either expands too early in the host compiler, or it doesn't get expanded at 
 all and one ends up with an RTL call to the library function.   For instance 
there doesn't appear to be call to fold builtins when state == EXPANSION. 
lesser values are present in the host compiler before LTO write out, AFAICT.


nathan


Re: New post-LTO OpenACC pass

2015-09-23 Thread Bernd Schmidt

On 09/22/2015 05:16 PM, Nathan Sidwell wrote:

+   if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+ /* acc_on_device must be evaluated at compile time for
+constant arguments.  */
+ {
+   oacc_xform_on_device (call);
+   rescan = true;
+ }


Is there a reason this is not done as part of pass_fold_builtins? (It 
looks like maybe adding this to fold_call_stmt in builtins.c would be 
sufficient too).



Bernd


Re: New post-LTO OpenACC pass

2015-09-23 Thread Nathan Sidwell

On 09/23/15 06:59, Bernd Schmidt wrote:

On 09/22/2015 05:16 PM, Nathan Sidwell wrote:

+if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+  /* acc_on_device must be evaluated at compile time for
+ constant arguments.  */
+  {
+oacc_xform_on_device (call);
+rescan = true;
+  }


Is there a reason this is not done as part of pass_fold_builtins? (It looks like
maybe adding this to fold_call_stmt in builtins.c would be sufficient too).


Perhaps it could be.  I'll need to check where  that pass happens.  Anyway, the 
main thrust of this patch is the new pass, which I thought might be easier to 
review with minimal additional  clutter.


nathan


Re: New post-LTO OpenACC pass

2015-09-23 Thread Bernd Schmidt

On 09/23/2015 02:14 PM, Nathan Sidwell wrote:

On 09/23/15 06:59, Bernd Schmidt wrote:

On 09/22/2015 05:16 PM, Nathan Sidwell wrote:

+if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+  /* acc_on_device must be evaluated at compile time for
+ constant arguments.  */
+  {
+oacc_xform_on_device (call);
+rescan = true;
+  }


Is there a reason this is not done as part of pass_fold_builtins? (It
looks like
maybe adding this to fold_call_stmt in builtins.c would be sufficient
too).


Perhaps it could be.  I'll need to check where  that pass happens.
Anyway, the main thrust of this patch is the new pass, which I thought
might be easier to review with minimal additional  clutter.


There's no issue adding a new pass if there's a demonstrated need for 
it, but I think builtin folding doesn't quite meet that criterion given 
that we already have a pass that does that. Unless you really need it to 
happen very early in the pipeline - fold_builtins runs pretty late, but 
I checked and fold_call_stmt gets called from pass_forwprop and possibly 
from elsewhere too.



Bernd


Re: New post-LTO OpenACC pass

2015-09-23 Thread Nathan Sidwell

On 09/23/15 08:58, Bernd Schmidt wrote:

On 09/23/2015 02:14 PM, Nathan Sidwell wrote:

On 09/23/15 06:59, Bernd Schmidt wrote:

On 09/22/2015 05:16 PM, Nathan Sidwell wrote:

+if (gimple_call_builtin_p (call, BUILT_IN_ACC_ON_DEVICE))
+  /* acc_on_device must be evaluated at compile time for
+ constant arguments.  */
+  {
+oacc_xform_on_device (call);
+rescan = true;
+  }


Is there a reason this is not done as part of pass_fold_builtins? (It
looks like
maybe adding this to fold_call_stmt in builtins.c would be sufficient
too).



As I feared, builtin folding occurs in several places.  In particular its first 
call is very early on in the host compiler, which is far too soon.


We have to defer folding until we know whether we're doing host or device 
compilation.


nathan


Re: New post-LTO OpenACC pass

2015-09-23 Thread Bernd Schmidt

On 09/23/2015 08:42 PM, Nathan Sidwell wrote:


As I feared, builtin folding occurs in several places.  In particular
its first call is very early on in the host compiler, which is far too
soon.

We have to defer folding until we know whether we're doing host or
device compilation.


Doesn't something like "symtab->state >= EXPANSION" give you that?


Bernd


Re: New post-LTO OpenACC pass

2015-09-23 Thread Nathan Sidwell

On 09/23/15 14:51, Bernd Schmidt wrote:

On 09/23/2015 08:42 PM, Nathan Sidwell wrote:


As I feared, builtin folding occurs in several places.  In particular
its first call is very early on in the host compiler, which is far too
soon.

We have to defer folding until we know whether we're doing host or
device compilation.


Doesn't something like "symtab->state >= EXPANSION" give you that?


I don't know.   It doesn't seem to me to be a good idea for the builtin 
expanders to be context-sensitive.


nathan


Re: New post-LTO OpenACC pass

2015-09-22 Thread Nathan Sidwell

On 09/21/15 16:39, Nathan Sidwell wrote:

On 09/21/15 16:30, Cesar Philippidis wrote:

On 09/21/2015 09:30 AM, Nathan Sidwell wrote:


+const pass_data pass_data_oacc_transform =
+{
+  GIMPLE_PASS, /* type */
+  "fold_oacc_transform", /* name */


Want to rename the tree dump file to oacc_xforms like I'm did in the
attached patch? Regardless, I think we need to document this flag in
invoke.texi.


Thanks for noticing the missing doc.  I'm not attached to any particular name.
'fold_oacc_transform' is rather generic, and a bit of  a mouthful.  Perhaps
'oacclower', 'oaccdevlower' or something (I  see there's 'lateomplower' for
guidance)


this updated patch includes Cesar's doc patch.  Also change the name of the pass 
to 'oaccdevlow'.


nathan
2015-09-22  Nathan Sidwell  
	Cesar Philippidis  

	* omp-low.h (get_oacc_fn_attrib): Declare.
	* omp-low.c (get_oacc_fn_attrib): New.
	(oacc_xform_on_device): New.
	(execute_oacc_device_lower): New pass.
	(pass_data_oacc_device_lower): New.
	(pass_oacc_device_lower): New.
	(make_pass_oacc_device_lower): New.
	* tree-pass.h (make_pass_oacc_device_lower): Declare.
	* passes.def: Add pass_oacc_transform.
	* doc/invoke.texi: Document -fdump-tree-oaccdevlow.

Index: tree-pass.h
===
--- tree-pass.h	(revision 227968)
+++ tree-pass.h	(working copy)
@@ -406,6 +406,7 @@ extern gimple_opt_pass *make_pass_lower_
 extern gimple_opt_pass *make_pass_diagnose_omp_blocks (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_expand_omp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_expand_omp_ssa (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_oacc_device_lower (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_object_sizes (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_strlen (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_fold_builtins (gcc::context *ctxt);
Index: passes.def
===
--- passes.def	(revision 227968)
+++ passes.def	(working copy)
@@ -148,6 +148,7 @@ along with GCC; see the file COPYING3.
   INSERT_PASSES_AFTER (all_passes)
   NEXT_PASS (pass_fixup_cfg);
   NEXT_PASS (pass_lower_eh_dispatch);
+  NEXT_PASS (pass_oacc_device_lower);
   NEXT_PASS (pass_all_optimizations);
   PUSH_INSERT_PASSES_WITHIN (pass_all_optimizations)
   NEXT_PASS (pass_remove_cgraph_callee_edges);
Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 227968)
+++ doc/invoke.texi	(working copy)
@@ -7179,6 +7179,11 @@ is made by appending @file{.slp} to the
 Dump each function after Value Range Propagation (VRP).  The file name
 is made by appending @file{.vrp} to the source file name.
 
+@item oaccdevlow
+@opindex fdump-tree-oaccdevlow
+Dump each function after applying device-specific OpenACC transformations.
+The file name is made by appending @file{.oaccdevlow} to the source file name.
+
 @item all
 @opindex fdump-tree-all
 Enable all the available tree dumps with the flags provided in this option.
Index: omp-low.c
===
--- omp-low.c	(revision 227968)
+++ omp-low.c	(working copy)
@@ -8860,6 +8860,16 @@ expand_omp_atomic (struct omp_region *re
   expand_omp_atomic_mutex (load_bb, store_bb, addr, loaded_val, stored_val);
 }
 
+#define OACC_FN_ATTRIB "oacc function"
+
+/* Retrieve the oacc function attrib and return it.  Non-oacc
+   functions will return NULL.  */
+
+tree
+get_oacc_fn_attrib (tree fn)
+{
+  return lookup_attribute (OACC_FN_ATTRIB, DECL_ATTRIBUTES (fn));
+}
 
 /* Expand the GIMPLE_OMP_TARGET starting at REGION.  */
 
@@ -13909,4 +13919,131 @@ omp_finish_file (void)
 }
 }
 
+/* Transform an acc_on_device call.  OpenACC 2.0a requires this folded at
+   compile time for constant operands.  We always fold it.  In an
+   offloaded function we're never 'none'.  */
+
+static void
+oacc_xform_on_device (gimple *call)
+{
+  tree arg = gimple_call_arg (call, 0);
+  unsigned val = GOMP_DEVICE_HOST;
+	  
+#ifdef ACCEL_COMPILER
+  val = GOMP_DEVICE_NOT_HOST;
+#endif
+  tree result = build2 (EQ_EXPR, boolean_type_node, arg,
+			build_int_cst (integer_type_node, val));
+#ifdef ACCEL_COMPILER
+  {
+tree dev  = build2 (EQ_EXPR, boolean_type_node, arg,
+			build_int_cst (integer_type_node,
+   ACCEL_COMPILER_acc_device));
+result = build2 (TRUTH_OR_EXPR, boolean_type_node, result, dev);
+  }
+#endif
+  result = fold_convert (integer_type_node, result);
+  tree lhs = gimple_call_lhs (call);
+  gimple_seq seq = NULL;
+
+  push_gimplify_context (true);
+  gimplify_assign (lhs, result, );
+  pop_gimplify_context (NULL);
+
+  gimple_stmt_iterator gsi = gsi_for_stmt (call);
+  gsi_replace_with_seq (, seq, false);
+}
+
+/* Main entry point for oacc transformations which run on the device
+   compiler after LTO, so we know what the 

Re: New post-LTO OpenACC pass

2015-09-21 Thread Cesar Philippidis
On 09/21/2015 09:30 AM, Nathan Sidwell wrote:

> +const pass_data pass_data_oacc_transform =
> +{
> +  GIMPLE_PASS, /* type */
> +  "fold_oacc_transform", /* name */

Want to rename the tree dump file to oacc_xforms like I'm did in the
attached patch? Regardless, I think we need to document this flag in
invoke.texi.

> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_NONE, /* tv_id */
> +  PROP_cfg, /* properties_required */
> +  0 /* Possibly PROP_gimple_eomp.  */, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
> +};

Cesar
2015-09-21  Cesar Philippidis  

	gcc/
	* doc/invoke.texi: Document -fdump-tree-oacc_xforms.
	* omp-low.c (pass_data_oacc_transform): Rename the tree dump for
	oacc_transform as oacc_xforms.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 92f82d7..7406941 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7158,6 +7158,11 @@ is made by appending @file{.slp} to the source file name.
 Dump each function after Value Range Propagation (VRP).  The file name
 is made by appending @file{.vrp} to the source file name.
 
+@item oacc_xforms
+@opindex fdump-tree-oacc_xforms
+Dump each function after applying target-specific OpenACC transformations.
+The file name is made by appending @file{.oacc_xforms} to the source file name.
+
 @item all
 @opindex fdump-tree-all
 Enable all the available tree dumps with the flags provided in this option.
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index e3dc160..f31e6cd 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -15086,7 +15086,7 @@ namespace {
 const pass_data pass_data_oacc_transform =
 {
   GIMPLE_PASS, /* type */
-  "fold_oacc_transform", /* name */
+  "oacc_xforms", /* name */
   OPTGROUP_NONE, /* optinfo_flags */
   TV_NONE, /* tv_id */
   PROP_cfg, /* properties_required */


Re: New post-LTO OpenACC pass

2015-09-21 Thread Nathan Sidwell

On 09/21/15 16:30, Cesar Philippidis wrote:

On 09/21/2015 09:30 AM, Nathan Sidwell wrote:


+const pass_data pass_data_oacc_transform =
+{
+  GIMPLE_PASS, /* type */
+  "fold_oacc_transform", /* name */


Want to rename the tree dump file to oacc_xforms like I'm did in the
attached patch? Regardless, I think we need to document this flag in
invoke.texi.


Thanks for noticing the missing doc.  I'm not attached to any particular name. 
'fold_oacc_transform' is rather generic, and a bit of  a mouthful.  Perhaps 
'oacclower', 'oaccdevlower' or something (I  see there's 'lateomplower' for 
guidance)


nathan