[gomp4] Use pass_ch instead of pass_ch_oacc_kernels (was: [PATCH, 8/16] Add pass_ch_oacc_kernels)

2015-11-30 Thread Thomas Schwinge
Hi!

On Wed, 11 Nov 2015 21:29:10 +0100, Tom de Vries  wrote:
> On 09/11/15 19:33, Tom de Vries wrote:
> > On 09/11/15 16:35, Tom de Vries wrote:
> > this patch adds a pass pass_ch_oacc_kernels, which is like pass_ch, but
> > only runs for loops with oacc_kernels_region set.
> >
> > [ But... thinking about it a bit more, I think that we could use a
> > regular pass_ch instead. We only use the kernels pass group for a single
> > loop nest in a kernels region, and we mark all the loops in the loop
> > nest with oacc_kernels_region. So I think that the oacc_kernels_region
> > test in pass_ch_oacc_kernels::process_loop_p evaluates to true. ]
> >
> > So, I'll try to confirm with retesting that we can drop this patch.
> >
> 
> That's confirmed. I can use pass_ch instead of pass_ch_oacc_kernels, so 
> I'm dropping this patch from the series.

Committed to gomp-4_0-branch in r231067:

commit 8249e606d83025092e3b0b227360f7e38fe591d4
Author: tschwinge 
Date:   Mon Nov 30 12:05:50 2015 +

Use pass_ch instead of pass_ch_oacc_kernels

gcc/
* passes.def: Use pass_ch instead of pass_ch_oacc_kernels.
* tree-pass.h (make_pass_ch_oacc_kernels): Remove.
* tree-ssa-loop-ch.c: Revert to trunk r230907 version.
gcc/testsuite/
* gcc.dg/tree-ssa/copy-headers.c: Update for new pass_ch.
* gcc.dg/tree-ssa/foldconst-2.c: Likewise.
* gcc.dg/tree-ssa/loop-40.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@231067 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp   |6 +++
 gcc/passes.def   |2 +-
 gcc/testsuite/ChangeLog.gomp |6 +++
 gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c |4 +-
 gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c  |4 +-
 gcc/testsuite/gcc.dg/tree-ssa/loop-40.c  |4 +-
 gcc/tree-pass.h  |1 -
 gcc/tree-ssa-loop-ch.c   |   60 +++---
 8 files changed, 24 insertions(+), 63 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 54712ab..2c8f0c2 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2015-11-30  Thomas Schwinge  
+
+   * passes.def: Use pass_ch instead of pass_ch_oacc_kernels.
+   * tree-pass.h (make_pass_ch_oacc_kernels): Remove.
+   * tree-ssa-loop-ch.c: Revert to trunk r230907 version.
+
 2015-11-18  Nathan Sidwell  
 
* config/nvptx/nvptx.c: Remove unneeded #includes. Backport
diff --git gcc/passes.def gcc/passes.def
index e44bfac..f4eb235 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -93,7 +93,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
- NEXT_PASS (pass_ch_oacc_kernels);
+ NEXT_PASS (pass_ch);
  NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index dd3b1f5..59733bd 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,3 +1,9 @@
+2015-11-30  Thomas Schwinge  
+
+   * gcc.dg/tree-ssa/copy-headers.c: Update for new pass_ch.
+   * gcc.dg/tree-ssa/foldconst-2.c: Likewise.
+   * gcc.dg/tree-ssa/loop-40.c: Likewise.
+
 2015-11-19  Cesar Philippidis  
 
* gfortran.dg/goacc/routine-6.f90: Ensure that the device clause is
diff --git gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c 
gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c
index 4241b40..a5a8212 100644
--- gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c
+++ gcc/testsuite/gcc.dg/tree-ssa/copy-headers.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fdump-tree-ch-details" } */
+/* { dg-options "-O2 -fdump-tree-ch2-details" } */
 
 extern int foo (int);
 
@@ -12,4 +12,4 @@ void bla (void)
 }
 
 /* There should be a header duplicated.  */
-/* { dg-final { scan-tree-dump-times "Duplicating header" 1 "ch"} } */
+/* { dg-final { scan-tree-dump-times "Duplicating header" 1 "ch2"} } */
diff --git gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c 
gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c
index eb1e6de..e9a6f87 100644
--- gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c
+++ gcc/testsuite/gcc.dg/tree-ssa/foldconst-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-ch" } */
+/* { dg-options "-O2 -fdump-tree-ch2" } */
 typedef union tree_node *tree;
 enum tree_code
 {
@@ -56,4 +56,4 @@ emit_support_tinfos (void)
 }
 /* We should copy loop header to fundamentals[0] and then fold it way into
known value.  */
-/* { dg-final { scan-tree-dump-not "fundamentals.0" "ch"} } */
+/* { dg-final { scan-tree-dump-not "fundamentals.0" "ch2"} } */
diff --git gcc/testsuite/gcc.dg/tree-ssa/loo

Re: [PATCH, 8/16] Add pass_ch_oacc_kernels

2015-11-11 Thread Tom de Vries

On 09/11/15 19:33, Tom de Vries wrote:

On 09/11/15 16:35, Tom de Vries wrote:

Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

  1Insert new exit block only when needed in
 transform_to_exit_first_loop_alt
  2Make create_parallel_loop return void
  3Ignore reduction clause on kernels directive
  4Implement -foffload-alias
  5Add in_oacc_kernels_region in struct loop
  6Add pass_oacc_kernels
  7Add pass_dominator_oacc_kernels
  8Add pass_ch_oacc_kernels
  9Add pass_parallelize_loops_oacc_kernels
 10Add pass_oacc_kernels pass group in passes.def
 11Update testcases after adding kernels pass group
 12Handle acc loop directive
 13Add c-c++-common/goacc/kernels-*.c
 14Add gfortran.dg/goacc/kernels-*.f95
 15Add libgomp.oacc-c-c++-common/kernels-*.c
 16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.


this patch adds a pass pass_ch_oacc_kernels, which is like pass_ch, but
only runs for loops with oacc_kernels_region set.

[ But... thinking about it a bit more, I think that we could use a
regular pass_ch instead. We only use the kernels pass group for a single
loop nest in a kernels region, and we mark all the loops in the loop
nest with oacc_kernels_region. So I think that the oacc_kernels_region
test in pass_ch_oacc_kernels::process_loop_p evaluates to true. ]

So, I'll try to confirm with retesting that we can drop this patch.



That's confirmed. I can use pass_ch instead of pass_ch_oacc_kernels, so 
I'm dropping this patch from the series.


Thanks,
- Tom



[PATCH, 8/16] Add pass_ch_oacc_kernels

2015-11-09 Thread Tom de Vries

On 09/11/15 16:35, Tom de Vries wrote:

Hi,

this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.

The patch series contains these patches:

  1Insert new exit block only when needed in
 transform_to_exit_first_loop_alt
  2Make create_parallel_loop return void
  3Ignore reduction clause on kernels directive
  4Implement -foffload-alias
  5Add in_oacc_kernels_region in struct loop
  6Add pass_oacc_kernels
  7Add pass_dominator_oacc_kernels
  8Add pass_ch_oacc_kernels
  9Add pass_parallelize_loops_oacc_kernels
 10Add pass_oacc_kernels pass group in passes.def
 11Update testcases after adding kernels pass group
 12Handle acc loop directive
 13Add c-c++-common/goacc/kernels-*.c
 14Add gfortran.dg/goacc/kernels-*.f95
 15Add libgomp.oacc-c-c++-common/kernels-*.c
 16Add libgomp.oacc-fortran/kernels-*.f95

The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.

Bootstrapped and reg-tested on x86_64.

Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).

I'll post the individual patches in reply to this message.


this patch adds a pass pass_ch_oacc_kernels, which is like pass_ch, but 
only runs for loops with oacc_kernels_region set.


[ But... thinking about it a bit more, I think that we could use a 
regular pass_ch instead. We only use the kernels pass group for a single 
loop nest in a kernels region, and we mark all the loops in the loop 
nest with oacc_kernels_region. So I think that the oacc_kernels_region 
test in pass_ch_oacc_kernels::process_loop_p evaluates to true. ]


So, I'll try to confirm with retesting that we can drop this patch.

Thanks,
- Tom

Add pass_ch_oacc_kernels

2015-11-09  Tom de Vries  

	* tree-pass.h (make_pass_ch_oacc_kernels): Declare.
	* tree-ssa-loop-ch.c (pass_ch::pass_ch (pass_data, gcc::context)): New
	constructor.
	(pass_data_ch_oacc_kernels): New pass_data.
	(class pass_ch_oacc_kernels): New pass.
	(pass_ch_oacc_kernels::process_loop_p): New function.
	(make_pass_ch_oacc_kernels): New function.
---
 gcc/tree-pass.h|  1 +
 gcc/tree-ssa-loop-ch.c | 54 +-
 2 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 2825aea..f95a820 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -389,6 +389,7 @@ extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_ch_oacc_kernels (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index 7e618bf..8bf47fe 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-inline.h"
 #include "tree-ssa-scopedtables.h"
 #include "tree-ssa-threadedge.h"
+#include "omp-low.h"
 
 /* Duplicates headers of loops if they are small enough, so that the statements
in the loop body are always executed when the loop is entered.  This
@@ -124,7 +125,7 @@ do_while_loop_p (struct loop *loop)
 
 namespace {
 
-/* Common superclass for both header-copying phases.  */
+/* Common superclass for header-copying phases.  */
 class ch_base : public gimple_opt_pass
 {
   protected:
@@ -159,6 +160,10 @@ public:
 : ch_base (pass_data_ch, ctxt)
   {}
 
+  pass_ch (pass_data data, gcc::context *ctxt)
+: ch_base (data, ctxt)
+  {}
+
   /* opt_pass methods: */
   virtual bool gate (function *) { return flag_tree_ch != 0; }
   
@@ -414,3 +419,50 @@ make_pass_ch (gcc::context *ctxt)
 {
   return new pass_ch (ctxt);
 }
+
+namespace {
+
+const pass_data pass_data_ch_oacc_kernels =
+{
+  GIMPLE_PASS, /* type */
+  "ch_oacc_kernels", /* name */
+  OPTGROUP_LOOP, /* optinfo_flags */
+  TV_TREE_CH, /* tv_id */
+  ( PROP_cfg | PROP_ssa ), /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_cleanup_cfg, /* todo_flags_finish */
+};
+
+class pass_ch_oacc_kernels : public pass_ch
+{
+public:
+  pass_ch_oacc_kernels (gcc::context *ctxt)
+: pass_ch (pass_data_ch_oacc_kernels, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *) { return true; }
+
+protected:
+  /* ch_base method: */
+  virtual bool pro