Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-06-09 Thread Tom de Vries

On 09/06/15 00:05, Tom de Vries wrote:

On 08/06/15 17:55, Thomas Schwinge wrote:

Hi Tom!

On Mon, 8 Jun 2015 12:43:01 +0200, Tom de Vries
tom_devr...@mentor.com wrote:

There are two problems in try_transform_to_exit_first_loop_alt:
1. In case the latch is not a singleton bb, the function should return
 false rather than true.
2. The check for singleton bb should ignore debug-insns.

Attached patch fixes these problems.



Fix try_transform_to_exit_first_loop_alt



PR tree-optimization/66442
* gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function.
* tree-parloops.c (try_transform_to_exit_first_loop_alt): Return
false
if the loop latch is not a singleton.  Use
gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p.


Per my testing, the backport of this patch that you committed to
gomp-4_0-branch, r224219, introduces a number of regressions in your
OpenACC kernels test cases, specifically the »scan-tree-dump-times
parloops_oacc_kernels (?n)pragma omp target
oacc_parallel.*num_gangs\\(32\\) 1« tests.  Would you please have a
look?




Hi Thomas,

I seem to have committed (to both trunk and gomp-4_0-branch) an older
version of the patch, which contained an incorrect version of
gimple_seq_nondebug_singleton_p.

I'll correct the mistake tomorrow morning.


Committed attached patch to trunk and propagated to gomp-4_0-branch.

Committed as obvious, since it changes gimple_seq_nondebug_singleton_p 
into the tested and approved version.


Thanks,
- Tom

Fix gimple_seq_nondebug_singleton_p

2015-06-09  Tom de Vries  t...@codesourcery.com

	* gimple-iterator.h (gimple_seq_nondebug_singleton_p): Don't
	always return false.
---
 gcc/gimple-iterator.h | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index d08245e..76fa456 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -351,33 +351,27 @@ static inline bool
 gimple_seq_nondebug_singleton_p (gimple_seq seq)
 {
   gimple_stmt_iterator gsi;
+
+  /* Find a nondebug gimple.  */
   gsi.ptr = gimple_seq_first (seq);
   gsi.seq = seq;
   gsi.bb = NULL;
-
-  /* Not a singleton if the sequence is empty.  */
-  if (gsi_end_p (gsi))
-return false;
-
-  /* Find a nondebug gimple.  */
   while (!gsi_end_p (gsi)
 	  is_gimple_debug (gsi_stmt (gsi)))
 gsi_next (gsi);
 
-  /* Not a nondebug singleton if there's no nondebug gimple.  */
-  if (is_gimple_debug (gsi_stmt (gsi)))
+  /* No nondebug gimple found, not a singleton.  */
+  if (gsi_end_p (gsi))
 return false;
 
-  /* Find the next nondebug gimple.  */
+  /* Find a next nondebug gimple.  */
+  gsi_next (gsi);
   while (!gsi_end_p (gsi)
 	  is_gimple_debug (gsi_stmt (gsi)))
 gsi_next (gsi);
 
-  /* If there's a next nondebug gimple, it's not a nondebug singleton.  */
-  if (!gsi_end_p (gsi))
-return false;
-
-  return true;
+  /* Only a singleton if there's no next nondebug gimple.  */
+  return gsi_end_p (gsi);
 }
 
 #endif /* GCC_GIMPLE_ITERATOR_H */
-- 
1.9.1



Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-06-08 Thread Tom de Vries

On 08/06/15 17:55, Thomas Schwinge wrote:

Hi Tom!

On Mon, 8 Jun 2015 12:43:01 +0200, Tom de Vries tom_devr...@mentor.com wrote:

There are two problems in try_transform_to_exit_first_loop_alt:
1. In case the latch is not a singleton bb, the function should return
 false rather than true.
2. The check for singleton bb should ignore debug-insns.

Attached patch fixes these problems.



Fix try_transform_to_exit_first_loop_alt



PR tree-optimization/66442
* gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function.
* tree-parloops.c (try_transform_to_exit_first_loop_alt): Return false
if the loop latch is not a singleton.  Use
gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p.


Per my testing, the backport of this patch that you committed to
gomp-4_0-branch, r224219, introduces a number of regressions in your
OpenACC kernels test cases, specifically the »scan-tree-dump-times
parloops_oacc_kernels (?n)pragma omp target
oacc_parallel.*num_gangs\\(32\\) 1« tests.  Would you please have a
look?




Hi Thomas,

I seem to have committed (to both trunk and gomp-4_0-branch) an older 
version of the patch, which contained an incorrect version of 
gimple_seq_nondebug_singleton_p.


I'll correct the mistake tomorrow morning.

Thanks,
- Tom


Grüße,
  Thomas



  gcc/gimple-iterator.h | 29 +
  gcc/tree-parloops.c   |  4 ++--
  2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index 87e943a..76fa456 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -345,4 +345,33 @@ gsi_seq (gimple_stmt_iterator i)
return *i.seq;
  }

+/* Determine whether SEQ is a nondebug singleton.  */
+
+static inline bool
+gimple_seq_nondebug_singleton_p (gimple_seq seq)
+{
+  gimple_stmt_iterator gsi;
+
+  /* Find a nondebug gimple.  */
+  gsi.ptr = gimple_seq_first (seq);
+  gsi.seq = seq;
+  gsi.bb = NULL;
+  while (!gsi_end_p (gsi)
+ is_gimple_debug (gsi_stmt (gsi)))
+gsi_next (gsi);
+
+  /* No nondebug gimple found, not a singleton.  */
+  if (gsi_end_p (gsi))
+return false;
+
+  /* Find a next nondebug gimple.  */
+  gsi_next (gsi);
+  while (!gsi_end_p (gsi)
+ is_gimple_debug (gsi_stmt (gsi)))
+gsi_next (gsi);
+
+  /* Only a singleton if there's no next nondebug gimple.  */
+  return gsi_end_p (gsi);
+}
+
  #endif /* GCC_GIMPLE_ITERATOR_H */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 02f44eb..c4b83fe 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1769,8 +1769,8 @@ try_transform_to_exit_first_loop_alt (struct loop *loop,
  tree nit)
  {
/* Check whether the latch contains a single statement.  */
-  if (!gimple_seq_singleton_p (bb_seq (loop-latch)))
-return true;
+  if (!gimple_seq_nondebug_singleton_p (bb_seq (loop-latch)))
+return false;

/* Check whether the latch contains the loop iv increment.  */
edge back = single_succ_edge (loop-latch);
--
1.9.1





Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-06-08 Thread Richard Biener
On Mon, 8 Jun 2015, Tom de Vries wrote:

 On 04/06/15 10:28, Tom de Vries wrote:
   I'm ok with the patch and count on you to fix eventual fallout ;)
   
  
  Great, will do.
 
 And here is the fallout:
 * PR66442 - [6 regression] FAIL: gcc.dg/autopar/pr46885.c (test for
   excess errors)
 
 There are two problems in try_transform_to_exit_first_loop_alt:
 1. In case the latch is not a singleton bb, the function should return
false rather than true.
 2. The check for singleton bb should ignore debug-insns.
 
 Attached patch fixes these problems.
 
 Bootstrapped and reg-tested on x86_64.
 
 Verified by Andreas to fix the problem on m68k.
 
 OK for trunk?

Ok.

Thanks,
Richard.


Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-06-08 Thread Tom de Vries

On 04/06/15 10:28, Tom de Vries wrote:

I'm ok with the patch and count on you to fix eventual fallout ;)



Great, will do.


And here is the fallout:
* PR66442 - [6 regression] FAIL: gcc.dg/autopar/pr46885.c (test for
  excess errors)

There are two problems in try_transform_to_exit_first_loop_alt:
1. In case the latch is not a singleton bb, the function should return
   false rather than true.
2. The check for singleton bb should ignore debug-insns.

Attached patch fixes these problems.

Bootstrapped and reg-tested on x86_64.

Verified by Andreas to fix the problem on m68k.

OK for trunk?

Thanks,
- Tom

Fix try_transform_to_exit_first_loop_alt

2015-06-06  Tom de Vries  t...@codesourcery.com

	PR tree-optimization/66442
	* gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function.
	* tree-parloops.c (try_transform_to_exit_first_loop_alt): Return false
	if the loop latch is not a singleton.  Use
	gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p.
---
 gcc/gimple-iterator.h | 29 +
 gcc/tree-parloops.c   |  4 ++--
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index 87e943a..76fa456 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -345,4 +345,33 @@ gsi_seq (gimple_stmt_iterator i)
   return *i.seq;
 }
 
+/* Determine whether SEQ is a nondebug singleton.  */
+
+static inline bool
+gimple_seq_nondebug_singleton_p (gimple_seq seq)
+{
+  gimple_stmt_iterator gsi;
+
+  /* Find a nondebug gimple.  */
+  gsi.ptr = gimple_seq_first (seq);
+  gsi.seq = seq;
+  gsi.bb = NULL;
+  while (!gsi_end_p (gsi)
+	  is_gimple_debug (gsi_stmt (gsi)))
+gsi_next (gsi);
+
+  /* No nondebug gimple found, not a singleton.  */
+  if (gsi_end_p (gsi))
+return false;
+
+  /* Find a next nondebug gimple.  */
+  gsi_next (gsi);
+  while (!gsi_end_p (gsi)
+	  is_gimple_debug (gsi_stmt (gsi)))
+gsi_next (gsi);
+
+  /* Only a singleton if there's no next nondebug gimple.  */
+  return gsi_end_p (gsi);
+}
+
 #endif /* GCC_GIMPLE_ITERATOR_H */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 02f44eb..c4b83fe 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -1769,8 +1769,8 @@ try_transform_to_exit_first_loop_alt (struct loop *loop,
   tree nit)
 {
   /* Check whether the latch contains a single statement.  */
-  if (!gimple_seq_singleton_p (bb_seq (loop-latch)))
-return true;
+  if (!gimple_seq_nondebug_singleton_p (bb_seq (loop-latch)))
+return false;
 
   /* Check whether the latch contains the loop iv increment.  */
   edge back = single_succ_edge (loop-latch);
-- 
1.9.1



Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-06-08 Thread Thomas Schwinge
Hi Tom!

On Mon, 8 Jun 2015 12:43:01 +0200, Tom de Vries tom_devr...@mentor.com wrote:
 There are two problems in try_transform_to_exit_first_loop_alt:
 1. In case the latch is not a singleton bb, the function should return
 false rather than true.
 2. The check for singleton bb should ignore debug-insns.
 
 Attached patch fixes these problems.

 Fix try_transform_to_exit_first_loop_alt

   PR tree-optimization/66442
   * gimple-iterator.h (gimple_seq_nondebug_singleton_p): Add function.
   * tree-parloops.c (try_transform_to_exit_first_loop_alt): Return false
   if the loop latch is not a singleton.  Use
   gimple_seq_nondebug_singleton_p instead of gimple_seq_singleton_p.

Per my testing, the backport of this patch that you committed to
gomp-4_0-branch, r224219, introduces a number of regressions in your
OpenACC kernels test cases, specifically the »scan-tree-dump-times
parloops_oacc_kernels (?n)pragma omp target
oacc_parallel.*num_gangs\\(32\\) 1« tests.  Would you please have a
look?


Grüße,
 Thomas


  gcc/gimple-iterator.h | 29 +
  gcc/tree-parloops.c   |  4 ++--
  2 files changed, 31 insertions(+), 2 deletions(-)
 
 diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
 index 87e943a..76fa456 100644
 --- a/gcc/gimple-iterator.h
 +++ b/gcc/gimple-iterator.h
 @@ -345,4 +345,33 @@ gsi_seq (gimple_stmt_iterator i)
return *i.seq;
  }
  
 +/* Determine whether SEQ is a nondebug singleton.  */
 +
 +static inline bool
 +gimple_seq_nondebug_singleton_p (gimple_seq seq)
 +{
 +  gimple_stmt_iterator gsi;
 +
 +  /* Find a nondebug gimple.  */
 +  gsi.ptr = gimple_seq_first (seq);
 +  gsi.seq = seq;
 +  gsi.bb = NULL;
 +  while (!gsi_end_p (gsi)
 +   is_gimple_debug (gsi_stmt (gsi)))
 +gsi_next (gsi);
 +
 +  /* No nondebug gimple found, not a singleton.  */
 +  if (gsi_end_p (gsi))
 +return false;
 +
 +  /* Find a next nondebug gimple.  */
 +  gsi_next (gsi);
 +  while (!gsi_end_p (gsi)
 +   is_gimple_debug (gsi_stmt (gsi)))
 +gsi_next (gsi);
 +
 +  /* Only a singleton if there's no next nondebug gimple.  */
 +  return gsi_end_p (gsi);
 +}
 +
  #endif /* GCC_GIMPLE_ITERATOR_H */
 diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
 index 02f44eb..c4b83fe 100644
 --- a/gcc/tree-parloops.c
 +++ b/gcc/tree-parloops.c
 @@ -1769,8 +1769,8 @@ try_transform_to_exit_first_loop_alt (struct loop *loop,
 tree nit)
  {
/* Check whether the latch contains a single statement.  */
 -  if (!gimple_seq_singleton_p (bb_seq (loop-latch)))
 -return true;
 +  if (!gimple_seq_nondebug_singleton_p (bb_seq (loop-latch)))
 +return false;
  
/* Check whether the latch contains the loop iv increment.  */
edge back = single_succ_edge (loop-latch);
 -- 
 1.9.1
 


signature.asc
Description: PGP signature


Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-06-04 Thread Tom de Vries

On 26/05/15 12:39, Richard Biener wrote:

On Thu, 14 May 2015, Tom de Vries wrote:


On 20-04-15 14:25, Richard Biener wrote:

On Wed, 15 Apr 2015, Tom de Vries wrote:


On 03-04-15 14:39, Tom de Vries wrote:

On 27-03-15 15:10, Tom de Vries wrote:

Hi,

this patch fixes PR65443, a todo in the parloops pass for function
transform_to_exit_first_loop:
...
  TODO: the common case is that latch of the loop is empty and
immediately
  follows the loop exit.  In this case, it would be better not to
copy
the
  body of the loop, but only move the entry of the loop directly
before
the
  exit check and increase the number of iterations of the loop by
one.
  This may need some additional preconditioning in case NIT = ~0.
...

The current implementation of transform_to_exit_first_loop transforms
the
loop
by moving and duplicating the loop body. This patch transforms the
loop
into the
same normal form, but without the duplication, if that's possible
(determined by
try_transform_to_exit_first_loop_alt).

The actual transformation, done by transform_to_exit_first_loop_alt
transforms
this loop:
...
bb preheader:
...
goto bb header

bb header:
...
if (ivtmp  n)
  goto bb latch;
else
  goto bb exit;

bb latch:
ivtmp = ivtmp + inc;
goto bb header
...

into this one:
...
bb preheader:
...
goto bb newheader

bb header:
...
goto bb latch;

bb newheader:
if (ivtmp  n + 1)
  goto bb header;
else
  goto bb exit;

bb latch:
ivtmp = ivtmp + inc;
goto bb newheader
...



Updated patch, now using redirect_edge_var_map and flush_pending_stmts.

Bootstrapped and reg-tested on x86_64.

OK for stage1 trunk?



Ping.




Hi Richard,

thanks for the review.


Sorry for the delay (too many things to review, too much other work...)



Np, thanks for the review.


+static void
+replace_imm_uses (tree val, imm_use_iterator *imm_iter)
+{
+  use_operand_p use_p;
+
+  FOR_EACH_IMM_USE_ON_STMT (use_p, *imm_iter)
+SET_USE (use_p, val);

Use propagate_value.  Why this odd interface passing a imm_iter?!



The replace_imm_uses function is common code factored out of
replace_uses_in_bb_by and replace_uses_in_bbs_by. I'm not sure what is odd
about the interface of the replace_imm_uses function, but I've removed the
function.

I tried using propagate_value, but that one doesn't like virtuals.


In fact most of the repair SSA in transform_to_exit_first_loop_alt
looks too ad-hoc to me ... it also looks somewhat familiar to other
code ...

Unfortunately the comment before the function isn't in SSA form
so it's hard to follow the transform.



I've added the ssa bits in the transformation comment before the function, and
updated variable names and comments in the function.


I consider the parloops code bitrotten, no repair possible, so
I might be convinced to not care about new spaghetti in there...

+  /* Fix up loop structure.  TODO: Check whether this is sufficient.  */
+  loop-header = new_header;
+

no, surely not.  Number of iterations (and estimates) are off
after the transform


The loop is cancelled at the end of gen_parallel_loop, and the estimations are
removed.  We only seem to be using a limited set of the loop data until the
loop is cancelled. Updated comment accordingly.


and loop-latch might also need updating.



That one actually stays the same. Updated comment accordingly.


Safest is to simply schedule a fixup (but you'll lose any
loop annotation in that process).



Since the loop is cancelled, AFAIU we don't need that, but... You're
referring to the annotations mentioned in
replace_loop_annotate_in_block? Losing the annotations seems to be a
generic problem of scheduling such a fixup, not particular to this
patch. Do you have a concern related to the patch and these annotations?


For example such annotations (or others added by OpenMP like the
safe_len stuff).  Generally fixups preserve those _iff_ the loop
header stays the same.  So in your case doing

   loop-header = new_header;
   set_loops_state (LOOPS_NEED_FIXUP);

would probably work.  But yes, if the loop is cancelled anyway
there is no point jumping through hoops.


+  /* Figure out whether nit + 1 overflows.  */
+  if (TREE_CODE (nit) == INTEGER_CST)
+{
+  if (!tree_int_cst_equal (nit, TYPE_MAXVAL (nit_type)))

in case nit_type is a pointer type TYPE_MAXVAL will be NULL I think.



We've done ivcanon just before this point, so AFAIU we can assume we're
dealing with an unsigned integer.


Is the whole exercise for performance?


I'm trying to fix the todo in the code for parloops, which is about
performance, though I don't expect a large gain.

OTOH, my main focus is a related oacc kernels issue. For an oacc kernels
region, the entire loop is enclosed in a kernels region. Peeling of the last
iteration means I have to guard that iteration 

Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-05-26 Thread Richard Biener
On Thu, 14 May 2015, Tom de Vries wrote:

 On 20-04-15 14:25, Richard Biener wrote:
  On Wed, 15 Apr 2015, Tom de Vries wrote:
  
   On 03-04-15 14:39, Tom de Vries wrote:
On 27-03-15 15:10, Tom de Vries wrote:
 Hi,
 
 this patch fixes PR65443, a todo in the parloops pass for function
 transform_to_exit_first_loop:
 ...
  TODO: the common case is that latch of the loop is empty and
 immediately
  follows the loop exit.  In this case, it would be better not to
 copy
 the
  body of the loop, but only move the entry of the loop directly
 before
 the
  exit check and increase the number of iterations of the loop by
 one.
  This may need some additional preconditioning in case NIT = ~0.
 ...
 
 The current implementation of transform_to_exit_first_loop transforms
 the
 loop
 by moving and duplicating the loop body. This patch transforms the
 loop
 into the
 same normal form, but without the duplication, if that's possible
 (determined by
 try_transform_to_exit_first_loop_alt).
 
 The actual transformation, done by transform_to_exit_first_loop_alt
 transforms
 this loop:
 ...
bb preheader:
...
goto bb header
 
bb header:
...
if (ivtmp  n)
  goto bb latch;
else
  goto bb exit;
 
bb latch:
ivtmp = ivtmp + inc;
goto bb header
 ...
 
 into this one:
 ...
bb preheader:
...
goto bb newheader
 
bb header:
...
goto bb latch;
 
bb newheader:
if (ivtmp  n + 1)
  goto bb header;
else
  goto bb exit;
 
bb latch:
ivtmp = ivtmp + inc;
goto bb newheader
 ...
 

Updated patch, now using redirect_edge_var_map and flush_pending_stmts.

Bootstrapped and reg-tested on x86_64.

OK for stage1 trunk?

   
   Ping.
  
 
 Hi Richard,
 
 thanks for the review.

Sorry for the delay (too many things to review, too much other work...)

  +static void
  +replace_imm_uses (tree val, imm_use_iterator *imm_iter)
  +{
  +  use_operand_p use_p;
  +
  +  FOR_EACH_IMM_USE_ON_STMT (use_p, *imm_iter)
  +SET_USE (use_p, val);
  
  Use propagate_value.  Why this odd interface passing a imm_iter?!
  
 
 The replace_imm_uses function is common code factored out of
 replace_uses_in_bb_by and replace_uses_in_bbs_by. I'm not sure what is odd
 about the interface of the replace_imm_uses function, but I've removed the
 function.
 
 I tried using propagate_value, but that one doesn't like virtuals.

  In fact most of the repair SSA in transform_to_exit_first_loop_alt
  looks too ad-hoc to me ... it also looks somewhat familiar to other
  code ...
  
  Unfortunately the comment before the function isn't in SSA form
  so it's hard to follow the transform.
  
 
 I've added the ssa bits in the transformation comment before the function, and
 updated variable names and comments in the function.
 
  I consider the parloops code bitrotten, no repair possible, so
  I might be convinced to not care about new spaghetti in there...
  
  +  /* Fix up loop structure.  TODO: Check whether this is sufficient.  */
  +  loop-header = new_header;
  +
  
  no, surely not.  Number of iterations (and estimates) are off
  after the transform
 
 The loop is cancelled at the end of gen_parallel_loop, and the estimations are
 removed.  We only seem to be using a limited set of the loop data until the
 loop is cancelled. Updated comment accordingly.
 
  and loop-latch might also need updating.
  
 
 That one actually stays the same. Updated comment accordingly.
 
  Safest is to simply schedule a fixup (but you'll lose any
  loop annotation in that process).
  
 
 Since the loop is cancelled, AFAIU we don't need that, but... You're 
 referring to the annotations mentioned in 
 replace_loop_annotate_in_block? Losing the annotations seems to be a 
 generic problem of scheduling such a fixup, not particular to this 
 patch. Do you have a concern related to the patch and these annotations?

For example such annotations (or others added by OpenMP like the
safe_len stuff).  Generally fixups preserve those _iff_ the loop
header stays the same.  So in your case doing

  loop-header = new_header;
  set_loops_state (LOOPS_NEED_FIXUP);

would probably work.  But yes, if the loop is cancelled anyway
there is no point jumping through hoops.

  +  /* Figure out whether nit + 1 overflows.  */
  +  if (TREE_CODE (nit) == INTEGER_CST)
  +{
  +  if (!tree_int_cst_equal (nit, TYPE_MAXVAL (nit_type)))
  
  in case nit_type is a pointer type TYPE_MAXVAL will be NULL I think.
  
 
 We've done ivcanon just before this point, so AFAIU we can assume we're
 dealing with an 

Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-05-14 Thread Tom de Vries

On 20-04-15 14:25, Richard Biener wrote:

On Wed, 15 Apr 2015, Tom de Vries wrote:


On 03-04-15 14:39, Tom de Vries wrote:

On 27-03-15 15:10, Tom de Vries wrote:

Hi,

this patch fixes PR65443, a todo in the parloops pass for function
transform_to_exit_first_loop:
...
 TODO: the common case is that latch of the loop is empty and
immediately
 follows the loop exit.  In this case, it would be better not to copy
the
 body of the loop, but only move the entry of the loop directly before
the
 exit check and increase the number of iterations of the loop by one.
 This may need some additional preconditioning in case NIT = ~0.
...

The current implementation of transform_to_exit_first_loop transforms the
loop
by moving and duplicating the loop body. This patch transforms the loop
into the
same normal form, but without the duplication, if that's possible
(determined by
try_transform_to_exit_first_loop_alt).

The actual transformation, done by transform_to_exit_first_loop_alt
transforms
this loop:
...
   bb preheader:
   ...
   goto bb header

   bb header:
   ...
   if (ivtmp  n)
 goto bb latch;
   else
 goto bb exit;

   bb latch:
   ivtmp = ivtmp + inc;
   goto bb header
...

into this one:
...
   bb preheader:
   ...
   goto bb newheader

   bb header:
   ...
   goto bb latch;

   bb newheader:
   if (ivtmp  n + 1)
 goto bb header;
   else
 goto bb exit;

   bb latch:
   ivtmp = ivtmp + inc;
   goto bb newheader
...



Updated patch, now using redirect_edge_var_map and flush_pending_stmts.

Bootstrapped and reg-tested on x86_64.

OK for stage1 trunk?



Ping.




Hi Richard,

thanks for the review.


+static void
+replace_imm_uses (tree val, imm_use_iterator *imm_iter)
+{
+  use_operand_p use_p;
+
+  FOR_EACH_IMM_USE_ON_STMT (use_p, *imm_iter)
+SET_USE (use_p, val);

Use propagate_value.  Why this odd interface passing a imm_iter?!



The replace_imm_uses function is common code factored out of 
replace_uses_in_bb_by and replace_uses_in_bbs_by. I'm not sure what is odd about 
the interface of the replace_imm_uses function, but I've removed the function.


I tried using propagate_value, but that one doesn't like virtuals.


In fact most of the repair SSA in transform_to_exit_first_loop_alt
looks too ad-hoc to me ... it also looks somewhat familiar to other
code ...

Unfortunately the comment before the function isn't in SSA form
so it's hard to follow the transform.



I've added the ssa bits in the transformation comment before the function, and 
updated variable names and comments in the function.



I consider the parloops code bitrotten, no repair possible, so
I might be convinced to not care about new spaghetti in there...

+  /* Fix up loop structure.  TODO: Check whether this is sufficient.  */
+  loop-header = new_header;
+

no, surely not.  Number of iterations (and estimates) are off
after the transform


The loop is cancelled at the end of gen_parallel_loop, and the estimations are 
removed.  We only seem to be using a limited set of the loop data until the loop 
is cancelled. Updated comment accordingly.



and loop-latch might also need updating.



That one actually stays the same. Updated comment accordingly.


Safest is to simply schedule a fixup (but you'll lose any
loop annotation in that process).



Since the loop is cancelled, AFAIU we don't need that, but... You're referring 
to the annotations mentioned in replace_loop_annotate_in_block? Losing the 
annotations seems to be a generic problem of scheduling such a fixup, not 
particular to this patch. Do you have a concern related to the patch and these 
annotations?



+  /* Figure out whether nit + 1 overflows.  */
+  if (TREE_CODE (nit) == INTEGER_CST)
+{
+  if (!tree_int_cst_equal (nit, TYPE_MAXVAL (nit_type)))

in case nit_type is a pointer type TYPE_MAXVAL will be NULL I think.



We've done ivcanon just before this point, so AFAIU we can assume we're dealing 
with an unsigned integer.



Is the whole exercise for performance?


I'm trying to fix the todo in the code for parloops, which is about performance, 
though I don't expect a large gain.


OTOH, my main focus is a related oacc kernels issue. For an oacc kernels region, 
the entire loop is enclosed in a kernels region. Peeling of the last iteration 
means I have to guard that iteration to run on only one thread, which currently 
isn't done, so in that sense it's a correctness issue as well.



In that case using an
entirely new, unsigned IV, that runs from 0 to niter should be easiest and


Introducting such an IV would mean that we don't have to update the IV, but 
still we have to connect the new IV to the uses of the old IV.


The current special handling of the IV variable is actually not that elaborate:
...
+  /* Replace ivtmp_a with ivtmp_c in condition 'if (ivtmp_a  n)'.  */
+  replace_uses_in_bb_by (res_a, res_c, 

Re: [PING][PATCH][PR65443] Add transform_to_exit_first_loop_alt

2015-04-20 Thread Richard Biener
On Wed, 15 Apr 2015, Tom de Vries wrote:

 On 03-04-15 14:39, Tom de Vries wrote:
  On 27-03-15 15:10, Tom de Vries wrote:
   Hi,
   
   this patch fixes PR65443, a todo in the parloops pass for function
   transform_to_exit_first_loop:
   ...
   TODO: the common case is that latch of the loop is empty and
   immediately
   follows the loop exit.  In this case, it would be better not to copy
   the
   body of the loop, but only move the entry of the loop directly before
   the
   exit check and increase the number of iterations of the loop by one.
   This may need some additional preconditioning in case NIT = ~0.
   ...
   
   The current implementation of transform_to_exit_first_loop transforms the
   loop
   by moving and duplicating the loop body. This patch transforms the loop
   into the
   same normal form, but without the duplication, if that's possible
   (determined by
   try_transform_to_exit_first_loop_alt).
   
   The actual transformation, done by transform_to_exit_first_loop_alt
   transforms
   this loop:
   ...
 bb preheader:
 ...
 goto bb header
   
 bb header:
 ...
 if (ivtmp  n)
   goto bb latch;
 else
   goto bb exit;
   
 bb latch:
 ivtmp = ivtmp + inc;
 goto bb header
   ...
   
   into this one:
   ...
 bb preheader:
 ...
 goto bb newheader
   
 bb header:
 ...
 goto bb latch;
   
 bb newheader:
 if (ivtmp  n + 1)
   goto bb header;
 else
   goto bb exit;
   
 bb latch:
 ivtmp = ivtmp + inc;
 goto bb newheader
   ...
   
  
  Updated patch, now using redirect_edge_var_map and flush_pending_stmts.
  
  Bootstrapped and reg-tested on x86_64.
  
  OK for stage1 trunk?
  
 
 Ping.

+static void
+replace_imm_uses (tree val, imm_use_iterator *imm_iter)
+{
+  use_operand_p use_p;
+
+  FOR_EACH_IMM_USE_ON_STMT (use_p, *imm_iter)
+SET_USE (use_p, val);

Use propagate_value.  Why this odd interface passing a imm_iter?!

In fact most of the repair SSA in transform_to_exit_first_loop_alt
looks too ad-hoc to me ... it also looks somewhat familiar to other
code ...

Unfortunately the comment before the function isn't in SSA form
so it's hard to follow the transform.

I consider the parloops code bitrotten, no repair possible, so
I might be convinced to not care about new spaghetti in there...

+  /* Fix up loop structure.  TODO: Check whether this is sufficient.  */
+  loop-header = new_header;
+

no, surely not.  Number of iterations (and estimates) are off
after the transform and loop-latch might also need updating.

Safest is to simply schedule a fixup (but you'll lose any
loop annotation in that process).

+  /* Figure out whether nit + 1 overflows.  */
+  if (TREE_CODE (nit) == INTEGER_CST)
+{
+  if (!tree_int_cst_equal (nit, TYPE_MAXVAL (nit_type)))

in case nit_type is a pointer type TYPE_MAXVAL will be NULL I think.

Is the whole exercise for performance?  In that case using an
entirely new, unsigned IV, that runs from 0 to niter should
be easiest and just run-time guard that niter == +INF case?

For the graphite case, can't you make graphite generate the
loops exit-first in the first place?

The testcases are just correctness ones?  That is, there isn't
any testcase that checks whether the new code is exercised?
(no extra debugging dumping?)

Thanks,
Richard.