Re: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-05 Thread Richard Biener
On Thu, Mar 5, 2015 at 10:53 AM, Thomas Preud'homme
 wrote:
> Note: this is stage1 material.
>
> Currently loop2_invariant pass hoist instructions out of loop by creating a 
> new temporary for the destination register of that instruction and leaving 
> there a mov from new temporary to old register as shown below:
>
> loop header
> start of loop body
> //stuff
> (set (reg 128) (const_int 0))
> //other stuff
> end of loop body
>
> becomes:
>
> (set (reg 129) (const_int 0))
> loop header
> start of loop body
> //stuff
> (set (reg 128) (reg 128))
> //other stuff
> end of loop body
>
> This is one of the errors that led to a useless ldr ending up inside a loop 
> (PR64616). This patch fix this specific bit (some other bit was fixed in [1]) 
> by simply moving the instruction if it's known to be safe. This is decided by 
> looking at all the uses of the register set in the instruction and checking 
> that (i) they were all dominated by the instruction and (ii) there is no 
> other def in the loop that could end up reaching one of the use.

Why doesn't copy-propagation clean this up?  It's run after loop2.

Richard.

> [1] https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00933.html
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2015-02-16  Thomas Preud'homme  
>
> * dominance.c (nearest_common_dominator_for_set): Fix A_Dominated_by_B
> code in comment.
> * loop-invariant.c (can_move_invariant_reg): New.
> (move_invariant_reg): Call above new function to decide whether
> instruction can just be moved, skipping creation of temporary
> register.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2015-02-16  Thomas Preud'homme  
>
> * gcc.dg/loop-7.c: Run on all targets and check for loop2_invariant
> being able to move instructions without introducing new temporary
> register.
> * gcc.dg/loop-8.c: New test.
>
> diff --git a/gcc/dominance.c b/gcc/dominance.c
> index 33d4ae4..09c8c90 100644
> --- a/gcc/dominance.c
> +++ b/gcc/dominance.c
> @@ -982,7 +982,7 @@ nearest_common_dominator_for_set (enum cdi_direction dir, 
> bitmap blocks)
>
> A_Dominated_by_B (node A, node B)
> {
> - return DFS_Number_In(A) >= DFS_Number_In(A)
> + return DFS_Number_In(A) >= DFS_Number_In(B)
>  && DFS_Number_Out (A) <= DFS_Number_Out(B);
> }  */
>
> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index f79b497..ab2a45c 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -1512,6 +1512,99 @@ replace_uses (struct invariant *inv, rtx reg, bool 
> in_group)
>return 1;
>  }
>
> +/* Whether invariant INV setting REG can be moved out of LOOP, at the end of
> +   the block preceding its header.  */
> +
> +static bool
> +can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg)
> +{
> +  df_ref def, use;
> +  bool ret = false;
> +  unsigned int i, dest_regno, defs_in_loop_count = 0;
> +  rtx_insn *insn = inv->insn;
> +  bitmap may_exit, has_exit, always_executed;
> +  basic_block *body, bb = BLOCK_FOR_INSN (inv->insn);
> +
> +  /* We ignore hard register and memory access for cost and complexity 
> reasons.
> + Hard register are few at this stage and expensive to consider as they
> + require building a separate data flow.  Memory access would require 
> using
> + df_simulate_* and can_move_insns_across functions and is more complex.  
> */
> +  if (!REG_P (reg) || HARD_REGISTER_P (reg))
> +return false;
> +
> +  /* Check whether the set is always executed.  We could omit this condition 
> if
> + we know that the register is unused outside of the loop, but it does not
> + seem worth finding out.  */
> +  may_exit = BITMAP_ALLOC (NULL);
> +  has_exit = BITMAP_ALLOC (NULL);
> +  always_executed = BITMAP_ALLOC (NULL);
> +  body = get_loop_body_in_dom_order (loop);
> +  find_exits (loop, body, may_exit, has_exit);
> +  compute_always_reached (loop, body, has_exit, always_executed);
> +  /* Find bit position for basic block bb.  */
> +  for (i = 0; i < loop->num_nodes && body[i] != bb; i++);
> +  if (!bitmap_bit_p (always_executed, i))
> +goto cleanup;
> +
> +  /* Check that all uses reached by the def in insn would still be reached
> + it.  */
> +  dest_regno = REGNO (reg);
> +  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
> +{
> +  rtx ref;
> +  basic_block use_bb;
> +
> +  ref = DF_REF_INSN (use);
> +  use_bb = BLOCK_FOR_INSN (ref);
> +
> +  /* Ignore instruction considered for moving.  */
> +  if (ref == insn)
> +   continue;
> +
> +  /* Don't consider uses outside loop.  */
> +  if (!flow_bb_inside_loop_p (loop, use_bb))
> +   continue;
> +
> +  /* Don't move if a use is not dominated by def in insn.  */
> +  if (use_bb == bb && DF_INSN_LUID (insn) > DF_INSN_LUID (ref))
> +   goto cleanup;
> +  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
> +   goto clea

RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-06 Thread Thomas Preud'homme
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Thursday, March 05, 2015 7:12 PM
> >
> > loop header
> > start of loop body
> > //stuff
> > (set (reg 128) (const_int 0))
> > //other stuff
> > end of loop body
> >
> > becomes:
> >
> > (set (reg 129) (const_int 0))
> > loop header
> > start of loop body
> > //stuff
> > (set (reg 128) (reg 128))
> > //other stuff
> > end of loop body
> >
> 
> Why doesn't copy-propagation clean this up?  It's run after loop2.

Actually cprop3 is what makes the situation worse in this case as it
will copy the constant that is set outside the loop in the mov that is in
the loop. In the case or PR64616 the constant is a symbol_ref which
makes it a memory access so it propagates the memory access in the
loop, making the load executed many times.

Note that as I said in the intro this bug is also solved by [1] which is
the first thing that goes wrong for this example. That being said,
loop invariant pass ought to simply move instructions if it can safely
do so.

[1] https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00933.html

Best regards,

Thomas






Re: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-06 Thread Jiong Wang


On 05/03/15 09:53, Thomas Preud'homme wrote:

*** gcc/testsuite/ChangeLog ***

2015-02-16  Thomas Preud'homme  

 * gcc.dg/loop-7.c: Run on all targets and check for loop2_invariant
 being able to move instructions without introducing new temporary

Thomas,

  Can you please confirm this relax on all target will not fail on 
AArch64? It do fails on my quick test.





RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-09 Thread Thomas Preud'homme
> From: Jiong Wang [mailto:jiong.w...@arm.com]
> Sent: Friday, March 06, 2015 8:10 PM
> 
> On 05/03/15 09:53, Thomas Preud'homme wrote:
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2015-02-16  Thomas Preud'homme  
> >
> >  * gcc.dg/loop-7.c: Run on all targets and check for loop2_invariant
> >  being able to move instructions without introducing new
> temporary
> Thomas,
> 
>Can you please confirm this relax on all target will not fail on
> AArch64? It do fails on my quick test.

Indeed, I made a very naïve assumption here and should have tested on a wider
range of targets. I'll rework the testcases associated with this patch.

Thanks for catching it Jiong.

Best regards,

Thomas





Re: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-09 Thread Steven Bosscher
On Thu, Mar 5, 2015 at 10:53 AM, Thomas Preud'homme wrote:
> diff --git a/gcc/dominance.c b/gcc/dominance.c
> index 33d4ae4..09c8c90 100644
> --- a/gcc/dominance.c
> +++ b/gcc/dominance.c
> @@ -982,7 +982,7 @@ nearest_common_dominator_for_set (enum cdi_direction dir, 
> bitmap blocks)
>
> A_Dominated_by_B (node A, node B)
> {
> - return DFS_Number_In(A) >= DFS_Number_In(A)
> + return DFS_Number_In(A) >= DFS_Number_In(B)
>  && DFS_Number_Out (A) <= DFS_Number_Out(B);
> }  */

This hunk is obvious enough ;-)


> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index f79b497..ab2a45c 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
...
> +  /* Check whether the set is always executed.  We could omit this condition 
> if
> + we know that the register is unused outside of the loop, but it does not
> + seem worth finding out.  */
> +  may_exit = BITMAP_ALLOC (NULL);
> +  has_exit = BITMAP_ALLOC (NULL);
> +  always_executed = BITMAP_ALLOC (NULL);
> +  body = get_loop_body_in_dom_order (loop);
> +  find_exits (loop, body, may_exit, has_exit);
> +  compute_always_reached (loop, body, has_exit, always_executed);
> +  /* Find bit position for basic block bb.  */
> +  for (i = 0; i < loop->num_nodes && body[i] != bb; i++);
> +  if (!bitmap_bit_p (always_executed, i))
> +goto cleanup;

It looks like this would run for all candidate loop invariants, right?

If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
potential compile time hog for large loops.

But why compute this at all? Perhaps I'm missing something, but you
already have inv->always_executed available, no?




> +  /* Check that all uses reached by the def in insn would still be reached
> + it.  */
> +  dest_regno = REGNO (reg);
> +  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
> +{
> +  rtx ref;

Would be nice if we can start using rtx_insn for new code. You do so
further up, I suggest you continue that good citizenship here :-)


> +  basic_block use_bb;
> +
> +  ref = DF_REF_INSN (use);
> +  use_bb = BLOCK_FOR_INSN (ref);

You can use DF_REF_BB.



> +  /* Ignore instruction considered for moving.  */
> +  if (ref == insn)
> +   continue;
> +
> +  /* Don't consider uses outside loop.  */
> +  if (!flow_bb_inside_loop_p (loop, use_bb))
> +   continue;
> +
> +  /* Don't move if a use is not dominated by def in insn.  */
> +  if (use_bb == bb && DF_INSN_LUID (insn) > DF_INSN_LUID (ref))
> +   goto cleanup;

You're safer with ">=" even if you've already checked ref==insn.


Ciao!
Steven


RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-10 Thread Thomas Preud'homme
> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Monday, March 09, 2015 7:48 PM
> To: Thomas Preud'homme
> Cc: GCC Patches; Eric Botcazou
> Subject: Re: [PATCH, stage1] Move insns without introducing new
> temporaries in loop2_invariant
> 
> On Thu, Mar 5, 2015 at 10:53 AM, Thomas Preud'homme wrote:
> > diff --git a/gcc/dominance.c b/gcc/dominance.c
> > index 33d4ae4..09c8c90 100644
> > --- a/gcc/dominance.c
> > +++ b/gcc/dominance.c
> > @@ -982,7 +982,7 @@ nearest_common_dominator_for_set (enum
> cdi_direction dir, bitmap blocks)
> >
> > A_Dominated_by_B (node A, node B)
> > {
> > - return DFS_Number_In(A) >= DFS_Number_In(A)
> > + return DFS_Number_In(A) >= DFS_Number_In(B)
> >  && DFS_Number_Out (A) <= DFS_Number_Out(B);
> > }  */
> 
> This hunk is obvious enough ;-)

Thus committed.

Best regards,

Thomas






RE: [PATCH, stage1] Move insns without introducing new temporaries in loop2_invariant

2015-03-16 Thread Thomas Preud'homme
> From: Steven Bosscher [mailto:stevenb@gmail.com]
> Sent: Monday, March 09, 2015 7:48 PM
> To: Thomas Preud'homme
> Cc: GCC Patches; Eric Botcazou
> Subject: Re: [PATCH, stage1] Move insns without introducing new
> temporaries in loop2_invariant

New patch below.

> 
> It looks like this would run for all candidate loop invariants, right?
> 
> If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
> potential compile time hog for large loops.
> 
> But why compute this at all? Perhaps I'm missing something, but you
> already have inv->always_executed available, no?

Indeed. I didn't realize the information was already there.

> 
> 
> > +  basic_block use_bb;
> > +
> > +  ref = DF_REF_INSN (use);
> > +  use_bb = BLOCK_FOR_INSN (ref);
> 
> You can use DF_REF_BB.

Since I need use_insn here I kept BLOCK_FOR_INSN but I used DF_REF_BB for the 
def below.


So here are the new ChangeLog entries:

*** gcc/ChangeLog ***

2015-03-11  Thomas Preud'homme  

* loop-invariant.c (can_move_invariant_reg): New.
(move_invariant_reg): Call above new function to decide whether
instruction can just be moved, skipping creation of temporary
register.

*** gcc/testsuite/ChangeLog ***

2015-03-12  Thomas Preud'homme  

* gcc.dg/loop-8.c: New test.
* gcc.dg/loop-9.c: New test.


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index f79b497..8217d62 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1512,6 +1512,79 @@ replace_uses (struct invariant *inv, rtx reg, bool 
in_group)
   return 1;
 }
 
And the new patch:

+/* Whether invariant INV setting REG can be moved out of LOOP, at the end of
+   the block preceding its header.  */
+
+static bool
+can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx reg)
+{
+  df_ref def, use;
+  unsigned int dest_regno, defs_in_loop_count = 0;
+  rtx_insn *insn = inv->insn;
+  basic_block bb = BLOCK_FOR_INSN (inv->insn);
+
+  /* We ignore hard register and memory access for cost and complexity reasons.
+ Hard register are few at this stage and expensive to consider as they
+ require building a separate data flow.  Memory access would require using
+ df_simulate_* and can_move_insns_across functions and is more complex.  */
+  if (!REG_P (reg) || HARD_REGISTER_P (reg))
+return false;
+
+  /* Check whether the set is always executed.  We could omit this condition if
+ we know that the register is unused outside of the loop, but it does not
+ seem worth finding out.  */
+  if (!inv->always_executed)
+return false;
+
+  /* Check that all uses reached by the def in insn would still be reached
+ it.  */
+  dest_regno = REGNO (reg);
+  for (use = DF_REG_USE_CHAIN (dest_regno); use; use = DF_REF_NEXT_REG (use))
+{
+  rtx_insn *use_insn;
+  basic_block use_bb;
+
+  use_insn = DF_REF_INSN (use);
+  use_bb = BLOCK_FOR_INSN (use_insn);
+
+  /* Ignore instruction considered for moving.  */
+  if (use_insn == insn)
+   continue;
+
+  /* Don't consider uses outside loop.  */
+  if (!flow_bb_inside_loop_p (loop, use_bb))
+   continue;
+
+  /* Don't move if a use is not dominated by def in insn.  */
+  if (use_bb == bb && DF_INSN_LUID (insn) >= DF_INSN_LUID (use_insn))
+   return false;
+  if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
+   return false;
+}
+
+  /* Check for other defs.  Any other def in the loop might reach a use
+ currently reached by the def in insn.  */
+  for (def = DF_REG_DEF_CHAIN (dest_regno); def; def = DF_REF_NEXT_REG (def))
+{
+  basic_block def_bb = DF_REF_BB (def);
+
+  /* Defs in exit block cannot reach a use they weren't already.  */
+  if (single_succ_p (def_bb))
+   {
+ basic_block def_bb_succ;
+
+ def_bb_succ = single_succ (def_bb);
+ if (!flow_bb_inside_loop_p (loop, def_bb_succ))
+   continue;
+   }
+
+  if (++defs_in_loop_count > 1)
+   return false;
+}
+
+  return true;
+}
+
 /* Move invariant INVNO out of the LOOP.  Returns true if this succeeds, false
otherwise.  */
 
@@ -1545,11 +1618,8 @@ move_invariant_reg (struct loop *loop, unsigned invno)
}
}
 
-  /* Move the set out of the loop.  If the set is always executed (we could
-omit this condition if we know that the register is unused outside of
-the loop, but it does not seem worth finding out) and it has no uses
-that would not be dominated by it, we may just move it (TODO).
-Otherwise we need to create a temporary register.  */
+  /* If possible, just move the set out of the loop.  Otherwise, we
+need to create a temporary register.  */
   set = single_set (inv->insn);
   reg = dest = SET_DEST (set);