Re: [PATCH 7/9] ifcvt: Emit two cmov variants and choose the less expensive one.

2019-10-27 Thread Richard Sandiford
Robin Dapp  writes:
> This patch duplicates the previous noce_emit_cmove logic.  First it
> passes the canonical comparison emits the sequence and costs it.
> Then, a second, separate sequence is created by passing the cc compare
> we extracted before.  The costs of both sequences are compared and the
> cheaper one is emitted.
> ---
>  gcc/ifcvt.c | 106 +++-
>  1 file changed, 97 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 3db707e1fd1..955f9541f60 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -3208,8 +3208,11 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>  
>hash_map need_temps;
>  
> -  if (!cc_cmp)
> -check_need_temps (then_bb, _temps, cond);
> +  /* If we newly set a CC before a cmov, we might need a temporary
> + even though the compare will be removed by a later pass.  Costing
> + of a sequence with these will be inaccurate.  */
> +
> +  check_need_temps (then_bb, _temps, cond);
>  
>hash_map temps_created;
>  
> @@ -3298,18 +3301,103 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
> old_val = lowpart_subreg (dst_mode, old_val, src_mode);
>   }
>  
> -  /* Actually emit the conditional move.  */
> -  rtx temp_dest = noce_emit_cmove (if_info, temp, cond_code,
> -x, y, new_val, old_val, cc_cmp);
> +  /* Try emitting a conditional move passing the backend the
> +  canonicalized comparison.  This can e.g. be used to recognize
> +  expressions like
> +
> +if (a < b)
> +{
> +  ...
> +  b = a;
> +}
> +
> +  A backend can recognize this and emit a min/max insn.
> +  We will still emit a superfluous CC comparison before the
> +  min/max, though, which complicates costing.
> +  */
> +
> +  rtx_insn *seq, *seq1 = 0, *seq2 = 0;
> +  rtx temp_dest, temp_dest1, temp_dest2;
> +  unsigned int cost1 = 0, cost2 = 0;
> +  bool speed_p = if_info->speed_p;
> +  push_topmost_sequence ();

Why do we need the push_topmost_sequence/pop_topmost_sequence?
start_sequence/end_sequence ought to be enough.

> +
> +  {
> + start_sequence ();
> + temp_dest1 = noce_emit_cmove (if_info, temp, cond_code,
> + x, y, new_val, old_val, NULL_RTX);

Nit: arguments should be indented below "if_info, ".

> +
> + /* If we failed to expand the conditional move, drop out and don't
> +try to continue.  */
> + if (temp_dest1 == NULL_RTX)
> +   {
> + seq1 = NULL;
> + cost1 = COSTS_N_INSNS (100);
> +   }
> + else
> +   {
> + seq1 = get_insns ();
> + cost1 = seq_cost (seq1, speed_p);
> + rtx_insn *ii;
> + for (ii = seq1; ii; ii = NEXT_INSN (ii))
> +   if (recog_memozied (ii) == -1)
> + cost1 += 1;
> +   }
> + end_sequence ();

Don't think we should use fake costs for failure.  Just setting seq1 to
null should be enough, and should be OK for the recog_memoized case too.
Can break the loop as soon as we find a failure.

> +  }
> +
> + /* Now try emitting one passing a non-canonicalized cc comparison.
> +This allows the backend to emit a cmov directly without additional
> +compare.  */
> +  {
> + start_sequence ();
> + temp_dest2 = noce_emit_cmove (if_info, target, cond_code,
> + x, y, new_val, old_val, cc_cmp);
> +
> + /* If we failed to expand the conditional move, drop out and don't
> +try to continue.  */
> + if (temp_dest2 == NULL_RTX)
> +   {
> + seq2 = NULL;
> + cost2 = COSTS_N_INSNS (100);
> +   }
> + else
> +   {
> + seq2 = get_insns ();
> + cost2 = seq_cost (seq2, speed_p);
> + rtx_insn *ii;
> + for (ii = seq2; ii; ii = NEXT_INSN (ii))
> +   if (recog_memozied (ii) == -1)
> + cost2 += 1;
> +   }
> + end_sequence ();
> +  }

Same comments here.  Would be nice to split this out into a subroutine
rather than duplicate it.

Thanks,
Richard

>  
> -  /* If we failed to expand the conditional move, drop out and don't
> -  try to continue.  */
> -  if (temp_dest == NULL_RTX)
> +  /* Check which version is less expensive.  */
> +  if (cost1 <= cost2 && seq1 != NULL_RTX)
> + {
> +   temp_dest = temp_dest1;
> +   seq = seq1;
> + }
> +  else if (seq2 != NULL_RTX)
> + {
> +   /* We do not require a temp register, so remove it from the list.  */
> +   seq = seq2;
> +   temp_dest = temp_dest2;
> +   temps_created.remove (temp);
> + }
> +  else
>   {
> +   /* Nothing worked, bail out.  */
> +   pop_topmost_sequence ();
> end_sequence ();
> -   return FALSE;
> +   return false;
>   }
>  
> +  /* End the sub sequence and emit to the main sequence.  */
> +  pop_topmost_sequence ();
> + 

[PATCH 7/9] ifcvt: Emit two cmov variants and choose the less expensive one.

2019-08-02 Thread Robin Dapp
This patch duplicates the previous noce_emit_cmove logic.  First it
passes the canonical comparison emits the sequence and costs it.
Then, a second, separate sequence is created by passing the cc compare
we extracted before.  The costs of both sequences are compared and the
cheaper one is emitted.
---
 gcc/ifcvt.c | 106 +++-
 1 file changed, 97 insertions(+), 9 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 3db707e1fd1..955f9541f60 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3208,8 +3208,11 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 
   hash_map need_temps;
 
-  if (!cc_cmp)
-check_need_temps (then_bb, _temps, cond);
+  /* If we newly set a CC before a cmov, we might need a temporary
+ even though the compare will be removed by a later pass.  Costing
+ of a sequence with these will be inaccurate.  */
+
+  check_need_temps (then_bb, _temps, cond);
 
   hash_map temps_created;
 
@@ -3298,18 +3301,103 @@ noce_convert_multiple_sets (struct noce_if_info 
*if_info)
  old_val = lowpart_subreg (dst_mode, old_val, src_mode);
}
 
-  /* Actually emit the conditional move.  */
-  rtx temp_dest = noce_emit_cmove (if_info, temp, cond_code,
-  x, y, new_val, old_val, cc_cmp);
+  /* Try emitting a conditional move passing the backend the
+canonicalized comparison.  This can e.g. be used to recognize
+expressions like
+
+  if (a < b)
+  {
+...
+b = a;
+  }
+
+A backend can recognize this and emit a min/max insn.
+We will still emit a superfluous CC comparison before the
+min/max, though, which complicates costing.
+  */
+
+  rtx_insn *seq, *seq1 = 0, *seq2 = 0;
+  rtx temp_dest, temp_dest1, temp_dest2;
+  unsigned int cost1 = 0, cost2 = 0;
+  bool speed_p = if_info->speed_p;
+  push_topmost_sequence ();
+
+  {
+   start_sequence ();
+   temp_dest1 = noce_emit_cmove (if_info, temp, cond_code,
+   x, y, new_val, old_val, NULL_RTX);
+
+   /* If we failed to expand the conditional move, drop out and don't
+  try to continue.  */
+   if (temp_dest1 == NULL_RTX)
+ {
+   seq1 = NULL;
+   cost1 = COSTS_N_INSNS (100);
+ }
+   else
+ {
+   seq1 = get_insns ();
+   cost1 = seq_cost (seq1, speed_p);
+   rtx_insn *ii;
+   for (ii = seq1; ii; ii = NEXT_INSN (ii))
+ if (recog_memozied (ii) == -1)
+   cost1 += 1;
+ }
+   end_sequence ();
+  }
+
+   /* Now try emitting one passing a non-canonicalized cc comparison.
+  This allows the backend to emit a cmov directly without additional
+  compare.  */
+  {
+   start_sequence ();
+   temp_dest2 = noce_emit_cmove (if_info, target, cond_code,
+   x, y, new_val, old_val, cc_cmp);
+
+   /* If we failed to expand the conditional move, drop out and don't
+  try to continue.  */
+   if (temp_dest2 == NULL_RTX)
+ {
+   seq2 = NULL;
+   cost2 = COSTS_N_INSNS (100);
+ }
+   else
+ {
+   seq2 = get_insns ();
+   cost2 = seq_cost (seq2, speed_p);
+   rtx_insn *ii;
+   for (ii = seq2; ii; ii = NEXT_INSN (ii))
+ if (recog_memozied (ii) == -1)
+   cost2 += 1;
+ }
+   end_sequence ();
+  }
 
-  /* If we failed to expand the conditional move, drop out and don't
-try to continue.  */
-  if (temp_dest == NULL_RTX)
+  /* Check which version is less expensive.  */
+  if (cost1 <= cost2 && seq1 != NULL_RTX)
+   {
+ temp_dest = temp_dest1;
+ seq = seq1;
+   }
+  else if (seq2 != NULL_RTX)
+   {
+ /* We do not require a temp register, so remove it from the list.  */
+ seq = seq2;
+ temp_dest = temp_dest2;
+ temps_created.remove (temp);
+   }
+  else
{
+ /* Nothing worked, bail out.  */
+ pop_topmost_sequence ();
  end_sequence ();
- return FALSE;
+ return false;
}
 
+  /* End the sub sequence and emit to the main sequence.  */
+  pop_topmost_sequence ();
+  emit_insn (seq);
+
   /* Bookkeeping.  */
   count++;
   targets.safe_push (target);
-- 
2.17.0