Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-11-30 Thread Richard Sandiford via Gcc-patches
BTW, in response to your earlier concern about stage 3: you posted the
series well in time for end of stage 1, so I think it can still go in
during stage 3.

Robin Dapp  writes:
> Hi Richard,
>
>> It's hard to judge this in isolation because it's not clear when
>> and how the new arguments are going to be used, but it seems OK
>> in principle.  Do you still want:
>> 
>>   /* If earliest == jump, try to build the cmove insn directly.
>>  This is helpful when combine has created some complex condition
>>  (like for alpha's cmovlbs) that we can't hope to regenerate
>>  through the normal interface.  */
>> 
>>   if (if_info->cond_earliest == if_info->jump)
>> {
>> 
>> to be used when cc_cmp and rev_cc_cmp are nonnull?
>
> My initial hunch was to just leave it in place as I did not manage to
> trigger it.  As it is going to be called and costed both ways (with
> cc_cmp, rev_cc_cmp and without) it is probably better to move it into
> the else branch.
>
> The single usage of this is in patch 5/7.  We are passing the already
> existing condition from the jump and its reverse to see if the backend
> can come up with something better than when creating a new comparison.
>
>>> +static rtx emit_conditional_move (rtx, rtx, rtx, rtx, machine_mode);
>>> +rtx emit_conditional_move (rtx, rtx, rtx, rtx, rtx, machine_mode);
>> 
>> This is redundant with the header file declaration.
>> 
>
> Removed it.
>
>> I think it'd be better to call one of these functions something else,
>> rather than make the interpretation of the third parameter depend on
>> the total number of parameters.  In the second overload, the comparison
>> rtx effectively replaces four parameters of the existing
>> emit_conditional_move, so perhaps that's the one that should remain
>> emit_conditional_move.  Maybe the first one should be called
>> emit_conditional_move_with_rev or something.
>
> Not entirely fond of calling the first one _with_rev because essentially
> both try normal and reversed variants but I agree that the naming is not
> ideal.  I don't have any great ideas how to properly untangle it so I
> would go with your suggestions in order to move forward.  As there is
> only one caller of the second function, we could also let the caller
> handle the reversing.  Then, the third function would need to be
> non-static, though.
>
> The third, static emit_conditional_move I already renamed locally to
> emit_conditional_move_1.

Thanks, renaming the third function helps.

>> Part of me wonders if this would be simpler if we created a structure
>> to describe a comparison and passed that around instead of individual
>> fields, but I guess it could become a rat hole.
>
> I also thought about this as it would allow us to use either
> representation as required by the usage site.  Even tried it in a branch
> locally but indeed it became ugly quickly so I postponed it for now.

Still, perhaps we could at least add (in rtl.h):

struct rtx_comparison {
  rtx_code code;
  machine_mode op_mode;
  rtx op0, op1;
};

and make the existing emit_conditional_moves use it instead of four
separate parameters.  These rtx arguments would then be replacing those
rtx_comparison arguments, which would avoid the ambiguity in the overloads.

With C++ it should be possible to rewrite the calls using { … }, e.g.:

  if (!emit_conditional_move (into_target, { cmp_code, op1_mode, cmp1, cmp2 },
  into_target, into_superword, word_mode, false))

so the new type wouldn't need to spread too far.

Does that sound OK?  If so, could you post the current version of full
patch series and say which bits still need review?

Thanks,
Richard


Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-11-12 Thread Robin Dapp via Gcc-patches
Hi Richard,

> It's hard to judge this in isolation because it's not clear when
> and how the new arguments are going to be used, but it seems OK
> in principle.  Do you still want:
> 
>   /* If earliest == jump, try to build the cmove insn directly.
>  This is helpful when combine has created some complex condition
>  (like for alpha's cmovlbs) that we can't hope to regenerate
>  through the normal interface.  */
> 
>   if (if_info->cond_earliest == if_info->jump)
> {
> 
> to be used when cc_cmp and rev_cc_cmp are nonnull?

My initial hunch was to just leave it in place as I did not manage to
trigger it.  As it is going to be called and costed both ways (with
cc_cmp, rev_cc_cmp and without) it is probably better to move it into
the else branch.

The single usage of this is in patch 5/7.  We are passing the already
existing condition from the jump and its reverse to see if the backend
can come up with something better than when creating a new comparison.

>> +static rtx emit_conditional_move (rtx, rtx, rtx, rtx, machine_mode);
>> +rtx emit_conditional_move (rtx, rtx, rtx, rtx, rtx, machine_mode);
> 
> This is redundant with the header file declaration.
> 

Removed it.

> I think it'd be better to call one of these functions something else,
> rather than make the interpretation of the third parameter depend on
> the total number of parameters.  In the second overload, the comparison
> rtx effectively replaces four parameters of the existing
> emit_conditional_move, so perhaps that's the one that should remain
> emit_conditional_move.  Maybe the first one should be called
> emit_conditional_move_with_rev or something.

Not entirely fond of calling the first one _with_rev because essentially
both try normal and reversed variants but I agree that the naming is not
ideal.  I don't have any great ideas how to properly untangle it so I
would go with your suggestions in order to move forward.  As there is
only one caller of the second function, we could also let the caller
handle the reversing.  Then, the third function would need to be
non-static, though.

The third, static emit_conditional_move I already renamed locally to
emit_conditional_move_1.

> Part of me wonders if this would be simpler if we created a structure
> to describe a comparison and passed that around instead of individual
> fields, but I guess it could become a rat hole.

I also thought about this as it would allow us to use either
representation as required by the usage site.  Even tried it in a branch
locally but indeed it became ugly quickly so I postponed it for now.

Regards
 Robin


Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-11-05 Thread Richard Sandiford via Gcc-patches
Robin Dapp  writes:
> Hi Richard,
>
> after giving it a second thought, and seeing that most of the changes to 
> existing code are not strictly necessary anymore, I figured it could be 
> easier not changing the current control flow too much like in the 
> attached patch.
>
> The changes remaining are to "outsource" the maybe_expand_insn part and 
> making the emit_conditional_move with full comparison and rev_comparsion 
> externally available.
>
> I suppose straightening of the arguably somewhat baroque parts, we can 
> defer to a separate patch.
>
> On s390 this works nicely but I haven't yet done a bootstrap on other archs.
>
> Regards
>   Robin
>
> commit eb50384ee0cdeeefa61ae89bdbb2875500b7ce60
> Author: Robin Dapp 
> Date:   Wed Nov 27 13:53:40 2019 +0100
>
> ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.
> 
> Currently we only ever call emit_conditional_move with the comparison
> (as well as its comparands) we got from the jump.  Thus, backends are
> going to emit a CC comparison for every conditional move that is being
> generated instead of re-using the existing CC.
> This, combined with emitting temporaries for each conditional move,
> causes sky-high costs for conditional moves.
> 
> This patch allows to re-use a CC so the costing situation is improved a
> bit.

Sorry for the slow reply.

> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 6ae883cbdd4..f7765e60548 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -772,7 +772,7 @@ static int noce_try_addcc (struct noce_if_info *);
>  static int noce_try_store_flag_constants (struct noce_if_info *);
>  static int noce_try_store_flag_mask (struct noce_if_info *);
>  static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
> - rtx, rtx, rtx);
> + rtx, rtx, rtx, rtx = NULL, rtx = NULL);
>  static int noce_try_cmove (struct noce_if_info *);
>  static int noce_try_cmove_arith (struct noce_if_info *);
>  static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
> @@ -1711,7 +1711,8 @@ noce_try_store_flag_mask (struct noce_if_info *if_info)
>  
>  static rtx
>  noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
> -  rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue)
> +  rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue, rtx cc_cmp,
> +  rtx rev_cc_cmp)
>  {
>rtx target ATTRIBUTE_UNUSED;
>int unsignedp ATTRIBUTE_UNUSED;
> @@ -1743,23 +1744,30 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, 
> enum rtx_code code,
>end_sequence ();
>  }
>  
> -  /* Don't even try if the comparison operands are weird
> - except that the target supports cbranchcc4.  */
> -  if (! general_operand (cmp_a, GET_MODE (cmp_a))
> -  || ! general_operand (cmp_b, GET_MODE (cmp_b)))
> -{
> -  if (!have_cbranchcc4
> -   || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
> -   || cmp_b != const0_rtx)
> - return NULL_RTX;
> -}
> -
>unsignedp = (code == LTU || code == GEU
>  || code == LEU || code == GTU);
>  
> -  target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
> -   vtrue, vfalse, GET_MODE (x),
> -   unsignedp);
> +  if (cc_cmp != NULL_RTX && rev_cc_cmp != NULL_RTX)
> +target = emit_conditional_move (x, cc_cmp, rev_cc_cmp,
> + vtrue, vfalse, GET_MODE (x));
> +  else
> +{
> +  /* Don't even try if the comparison operands are weird
> +  except that the target supports cbranchcc4.  */
> +  if (! general_operand (cmp_a, GET_MODE (cmp_a))
> +   || ! general_operand (cmp_b, GET_MODE (cmp_b)))
> + {
> +   if (!have_cbranchcc4
> +   || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
> +   || cmp_b != const0_rtx)
> + return NULL_RTX;
> + }
> +
> +  target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
> +   vtrue, vfalse, GET_MODE (x),
> +   unsignedp);
> +}
> +

It's hard to judge this in isolation because it's not clear when
and how the new arguments are going to be used, but it seems OK
in principle.  Do you still want:

  /* If earliest == jump, try to build the cmove insn directly.
 This is helpful when combine has created some complex condition
 (like for alpha's cmovlbs) that we can't hope to regenerate
 through the normal interface.  */

  if (if_info->cond_earliest == if_info->jump)
{

to be used when cc_cmp and rev_cc_cmp are nonnull?

>if (target)
>  return target;
>  
> diff --git a/gcc/optabs.c b/gcc/optabs.c
> index 019bbb62882..25eecf29ed8 100644
> --- a/gcc/optabs.c
> +++ b/gcc/optabs.c
> @@ -52,6 +52,9 @@ static void prepare_float_lib_cmp (rtx, rtx, enum rtx_code, 
> rtx *,
>  static rtx expand_unop_direct (machine_mode, optab, rtx, rtx, int);
> 

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-11-03 Thread Robin Dapp via Gcc-patches
Ping :)

Not wanting to pester but I'd have hoped to get this into stage1 still
as it helps performance on s390 in some cases.
Expecting there will be some more reviewing rounds for the remaining
patches, would I need to open a PR in order to be able to "officially"
commit this in a later stage? This could relax the time constraints.

Regards
 Robin


Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-10-18 Thread Robin Dapp via Gcc-patches

Hi Richard,

after giving it a second thought, and seeing that most of the changes to 
existing code are not strictly necessary anymore, I figured it could be 
easier not changing the current control flow too much like in the 
attached patch.


The changes remaining are to "outsource" the maybe_expand_insn part and 
making the emit_conditional_move with full comparison and rev_comparsion 
externally available.


I suppose straightening of the arguably somewhat baroque parts, we can 
defer to a separate patch.


On s390 this works nicely but I haven't yet done a bootstrap on other archs.

Regards
 Robin
commit eb50384ee0cdeeefa61ae89bdbb2875500b7ce60
Author: Robin Dapp 
Date:   Wed Nov 27 13:53:40 2019 +0100

ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.

Currently we only ever call emit_conditional_move with the comparison
(as well as its comparands) we got from the jump.  Thus, backends are
going to emit a CC comparison for every conditional move that is being
generated instead of re-using the existing CC.
This, combined with emitting temporaries for each conditional move,
causes sky-high costs for conditional moves.

This patch allows to re-use a CC so the costing situation is improved a
bit.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 6ae883cbdd4..f7765e60548 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -772,7 +772,7 @@ static int noce_try_addcc (struct noce_if_info *);
 static int noce_try_store_flag_constants (struct noce_if_info *);
 static int noce_try_store_flag_mask (struct noce_if_info *);
 static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
-			rtx, rtx, rtx);
+			rtx, rtx, rtx, rtx = NULL, rtx = NULL);
 static int noce_try_cmove (struct noce_if_info *);
 static int noce_try_cmove_arith (struct noce_if_info *);
 static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
@@ -1711,7 +1711,8 @@ noce_try_store_flag_mask (struct noce_if_info *if_info)
 
 static rtx
 noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
-		 rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue)
+		 rtx cmp_a, rtx cmp_b, rtx vfalse, rtx vtrue, rtx cc_cmp,
+		 rtx rev_cc_cmp)
 {
   rtx target ATTRIBUTE_UNUSED;
   int unsignedp ATTRIBUTE_UNUSED;
@@ -1743,23 +1744,30 @@ noce_emit_cmove (struct noce_if_info *if_info, rtx x, enum rtx_code code,
   end_sequence ();
 }
 
-  /* Don't even try if the comparison operands are weird
- except that the target supports cbranchcc4.  */
-  if (! general_operand (cmp_a, GET_MODE (cmp_a))
-  || ! general_operand (cmp_b, GET_MODE (cmp_b)))
-{
-  if (!have_cbranchcc4
-	  || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
-	  || cmp_b != const0_rtx)
-	return NULL_RTX;
-}
-
   unsignedp = (code == LTU || code == GEU
 	   || code == LEU || code == GTU);
 
-  target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
-  vtrue, vfalse, GET_MODE (x),
-  unsignedp);
+  if (cc_cmp != NULL_RTX && rev_cc_cmp != NULL_RTX)
+target = emit_conditional_move (x, cc_cmp, rev_cc_cmp,
+vtrue, vfalse, GET_MODE (x));
+  else
+{
+  /* Don't even try if the comparison operands are weird
+	 except that the target supports cbranchcc4.  */
+  if (! general_operand (cmp_a, GET_MODE (cmp_a))
+	  || ! general_operand (cmp_b, GET_MODE (cmp_b)))
+	{
+	  if (!have_cbranchcc4
+	  || GET_MODE_CLASS (GET_MODE (cmp_a)) != MODE_CC
+	  || cmp_b != const0_rtx)
+	return NULL_RTX;
+	}
+
+  target = emit_conditional_move (x, code, cmp_a, cmp_b, VOIDmode,
+  vtrue, vfalse, GET_MODE (x),
+  unsignedp);
+}
+
   if (target)
 return target;
 
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 019bbb62882..25eecf29ed8 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -52,6 +52,9 @@ static void prepare_float_lib_cmp (rtx, rtx, enum rtx_code, rtx *,
 static rtx expand_unop_direct (machine_mode, optab, rtx, rtx, int);
 static void emit_libcall_block_1 (rtx_insn *, rtx, rtx, rtx, bool);
 
+static rtx emit_conditional_move (rtx, rtx, rtx, rtx, machine_mode);
+rtx emit_conditional_move (rtx, rtx, rtx, rtx, rtx, machine_mode);
+
 /* Debug facility for use in GDB.  */
 void debug_optab_libfuncs (void);
 
@@ -4875,6 +4878,7 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
   /* get_condition will prefer to generate LT and GT even if the old
  comparison was against zero, so undo that canonicalization here since
  comparisons against zero are cheaper.  */
+
   if (code == LT && op1 == const1_rtx)
 code = LE, op1 = const0_rtx;
   else if (code == GT && op1 == constm1_rtx)
@@ -4925,18 +4929,10 @@ emit_conditional_move (rtx target, enum rtx_code code, rtx op0, rtx op1,
 			OPTAB_WIDEN, , );
 	  if (comparison)
 	{
-	  class expand_operand ops[4];
-
-	  create_output_operand ([0], target, mode);
-	  create_fixed_operand ([1], comparison);
-	  

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-10-14 Thread Richard Sandiford via Gcc-patches
Robin Dapp  writes:
> Hi Richard,
>
>> (2) Insert:
>> 
>>  if (SUBREG_P (src))
>>src = SUBREG_REG (src);
>> 
>> here.
>> 
>> OK with those changes if that works.  Let me know if they don't —
>> I'll try to be quicker with the next review.
>
> thank you, this looks good in a first testsuite run on s390.  If you 
> have time, would you mind looking at the other outstanding patches of 
> this series as well? In case of further comments, which I am sure there 
> will be, I could test them all at once afterwards.

Which ones still need review?  I think 2/7 and 3/7 are approved,
but 4/7 was still being discussed.  AFAICT the last message on
that was:

   https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576865.html

We probably need to reach a conclusion on that before 5/7.
6/7 and 7/7 looked to be s390-specific.

Thanks,
Richard


Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-10-14 Thread Robin Dapp via Gcc-patches

Hi Richard,


(2) Insert:

 if (SUBREG_P (src))
   src = SUBREG_REG (src);

here.

OK with those changes if that works.  Let me know if they don't —
I'll try to be quicker with the next review.


thank you, this looks good in a first testsuite run on s390.  If you 
have time, would you mind looking at the other outstanding patches of 
this series as well? In case of further comments, which I am sure there 
will be, I could test them all at once afterwards.


Thanks again
 Robin


Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-10-14 Thread Richard Sandiford via Gcc-patches
Hi Robin,

Thanks for the update and sorry for the late response.

Robin Dapp  writes:
> Hi Richard,
>
>> Don't we still need this code (without the REG_DEAD handling) for the
>> case in which…
>> 
>>> +  /* As we are transforming
>>> +if (x > y)
>>> +  {
>>> +a = b;
>>> +c = d;
>>> +  }
>>> +into
>>> +  a = (x > y) ...
>>> +  c = (x > y) ...
>>> +
>>> +we potentially check x > y before every set.
>>> +Even though the check might be removed by subsequent passes, this means
>>> +that we cannot transform
>>> +  if (x > y)
>>> +{
>>> +  x = y;
>>> +  ...
>>> +}
>>> +into
>>> +  x = (x > y) ...
>>> +  ...
>>> +since this would invalidate x and the following to-be-removed checks.
>>> +Therefore we introduce a temporary every time we are about to
>>> +overwrite a variable used in the check.  Costing of a sequence with
>>> +these is going to be inaccurate so only use temporaries when
>>> +needed.  */
>>> +  if (reg_overlap_mentioned_p (target, cond))
>>> +   temp = gen_reg_rtx (GET_MODE (target));
>> 
>> …this code triggers?  I don't see otherwise how later uses of x would
>> pick up “temp” instead of the original target.  E.g. suppose we had:
>> 
>>  if (x > y)
>>{
>>  x = …;
>>  z = x; // x does not die here
>>}
>> 
>> Without the loop, it looks like z would pick up the old value of x
>> (used in the comparison) instead of the new one.
>
> getting back to this now.  I re-added handling of the situation you 
> mentioned (even though I didn't manage to trigger it myself).

Yeah, the only reliable way to test for this might be an rtx test
(see testsuite/gcc.dg/rtl for some examples).  A test isn't necessary
though.

My only remaining concern is that bb_ok_for_noce_convert_multiple_sets
allows subreg sources:

  if (!(REG_P (src)
   || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
   && subreg_lowpart_p (src
return false;

These subregs could also require rewrites.  It looks like the original
code checks for overlaps correctly, but doesn't necessarily deal with
a hit correctly.  So this is at least partly pre-existing.

I think the way of handling that is as follows:

> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 017944f4f79..f1448667732 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -98,6 +98,8 @@ static int dead_or_predicable (basic_block, basic_block, 
> basic_block,
>  edge, int);
>  static void noce_emit_move_insn (rtx, rtx);
>  static rtx_insn *block_has_only_trap (basic_block);
> +static void need_cmov_or_rewire (basic_block, hash_set *,
> +  hash_map *);
>
>  /* Count the number of non-jump active insns in BB.  */
>  
> @@ -3203,6 +3205,11 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>auto_vec unmodified_insns;
>int count = 0;
>  
> +  hash_set need_no_cmov;
> +  hash_map rewired_src;
> +
> +  need_cmov_or_rewire (then_bb, _no_cmov, _src);
> +
>FOR_BB_INSNS (then_bb, insn)
>  {
>/* Skip over non-insns.  */
> @@ -3213,26 +3220,47 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>gcc_checking_assert (set);
>  
>rtx target = SET_DEST (set);
> -  rtx temp = gen_reg_rtx (GET_MODE (target));
> -  rtx new_val = SET_SRC (set);
> +  rtx temp;
> +
> +  int *ii = rewired_src.get (insn);
> +  rtx new_val = ii == NULL ? SET_SRC (set) : temporaries[*ii];

(1) here use something like:

rtx new_val = SET_SRC (set);
if (int *ii = rewired_src.get (insn))
  new_val = simplify_replace_rtx (new_val, targets[*ii],
  temporaries[*ii]);

>rtx old_val = target;
>  
> -  /* If we were supposed to read from an earlier write in this block,
> -  we've changed the register allocation.  Rewire the read.  While
> -  we are looking, also try to catch a swap idiom.  */
> -  for (int i = count - 1; i >= 0; --i)
> - if (reg_overlap_mentioned_p (new_val, targets[i]))
> -   {
> - /* Catch a "swap" style idiom.  */
> - if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
> -   /* The write to targets[i] is only live until the read
> -  here.  As the condition codes match, we can propagate
> -  the set to here.  */
> -   new_val = SET_SRC (single_set (unmodified_insns[i]));
> - else
> -   new_val = temporaries[i];
> - break;
> -   }
> +  /* As we are transforming
> +  if (x > y)
> +{
> +  a = b;
> +  c = d;
> +}
> +  into
> +a = (x > y) ...
> +c = (x > y) ...
> +
> +  we potentially check x > y before every set.
> +  Even though the check might be removed by subsequent passes, this means
> +  that we cannot transform
> +if (x > y)
> +  

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-09-15 Thread Robin Dapp via Gcc-patches

Hi Richard,


Don't we still need this code (without the REG_DEAD handling) for the
case in which…


+  /* As we are transforming
+if (x > y)
+  {
+a = b;
+c = d;
+  }
+into
+  a = (x > y) ...
+  c = (x > y) ...
+
+we potentially check x > y before every set.
+Even though the check might be removed by subsequent passes, this means
+that we cannot transform
+  if (x > y)
+{
+  x = y;
+  ...
+}
+into
+  x = (x > y) ...
+  ...
+since this would invalidate x and the following to-be-removed checks.
+Therefore we introduce a temporary every time we are about to
+overwrite a variable used in the check.  Costing of a sequence with
+these is going to be inaccurate so only use temporaries when
+needed.  */
+  if (reg_overlap_mentioned_p (target, cond))
+   temp = gen_reg_rtx (GET_MODE (target));


…this code triggers?  I don't see otherwise how later uses of x would
pick up “temp” instead of the original target.  E.g. suppose we had:

 if (x > y)
   {
 x = …;
 z = x; // x does not die here
   }

Without the loop, it looks like z would pick up the old value of x
(used in the comparison) instead of the new one.


getting back to this now.  I re-added handling of the situation you 
mentioned (even though I didn't manage to trigger it myself).


Regards
 Robin
commit 2d909ee93ee1eb0f7474ed57581713367c22ba6c
Author: Robin Dapp 
Date:   Thu Jun 24 16:40:04 2021 +0200

ifcvt: Check if cmovs are needed.

When if-converting multiple SETs and we encounter a swap-style idiom

  if (a > b)
{
  tmp = c;   // [1]
  c = d;
  d = tmp;
}

ifcvt should not generate a conditional move for the instruction at
[1].

In order to achieve that, this patch goes through all relevant SETs
and marks the relevant instructions.  This helps to evaluate costs.

On top, only generate temporaries if the current cmov is going to
overwrite one of the comparands of the initial compare.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 017944f4f79..f1448667732 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -98,6 +98,8 @@ static int dead_or_predicable (basic_block, basic_block, basic_block,
 			   edge, int);
 static void noce_emit_move_insn (rtx, rtx);
 static rtx_insn *block_has_only_trap (basic_block);
+static void need_cmov_or_rewire (basic_block, hash_set *,
+ hash_map *);
 
 /* Count the number of non-jump active insns in BB.  */
 
@@ -3203,6 +3205,11 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   auto_vec unmodified_insns;
   int count = 0;
 
+  hash_set need_no_cmov;
+  hash_map rewired_src;
+
+  need_cmov_or_rewire (then_bb, _no_cmov, _src);
+
   FOR_BB_INSNS (then_bb, insn)
 {
   /* Skip over non-insns.  */
@@ -3213,26 +3220,47 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   gcc_checking_assert (set);
 
   rtx target = SET_DEST (set);
-  rtx temp = gen_reg_rtx (GET_MODE (target));
-  rtx new_val = SET_SRC (set);
+  rtx temp;
+
+  int *ii = rewired_src.get (insn);
+  rtx new_val = ii == NULL ? SET_SRC (set) : temporaries[*ii];
   rtx old_val = target;
 
-  /* If we were supposed to read from an earlier write in this block,
-	 we've changed the register allocation.  Rewire the read.  While
-	 we are looking, also try to catch a swap idiom.  */
-  for (int i = count - 1; i >= 0; --i)
-	if (reg_overlap_mentioned_p (new_val, targets[i]))
-	  {
-	/* Catch a "swap" style idiom.  */
-	if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
-	  /* The write to targets[i] is only live until the read
-		 here.  As the condition codes match, we can propagate
-		 the set to here.  */
-	  new_val = SET_SRC (single_set (unmodified_insns[i]));
-	else
-	  new_val = temporaries[i];
-	break;
-	  }
+  /* As we are transforming
+	 if (x > y)
+	   {
+	 a = b;
+	 c = d;
+	   }
+	 into
+	   a = (x > y) ...
+	   c = (x > y) ...
+
+	 we potentially check x > y before every set.
+	 Even though the check might be removed by subsequent passes, this means
+	 that we cannot transform
+	   if (x > y)
+	 {
+	   x = y;
+	   ...
+	 }
+	 into
+	   x = (x > y) ...
+	   ...
+	 since this would invalidate x and the following to-be-removed checks.
+	 Therefore we introduce a temporary every time we are about to
+	 overwrite a variable used in the check.  Costing of a sequence with
+	 these is going to be inaccurate so only use temporaries when
+	 needed.  */
+  if (reg_overlap_mentioned_p (target, cond))
+	temp = gen_reg_rtx (GET_MODE (target));
+  else
+	temp = target;
+
+  /* We have identified swap-style idioms in check_need_cmovs.  A normal
+	 set will need to be a cmov while the 

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-07-26 Thread Richard Sandiford via Gcc-patches
Robin Dapp  writes:
> Hi Richard,
>
> thanks for going through the patch set.
>
>> A hash_set might be simpler, given that the code only enters insns
>> for which the bool is false.  “rtx_insn *” would be better than rtx.
>
> Right, changed that.
>
>> Do you mean the sets might be removed or that the checks might be
>> removed?
>
> It's actually the checks that are removed.  I clarified and amended the 
> comments.
>
>> The patch is quite hard to review on its own, since nothing actually
>> uses this variable.  It's also not obvious how the
>> reg_overlap_mentioned_p code works if the old target is referenced
>> later.
>> 
>> Could you refactor the series a bit so that each patch is
>> self-contained? It's OK if that means fewer patches.
> The attached v2 makes use of need_cmov now, I hope this makes it a bit 
> clearer.  Regarding the reg_overlap_mentioned_p: it is used to detect 
> the swap idioms as well as when a cmov destination is used in the 
> condition as well.  If needed this could be two separate patches (most 
> of the second patch would be comments, though).

Thanks for the update.  No need for further splitting, this looks like a
nice self-contained patch as it is.

> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 017944f4f79..a5e55456d88 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -98,6 +98,7 @@ static int dead_or_predicable (basic_block, basic_block, 
> basic_block,
>  edge, int);
>  static void noce_emit_move_insn (rtx, rtx);
>  static rtx_insn *block_has_only_trap (basic_block);
> +static void check_need_cmovs (basic_block, hash_set *);
>
>  /* Count the number of non-jump active insns in BB.  */
>  
> @@ -3203,6 +3204,10 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>auto_vec unmodified_insns;
>int count = 0;
>  
> +  hash_set need_no_cmov;
> +
> +  check_need_cmovs (then_bb, _no_cmov);
> +
>FOR_BB_INSNS (then_bb, insn)
>  {
>/* Skip over non-insns.  */
> @@ -3213,26 +3218,47 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>gcc_checking_assert (set);
>  
>rtx target = SET_DEST (set);
> -  rtx temp = gen_reg_rtx (GET_MODE (target));
> +  rtx temp;
>rtx new_val = SET_SRC (set);
>rtx old_val = target;
>  
> -  /* If we were supposed to read from an earlier write in this block,
> -  we've changed the register allocation.  Rewire the read.  While
> -  we are looking, also try to catch a swap idiom.  */
> -  for (int i = count - 1; i >= 0; --i)
> - if (reg_overlap_mentioned_p (new_val, targets[i]))
> -   {
> - /* Catch a "swap" style idiom.  */
> - if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
> -   /* The write to targets[i] is only live until the read
> -  here.  As the condition codes match, we can propagate
> -  the set to here.  */
> -   new_val = SET_SRC (single_set (unmodified_insns[i]));
> - else
> -   new_val = temporaries[i];
> - break;
> -   }

Don't we still need this code (without the REG_DEAD handling) for the
case in which…

> +  /* As we are transforming
> +  if (x > y)
> +{
> +  a = b;
> +  c = d;
> +}
> +  into
> +a = (x > y) ...
> +c = (x > y) ...
> +
> +  we potentially check x > y before every set.
> +  Even though the check might be removed by subsequent passes, this means
> +  that we cannot transform
> +if (x > y)
> +  {
> +x = y;
> +...
> +  }
> +  into
> +x = (x > y) ...
> +...
> +  since this would invalidate x and the following to-be-removed checks.
> +  Therefore we introduce a temporary every time we are about to
> +  overwrite a variable used in the check.  Costing of a sequence with
> +  these is going to be inaccurate so only use temporaries when
> +  needed.  */
> +  if (reg_overlap_mentioned_p (target, cond))
> + temp = gen_reg_rtx (GET_MODE (target));

…this code triggers?  I don't see otherwise how later uses of x would
pick up “temp” instead of the original target.  E.g. suppose we had:

if (x > y)
  {
x = …;
z = x; // x does not die here
  }

Without the loop, it looks like z would pick up the old value of x
(used in the comparison) instead of the new one.

> +  else
> + temp = target;
> +
> +  /* We have identified swap-style idioms in check_need_cmovs.  A normal
> +  set will need to be a cmov while the first instruction of a swap-style
> +  idiom can be a regular move.  This helps with costing.  */
> +  bool need_cmov = true;
> +  if (need_no_cmov.contains (insn))
> + need_cmov = false;

Would be simpler as:

  bool need_cmov = !need_no_cmov.contains (insn);
>  
>/* If we had a non-canonical conditional jump (i.e. one where
>the fallthrough is to the "else" 

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-07-22 Thread Robin Dapp via Gcc-patches

Hi Richard,

thanks for going through the patch set.


A hash_set might be simpler, given that the code only enters insns
for which the bool is false.  “rtx_insn *” would be better than rtx.


Right, changed that.


Do you mean the sets might be removed or that the checks might be
removed?


It's actually the checks that are removed.  I clarified and amended the 
comments.



The patch is quite hard to review on its own, since nothing actually
uses this variable.  It's also not obvious how the
reg_overlap_mentioned_p code works if the old target is referenced
later.

Could you refactor the series a bit so that each patch is
self-contained? It's OK if that means fewer patches.
The attached v2 makes use of need_cmov now, I hope this makes it a bit 
clearer.  Regarding the reg_overlap_mentioned_p: it is used to detect 
the swap idioms as well as when a cmov destination is used in the 
condition as well.  If needed this could be two separate patches (most 
of the second patch would be comments, though).


Regards
 Robin
>From bf7e372d7f48e614f20676c7cf4b2fbde741d4fc Mon Sep 17 00:00:00 2001
From: Robin Dapp 
Date: Thu, 24 Jun 2021 16:40:04 +0200
Subject: [PATCH v2 1/7] ifcvt: Check if cmovs are needed.

When if-converting multiple SETs and we encounter a swap-style idiom

  if (a > b)
{
  tmp = c;   // [1]
  c = d;
  d = tmp;
}

ifcvt should not generate a conditional move for the instruction at
[1].

In order to achieve that, this patch goes through all relevant SETs
and marks the relevant instructions.  This helps to evaluate costs.

On top, only generate temporaries if the current cmov is going to
overwrite one of the comparands of the initial compare.
---
 gcc/ifcvt.c | 138 
 1 file changed, 118 insertions(+), 20 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 017944f4f79..a5e55456d88 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -98,6 +98,7 @@ static int dead_or_predicable (basic_block, basic_block, basic_block,
 			   edge, int);
 static void noce_emit_move_insn (rtx, rtx);
 static rtx_insn *block_has_only_trap (basic_block);
+static void check_need_cmovs (basic_block, hash_set *);
 
 /* Count the number of non-jump active insns in BB.  */
 
@@ -3203,6 +3204,10 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   auto_vec unmodified_insns;
   int count = 0;
 
+  hash_set need_no_cmov;
+
+  check_need_cmovs (then_bb, _no_cmov);
+
   FOR_BB_INSNS (then_bb, insn)
 {
   /* Skip over non-insns.  */
@@ -3213,26 +3218,47 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   gcc_checking_assert (set);
 
   rtx target = SET_DEST (set);
-  rtx temp = gen_reg_rtx (GET_MODE (target));
+  rtx temp;
   rtx new_val = SET_SRC (set);
   rtx old_val = target;
 
-  /* If we were supposed to read from an earlier write in this block,
-	 we've changed the register allocation.  Rewire the read.  While
-	 we are looking, also try to catch a swap idiom.  */
-  for (int i = count - 1; i >= 0; --i)
-	if (reg_overlap_mentioned_p (new_val, targets[i]))
-	  {
-	/* Catch a "swap" style idiom.  */
-	if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
-	  /* The write to targets[i] is only live until the read
-		 here.  As the condition codes match, we can propagate
-		 the set to here.  */
-	  new_val = SET_SRC (single_set (unmodified_insns[i]));
-	else
-	  new_val = temporaries[i];
-	break;
-	  }
+  /* As we are transforming
+	 if (x > y)
+	   {
+	 a = b;
+	 c = d;
+	   }
+	 into
+	   a = (x > y) ...
+	   c = (x > y) ...
+
+	 we potentially check x > y before every set.
+	 Even though the check might be removed by subsequent passes, this means
+	 that we cannot transform
+	   if (x > y)
+	 {
+	   x = y;
+	   ...
+	 }
+	 into
+	   x = (x > y) ...
+	   ...
+	 since this would invalidate x and the following to-be-removed checks.
+	 Therefore we introduce a temporary every time we are about to
+	 overwrite a variable used in the check.  Costing of a sequence with
+	 these is going to be inaccurate so only use temporaries when
+	 needed.  */
+  if (reg_overlap_mentioned_p (target, cond))
+	temp = gen_reg_rtx (GET_MODE (target));
+  else
+	temp = target;
+
+  /* We have identified swap-style idioms in check_need_cmovs.  A normal
+	 set will need to be a cmov while the first instruction of a swap-style
+	 idiom can be a regular move.  This helps with costing.  */
+  bool need_cmov = true;
+  if (need_no_cmov.contains (insn))
+	need_cmov = false;
 
   /* If we had a non-canonical conditional jump (i.e. one where
 	 the fallthrough is to the "else" case) we need to reverse
@@ -3275,9 +3301,22 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	  old_val = lowpart_subreg (dst_mode, old_val, src_mode);
 	}
 
-  /* Actually emit the conditional move.  */
-  rtx temp_dest = noce_emit_cmove 

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-07-15 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review.

Robin Dapp  writes:
> When if-converting multiple SETs and we encounter a swap-style idiom
>
>   if (a > b)
> {
>   tmp = c;   // [1]
>   c = d;
>   d = tmp;
> }
>
> ifcvt should not generate a conditional move for the instruction at
> [1].
>
> In order to achieve that, this patch goes through all relevant SETs
> and marks the relevant instructions.  This helps to evaluate costs.
>
> On top, only generate temporaries if the current cmov is going to
> overwrite one of the comparands of the initial compare.
> ---
>  gcc/ifcvt.c | 104 +++-
>  1 file changed, 87 insertions(+), 17 deletions(-)
>
> diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
> index 017944f4f79..eef6490626a 100644
> --- a/gcc/ifcvt.c
> +++ b/gcc/ifcvt.c
> @@ -98,6 +98,7 @@ static int dead_or_predicable (basic_block, basic_block, 
> basic_block,
>  edge, int);
>  static void noce_emit_move_insn (rtx, rtx);
>  static rtx_insn *block_has_only_trap (basic_block);
> +static void check_need_cmovs (basic_block, hash_map *);
>
>  /* Count the number of non-jump active insns in BB.  */
>  
> @@ -3203,6 +3204,10 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>auto_vec unmodified_insns;
>int count = 0;
>  
> +  hash_map need_cmovs;

A hash_set might be simpler, given that the code only enters insns for
which the bool is false.  “rtx_insn *” would be better than rtx.

> +
> +  check_need_cmovs (then_bb, _cmovs);
> +
>FOR_BB_INSNS (then_bb, insn)
>  {
>/* Skip over non-insns.  */
> @@ -3213,26 +3218,38 @@ noce_convert_multiple_sets (struct noce_if_info 
> *if_info)
>gcc_checking_assert (set);
>  
>rtx target = SET_DEST (set);
> -  rtx temp = gen_reg_rtx (GET_MODE (target));
> +  rtx temp;
>rtx new_val = SET_SRC (set);
>rtx old_val = target;
>  
> -  /* If we were supposed to read from an earlier write in this block,
> -  we've changed the register allocation.  Rewire the read.  While
> -  we are looking, also try to catch a swap idiom.  */
> -  for (int i = count - 1; i >= 0; --i)
> - if (reg_overlap_mentioned_p (new_val, targets[i]))
> -   {
> - /* Catch a "swap" style idiom.  */
> - if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
> -   /* The write to targets[i] is only live until the read
> -  here.  As the condition codes match, we can propagate
> -  the set to here.  */
> -   new_val = SET_SRC (single_set (unmodified_insns[i]));
> - else
> -   new_val = temporaries[i];
> - break;
> -   }
> +  /* As we are transforming
> +  if (x > y)
> +a = b;
> +c = d;

Looks like some missing braces here.

> +  into
> +a = (x > y) ...
> +c = (x > y) ...
> +
> +  we potentially check x > y before every set here.
> +  (Even though might be removed by subsequent passes.)

Do you mean the sets might be removed or that the checks might be removed?

> +  We cannot transform
> +if (x > y)
> +  x = y;
> +  ...
> +  into
> +x = (x > y) ...
> +...
> +  since this would invalidate x.  Therefore we introduce a temporary
> +  every time we are about to overwrite a variable used in the
> +  check.  Costing of a sequence with these is going to be inaccurate.  */
> +  if (reg_overlap_mentioned_p (target, cond))
> + temp = gen_reg_rtx (GET_MODE (target));
> +  else
> + temp = target;
> +
> +  bool need_cmov = true;
> +  if (need_cmovs.get (insn))
> + need_cmov = false;

The patch is quite hard to review on its own, since nothing actually uses
this variable.  It's also not obvious how the reg_overlap_mentioned_p
code works if the old target is referenced later.

Could you refactor the series a bit so that each patch is self-contained?
It's OK if that means fewer patches.

Thanks,
Richard

>/* If we had a non-canonical conditional jump (i.e. one where
>the fallthrough is to the "else" case) we need to reverse
> @@ -3808,6 +3825,59 @@ check_cond_move_block (basic_block bb,
>return TRUE;
>  }
>  
> +/* Find local swap-style idioms in BB and mark the first insn (1)
> +   that is only a temporary as not needing a conditional move as
> +   it is going to be dead afterwards anyway.
> +
> + (1) int tmp = a;
> +  a = b;
> +  b = tmp;
> +
> +  ifcvt
> +  -->
> +
> +  load tmp,a
> +  cmov a,b
> +  cmov b,tmp   */
> +
> +static void
> +check_need_cmovs (basic_block bb, hash_map *need_cmov)
> +{
> +  rtx_insn *insn;
> +  int count = 0;
> +  auto_vec insns;
> +  auto_vec dests;
> +
> +  FOR_BB_INSNS (bb, insn)
> +{
> +  rtx set, src, dest;
> +
> +  if (!active_insn_p (insn))
> + continue;
> +
> +  set = single_set (insn);
> +  if (set == NULL_RTX)
> + continue;
> +
> +

[PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-06-25 Thread Robin Dapp via Gcc-patches
When if-converting multiple SETs and we encounter a swap-style idiom

  if (a > b)
{
  tmp = c;   // [1]
  c = d;
  d = tmp;
}

ifcvt should not generate a conditional move for the instruction at
[1].

In order to achieve that, this patch goes through all relevant SETs
and marks the relevant instructions.  This helps to evaluate costs.

On top, only generate temporaries if the current cmov is going to
overwrite one of the comparands of the initial compare.
---
 gcc/ifcvt.c | 104 +++-
 1 file changed, 87 insertions(+), 17 deletions(-)

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 017944f4f79..eef6490626a 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -98,6 +98,7 @@ static int dead_or_predicable (basic_block, basic_block, 
basic_block,
   edge, int);
 static void noce_emit_move_insn (rtx, rtx);
 static rtx_insn *block_has_only_trap (basic_block);
+static void check_need_cmovs (basic_block, hash_map *);
 
 /* Count the number of non-jump active insns in BB.  */
 
@@ -3203,6 +3204,10 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   auto_vec unmodified_insns;
   int count = 0;
 
+  hash_map need_cmovs;
+
+  check_need_cmovs (then_bb, _cmovs);
+
   FOR_BB_INSNS (then_bb, insn)
 {
   /* Skip over non-insns.  */
@@ -3213,26 +3218,38 @@ noce_convert_multiple_sets (struct noce_if_info 
*if_info)
   gcc_checking_assert (set);
 
   rtx target = SET_DEST (set);
-  rtx temp = gen_reg_rtx (GET_MODE (target));
+  rtx temp;
   rtx new_val = SET_SRC (set);
   rtx old_val = target;
 
-  /* If we were supposed to read from an earlier write in this block,
-we've changed the register allocation.  Rewire the read.  While
-we are looking, also try to catch a swap idiom.  */
-  for (int i = count - 1; i >= 0; --i)
-   if (reg_overlap_mentioned_p (new_val, targets[i]))
- {
-   /* Catch a "swap" style idiom.  */
-   if (find_reg_note (insn, REG_DEAD, new_val) != NULL_RTX)
- /* The write to targets[i] is only live until the read
-here.  As the condition codes match, we can propagate
-the set to here.  */
- new_val = SET_SRC (single_set (unmodified_insns[i]));
-   else
- new_val = temporaries[i];
-   break;
- }
+  /* As we are transforming
+if (x > y)
+  a = b;
+  c = d;
+into
+  a = (x > y) ...
+  c = (x > y) ...
+
+we potentially check x > y before every set here.
+(Even though might be removed by subsequent passes.)
+We cannot transform
+  if (x > y)
+x = y;
+...
+into
+  x = (x > y) ...
+  ...
+since this would invalidate x.  Therefore we introduce a temporary
+every time we are about to overwrite a variable used in the
+check.  Costing of a sequence with these is going to be inaccurate.  */
+  if (reg_overlap_mentioned_p (target, cond))
+   temp = gen_reg_rtx (GET_MODE (target));
+  else
+   temp = target;
+
+  bool need_cmov = true;
+  if (need_cmovs.get (insn))
+   need_cmov = false;
 
   /* If we had a non-canonical conditional jump (i.e. one where
 the fallthrough is to the "else" case) we need to reverse
@@ -3808,6 +3825,59 @@ check_cond_move_block (basic_block bb,
   return TRUE;
 }
 
+/* Find local swap-style idioms in BB and mark the first insn (1)
+   that is only a temporary as not needing a conditional move as
+   it is going to be dead afterwards anyway.
+
+ (1) int tmp = a;
+a = b;
+b = tmp;
+
+ifcvt
+-->
+
+load tmp,a
+cmov a,b
+cmov b,tmp   */
+
+static void
+check_need_cmovs (basic_block bb, hash_map *need_cmov)
+{
+  rtx_insn *insn;
+  int count = 0;
+  auto_vec insns;
+  auto_vec dests;
+
+  FOR_BB_INSNS (bb, insn)
+{
+  rtx set, src, dest;
+
+  if (!active_insn_p (insn))
+   continue;
+
+  set = single_set (insn);
+  if (set == NULL_RTX)
+   continue;
+
+  src = SET_SRC (set);
+  dest = SET_DEST (set);
+
+  for (int i = count - 1; i >= 0; --i)
+   {
+ if (reg_overlap_mentioned_p (src, dests[i])
+ && find_reg_note (insn, REG_DEAD, src) != NULL_RTX)
+   {
+ need_cmov->put (insns[i], false);
+   }
+   }
+
+  insns.safe_push (insn);
+  dests.safe_push (dest);
+
+  count++;
+}
+}
+
 /* Given a basic block BB suitable for conditional move conversion,
a condition COND, and pointer maps THEN_VALS and ELSE_VALS containing
the register values depending on COND, emit the insns in the block as
-- 
2.31.1