Re: ifcvt: Fix PR104153 and PR104198

2022-02-05 Thread Jeff Law via Gcc-patches




On 2/3/2022 1:00 AM, Robin Dapp wrote:

Do you need to adjust anything now that this is emitting into TEMP
rather than TARGET?

The idea now is to emit to TEMP in the first pass and check if we read
the initial condition.  Overwriting the condition (and of course reading
it in every sequence) is the reason temporaries were needed in the first
place.

In the second pass (always done now) we use TEMP for those
SETs/sequences before the last one that read the condition and TARGET
for those after it.  Ideally it's TARGET for all of them which would
result in the best costing situation.

OK.  Thanks for clarifying.



The whole logic was/is really a bit convoluted for what we want to
achieve but I'd argue it's not much worse than before ;)
Yea.  And it's not that far off from things we do elsewhere -- we 
generate some sequence then start grubbing around inside for things of 
interest.


OK for the trunk.

Any thoughts on the ARC failure which is triggered by your patch and 
still unresolved?


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104154

Jeff



Re: ifcvt: Fix PR104153 and PR104198

2022-02-03 Thread Robin Dapp via Gcc-patches
> Do you need to adjust anything now that this is emitting into TEMP 
> rather than TARGET?

The idea now is to emit to TEMP in the first pass and check if we read
the initial condition.  Overwriting the condition (and of course reading
it in every sequence) is the reason temporaries were needed in the first
place.

In the second pass (always done now) we use TEMP for those
SETs/sequences before the last one that read the condition and TARGET
for those after it.  Ideally it's TARGET for all of them which would
result in the best costing situation.

The whole logic was/is really a bit convoluted for what we want to
achieve but I'd argue it's not much worse than before ;)

Regards
 Robin


Re: ifcvt: Fix PR104153 and PR104198

2022-02-02 Thread Jeff Law via Gcc-patches




On 2/1/2022 7:45 AM, Robin Dapp wrote:

Hi,

this is a bugfix for aa8cfe785953a0e87d2472311e1260cd98c605c0 which
broke an or1k test case (PR104153) as well as SPARC bootstrap (PR104198).

cond_exec_get_condition () returns the jump condition directly and we
now it to the backend.  The or1k backend modified the condition in-place
but this modification is not reverted when the sequence in question is
discarded.  Therefore this patch copies the RTX instead of using it
directly.

The SPARC problem is due to the backend recreating the initial condition
when being passed a CC comparison.  This causes the sequence
to read from an already overwritten condition operand.  Generally, this
could also happen on other targets.  The workaround is to always first
emit to a temporary.  In a second run of noce_convert_multiple_1 we know
which sequences actually require the comparison and use no
temporaries if all sequences after the current one do not require it.


Before, I used reg_overlap_mentioned_p () to check the generated
instructions against the condition.  The problem with this is that
reg_overlap... only handles a set of rtx_codes while a backend can
theoretically emit everything in an expander.  Is reg_mentioned_p () the
"right thing" to do?  Maybe it is overly conservative but as soon as we
have more than let's say three insns, we are unlikely to succeed anyway.

Bootstrapped and reg-tested on s390x, Power 9, x86 and SPARC.

Regards
  Robin

--

 PR 104198
 PR 104153

gcc/ChangeLog:

 * ifcvt.cc (noce_convert_multiple_sets_1): Copy rtx instead of
using it
 directly.  Rework comparison handling and always perform a
second pass.

gcc/testsuite/ChangeLog:

 * gcc.dg/pr104198.c: New test.
Also note this fixed the or1k issues and it's been tested on a variety 
of the embedded targets.





ifcvt-pr.patch

commit 68489d5729b4879bf2df540753fc7ea8ba1565a5
Author: Robin Dapp 
Date:   Mon Jan 24 10:28:05 2022 +0100

 ifcvt: Fix PR104153 and PR104198.
 
 This is a bugfix for aa8cfe785953a0e87d2472311e1260cd98c605c0 which

 broke an or1k test case (PR104153) as well as SPARC bootstrap (PR104198).
 
 cond_exec_get_condition () returns the jump condition directly and we now

 pass it to the backend.  The or1k backend modified the condition in-place
 (other backends do that as well) but this modification is not reverted
 when the sequence in question is discarded.  Therefore we copy the RTX
 instead of using it directly.
 
 The SPARC problem is due to the SPARC backend recreating the initial

 condition when being passed a CC comparison.  This causes the sequence
 to read from an already overwritten condition operand.  Generally, this
 could also happen on other targets.  The workaround is to always first
 emit to a temporary.  In a second run of noce_convert_multiple_1 we know
 which sequences actually require the comparison and will use no
 temporaries if all sequences after the current one do not require it.

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index fe250d508e1..92c2b40a45a 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -3391,7 +3391,11 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
*if_info,
rtx cond = noce_get_condition (jump, _earliest, false);
  
rtx cc_cmp = cond_exec_get_condition (jump);

+  if (cc_cmp)
+cc_cmp = copy_rtx (cc_cmp);
rtx rev_cc_cmp = cond_exec_get_condition (jump, /* get_reversed */ true);
+  if (rev_cc_cmp)
+rev_cc_cmp = copy_rtx (rev_cc_cmp);
  
rtx_insn *insn;

int count = 0;
@@ -3515,6 +3519,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
*if_info,
unsigned cost1 = 0, cost2 = 0;
rtx_insn *seq, *seq1, *seq2;
rtx temp_dest = NULL_RTX, temp_dest1 = NULL_RTX, temp_dest2 = NULL_RTX;
+  bool read_comparison = false;
  
seq1 = try_emit_cmove_seq (if_info, temp, cond,

 new_val, old_val, need_cmov,
@@ -3524,10 +3529,38 @@ noce_convert_multiple_sets_1 (struct noce_if_info 
*if_info,
 as well.  This allows the backend to emit a cmov directly without
 creating an additional compare for each.  If successful, costing
 is easier and this sequence is usually preferred.  */
-  seq2 = try_emit_cmove_seq (if_info, target, cond,
+  seq2 = try_emit_cmove_seq (if_info, temp, cond,
Do you need to adjust anything now that this is emitting into TEMP 
rather than TARGET?

jeff



Re: ifcvt: Fix PR104153 and PR104198

2022-02-02 Thread Rainer Orth
Hi Robin,

> Bootstrapped and reg-tested on s390x, Power 9, x86 and SPARC.

I've now also tested the patch on sparcv9-sun-solaris2.11: no regressions.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University