[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-26 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

Uroš Bizjak  changed:

   What|Removed |Added

  Attachment #42479|0   |1
is obsolete||

--- Comment #20 from Uroš Bizjak  ---
Created attachment 42481
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42481=edit
WIP 2 patch

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-26 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

Uroš Bizjak  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ubizjak at gmail dot com

--- Comment #19 from Uroš Bizjak  ---
Created attachment 42479
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42479=edit
WIP patch

This is a WIP patch that tags all unorederd compares with:

(unspec [(const_int 0)] UNSPEC_NOTRAP)

WIP patch, because there is SSE subst pattern that involves CCFPU mode. This
mode will be removed, and I have to figure out what the subst does...

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-26 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

Uroš Bizjak  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #18 from Uroš Bizjak  ---
*** Bug 82733 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #17 from Segher Boessenkool  ---
So we have a compare:CCFPU, the resulting flags is used in a GE
only, and ix86_cc_mode thinks the best mode to use for that is CCFP.

Which is fine, except compare:CCFPU is a different instruction, and GE
for the resulting insn means a different thing?!

How is this supposed to work?  How can generic code know this?

Everything worked fine, except the compare insn did not do the side
effect of setting a status flag.  Perhaps an unspec (or even an
unspec_volatile) should have been used for the compare?

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #16 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #15)
> My point is that doing this only for FLOAT_MODE_P makes no real sense.
> If we can describe ordered comparisons with special CC modes, we should
> do tests with those modes only here.

I really can't see how we can use CC_MODES_COMPATIBLE check without harming
integer compares. Not being an expert in this part of the compiler, I'm out of
ideas what to do here - do you perhaps have a particular solution in mind?

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #15 from Segher Boessenkool  ---
My point is that doing this only for FLOAT_MODE_P makes no real sense.
If we can describe ordered comparisons with special CC modes, we should
do tests with those modes only here.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #14 from Uroš Bizjak  ---
(In reply to Uroš Bizjak from comment #13)
> (In reply to Segher Boessenkool from comment #12)
> > But why only do this for FLOAT_MODE_P?  Either the logic here isn't
> > correct, or cc_modes_compatible isn't the correct hook (we'll need
> > a new hook then?), or determining ordered/unordered by CC mode does
> > not work (does not fit into how RTL works).
> 
> Non-FP compares can select different mode depending on their operands (e.g.
> CCmode to CCZmode when one operand is zero) without secondary effects. But
> when reversing the condition from UNGE -> LT, SELECT_CC_MODE will return
> trapping mode (LT), whereas original, non-reversed mode (UNGE) was
> non-trapping. Please see how targets depend mode of their FP compares on the
> condition code.
> 
> The solution here is to keep the original comparison mode for FP compares
> (as was proposed in the first version of the patch): when qNaN is
> encountered at this point, an exception has to be generated, no matter how
> the condition was changed.

... an exception should not be generated ... in the above case, when UNGE is
reversed to LT.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #13 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #12)
> But why only do this for FLOAT_MODE_P?  Either the logic here isn't
> correct, or cc_modes_compatible isn't the correct hook (we'll need
> a new hook then?), or determining ordered/unordered by CC mode does
> not work (does not fit into how RTL works).

Non-FP compares can select different mode depending on their operands (e.g.
CCmode to CCZmode when one operand is zero) without secondary effects. But when
reversing the condition from UNGE -> LT, SELECT_CC_MODE will return trapping
mode (LT), whereas original, non-reversed mode (UNGE) was non-trapping. Please
see how targets depend mode of their FP compares on the condition code.

The solution here is to keep the original comparison mode for FP compares (as
was proposed in the first version of the patch): when qNaN is encountered at
this point, an exception has to be generated, no matter how the condition was
changed.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #12 from Segher Boessenkool  ---
But why only do this for FLOAT_MODE_P?  Either the logic here isn't
correct, or cc_modes_compatible isn't the correct hook (we'll need
a new hook then?), or determining ordered/unordered by CC mode does
not work (does not fit into how RTL works).

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-25 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #11 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #10)
> So should combine use targetm.cc_modes_compatible here?

Yes. The trappines of FP compares is distinguished by their mode, so
I guess something along the following patch should work:

--cut here--
diff --git a/gcc/combine.c b/gcc/combine.c
index d71e50fdefb5..0220be2e484e 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -6787,6 +6787,14 @@ simplify_set (rtx x)
   else
compare_mode = SELECT_CC_MODE (new_code, op0, op1);

+  /* Do not change compare mode of a floating point compare to
+an incompatible mode.  Targets distingush trapping and
+non-traping compares by their compare mode, and SELECT_CC_MODE
+could return different mode for a new_code.  */
+  if (FLOAT_MODE_P (GET_MODE (op0))
+ && !targetm.cc_modes_compatible (compare_mode, GET_MODE (dest)))
+   compare_mode = GET_MODE (dest);
+
   /* If the mode changed, we have to change SET_DEST, the mode in the
 compare, and the mode in the place SET_DEST is used.  If SET_DEST is
 a hard register, just build new versions with the proper mode.  If it
--cut here--

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #10 from Segher Boessenkool  ---
So should combine use targetm.cc_modes_compatible here?

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #9 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #8)
> Maybe you can handle this in can_change_dest_mode?  That will catch
> the similar cases, too.

No, because we only have to prevent CCmode changes that apply to FP operands.
can_change_dest_mode only looks at mode changes, but CCFPmode and CCFPUmode are
x86 specific.

I have looked at other SELECT_CC_MODE changes, and they deal with propagation
of compares into arithmetic operations. This is the only place that can change
CCmode of FP compares.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #8 from Segher Boessenkool  ---
Maybe you can handle this in can_change_dest_mode?  That will catch
the similar cases, too.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #7 from Uroš Bizjak  ---
To illustrate the problem, following patch fixes the failure:

--cut here--
Index: combine.c
===
--- combine.c   (revision 254011)
+++ combine.c   (working copy)
@@ -6784,7 +6784,7 @@ simplify_set (rtx x)
   && op0 == XEXP (inner_compare, 0)
   && op1 == XEXP (inner_compare, 1))
compare_mode = GET_MODE (inner_compare);
-  else
+  else if (!FLOAT_MODE_P (GET_MODE (op0)))
compare_mode = SELECT_CC_MODE (new_code, op0, op1);

   /* If the mode changed, we have to change SET_DEST, the mode in the
--cut here--

The patch avoids mode changes for floating-point operands.

(It will work for x86, since it has all comparisons trapping and non-trapping).

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #6 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #5)
> The combination 8 -> 9 (where the GE is introduced) does not call
> SELECT_CC_MODE at all.

The problematic conversion is 7 -> 9, *after* 8 -> 9 is performed.

Please see this gdb session:

(gdb) b ix86_cc_mode

Breakpoint 1, ix86_cc_mode (code=GE, op0=0x7fffeff09ab0, op1=0x7fffeff09960) at
/home/uros/gcc-svn/trunk/gcc/config/i386/i386.c:21718
21718   {
(gdb) bt
#0  ix86_cc_mode (code=GE, op0=0x7fffeff09ab0, op1=0x7fffeff09960) at
/home/uros/gcc-svn/trunk/gcc/config/i386/i386.c:21718
#1  0x012eb45d in simplify_set (x=x@entry=0x7fffeff09c60) at
/home/uros/gcc-svn/trunk/gcc/combine.c:6788
#2  0x012ecb48 in combine_simplify_rtx(rtx_def*, machine_mode, int,
int) () at /home/uros/gcc-svn/trunk/gcc/combine.c:6293
#3  0x012eee32 in subst(rtx_def*, rtx_def*, rtx_def*, int, int, int) ()
at /home/uros/gcc-svn/trunk/gcc/combine.c:5573
#4  0x012f1e42 in try_combine(rtx_insn*, rtx_insn*, rtx_insn*,
rtx_insn*, int*, rtx_insn*) () at /home/uros/gcc-svn/trunk/gcc/combine.c:3332
#5  0x012f7d91 in combine_instructions (nregs=,
f=) at /home/uros/gcc-svn/trunk/gcc/combine.c:1301

(gdb) f4
#4  0x012f1e42 in try_combine(rtx_insn*, rtx_insn*, rtx_insn*,
rtx_insn*, int*, rtx_insn*) () at /home/uros/gcc-svn/trunk/gcc/combine.c:3332
3332  newpat = subst (PATTERN (i3), i2dest, i2src, 0, 0,

(gdb) p debug_rtx (i3)
(insn 9 8 10 2 (set (reg:CCFPU 17 flags)
(compare:CCFPU (reg:DF 95)
(reg/v:DF 91 [ x ]))) "pr82692.c":9 2147483647 {NOOP_MOVE}
 (nil))

(gdb) b combine.c:
Breakpoint 2 at 0x12f1dfb: file /home/uros/gcc-svn/trunk/gcc/combine.c, line
.

(gdb) c
Continuing.

Breakpoint 2, try_combine(rtx_insn*, rtx_insn*, rtx_insn*, rtx_insn*, int*,
rtx_insn*) () at /home/uros/gcc-svn/trunk/gcc/combine.c:3334
3334  || ((i0_feeds_i2_n || (i0_feeds_i1_n &&
i1_feeds_i2_n))

(gdb) p debug_rtx (newpat)
(set (reg:CCFP 17 flags)
(compare:CCFP (reg:DF 95)
(reg/v:DF 91 [ x ])))

And also changes mode of the conditional jump.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #5 from Segher Boessenkool  ---
The combination 8 -> 9 (where the GE is introduced) does not call
SELECT_CC_MODE at all.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #4 from Uroš Bizjak  ---
(In reply to Segher Boessenkool from comment #3)
> The combine output you showed is _not_ succeeding though?  "matched"
> just means the rtx was recog()'ed; not that it was actually replaced.

FAOD, this is the sequence before combine:

--cut here--
(insn 7 6 8 2 (set (reg:CCFPU 17 flags)
(compare:CCFPU (reg:DF 95)
(reg/v:DF 91 [ x ]))) "pr82692.c":9 52 {*cmpiudf}
 (expr_list:REG_DEAD (reg:DF 95)
(expr_list:REG_EQUAL (compare:CCFPU (const_double:DF 0.0 [0x0.0p+0])
(reg/v:DF 91 [ x ]))
(nil
(insn 8 7 9 2 (set (reg:QI 94)
(unlt:QI (reg:CCFPU 17 flags)
(const_int 0 [0]))) "pr82692.c":9 664 {*setcc_qi}
 (expr_list:REG_DEAD (reg:CCFPU 17 flags)
(nil)))
(insn 9 8 10 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:QI 94)
(const_int 0 [0]))) "pr82692.c":9 1 {*cmpqi_ccno_1}
 (expr_list:REG_DEAD (reg:QI 94)
(nil)))
(jump_insn 10 9 32 2 (set (pc)
(if_then_else (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(label_ref:DI 34)
(pc))) "pr82692.c":9 668 {*jcc}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(int_list:REG_BR_PROB 182536112 (nil)))
 -> 34)
--cut here--

which is converted by combine pass to:

--cut here--
(note 7 6 8 2 NOTE_INSN_DELETED)
(note 8 7 9 2 NOTE_INSN_DELETED)
(insn 9 8 10 2 (set (reg:CCFP 17 flags)
(compare:CCFP (reg:DF 95)
(reg/v:DF 91 [ x ]))) "pr82692.c":9 50 {*cmpidf}
 (expr_list:REG_DEAD (reg:DF 95)
(nil)))
(jump_insn 10 9 32 2 (set (pc)
(if_then_else (ge (reg:CCFP 17 flags)
(const_int 0 [0]))
(label_ref:DI 34)
(pc))) "pr82692.c":9 668 {*jcc}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(int_list:REG_BR_PROB 182536112 (nil)))
 -> 34)
--cut here--

UNLT compare (CCFPUmode) has been converted to GE compare (CCFPmode). This is
not correct as far as traps are concerned, since UNLT doesn't trap on qNaN,
while GE does.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

--- Comment #3 from Segher Boessenkool  ---
The combine output you showed is _not_ succeeding though?  "matched"
just means the rtx was recog()'ed; not that it was actually replaced.

[Bug rtl-optimization/82692] [8 Regression] Ordered comparisons used for unordered built-ins

2017-10-24 Thread ubizjak at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692

Uroš Bizjak  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org
  Component|target  |rtl-optimization

--- Comment #2 from Uroš Bizjak  ---
This is the problem in the combine pass. It is substituting non-trapping
compare with trapping via SELECT_CC_MODE. This particular problem happens in
line 6788:

--cut here--
  /* If this machine has CC modes other than CCmode, check to see if we
 need to use a different CC mode here.  */
  if (GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC)
compare_mode = GET_MODE (op0);
  else if (inner_compare
   && GET_MODE_CLASS (GET_MODE (inner_compare)) == MODE_CC
   && new_code == old_code
   && op0 == XEXP (inner_compare, 0)
   && op1 == XEXP (inner_compare, 1))
compare_mode = GET_MODE (inner_compare);
  else
compare_mode = SELECT_CC_MODE (new_code, op0, op1);
--cut here--

New compare mode should NOT change trapping of the compare insn.