On Fri, Jan 14, 2022 at 11:56 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> C++20:
> #include <compare>
> auto cmp4way(double a, double b)
> {
>   return a <=> b;
> }
> expands to:
>         ucomisd %xmm1, %xmm0
>         jp      .L8
>         movl    $0, %eax
>         jne     .L8
> .L2:
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L8:
>         comisd  %xmm0, %xmm1
>         movl    $-1, %eax
>         ja      .L2
>         ucomisd %xmm1, %xmm0
>         setbe   %al
>         addl    $1, %eax
>         ret
> That is 3 comparisons of the same operands.
> The following patch improves it to just one comparison:
>         comisd  %xmm1, %xmm0
>         jp      .L4
>         seta    %al
>         movl    $0, %edx
>         leal    -1(%rax,%rax), %eax
>         cmove   %edx, %eax
>         ret
> .L4:
>         movl    $2, %eax
>         ret
> While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
> where the first comparison is equality and this shouldn't raise
> exceptions on qNaN operands, if the operands aren't equal (which
> includes unordered cases), then it immediately performs < or >
> comparison and that raises exceptions even on qNaNs, so we can just
> perform a single comparison that raises exceptions on qNaN.
> As the 4 different cases are encoded as
> ZF CF PF
> 1  1  1  a unordered b
> 0  0  0  a > b
> 0  1  0  a < b
> 1  0  0  a == b
> we can emit optimal sequence of comparions, first jp
> for the unordered case, then je for the == case and finally jb
> for the < case.
>
> The patch pattern recognizes spaceship-like comparisons during
> widening_mul if the spaceship optab is implemented, and replaces
> thoose comparisons with comparisons of .SPACESHIP ifn which returns
> -1/0/1/2 based on the comparison.  This seems to work well both for the
> case of just returning the -1/0/1/2 (when we have just a common
> successor with a PHI) or when the different cases are handled with
> various other basic blocks.  The testcases cover both of those cases,
> the latter with different function calls in those.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-01-14  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/103973
>         * tree-cfg.h (cond_only_block_p): Declare.
>         * tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
>         * tree-cfg.c (cond_only_block_p): ... here.  No longer static.
>         * optabs.def (spaceship_optab): New optab.
>         * internal-fn.def (SPACESHIP): New internal function.
>         * internal-fn.h (expand_SPACESHIP): Declare.
>         * internal-fn.c (expand_PHI): Formatting fix.
>         (expand_SPACESHIP): New function.
>         * tree-ssa-math-opts.c (optimize_spaceship): New function.
>         (math_opts_dom_walker::after_dom_children): Use it.
>         * config/i386/i386.md (spaceship<mode>3): New define_expand.
>         * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
>         * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
>         * doc/md.texi (spaceship@var{m}3): Document.
>
>         * gcc.target/i386/pr103973-1.c: New test.
>         * gcc.target/i386/pr103973-2.c: New test.
>         * gcc.target/i386/pr103973-3.c: New test.
>         * gcc.target/i386/pr103973-4.c: New test.
>         * g++.target/i386/pr103973-1.C: New test.
>         * g++.target/i386/pr103973-2.C: New test.
>         * g++.target/i386/pr103973-3.C: New test.
>         * g++.target/i386/pr103973-4.C: New test.

>--- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> +++ gcc/config/i386/i386.md     2022-01-14 18:22:41.140906449 +0100
> @@ -23886,6 +23886,18 @@ (define_insn "hreset"
>    [(set_attr "type" "other")
>     (set_attr "length" "4")])
>
> +;; Spaceship optimization
> +(define_expand "spaceship<mode>3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH))
> +   && TARGET_CMOVE && TARGET_IEEE_FP"

Is there a reason that this pattern is limited to TARGET_IEEE_FP?
During the expansion in ix86_expand_fp_spaceship, we can just skip
jump on unordered, while ix86_expand_fp_compare will emit the correct
comparison mode depending on TARGET_IEEE_FP.

> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
>  (include "mmx.md")
>  (include "sse.md")
>  (include "sync.md")
> --- gcc/config/i386/i386-expand.c.jj    2022-01-14 11:51:34.429384213 +0100
> +++ gcc/config/i386/i386-expand.c       2022-01-14 21:02:09.446437810 +0100
> @@ -2879,6 +2879,46 @@ ix86_expand_setcc (rtx dest, enum rtx_co
>    emit_insn (gen_rtx_SET (dest, ret));
>  }
>
> +/* Expand floating point op0 <=> op1 if NaNs are honored.  */
> +
> +void
> +ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
> +{
> +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
> +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> +  rtx l0 = gen_label_rtx ();
> +  rtx l1 = gen_label_rtx ();
> +  rtx l2 = gen_label_rtx ();
> +  rtx lend = gen_label_rtx ();
> +  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> +                                 gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> +  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());

Please also add JUMP_LABEL to the insn.

> +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> +                          gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> +                             gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> +                             gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> +  add_reg_br_prob_note (jmp, profile_probability::even ());
> +  emit_move_insn (dest, constm1_rtx);
> +  emit_jump (lend);
> +  emit_label (l0);

and LABEL_NUSES label.

> +  emit_move_insn (dest, const0_rtx);
> +  emit_jump (lend);
> +  emit_label (l1);
> +  emit_move_insn (dest, const1_rtx);
> +  emit_jump (lend);
> +  emit_label (l2);
> +  emit_move_insn (dest, const2_rtx);
> +  emit_label (lend);
> +}
> +
>  /* Expand comparison setting or clearing carry flag.  Return true when
>     successful and set pop for the operation.  */
>  static bool

Uros.

Reply via email to