https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116787
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #6)
> (In reply to Richard Biener from comment #0)
> > typedef float v4sf __attribute__((vector_size (sizeof (4 * sizeof
> > (float)))));
> >
> > v4sf
> > foo (v4sf x, v4sf y)
> > {
> > return x < y ? y : x;
> > }
> >
> > is no longer generating
> >
> > _Z3fooDv2_fS_:
> > .LFB0:
> > .cfi_startproc
> > maxps %xmm0, %xmm1
> > movaps %xmm1, %xmm0
> > ret
> >
> > with -O2, neither with -O2 -msse4.2
>
> It does with -ffast-math.
I think the issue is that we pad out the compare RTXen with zeros but
not the RHS when they are equal to the compare operands. We then later
are not able to combine to maxps.
Something like
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 9a8d6030d8b..8b846306fd6 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1184,6 +1184,14 @@
emit_insn (gen_movq_v2sf_to_sse (ops[5], operands[5]));
emit_insn (gen_movq_v2sf_to_sse (ops[4], operands[4]));
+ if (rtx_equal_p (operands[5], operands[2]))
+ ops[2] = ops[5];
+ else if (rtx_equal_p (operands[5], operands[1]))
+ ops[1] = ops[5];
+ if (rtx_equal_p (operands[4], operands[1]))
+ ops[1] = ops[4];
+ else if (rtx_equal_p (operands[4], operands[2]))
+ ops[2] = ops[4];
bool ok = ix86_expand_fp_vcond (ops);
gcc_assert (ok);
generates the expected
_Z3fooDv2_fS_:
.LFB0:
.cfi_startproc
movq %xmm0, %xmm0
movq %xmm1, %xmm1
maxps %xmm0, %xmm1
movaps %xmm1, %xmm0
ret