[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #11 from rguenther at suse dot de --- On Fri, 21 Jul 2023, pinskia at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 > > --- Comment #10 from Andrew Pinski --- > (In reply to CVS Commits from comment #8) > > * g++.target/i386/pr61747.C: New testcase. > > The testcase fails now, I don't know what caused it to fail though: > FAIL: g++.target/i386/pr61747.C -std=gnu++14 scan-assembler-times max 4 I failed to update it before pushing, it will be fixed with the next push I do (currently re-testing)
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #10 from Andrew Pinski --- (In reply to CVS Commits from comment #8) > * g++.target/i386/pr61747.C: New testcase. The testcase fails now, I don't know what caused it to fail though: FAIL: g++.target/i386/pr61747.C -std=gnu++14 scan-assembler-times max 4
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 Richard Biener changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |14.0 Status|ASSIGNED|RESOLVED --- Comment #9 from Richard Biener --- Fixed for GCC 14.
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #8 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:ceae1400cf24f329393e96dd9720b0391afe858d commit r14-2667-gceae1400cf24f329393e96dd9720b0391afe858d Author: Richard Biener Date: Tue Jul 18 13:19:11 2023 +0200 middle-end/61747 - conditional move expansion and constants When expanding a COND_EXPR or a VEC_COND_EXPR the x86 backend for example tries to match FP min/max instructions. But this only works when it can see the equality of the comparison and selected operands. This breaks in both prepare_cmp_insn and vector_compare_rtx where the former forces expensive constants to a register and the latter performs legitimization. The patch below fixes this in the caller preserving former equalities. PR middle-end/61747 * internal-fn.cc (expand_vec_cond_optab_fn): When the value operands are equal to the original comparison operands preserve that equality by re-using the comparison expansion. * optabs.cc (emit_conditional_move): When the value operands are equal to the comparison operands and would be forced to a register by prepare_cmp_insn do so earlier, preserving the equality. * g++.target/i386/pr61747.C: New testcase.
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 Richard Biener changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #7 from Richard Biener --- The cases with constant arguments fail to be recognized by the x86 conditional move expansion because RTL expansion makes it too difficult to see they are equal where required. That is emit_conditional_move forcing the constant to two different regs via prepare_cmp_insn. I'm testing a patch for this.
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Severity|normal |enhancement Last reconfirmed||2021-12-13 Status|UNCONFIRMED |NEW --- Comment #6 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #5 from Richard Biener --- ;; ??? For !flag_finite_math_only, the representation with SMIN/SMAX ;; isn't really correct, as those rtl operators aren't defined when ;; applied to NaNs. Hopefully the optimizers won't get too smart on us. (define_expand "3" [(set (match_operand:VF 0 "register_operand") (smaxmin:VF (match_operand:VF 1 "") (match_operand:VF 2 "")))] "TARGET_SSE && && " { if (!flag_finite_math_only) operands[1] = force_reg (mode, operands[1]); ix86_fixup_binary_operands_no_copy (, mode, operands); }) and ;; These versions of the min/max patterns implement exactly the operations ;; min = (op1 < op2 ? op1 : op2) ;; max = (!(op1 < op2) ? op1 : op2) ;; Their operands are not commutative, and thus they may be used in the ;; presence of -0.0 and NaN. (define_insn "*ieee_smin3" [(set (match_operand:VF 0 "register_operand" "=v,v") (unspec:VF [(match_operand:VF 1 "register_operand" "0,v") (match_operand:VF 2 "nonimmediate_operand" "vm,vm")] UNSPEC_IEEE_MIN))] "TARGET_SSE" ... maybe explain the -O2 code. Note that the middle-end uses min/max regardless of flags and makes it the targets responsibility to disable instructions that don't conform to IEEE. The above suggests that a>b ? a : b isn't IEEE conform on x86.
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #4 from vincenzo Innocente --- confirm that -ffinite-math-only -fno-signed-zeros is equivalent to Ofast in this case so we conclude that the code generated at O2 is wrong and -ffinite-math-only -fno-signed-zeros is required to trigger min/max?
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #3 from Marc Glisse --- (In reply to vincenzo Innocente from comment #2) > > I think you need -fno-signed-zeros for the transformation to be valid. > possible. > but then is the O2 code that is wrong? > in any case adding -fno-signed-zeros makes no difference w/r/t O2 alone -fno-signed-zeros comes in addition to some flag saying there are no NaNs (-ffinite-math-only for instance).
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #2 from vincenzo Innocente --- > I think you need -fno-signed-zeros for the transformation to be valid. possible. but then is the O2 code that is wrong? in any case adding -fno-signed-zeros makes no difference w/r/t O2 alone
[Bug tree-optimization/61747] min,max pattern not always properly optimized (for sse4 targets)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61747 --- Comment #1 from Marc Glisse --- I think you need -fno-signed-zeros for the transformation to be valid.