Hello! Attached patch enables l<rounding_insn><MODEF:mode><SWI48:mode>2 for TARGET_SSE4.1, and while there, also corrects operand 1 predicate of rounds{s,d} instruction.
2018-05-29 Uros Bizjak <ubiz...@gmail.com> PR target/85950 * config/i386/i386.md (l<rounding_insn><MODEF:mode><SWI48:mode>2): Enable for TARGET_SSE4_1 and generate rounds{s,d} and cvtts{s,d}2si{,q} sequence. (sse4_1_round<mode>2): Use nonimmediate_operand for operand 1 predicate. testsuite/ChangeLog: 2018-05-29 Uros Bizjak <ubiz...@gmail.com> PR target/85950 * gcc.target/i386/pr85950.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros.
Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 260850) +++ config/i386/i386.md (working copy) @@ -16655,7 +16655,7 @@ (define_insn "sse4_1_round<mode>2" [(set (match_operand:MODEF 0 "register_operand" "=x,v") - (unspec:MODEF [(match_operand:MODEF 1 "register_operand" "x,v") + (unspec:MODEF [(match_operand:MODEF 1 "nonimmediate_operand" "xm,vm") (match_operand:SI 2 "const_0_to_15_operand" "n,n")] UNSPEC_ROUND))] "TARGET_SSE4_1" @@ -17251,12 +17251,19 @@ FIST_ROUNDING)) (clobber (reg:CC FLAGS_REG))])] "SSE_FLOAT_MODE_P (<MODEF:MODE>mode) && TARGET_SSE_MATH - && !flag_trapping_math" + && (TARGET_SSE4_1 || !flag_trapping_math)" { - if (TARGET_64BIT && optimize_insn_for_size_p ()) - FAIL; + if (TARGET_SSE4_1) + { + rtx tmp = gen_reg_rtx (<MODEF:MODE>mode); - if (ROUND_<ROUNDING> == ROUND_FLOOR) + emit_insn (gen_sse4_1_round<mode>2 + (tmp, operands[1], GEN_INT (ROUND_<ROUNDING> + | ROUND_NO_EXC))); + emit_insn (gen_fix_trunc<MODEF:mode><SWI48:mode>2 + (operands[0], tmp)); + } + else if (ROUND_<ROUNDING> == ROUND_FLOOR) ix86_expand_lfloorceil (operands[0], operands[1], true); else if (ROUND_<ROUNDING> == ROUND_CEIL) ix86_expand_lfloorceil (operands[0], operands[1], false); Index: testsuite/gcc.target/i386/pr85950.c =================================================================== --- testsuite/gcc.target/i386/pr85950.c (nonexistent) +++ testsuite/gcc.target/i386/pr85950.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1 -mfpmath=sse" } */ + +double floor (double); +double ceil (double); + +int ifloor (double x) { return floor (x); } +int iceil (double x) { return ceil (x); } + +#ifdef __x86_64__ +long long llfloor (double x) { return floor (x); } +long long llceil (double x) { return ceil (x); } +#endif + +/* { dg-final { scan-assembler-times "roundsd" 2 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "roundsd" 4 { target { ! ia32 } } } } */