On Sat, Dec 28, 2019 at 12:02 PM Jakub Jelinek <ja...@redhat.com> wrote: > > On Sat, Dec 28, 2019 at 11:48:12AM +0100, Uros Bizjak wrote: > > On Sat, Dec 28, 2019 at 10:33 AM Jakub Jelinek <ja...@redhat.com> wrote: > > > > > > Hi! > > > > > > In i386.md, we have nearbyint<mode>2 and rint<mode>2 patterns that expand > > > SF/DF/XF mode patterns to rounding instructions. For pre-sse4.1 that is > > > done using XFmode and so inappropriate for vectorization, but for sse4.1 > > > and later we can just use the {,v}{round,rndscale}p{s,d} instructions > > > when we emit {,v}rounds{s,d} for SF/DF mode. > > > > In i386-builtins.c, ix86_builtin_vectorized_function, we already have: > > > > --cut here-- > > CASE_CFN_RINT: > > /* The round insn does not trap on denormals. */ > > if (flag_trapping_math || !TARGET_SSE4_1) > > break; > > > > if (out_mode == DFmode && in_mode == DFmode) > > { > > if (out_n == 2 && in_n == 2) > > return ix86_get_builtin (IX86_BUILTIN_RINTPD); > > else if (out_n == 4 && in_n == 4) > > return ix86_get_builtin (IX86_BUILTIN_RINTPD256); > > } > > if (out_mode == SFmode && in_mode == SFmode) > > { > > if (out_n == 4 && in_n == 4) > > return ix86_get_builtin (IX86_BUILTIN_RINTPS); > > else if (out_n == 8 && in_n == 8) > > return ix86_get_builtin (IX86_BUILTIN_RINTPS256); > > } > > break; > > --cut here-- > > Ok, will test removing that stuff, seems nothing in the headers uses that. > > > which is converting rint functions to corresponding x86 builtin. If we > > want to go through generic path, then the above code is probably > > redundant and should be removed together with corresponding builtins. > > OTOH, the existing code also bails out for flag_trapping_math, so this > > condition should also be considered in named expanders. > > The conditions are: > (define_expand "nearbyint<mode>2" > [(use (match_operand:MODEF 0 "register_operand")) > (use (match_operand:MODEF 1 "nonimmediate_operand"))] > "(TARGET_USE_FANCY_MATH_387 > && (!(SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH) > || TARGET_MIX_SSE_I387) > && !flag_trapping_math) > || (TARGET_SSE4_1 && TARGET_SSE_MATH)" > and: > (define_expand "rint<mode>2" > [(use (match_operand:MODEF 0 "register_operand")) > (use (match_operand:MODEF 1 "nonimmediate_operand"))] > "TARGET_USE_FANCY_MATH_387 > || (SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH)" > Only nearbyint tests flag_trapping_math, and only for the pre-sse4.1 case,
This is correct, since x87 frndint always generates precision (inexact) exceptions, but nearbyint should not generate any. On a related note, trap on denormal is not IEEE exception, and documentation explicitly says that -fno-trapping-math affects only division by zero, overflow, underflow, inexact result and invalid operation. So, do we need to check for flag_trapping_math in ix86_builtin_vectorized_function for other builtins involving ROUND insn? Also, perhaps floor/ceil/trunc can be reimplemented using standard named expander instead. > with sse4.1 it is enabled regardless of that (just depends on > TARGET_SSE_MATH, but I think for vectorization we don't really test that, > vectorization is always done in sse*). Your patch with stuff removed from ix86_builtin_vectorized_function is OK. Thanks, Uros.