Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
This patch fixes the oversight. Tested on x86-64 Linux. Pushed. On Wed, Nov 9, 2022 at 12:05 AM Aldy Hernandez wrote: > > Sigh, one more thing. > > There are further possibilities for a NAN result, even if the operands > are !NAN and the result from frange_arithmetic is free of NANs. > Adding different signed infinities. > > For example, [-INF,+INF] + [-INF,+INF] has the possibility of adding > -INF and +INF, which is a NAN. Since we end up calling frange > arithmetic on the lower bounds and then on the upper bounds, we miss > this, and mistakenly think we're free of NANs. > > I have a patch in testing, but FYI, in case anyone notices this before > I get around to it tomorrow. > > Aldy > > On Tue, Nov 8, 2022 at 3:11 PM Jakub Jelinek wrote: > > > > On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote: > > > +// If either operand is a NAN, set R to the combination of both NANs > > > +// signwise and return TRUE. > > > > This comment doesn't describe what it does now. > > If either operand is a NAN, set R to NAN with unspecified sign bit and > > return > > TRUE. > > ? > > > > Other than this LGTM. > > > > Jakub > > From 68b0615be2aaff3a8ce91ba7cd0f69ebbd93702c Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Tue, 8 Nov 2022 23:42:04 +0100 Subject: [PATCH] [range-op-float] Set NAN possibility for INF + (-INF) and vice versa. Some combinations of operations can yield a NAN even if no operands have the possiblity of a NAN. For example, [-INF] + [+INF] = NAN and vice versa. For [-INF,+INF] + [-INF,+INF], frange_arithmetic will not return a NAN, and since the operands have no possibility of a NAN, we will mistakenly assume the result cannot have a NAN. This fixes the oversight. gcc/ChangeLog: * range-op-float.cc (foperator_plus::fold_range): Set NAN for addition of different signed infinities. (range_op_float_tests): New test. --- gcc/range-op-float.cc | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index 3bc6cc8849d..8282c912fc4 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -1863,7 +1863,21 @@ foperator_plus::fold_range (frange , tree type, r.set (type, lb, ub); - if (lb_nan || ub_nan) + // Some combinations can yield a NAN even if no operands have the + // possibility of a NAN. + bool maybe_nan; + // [-INF] + [+INF] = NAN + if (real_isinf (_bound (), true) + && real_isinf (_bound (), false)) +maybe_nan = true; + // [+INF] + [-INF] = NAN + else if (real_isinf (_bound (), false) + && real_isinf (_bound (), true)) +maybe_nan = true; + else +maybe_nan = false; + + if (lb_nan || ub_nan || maybe_nan) // Keep the default NAN (with a varying sign) set by the setter. ; else if (!op1.maybe_isnan () && !op2.maybe_isnan ()) @@ -1960,6 +1974,16 @@ range_op_float_tests () r1 = frange_float ("-1", "-0"); r1.update_nan (false); ASSERT_EQ (r, r1); + + // [-INF,+INF] + [-INF,+INF] could be a NAN. + range_op_handler plus (PLUS_EXPR, float_type_node); + r0.set_varying (float_type_node); + r1.set_varying (float_type_node); + r0.clear_nan (); + r1.clear_nan (); + plus.fold_range (r, float_type_node, r0, r1); + if (HONOR_NANS (float_type_node)) +ASSERT_TRUE (r.maybe_isnan ()); } } // namespace selftest -- 2.38.1
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
Sigh, one more thing. There are further possibilities for a NAN result, even if the operands are !NAN and the result from frange_arithmetic is free of NANs. Adding different signed infinities. For example, [-INF,+INF] + [-INF,+INF] has the possibility of adding -INF and +INF, which is a NAN. Since we end up calling frange arithmetic on the lower bounds and then on the upper bounds, we miss this, and mistakenly think we're free of NANs. I have a patch in testing, but FYI, in case anyone notices this before I get around to it tomorrow. Aldy On Tue, Nov 8, 2022 at 3:11 PM Jakub Jelinek wrote: > > On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote: > > +// If either operand is a NAN, set R to the combination of both NANs > > +// signwise and return TRUE. > > This comment doesn't describe what it does now. > If either operand is a NAN, set R to NAN with unspecified sign bit and return > TRUE. > ? > > Other than this LGTM. > > Jakub >
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 10:11 AM Jakub Jelinek wrote: > > On Tue, Nov 08, 2022 at 09:44:40AM -0800, Andrew Waterman wrote: > > On Tue, Nov 8, 2022 at 3:20 AM Jakub Jelinek via Gcc-patches > > wrote: > > > > > > On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote: > > > > As suggested upthread, I have also adjusted update_nan_sign() to drop > > > > the NAN sign to VARYING if both operands are NAN. As an optimization > > > > I keep the sign if both operands are NAN and have the same sign. > > > > > > For NaNs this still relies on something IEEE754 doesn't guarantee, > > > as I cited, after a binary operation the sign bit of the NaN is > > > unspecified, whether there is one NaN operand or two. > > > It might be that all CPUs handle it the way you've implemented > > > (that for one NaN operand the sign of NaN result will be the same > > > as that NaN operand and for two it will be the sign of one of the two > > > NaNs operands, never something else), but I think we'd need to check > > > more than one implementation for that (I've only tried x86_64 and thus > > > SSE behavior in it), so one would need to test i387 long double behavior > > > too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ... > > > The guarantee given by IEEE754 is only for those copy, negate, abs, > > > copySign > > > operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*, > > > __builtin_copysign*. > > > > FWIW, RISC-V canonicalizes NaNs by clearing the sign bit; the signs of > > the input NaNs do not factor in. > > Just for binary operations and some unary, or also the ones that > IEEE754 spells out (moves, negations, absolute value and copysign)? I should've been more specific in my earlier email: I was referring to the arithmetic operators. Copysign and friends do not canonicalize NaNs. > > Jakub >
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 08, 2022 at 09:44:40AM -0800, Andrew Waterman wrote: > On Tue, Nov 8, 2022 at 3:20 AM Jakub Jelinek via Gcc-patches > wrote: > > > > On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote: > > > As suggested upthread, I have also adjusted update_nan_sign() to drop > > > the NAN sign to VARYING if both operands are NAN. As an optimization > > > I keep the sign if both operands are NAN and have the same sign. > > > > For NaNs this still relies on something IEEE754 doesn't guarantee, > > as I cited, after a binary operation the sign bit of the NaN is > > unspecified, whether there is one NaN operand or two. > > It might be that all CPUs handle it the way you've implemented > > (that for one NaN operand the sign of NaN result will be the same > > as that NaN operand and for two it will be the sign of one of the two > > NaNs operands, never something else), but I think we'd need to check > > more than one implementation for that (I've only tried x86_64 and thus > > SSE behavior in it), so one would need to test i387 long double behavior > > too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ... > > The guarantee given by IEEE754 is only for those copy, negate, abs, copySign > > operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*, > > __builtin_copysign*. > > FWIW, RISC-V canonicalizes NaNs by clearing the sign bit; the signs of > the input NaNs do not factor in. Just for binary operations and some unary, or also the ones that IEEE754 spells out (moves, negations, absolute value and copysign)? Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 3:20 AM Jakub Jelinek via Gcc-patches wrote: > > On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote: > > As suggested upthread, I have also adjusted update_nan_sign() to drop > > the NAN sign to VARYING if both operands are NAN. As an optimization > > I keep the sign if both operands are NAN and have the same sign. > > For NaNs this still relies on something IEEE754 doesn't guarantee, > as I cited, after a binary operation the sign bit of the NaN is > unspecified, whether there is one NaN operand or two. > It might be that all CPUs handle it the way you've implemented > (that for one NaN operand the sign of NaN result will be the same > as that NaN operand and for two it will be the sign of one of the two > NaNs operands, never something else), but I think we'd need to check > more than one implementation for that (I've only tried x86_64 and thus > SSE behavior in it), so one would need to test i387 long double behavior > too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ... > The guarantee given by IEEE754 is only for those copy, negate, abs, copySign > operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*, > __builtin_copysign*. FWIW, RISC-V canonicalizes NaNs by clearing the sign bit; the signs of the input NaNs do not factor in. > > Otherwise LGTM (but would be nice to get into GCC13 not just > +, but also -, *, /, sqrt at least). > > Jakub >
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 3:11 PM Jakub Jelinek wrote: > > On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote: > > +// If either operand is a NAN, set R to the combination of both NANs > > +// signwise and return TRUE. > > This comment doesn't describe what it does now. > If either operand is a NAN, set R to NAN with unspecified sign bit and return > TRUE. > ? OMG, I suck! // If either operand is a NAN, set R to NAN and return TRUE. Tests on-going :). Aldy
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote: > +// If either operand is a NAN, set R to the combination of both NANs > +// signwise and return TRUE. This comment doesn't describe what it does now. If either operand is a NAN, set R to NAN with unspecified sign bit and return TRUE. ? Other than this LGTM. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 2:50 PM Jakub Jelinek wrote: > > On Tue, Nov 08, 2022 at 02:47:35PM +0100, Aldy Hernandez wrote: > > Well, perhaps we should just nuke update_nan_sign() altogether, and > > always keep the sign varying? > > > > inline bool > > propagate_nans (frange , const frange , const frange ) > > { > > if (op1.known_isnan () || op2.known_isnan ()) > > { > > r.set_nan (op1.type ()); > > return true; > > } > > return false; > > } > > > > I'm fine either way. The less code the better :). > > Yes, but you had 2 callers, so something needs to be done also if > in foperator_plus::fold_range. We can also remove the update_nan_sign() in the other call because the r.set() before it sets a default NAN (with a varying sign). Attached patch in testing. Aldy From 8da4fc39cc73f0ae785463bd5f371223fa59027e Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Thu, 13 Oct 2022 08:14:16 +0200 Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats. This is the range-op entry for floating point PLUS_EXPR. It's the most intricate range entry we have so far, because we need to keep track of rounding and target FP formats. This will be the last FP entry I commit, mostly to avoid disturbing the tree any further, and also because what we have so far is enough for a solid VRP. So far we track NANs and signs correctly. We also handle relationals (symbolics and numeric), both ordered and unordered, ABS_EXPR and NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign (__builtin_copysign is coming up). All in all, I think this provide more than enough for basic VRP on floats, as well as provide a basis to flesh out the rest if there's interest. My goal with this entry is to provide a template for additional binary operators, as they tend to follow a similar pattern: handle NANs, do the arithmetic while keeping track of rounding, and adjust for NAN. I may abstract the general parts as we do for irange's fold_range and wi_fold. PR tree-optimization/24021 gcc/ChangeLog: * range-op-float.cc (propagate_nans): New. (frange_nextafter): New. (frange_arithmetic): New. (class foperator_plus): New. (floating_op_table::floating_op_table): Add PLUS_EXPR entry. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-plus.c: New test. --- gcc/range-op-float.cc | 128 ++ .../gcc.dg/tree-ssa/vrp-float-plus.c | 21 +++ 2 files changed, 149 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index a1f372997bf..502b67aaa66 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -192,6 +192,81 @@ frelop_early_resolve (irange , tree type, && relop_early_resolve (r, type, op1, op2, rel, my_rel)); } +// If either operand is a NAN, set R to the combination of both NANs +// signwise and return TRUE. + +inline bool +propagate_nans (frange , const frange , const frange ) +{ + if (op1.known_isnan () || op2.known_isnan ()) +{ + r.set_nan (op1.type ()); + return true; +} + return false; +} + +// Set VALUE to its next real value, or INF if the operation overflows. + +inline void +frange_nextafter (enum machine_mode mode, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + const real_format *fmt = REAL_MODE_FORMAT (mode); + REAL_VALUE_TYPE tmp; + real_nextafter (, fmt, , ); + value = tmp; +} + +// Like real_arithmetic, but round the result to INF if the operation +// produced inexact results. +// +// ?? There is still one problematic case, i387. With +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in +// XFmode (long_double_type_node), so that case is OK. But without +// -mfpmath=sse, all the SF/DFmode computations are in XFmode +// precision (64-bit mantissa) and only occassionally rounded to +// SF/DFmode (when storing into memory from the 387 stack). Maybe +// this is ok as well though it is just occassionally more precise. ?? + +static void +frange_arithmetic (enum tree_code code, tree type, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE , + const REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + REAL_VALUE_TYPE value; + enum machine_mode mode = TYPE_MODE (type); + bool mode_composite = MODE_COMPOSITE_P (mode); + + bool inexact = real_arithmetic (, code, , ); + real_convert (, mode, ); + + // Be extra careful if there may be discrepancies between the + // compile and runtime results. + if ((mode_composite || (real_isneg () ? real_less (, ) + : !real_less (, ))) + && (inexact || !real_identical (, ))) +{ + if (mode_composite) + { + if (real_isdenormal (, mode) + || real_iszero ()) + { + // IBM extended denormals only have DFmode precision. + REAL_VALUE_TYPE tmp; +
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 08, 2022 at 03:02:40PM +0100, Aldy Hernandez wrote: > From d02ce8eaf16d2fc6db6472268fd962e09c2fd81e Mon Sep 17 00:00:00 2001 > From: Aldy Hernandez > Date: Mon, 7 Nov 2022 14:18:57 +0100 > Subject: [PATCH] Provide normalized and denormal format version of > real_isdenormal. > > Implement a variant of real_isdenormal() to be used within real.cc > where the argument is known to be in denormal format. Rewrite > real_isdenormal() for use outside of real.cc where the argument is > known to be normalized. > > gcc/ChangeLog: > > * real.cc (real_isdenormal): New. > (encode_ieee_single): Call real_isdenormal. > (encode_ieee_double): Same. > (encode_ieee_extended): Same. > (encode_ieee_quad): Same. > (encode_ieee_half): Same. > (encode_arm_bfloat_half): Same. > * real.h (real_isdenormal): Add mode argument. Rewrite for > normalized values. > * value-range.cc (frange::flush_denormals_to_zero): Pass mode to > real_isdenormal. LGTM, thanks. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 2:15 PM Jakub Jelinek wrote: > > On Tue, Nov 08, 2022 at 01:47:58PM +0100, Aldy Hernandez wrote: > > On Tue, Nov 8, 2022 at 12:07 PM Jakub Jelinek wrote: > > > > > > On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote: > > > > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001 > > > > From: Aldy Hernandez > > > > Date: Mon, 7 Nov 2022 14:18:57 +0100 > > > > Subject: [PATCH] Provide normalized and denormal format version of > > > > real_isdenormal. > > > > > > > > Implement real_isdenormal_target() to be used within real.cc where the > > > > argument is known to be in denormal format. Rewrite real_isdenormal() > > > > for use outside of real.cc where the argument is known to be > > > > normalized. > > > > > > > > gcc/ChangeLog: > > > > > > > > * real.cc (real_isdenormal_target): New. > > > > (encode_ieee_single): Use real_isdenormal_target. > > > > (encode_ieee_double): Same. > > > > (encode_ieee_extended): Same. > > > > (encode_ieee_quad): Same. > > > > (encode_ieee_half): Same. > > > > (encode_arm_bfloat_half): Same. > > > > * value-range.cc (frange::flush_denormals_to_zero): Same. > > > > * real.h (real_isdenormal): Rewrite to look at mode. > > > > > > I'd make real_isdenormal_target static inline bool > > > rather than inline bool, it is only defined in real.cc, so there is > > > no point exporting it. > > > > Huh. I thought inline alone would inhibit the exporting. Thanks. > > That is what happens with C99 inline (unless there is extern for the decl), > but C++ inline is different. It isn't guaranteed to be exported, but > with -fkeep-inline-functions or if you say take address of the inline > in a way that can't be optimized back into a call to the inline (or even > just call it with -O0), it is exported. > > > > > Though, as you've added the mode argument, the real.cc inline > > > could very well also be called real_isdenormal too, it wouldn't be > > > a redeclaration or ODR violation. > > > > Great, even better. > > > > OK pending tests? > > Aldy > > > From c3ca1d606bfb22bf4f8bc7ac0ce561bd6afc3368 Mon Sep 17 00:00:00 2001 > > From: Aldy Hernandez > > Date: Mon, 7 Nov 2022 14:18:57 +0100 > > Subject: [PATCH] Provide normalized and denormal format version of > > real_isdenormal. > > > > Implement a variant of real_isdenormal() to be used within real.cc > > where the argument is known to be in denormal format. Rewrite > > real_isdenormal() for use outside of real.cc where the argument is > > known to be normalized. > > > > gcc/ChangeLog: > > > > * real.cc (real_isdenormal): New. > > * real.h (real_isdenormal): Add mode argument. Rewrite for > > normalized values. > > * value-range.cc (frange::flush_denormals_to_zero): Pass mode to > > real_isdenormal. > > --- > > gcc/real.cc| 10 ++ > > gcc/real.h | 7 --- > > gcc/value-range.cc | 5 +++-- > > 3 files changed, 17 insertions(+), 5 deletions(-) > > > > diff --git a/gcc/real.cc b/gcc/real.cc > > index aae7c335d59..028aad95ec4 100644 > > --- a/gcc/real.cc > > +++ b/gcc/real.cc > > @@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int); > > static void times_pten (REAL_VALUE_TYPE *, int); > > > > static void round_for_format (const struct real_format *, REAL_VALUE_TYPE > > *); > > + > > +/* Determine whether a floating-point value X is a denormal. R is > > + expected to be in denormal form, so this function is only > > + meaningful after a call to round_for_format. */ > > + > > +static inline bool > > +real_isdenormal (const REAL_VALUE_TYPE *r) > > +{ > > + return (r->sig[SIGSZ-1] & SIG_MSB) == 0; > > I would probably keep the r->cl == rvc_normal in here too. > I know the code in real.cc didn't do it before, but what > r->sig is for the rvc_zero/rvc_inf is unclear. It is true > that get_zero/get_canonical_?nan/get_inf clear the whole sig, > but not really sure if we guarantee that everywhere. > The real.cc uses were like: > bool denormal = ...; > at the start of the function and then > switch (...) > { > ... > case rvc_normal: > if (denormal) > ... > } > so another even better possibility would be to use your simple > real.cc (real_isdenormal) and drop all the denormal variables, so: > - bool denormal = ...; > switch (...) > { > ... > case rvc_normal: > - if (denormal) > + if (real_isdenormal (r)) > ... > } Sure. Attached patch in testing. Aldy From d02ce8eaf16d2fc6db6472268fd962e09c2fd81e Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Mon, 7 Nov 2022 14:18:57 +0100 Subject: [PATCH] Provide normalized and denormal format version of real_isdenormal. Implement a variant of real_isdenormal() to be used within real.cc where the argument is known to be in denormal format. Rewrite real_isdenormal() for use outside of real.cc where the argument is known to be normalized. gcc/ChangeLog: *
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 08, 2022 at 02:47:35PM +0100, Aldy Hernandez wrote: > Well, perhaps we should just nuke update_nan_sign() altogether, and > always keep the sign varying? > > inline bool > propagate_nans (frange , const frange , const frange ) > { > if (op1.known_isnan () || op2.known_isnan ()) > { > r.set_nan (op1.type ()); > return true; > } > return false; > } > > I'm fine either way. The less code the better :). Yes, but you had 2 callers, so something needs to be done also if in foperator_plus::fold_range. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 2:25 PM Jakub Jelinek wrote: > > On Tue, Nov 08, 2022 at 02:06:58PM +0100, Aldy Hernandez wrote: > > + gcc_checking_assert (!r.nan_signbit_p (sign1)); > > + if (op1_nan && op2_nan) > > +{ > > + // If boths signs agree, we could use that sign, but IEEE754 > > + // does not guarantee this for a binary operator. The x86_64 > > + // architure does keep the common known sign, but further tests > > + // are needed to see if other architectures do the same (i387 > > + // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc). > > s/RSICV/RISCV/ > > > + // In the meantime, keep sign VARYING. > > + ; > > +} > > + else if (op1_nan) > > +{ > > + if (op1.nan_signbit_p (sign1)) > > + r.update_nan (sign1); > > +} > > + else if (op2_nan) > > +{ > > + if (op2.nan_signbit_p (sign2)) > > + r.update_nan (sign2); > > +} > > Well, these cases also aren't guaranteed for binary operator. > I think a conforming implementation can say copy the NaN operand > to result and toggle the sign. Or, if the operand would be a sNaN, > it must turn it into a qNaN (don't remember right now if there are > requirements on what it can do with the mantissa which needs to change > for the sNaN -> qNaN difference at least, but whether it can just > generate a canonical qNaN or needs to preserve at least some bits), > but could e.g. clear or toggle the sign of the NaN as well. > Whether there are any such implementations or not is a question. > For the single qNaN operand case, it would surprise me if anybody > bothered to tweak the sign bit in any way, just copying the input > seems simpler to me, but for the sNaN -> qNaN conversion it wouldn't > surprise me that much. Well, perhaps we should just nuke update_nan_sign() altogether, and always keep the sign varying? inline bool propagate_nans (frange , const frange , const frange ) { if (op1.known_isnan () || op2.known_isnan ()) { r.set_nan (op1.type ()); return true; } return false; } I'm fine either way. The less code the better :). Aldy
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 08, 2022 at 02:06:58PM +0100, Aldy Hernandez wrote: > + gcc_checking_assert (!r.nan_signbit_p (sign1)); > + if (op1_nan && op2_nan) > +{ > + // If boths signs agree, we could use that sign, but IEEE754 > + // does not guarantee this for a binary operator. The x86_64 > + // architure does keep the common known sign, but further tests > + // are needed to see if other architectures do the same (i387 > + // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc). s/RSICV/RISCV/ > + // In the meantime, keep sign VARYING. > + ; > +} > + else if (op1_nan) > +{ > + if (op1.nan_signbit_p (sign1)) > + r.update_nan (sign1); > +} > + else if (op2_nan) > +{ > + if (op2.nan_signbit_p (sign2)) > + r.update_nan (sign2); > +} Well, these cases also aren't guaranteed for binary operator. I think a conforming implementation can say copy the NaN operand to result and toggle the sign. Or, if the operand would be a sNaN, it must turn it into a qNaN (don't remember right now if there are requirements on what it can do with the mantissa which needs to change for the sNaN -> qNaN difference at least, but whether it can just generate a canonical qNaN or needs to preserve at least some bits), but could e.g. clear or toggle the sign of the NaN as well. Whether there are any such implementations or not is a question. For the single qNaN operand case, it would surprise me if anybody bothered to tweak the sign bit in any way, just copying the input seems simpler to me, but for the sNaN -> qNaN conversion it wouldn't surprise me that much. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 08, 2022 at 01:47:58PM +0100, Aldy Hernandez wrote: > On Tue, Nov 8, 2022 at 12:07 PM Jakub Jelinek wrote: > > > > On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote: > > > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001 > > > From: Aldy Hernandez > > > Date: Mon, 7 Nov 2022 14:18:57 +0100 > > > Subject: [PATCH] Provide normalized and denormal format version of > > > real_isdenormal. > > > > > > Implement real_isdenormal_target() to be used within real.cc where the > > > argument is known to be in denormal format. Rewrite real_isdenormal() > > > for use outside of real.cc where the argument is known to be > > > normalized. > > > > > > gcc/ChangeLog: > > > > > > * real.cc (real_isdenormal_target): New. > > > (encode_ieee_single): Use real_isdenormal_target. > > > (encode_ieee_double): Same. > > > (encode_ieee_extended): Same. > > > (encode_ieee_quad): Same. > > > (encode_ieee_half): Same. > > > (encode_arm_bfloat_half): Same. > > > * value-range.cc (frange::flush_denormals_to_zero): Same. > > > * real.h (real_isdenormal): Rewrite to look at mode. > > > > I'd make real_isdenormal_target static inline bool > > rather than inline bool, it is only defined in real.cc, so there is > > no point exporting it. > > Huh. I thought inline alone would inhibit the exporting. Thanks. That is what happens with C99 inline (unless there is extern for the decl), but C++ inline is different. It isn't guaranteed to be exported, but with -fkeep-inline-functions or if you say take address of the inline in a way that can't be optimized back into a call to the inline (or even just call it with -O0), it is exported. > > > Though, as you've added the mode argument, the real.cc inline > > could very well also be called real_isdenormal too, it wouldn't be > > a redeclaration or ODR violation. > > Great, even better. > > OK pending tests? > Aldy > From c3ca1d606bfb22bf4f8bc7ac0ce561bd6afc3368 Mon Sep 17 00:00:00 2001 > From: Aldy Hernandez > Date: Mon, 7 Nov 2022 14:18:57 +0100 > Subject: [PATCH] Provide normalized and denormal format version of > real_isdenormal. > > Implement a variant of real_isdenormal() to be used within real.cc > where the argument is known to be in denormal format. Rewrite > real_isdenormal() for use outside of real.cc where the argument is > known to be normalized. > > gcc/ChangeLog: > > * real.cc (real_isdenormal): New. > * real.h (real_isdenormal): Add mode argument. Rewrite for > normalized values. > * value-range.cc (frange::flush_denormals_to_zero): Pass mode to > real_isdenormal. > --- > gcc/real.cc| 10 ++ > gcc/real.h | 7 --- > gcc/value-range.cc | 5 +++-- > 3 files changed, 17 insertions(+), 5 deletions(-) > > diff --git a/gcc/real.cc b/gcc/real.cc > index aae7c335d59..028aad95ec4 100644 > --- a/gcc/real.cc > +++ b/gcc/real.cc > @@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int); > static void times_pten (REAL_VALUE_TYPE *, int); > > static void round_for_format (const struct real_format *, REAL_VALUE_TYPE *); > + > +/* Determine whether a floating-point value X is a denormal. R is > + expected to be in denormal form, so this function is only > + meaningful after a call to round_for_format. */ > + > +static inline bool > +real_isdenormal (const REAL_VALUE_TYPE *r) > +{ > + return (r->sig[SIGSZ-1] & SIG_MSB) == 0; I would probably keep the r->cl == rvc_normal in here too. I know the code in real.cc didn't do it before, but what r->sig is for the rvc_zero/rvc_inf is unclear. It is true that get_zero/get_canonical_?nan/get_inf clear the whole sig, but not really sure if we guarantee that everywhere. The real.cc uses were like: bool denormal = ...; at the start of the function and then switch (...) { ... case rvc_normal: if (denormal) ... } so another even better possibility would be to use your simple real.cc (real_isdenormal) and drop all the denormal variables, so: - bool denormal = ...; switch (...) { ... case rvc_normal: - if (denormal) + if (real_isdenormal (r)) ... } Otherwise LGTM. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 12:20 PM Jakub Jelinek wrote: > > On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote: > > As suggested upthread, I have also adjusted update_nan_sign() to drop > > the NAN sign to VARYING if both operands are NAN. As an optimization > > I keep the sign if both operands are NAN and have the same sign. > > For NaNs this still relies on something IEEE754 doesn't guarantee, > as I cited, after a binary operation the sign bit of the NaN is > unspecified, whether there is one NaN operand or two. > It might be that all CPUs handle it the way you've implemented > (that for one NaN operand the sign of NaN result will be the same > as that NaN operand and for two it will be the sign of one of the two > NaNs operands, never something else), but I think we'd need to check > more than one implementation for that (I've only tried x86_64 and thus > SSE behavior in it), so one would need to test i387 long double behavior > too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ... > The guarantee given by IEEE754 is only for those copy, negate, abs, copySign > operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*, > __builtin_copysign*. Ughh, that's unfortunate. OK, I've added a big note. > > Otherwise LGTM (but would be nice to get into GCC13 not just > +, but also -, *, /, sqrt at least). Minus is trivial as we can implement it with a negate and plus. I have a patch queued up for that. The rest require a bit more thought, though perhaps with what we have so far can serve as a base. I'll look into it. Attached is the patch I'm retesting. Thanks for your patience, and copious help here. Aldy From 32e9063bbd5a48bf7f7b16077ebc0c1e7bf3c33d Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Thu, 13 Oct 2022 08:14:16 +0200 Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats. This is the range-op entry for floating point PLUS_EXPR. It's the most intricate range entry we have so far, because we need to keep track of rounding and target FP formats. This will be the last FP entry I commit, mostly to avoid disturbing the tree any further, and also because what we have so far is enough for a solid VRP. So far we track NANs and signs correctly. We also handle relationals (symbolics and numeric), both ordered and unordered, ABS_EXPR and NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign (__builtin_copysign is coming up). All in all, I think this provide more than enough for basic VRP on floats, as well as provide a basis to flesh out the rest if there's interest. My goal with this entry is to provide a template for additional binary operators, as they tend to follow a similar pattern: handle NANs, do the arithmetic while keeping track of rounding, and adjust for NAN. I may abstract the general parts as we do for irange's fold_range and wi_fold. PR tree-optimization/24021 gcc/ChangeLog: * range-op-float.cc (update_nan_sign): New. (propagate_nans): New. (frange_nextafter): New. (frange_arithmetic): New. (class foperator_plus): New. (floating_op_table::floating_op_table): Add PLUS_EXPR entry. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-plus.c: New test. --- gcc/range-op-float.cc | 165 ++ .../gcc.dg/tree-ssa/vrp-float-plus.c | 21 +++ 2 files changed, 186 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index a1f372997bf..1a6913b8b98 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -192,6 +192,118 @@ frelop_early_resolve (irange , tree type, && relop_early_resolve (r, type, op1, op2, rel, my_rel)); } +// If R contains a NAN of unknown sign, update the NAN's signbit +// depending on two operands. + +inline void +update_nan_sign (frange , const frange , const frange ) +{ + if (!r.maybe_isnan ()) +return; + + bool op1_nan = op1.maybe_isnan (); + bool op2_nan = op2.maybe_isnan (); + bool sign1, sign2; + + gcc_checking_assert (!r.nan_signbit_p (sign1)); + if (op1_nan && op2_nan) +{ + // If boths signs agree, we could use that sign, but IEEE754 + // does not guarantee this for a binary operator. The x86_64 + // architure does keep the common known sign, but further tests + // are needed to see if other architectures do the same (i387 + // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc). + // In the meantime, keep sign VARYING. + ; +} + else if (op1_nan) +{ + if (op1.nan_signbit_p (sign1)) + r.update_nan (sign1); +} + else if (op2_nan) +{ + if (op2.nan_signbit_p (sign2)) + r.update_nan (sign2); +} +} + +// If either operand is a NAN, set R to the combination of both NANs +// signwise and return TRUE. + +inline bool +propagate_nans (frange , const frange , const fra
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Tue, Nov 8, 2022 at 12:07 PM Jakub Jelinek wrote: > > On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote: > > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001 > > From: Aldy Hernandez > > Date: Mon, 7 Nov 2022 14:18:57 +0100 > > Subject: [PATCH] Provide normalized and denormal format version of > > real_isdenormal. > > > > Implement real_isdenormal_target() to be used within real.cc where the > > argument is known to be in denormal format. Rewrite real_isdenormal() > > for use outside of real.cc where the argument is known to be > > normalized. > > > > gcc/ChangeLog: > > > > * real.cc (real_isdenormal_target): New. > > (encode_ieee_single): Use real_isdenormal_target. > > (encode_ieee_double): Same. > > (encode_ieee_extended): Same. > > (encode_ieee_quad): Same. > > (encode_ieee_half): Same. > > (encode_arm_bfloat_half): Same. > > * value-range.cc (frange::flush_denormals_to_zero): Same. > > * real.h (real_isdenormal): Rewrite to look at mode. > > I'd make real_isdenormal_target static inline bool > rather than inline bool, it is only defined in real.cc, so there is > no point exporting it. Huh. I thought inline alone would inhibit the exporting. Thanks. > Though, as you've added the mode argument, the real.cc inline > could very well also be called real_isdenormal too, it wouldn't be > a redeclaration or ODR violation. Great, even better. OK pending tests? Aldy From c3ca1d606bfb22bf4f8bc7ac0ce561bd6afc3368 Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Mon, 7 Nov 2022 14:18:57 +0100 Subject: [PATCH] Provide normalized and denormal format version of real_isdenormal. Implement a variant of real_isdenormal() to be used within real.cc where the argument is known to be in denormal format. Rewrite real_isdenormal() for use outside of real.cc where the argument is known to be normalized. gcc/ChangeLog: * real.cc (real_isdenormal): New. * real.h (real_isdenormal): Add mode argument. Rewrite for normalized values. * value-range.cc (frange::flush_denormals_to_zero): Pass mode to real_isdenormal. --- gcc/real.cc| 10 ++ gcc/real.h | 7 --- gcc/value-range.cc | 5 +++-- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/gcc/real.cc b/gcc/real.cc index aae7c335d59..028aad95ec4 100644 --- a/gcc/real.cc +++ b/gcc/real.cc @@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int); static void times_pten (REAL_VALUE_TYPE *, int); static void round_for_format (const struct real_format *, REAL_VALUE_TYPE *); + +/* Determine whether a floating-point value X is a denormal. R is + expected to be in denormal form, so this function is only + meaningful after a call to round_for_format. */ + +static inline bool +real_isdenormal (const REAL_VALUE_TYPE *r) +{ + return (r->sig[SIGSZ-1] & SIG_MSB) == 0; +} /* Initialize R with a positive zero. */ diff --git a/gcc/real.h b/gcc/real.h index 306e9593866..b14bcdd3fde 100644 --- a/gcc/real.h +++ b/gcc/real.h @@ -286,11 +286,12 @@ extern bool real_isnan (const REAL_VALUE_TYPE *); /* Determine whether a floating-point value X is a signaling NaN. */ extern bool real_issignaling_nan (const REAL_VALUE_TYPE *); -/* Determine whether a floating-point value X is a denormal. */ +/* Determine whether floating-point value R is a denormal. This + function is only valid for normalized values. */ inline bool -real_isdenormal (const REAL_VALUE_TYPE *r) +real_isdenormal (const REAL_VALUE_TYPE *r, machine_mode mode) { - return r->cl == rvc_normal && (r->sig[SIGSZ-1] & SIG_MSB) == 0; + return r->cl == rvc_normal && REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin; } /* Determine whether a floating-point value X is finite. */ diff --git a/gcc/value-range.cc b/gcc/value-range.cc index a855aaf626c..859c7fb4af9 100644 --- a/gcc/value-range.cc +++ b/gcc/value-range.cc @@ -266,15 +266,16 @@ frange::flush_denormals_to_zero () if (undefined_p () || known_isnan ()) return; + machine_mode mode = TYPE_MODE (type ()); // Flush [x, -DENORMAL] to [x, -0.0]. - if (real_isdenormal (_max) && real_isneg (_max)) + if (real_isdenormal (_max, mode) && real_isneg (_max)) { m_max = dconst0; if (HONOR_SIGNED_ZEROS (m_type)) m_max.sign = 1; } // Flush [+DENORMAL, x] to [+0.0, x]. - if (real_isdenormal (_min) && !real_isneg (_min)) + if (real_isdenormal (_min, mode) && !real_isneg (_min)) m_min = dconst0; } -- 2.38.1
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote: > As suggested upthread, I have also adjusted update_nan_sign() to drop > the NAN sign to VARYING if both operands are NAN. As an optimization > I keep the sign if both operands are NAN and have the same sign. For NaNs this still relies on something IEEE754 doesn't guarantee, as I cited, after a binary operation the sign bit of the NaN is unspecified, whether there is one NaN operand or two. It might be that all CPUs handle it the way you've implemented (that for one NaN operand the sign of NaN result will be the same as that NaN operand and for two it will be the sign of one of the two NaNs operands, never something else), but I think we'd need to check more than one implementation for that (I've only tried x86_64 and thus SSE behavior in it), so one would need to test i387 long double behavior too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ... The guarantee given by IEEE754 is only for those copy, negate, abs, copySign operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*, __builtin_copysign*. Otherwise LGTM (but would be nice to get into GCC13 not just +, but also -, *, /, sqrt at least). Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote: > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001 > From: Aldy Hernandez > Date: Mon, 7 Nov 2022 14:18:57 +0100 > Subject: [PATCH] Provide normalized and denormal format version of > real_isdenormal. > > Implement real_isdenormal_target() to be used within real.cc where the > argument is known to be in denormal format. Rewrite real_isdenormal() > for use outside of real.cc where the argument is known to be > normalized. > > gcc/ChangeLog: > > * real.cc (real_isdenormal_target): New. > (encode_ieee_single): Use real_isdenormal_target. > (encode_ieee_double): Same. > (encode_ieee_extended): Same. > (encode_ieee_quad): Same. > (encode_ieee_half): Same. > (encode_arm_bfloat_half): Same. > * value-range.cc (frange::flush_denormals_to_zero): Same. > * real.h (real_isdenormal): Rewrite to look at mode. I'd make real_isdenormal_target static inline bool rather than inline bool, it is only defined in real.cc, so there is no point exporting it. Though, as you've added the mode argument, the real.cc inline could very well also be called real_isdenormal too, it wouldn't be a redeclaration or ODR violation. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Fri, Nov 4, 2022 at 8:14 PM Jakub Jelinek wrote: > > On Mon, Oct 17, 2022 at 08:21:01AM +0200, Aldy Hernandez wrote: > > +// Set VALUE to its next real value, or INF if the operation overflows. > > + > > +inline void > > +frange_nextafter (enum machine_mode mode, > > + REAL_VALUE_TYPE , > > + const REAL_VALUE_TYPE ) > > +{ > > + const real_format *fmt = REAL_MODE_FORMAT (mode); > > + REAL_VALUE_TYPE tmp; > > + bool overflow = real_nextafter (, fmt, , ); > > + if (overflow) > > +value = inf; > > + else > > +value = tmp; > > Please change the above 5 lines to just > real_nextafter (, fmt, , ); > value = tmp; Done. As suggested upthread, I have also adjusted update_nan_sign() to drop the NAN sign to VARYING if both operands are NAN. As an optimization I keep the sign if both operands are NAN and have the same sign. How's this? Tested on x86-64 Linux. LAPACK testing as well. Aldy From 7717ad5c495934ff28ea2abd639ab4d91b7ffcab Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Thu, 13 Oct 2022 08:14:16 +0200 Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats. This is the range-op entry for floating point PLUS_EXPR. It's the most intricate range entry we have so far, because we need to keep track of rounding and target FP formats. This will be the last FP entry I commit, mostly to avoid disturbing the tree any further, and also because what we have so far is enough for a solid VRP. So far we track NANs and signs correctly. We also handle relationals (symbolics and numeric), both ordered and unordered, ABS_EXPR and NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign (__builtin_copysign is coming up). All in all, I think this provide more than enough for basic VRP on floats, as well as provide a basis to flesh out the rest if there's interest. My goal with this entry is to provide a template for additional binary operators, as they tend to follow a similar pattern: handle NANs, do the arithmetic while keeping track of rounding, and adjust for NAN. I may abstract the general parts as we do for irange's fold_range and wi_fold. PR tree-optimization/24021 gcc/ChangeLog: * range-op-float.cc (update_nan_sign): New. (propagate_nans): New. (frange_nextafter): New. (frange_arithmetic): New. (class foperator_plus): New. (floating_op_table::floating_op_table): Add PLUS_EXPR entry. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-plus.c: New test. --- gcc/range-op-float.cc | 162 ++ .../gcc.dg/tree-ssa/vrp-float-plus.c | 21 +++ 2 files changed, 183 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index a1f372997bf..9c09bb35dbe 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -192,6 +192,115 @@ frelop_early_resolve (irange , tree type, && relop_early_resolve (r, type, op1, op2, rel, my_rel)); } +// If R contains a NAN of unknown sign, update the NAN's signbit +// depending on two operands. + +inline void +update_nan_sign (frange , const frange , const frange ) +{ + if (!r.maybe_isnan ()) +return; + + bool op1_nan = op1.maybe_isnan (); + bool op2_nan = op2.maybe_isnan (); + bool sign1, sign2; + + gcc_checking_assert (!r.nan_signbit_p (sign1)); + if (op1_nan && op2_nan) +{ + // If both signs agree, use that. Otherwise keep sign VARYING. + if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2) + && sign1 == sign2) + r.update_nan (sign1); +} + else if (op1_nan) +{ + if (op1.nan_signbit_p (sign1)) + r.update_nan (sign1); +} + else if (op2_nan) +{ + if (op2.nan_signbit_p (sign2)) + r.update_nan (sign2); +} +} + +// If either operand is a NAN, set R to the combination of both NANs +// signwise and return TRUE. + +inline bool +propagate_nans (frange , const frange , const frange ) +{ + if (op1.known_isnan () || op2.known_isnan ()) +{ + r.set_nan (op1.type ()); + update_nan_sign (r, op1, op2); + return true; +} + return false; +} + +// Set VALUE to its next real value, or INF if the operation overflows. + +inline void +frange_nextafter (enum machine_mode mode, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + const real_format *fmt = REAL_MODE_FORMAT (mode); + REAL_VALUE_TYPE tmp; + real_nextafter (, fmt, , ); + value = tmp; +} + +// Like real_arithmetic, but round the result to INF if the operation +// produced inexact results. +// +// ?? There is still one problematic case, i387. With +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in +// XFmode (long_double_type_node), so that case is OK. But without +// -mfpmath=sse, all the SF/DFmode computations are in XFmode +// precision (64-bit mantiss
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
Fair enough. How's this? Tested on x86-64 Linux. LAPACK regression testing as well. On Mon, Nov 7, 2022 at 1:56 PM Jakub Jelinek wrote: > > On Mon, Nov 07, 2022 at 01:48:28PM +0100, Aldy Hernandez wrote: > > On Mon, Nov 7, 2022 at 1:43 PM Jakub Jelinek wrote: > > > > > > On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote: > > > > Let me see if I understand you correctly... > > > > > > > > real_isdenormal is always returning false for our uses in frange? So > > > > instead of using real_isdenormal in flush_denormals_to_zero, perhaps > > > > we should be using: > > > > > > > > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin > > > > > > > > ?? > > > > > > > > Could we perhaps make real_isdenormal always work for all values? Is > > > > that possible? > > > > > > Yes. Or have real_isdenormal_target for the uses in real.cc > > > which would be private to real.cc and would do what real_isdenormal > > > currently does, and then real_isdenormal with the above definition > > > (well, r->cl == rvc_normal && ...). > > > > If the REAL_EXP(r)... definition works for everyone, can't we just use > > that? Or do you think there'd be a measurable performance impact? > > It doesn't work alone. > > Either denormals are in normalized form, then > r->cl == rvc_normal && REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin > is true, or they are in denormal form, then > r->cl == rvc_normal && (r->sig[SIGSZ-1] & SIG_MSB) == 0 > is true. > > You could use > r->cl == rvc_normal > && (REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin > || (r->sig[SIGSZ-1] & SIG_MSB) == 0) > but that would be waste of compile time both in real.cc and outside of > real.cc. > > Jakub > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Mon, 7 Nov 2022 14:18:57 +0100 Subject: [PATCH] Provide normalized and denormal format version of real_isdenormal. Implement real_isdenormal_target() to be used within real.cc where the argument is known to be in denormal format. Rewrite real_isdenormal() for use outside of real.cc where the argument is known to be normalized. gcc/ChangeLog: * real.cc (real_isdenormal_target): New. (encode_ieee_single): Use real_isdenormal_target. (encode_ieee_double): Same. (encode_ieee_extended): Same. (encode_ieee_quad): Same. (encode_ieee_half): Same. (encode_arm_bfloat_half): Same. * value-range.cc (frange::flush_denormals_to_zero): Same. * real.h (real_isdenormal): Rewrite to look at mode. --- gcc/real.cc| 22 -- gcc/real.h | 7 --- gcc/value-range.cc | 5 +++-- 3 files changed, 23 insertions(+), 11 deletions(-) diff --git a/gcc/real.cc b/gcc/real.cc index aae7c335d59..058b0b8dcb4 100644 --- a/gcc/real.cc +++ b/gcc/real.cc @@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int); static void times_pten (REAL_VALUE_TYPE *, int); static void round_for_format (const struct real_format *, REAL_VALUE_TYPE *); + +/* Determine whether a floating-point value X is a denormal. R is + expected to be in denormal form, so this function is only + meaningful after a call to round_for_format. */ + +inline bool +real_isdenormal_target (const REAL_VALUE_TYPE *r) +{ + return (r->sig[SIGSZ-1] & SIG_MSB) == 0; +} /* Initialize R with a positive zero. */ @@ -2962,7 +2972,7 @@ encode_ieee_single (const struct real_format *fmt, long *buf, { unsigned long image, sig, exp; unsigned long sign = r->sign; - bool denormal = real_isdenormal (r); + bool denormal = real_isdenormal_target (r); image = sign << 31; sig = (r->sig[SIGSZ-1] >> (HOST_BITS_PER_LONG - 24)) & 0x7f; @@ -3183,7 +3193,7 @@ encode_ieee_double (const struct real_format *fmt, long *buf, { unsigned long image_lo, image_hi, sig_lo, sig_hi, exp; unsigned long sign = r->sign; - bool denormal = real_isdenormal (r); + bool denormal = real_isdenormal_target (r); image_hi = sign << 31; image_lo = 0; @@ -3441,7 +3451,7 @@ encode_ieee_extended (const struct real_format *fmt, long *buf, const REAL_VALUE_TYPE *r) { unsigned long image_hi, sig_hi, sig_lo; - bool denormal = real_isdenormal (r); + bool denormal = real_isdenormal_target (r); image_hi = r->sign << 15; sig_hi = sig_lo = 0; @@ -3972,7 +3982,7 @@ encode_ieee_quad (const struct real_format *fmt, long *buf, { unsigned long image3, image2, image1, image0, exp; unsigned long sign = r->sign; - bool denormal = real_isdenormal (r); + bool denormal = real_isdenormal_target (r); REAL_VALUE_TYPE u; image3 = sign << 31; @@ -4729,7 +4739,7 @@ encode_ieee_half (const struct real_format *fmt, long *buf, { unsigned long image, sig, exp; unsigned long sign = r->sign; - bool denormal = real_isdenormal (r); + bool denormal = real_isdenormal_target (r); image = sign << 15; sig = (r->sig[SIGSZ-1] >> (HOST_BITS_PER_LONG - 11)) & 0x3ff; @@ -4843,7 +4853,7 @@ encode_arm_bfloat_half (const struct real_format *fmt, long *buf, {
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Nov 07, 2022 at 01:48:28PM +0100, Aldy Hernandez wrote: > On Mon, Nov 7, 2022 at 1:43 PM Jakub Jelinek wrote: > > > > On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote: > > > Let me see if I understand you correctly... > > > > > > real_isdenormal is always returning false for our uses in frange? So > > > instead of using real_isdenormal in flush_denormals_to_zero, perhaps > > > we should be using: > > > > > > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin > > > > > > ?? > > > > > > Could we perhaps make real_isdenormal always work for all values? Is > > > that possible? > > > > Yes. Or have real_isdenormal_target for the uses in real.cc > > which would be private to real.cc and would do what real_isdenormal > > currently does, and then real_isdenormal with the above definition > > (well, r->cl == rvc_normal && ...). > > If the REAL_EXP(r)... definition works for everyone, can't we just use > that? Or do you think there'd be a measurable performance impact? It doesn't work alone. Either denormals are in normalized form, then r->cl == rvc_normal && REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin is true, or they are in denormal form, then r->cl == rvc_normal && (r->sig[SIGSZ-1] & SIG_MSB) == 0 is true. You could use r->cl == rvc_normal && (REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin || (r->sig[SIGSZ-1] & SIG_MSB) == 0) but that would be waste of compile time both in real.cc and outside of real.cc. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Nov 7, 2022 at 1:43 PM Jakub Jelinek wrote: > > On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote: > > Let me see if I understand you correctly... > > > > real_isdenormal is always returning false for our uses in frange? So > > instead of using real_isdenormal in flush_denormals_to_zero, perhaps > > we should be using: > > > > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin > > > > ?? > > > > Could we perhaps make real_isdenormal always work for all values? Is > > that possible? > > Yes. Or have real_isdenormal_target for the uses in real.cc > which would be private to real.cc and would do what real_isdenormal > currently does, and then real_isdenormal with the above definition > (well, r->cl == rvc_normal && ...). If the REAL_EXP(r)... definition works for everyone, can't we just use that? Or do you think there'd be a measurable performance impact? Aldy
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote: > Let me see if I understand you correctly... > > real_isdenormal is always returning false for our uses in frange? So > instead of using real_isdenormal in flush_denormals_to_zero, perhaps > we should be using: > > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin > > ?? > > Could we perhaps make real_isdenormal always work for all values? Is > that possible? Yes. Or have real_isdenormal_target for the uses in real.cc which would be private to real.cc and would do what real_isdenormal currently does, and then real_isdenormal with the above definition (well, r->cl == rvc_normal && ...). Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Fri, Nov 4, 2022 at 8:53 PM Jakub Jelinek wrote: > > On Fri, Nov 04, 2022 at 08:14:23PM +0100, Jakub Jelinek via Gcc-patches wrote: > > One thing that is clear is that real_isdenormal is never true for these > > (but then a question is if flush_denormals_to_zero actually works). > > So, after some more investigation, I think that is actually the case, > real_isdenormal is only meaningful (will ever return true) right after > round_for_format. > The uses inside of real.cc are fine, real_to_target first calls > round_for_format and then calls fmt->encode in which the real_isdenormal > calls are done. And, round_for_format is what transforms the > normalized way of expressing the number into the denormal form: > /* Check the range of the exponent. If we're out of range, > either underflow or overflow. */ > if (REAL_EXP (r) > emax2) > goto overflow; > else if (REAL_EXP (r) <= emin2m1) > { > int diff; > > if (!fmt->has_denorm) > { > /* Don't underflow completely until we've had a chance to round. */ > if (REAL_EXP (r) < emin2m1) > goto underflow; > } > else > { > diff = emin2m1 - REAL_EXP (r) + 1; > if (diff > p2) > goto underflow; > > /* De-normalize the significand. */ > r->sig[0] |= sticky_rshift_significand (r, r, diff); > SET_REAL_EXP (r, REAL_EXP (r) + diff); > } > } > But, real_to_target is one of the 4 callers of static round_for_format, > another one is real_convert, but that one undoes that immediately: > round_for_format (fmt, r); > > /* Make resulting NaN value to be qNaN. The caller has the > responsibility to avoid the operation if flag_signaling_nans > is on. */ > if (r->cl == rvc_nan) > r->signalling = 0; > > /* round_for_format de-normalizes denormals. Undo just that part. */ > if (r->cl == rvc_normal) > normalize (r); > and the last two are inside of encoding the IBM double double composite. > So, I think everywhere in the frange you'll see normalized REAL_VALUE_TYPE > and so real_isdenormal will always be false there. > You need to check for REAL_EXP (r) < fmt->emin or so (and ideally only on > REAL_VALUE_TYPE already real_converted to the right mode (your > frange_arithmetics does that already, which is good, but say conversions > and other unary ops might need it too). Let me see if I understand you correctly... real_isdenormal is always returning false for our uses in frange? So instead of using real_isdenormal in flush_denormals_to_zero, perhaps we should be using: REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin ?? Could we perhaps make real_isdenormal always work for all values? Is that possible? Aldy
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Fri, Nov 04, 2022 at 08:14:23PM +0100, Jakub Jelinek via Gcc-patches wrote: > One thing that is clear is that real_isdenormal is never true for these > (but then a question is if flush_denormals_to_zero actually works). So, after some more investigation, I think that is actually the case, real_isdenormal is only meaningful (will ever return true) right after round_for_format. The uses inside of real.cc are fine, real_to_target first calls round_for_format and then calls fmt->encode in which the real_isdenormal calls are done. And, round_for_format is what transforms the normalized way of expressing the number into the denormal form: /* Check the range of the exponent. If we're out of range, either underflow or overflow. */ if (REAL_EXP (r) > emax2) goto overflow; else if (REAL_EXP (r) <= emin2m1) { int diff; if (!fmt->has_denorm) { /* Don't underflow completely until we've had a chance to round. */ if (REAL_EXP (r) < emin2m1) goto underflow; } else { diff = emin2m1 - REAL_EXP (r) + 1; if (diff > p2) goto underflow; /* De-normalize the significand. */ r->sig[0] |= sticky_rshift_significand (r, r, diff); SET_REAL_EXP (r, REAL_EXP (r) + diff); } } But, real_to_target is one of the 4 callers of static round_for_format, another one is real_convert, but that one undoes that immediately: round_for_format (fmt, r); /* Make resulting NaN value to be qNaN. The caller has the responsibility to avoid the operation if flag_signaling_nans is on. */ if (r->cl == rvc_nan) r->signalling = 0; /* round_for_format de-normalizes denormals. Undo just that part. */ if (r->cl == rvc_normal) normalize (r); and the last two are inside of encoding the IBM double double composite. So, I think everywhere in the frange you'll see normalized REAL_VALUE_TYPE and so real_isdenormal will always be false there. You need to check for REAL_EXP (r) < fmt->emin or so (and ideally only on REAL_VALUE_TYPE already real_converted to the right mode (your frange_arithmetics does that already, which is good, but say conversions and other unary ops might need it too). If I in the hack from last mail replace - fprintf (stderr, "%d %d %s %s\n", i, real_isdenormal (), buf, buf2); + fprintf (stderr, "%d %d %s %s\n", i, REAL_EXP () < REAL_MODE_FORMAT (mode)->emin, buf, buf2); then it is 1 until: 102 1 0x0.8p-971 0x0.88p-971 103 1 0x0.8p-970 0x0.88p-970 104 1 0x0.8p-969 0x0.88p-969 105 0 0x0.8p-968 0x0.88p-968 106 0 0x0.8p-967 0x0.88p-967 107 0 0x0.8p-966 0x0.88p-966 Now, the __LDBL_MIN__ powerpc64 gcc announces is 2.00416836000897277799610805135016205e-292L which is equivalent to: 0x1p-969 and that is equivalent to 0x0.8p-968, so I think that is exactly what we want. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Oct 17, 2022 at 08:21:01AM +0200, Aldy Hernandez wrote: > +// Set VALUE to its next real value, or INF if the operation overflows. > + > +inline void > +frange_nextafter (enum machine_mode mode, > + REAL_VALUE_TYPE , > + const REAL_VALUE_TYPE ) > +{ > + const real_format *fmt = REAL_MODE_FORMAT (mode); > + REAL_VALUE_TYPE tmp; > + bool overflow = real_nextafter (, fmt, , ); > + if (overflow) > +value = inf; > + else > +value = tmp; Please change the above 5 lines to just real_nextafter (, fmt, , ); value = tmp; What real_nextafter returns isn't bool whether it overflowed, but a bool whether some exception should be raised for the nextafter, which can be either an overflow (in that case tmp will be already set to infinity of the right sign: if (REAL_EXP (r) > fmt->emax) { get_inf (r, x->sign); return true; } or underflow to zero: if (REAL_EXP (r) <= fmt->emin - fmt->p) { get_zero (r, x->sign); return true; } or when the result is subnormal: return r->cl == rvc_zero || REAL_EXP (r) < fmt->emin; But we shouldn't care about exceptions, we aren't emitting code, just estimating floating point range, and +-inf, or +-0, or subnormal are the right answers. Just for playing with this, I've tried: --- range-op-float.cc.jj2022-11-02 10:06:02.620960861 +0100 +++ range-op-float.cc 2022-11-04 19:49:14.188075070 +0100 @@ -1815,6 +1815,20 @@ frange_float (const char *lb, const char return frange (type, min, max); } +inline void +frange_nextafter (enum machine_mode mode, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + const real_format *fmt = REAL_MODE_FORMAT (mode); + REAL_VALUE_TYPE tmp; + bool overflow = real_nextafter (, fmt, , ); + if (overflow) +value = inf; + else + value = tmp; +} + void range_op_float_tests () { @@ -1833,6 +1847,30 @@ range_op_float_tests () r1 = frange_float ("-1", "-0"); r1.update_nan (false); ASSERT_EQ (r, r1); + + if (MODE_COMPOSITE_P (TYPE_MODE (long_double_type_node))) +{ + enum machine_mode mode = TYPE_MODE (long_double_type_node); + REAL_VALUE_TYPE result = dconst0; + { + REAL_VALUE_TYPE tmp; + real_convert (, DFmode, ); + frange_nextafter (DFmode, tmp, dconstinf); + real_convert (, mode, ); + } + for (int i = 0; i < 108; ++i) + { + char buf[1024], buf2[1024]; + REAL_VALUE_TYPE tmp, tmp2; + real_convert (, DFmode, ); + frange_nextafter (DFmode, tmp, dconstinf); + real_convert (, mode, ); + real_to_hexadecimal (buf, , 1024, 0, 1); + real_to_hexadecimal (buf2, , 1024, 0, 1); + fprintf (stderr, "%d %d %s %s\n", i, real_isdenormal (), buf, buf2); + real_arithmetic (, MULT_EXPR, , ); + } +} } } // namespace selftest in cross to
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Mon, Oct 17, 2022 at 08:21:01AM +0200, Aldy Hernandez wrote: > --- a/gcc/range-op-float.cc > +++ b/gcc/range-op-float.cc > @@ -200,6 +200,116 @@ frelop_early_resolve (irange , tree type, > && relop_early_resolve (r, type, op1, op2, rel, my_rel)); > } > > +// If R contains a NAN of unknown sign, update the NAN's signbit > +// depending on two operands. > + > +inline void > +update_nan_sign (frange , const frange , const frange ) > +{ > + if (!r.maybe_isnan ()) > +return; > + > + bool op1_nan = op1.maybe_isnan (); > + bool op2_nan = op2.maybe_isnan (); > + bool sign1, sign2; > + > + gcc_checking_assert (!r.nan_signbit_p (sign1)); > + if (op1_nan && op2_nan) > +{ > + if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2)) > + r.update_nan (sign1 | sign2); > +} > + else if (op1_nan) > +{ > + if (op1.nan_signbit_p (sign1)) > + r.update_nan (sign1); > +} > + else if (op2_nan) > +{ > + if (op2.nan_signbit_p (sign2)) > + r.update_nan (sign2); > +} > +} IEEE 754-2008 says: "When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note, however, that operations on bit strings — copy, negate, abs, copySign — specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicate totalOrder is also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation." so, while one can e.g. see on x86_64 that in simple -O0 int main () { volatile float f1 = __builtin_nansf (""); volatile float f2 = __builtin_copysignf (__builtin_nansf (""), -1.0f); volatile float f3 = __builtin_nanf (""); volatile float f4 = __builtin_copysignf (__builtin_nanf (""), -1.0f); volatile float fzero = 0.0f; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f1 = -f1; f2 = -f2; f3 = -f3; f4 = -f4; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); volatile float f5 = f1 + fzero; volatile float f6 = fzero + f1; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f2 + fzero; f6 = fzero + f2; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f3 + fzero; f6 = fzero + f3; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f4 + fzero; f6 = fzero + f4; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f1 + f2; f6 = f2 + f1; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f2 + f3; f6 = f3 + f2; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f3 + f4; f6 = f4 + f3; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); f5 = f4 + f1; f6 = f1 + f4; __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *)); return 0; } result of: 7fa0 ffa0 7fc0 ffc0 ffa0 7fa0 ffc0 7fc0 ffe0 ffe0 7fe0 7fe0 ffc0 ffc0 7fc0 7fc0 7fe0 ffe0 ffc0 7fe0 7fc0 ffc0 ffe0 7fc0 which basically shows that copysign copies sign bit, negation toggles it, binary operation with a single NaN (quiet or signaling) get that NaN with its sign and for binary operation on 2 NaNs (again quiet or signaling) one gets sign from the second? operand, I think the above IEEE text doesn't guarantee it except for a simple assignment (but with no mode conversion; that one copies the bit), NEGATE_EXPR (toggles it), ABS_EXPR (clears it), __builtin_copysign*/IFN_COPYSIGN (copies it from the second operand). Everything else, including invalid operation cases, should set the possible sign bit values of NANs to 0/1 rather than just one of them. Perhaps COND_EXPR 2nd/3rd operand is a move too. As you are adding a binary operation, that should be one of those cases where we drop the NAN sign to VARYING. > + > +// If either operand is a NAN, set R to the combination of both NANs > +// signwise and return TRUE. > + > +inline bool > +propagate_nans (frange , const frange , const frange ) > +{ > + if (op1.known_isnan () || op2.known_isnan ()) > +{ > + r.set_nan (op1.type ()); > + update_nan_sign (r, op1, op2); > + return true; > +} > + return false; > +} > + > +// Set VALUE to its next real value, or INF if the operation overflows. > + > +inline void > +frange_nextafter (enum machine_mode mode, > + REAL_VALUE_TYPE , > + const REAL_VALUE_TYPE ) > +{ > + const real_format *fmt = REAL_MODE_FORMAT (mode); > + REAL_VALUE_TYPE tmp; > + bool overflow = real_nextafter (, fmt, , ); > + if (overflow) > +value = inf; > + else > +value = tmp; > +} > + > +// Like real_arithmetic, but round the result to INF
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
Ping ping On Mon, Oct 24, 2022, 08:04 Aldy Hernandez wrote: > PING > > On Mon, Oct 17, 2022 at 8:21 AM Aldy Hernandez wrote: > > > > On Thu, Oct 13, 2022 at 7:57 PM Jakub Jelinek wrote: > > > > > > On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote: > > > > +// Like real_arithmetic, but round the result to INF if the > operation > > > > +// produced inexact results. > > > > +// > > > > +// ?? There is still one problematic case, i387. With > > > > +// -fexcess-precision=standard we perform most SF/DFmode arithmetic > in > > > > +// XFmode (long_double_type_node), so that case is OK. But without > > > > +// -mfpmath=sse, all the SF/DFmode computations are in XFmode > > > > +// precision (64-bit mantissa) and only occassionally rounded to > > > > +// SF/DFmode (when storing into memory from the 387 stack). Maybe > > > > +// this is ok as well though it is just occassionally more precise. > ?? > > > > + > > > > +static void > > > > +frange_arithmetic (enum tree_code code, tree type, > > > > +REAL_VALUE_TYPE , > > > > +const REAL_VALUE_TYPE , > > > > +const REAL_VALUE_TYPE , > > > > +const REAL_VALUE_TYPE ) > > > > +{ > > > > + REAL_VALUE_TYPE value; > > > > + enum machine_mode mode = TYPE_MODE (type); > > > > + bool mode_composite = MODE_COMPOSITE_P (mode); > > > > + > > > > + bool inexact = real_arithmetic (, code, , ); > > > > + real_convert (, mode, ); > > > > + > > > > + // If real_convert above has rounded an inexact value to towards > > > > + // inf, we can keep the result as is, otherwise we'll adjust by 1 > ulp > > > > + // later (real_nextafter). > > > > + bool rounding = (flag_rounding_math > > > > +&& (real_isneg () > > > > +? real_less (, ) > > > > +: !real_less (, ))); > > > > > > I thought the agreement during Cauldron was that we'd do this always, > > > regardless of flag_rounding_math. > > > Because excess precision (the fast one like on ia32 or -mfpmath=387 on > > > x86_64), or -frounding-math, or FMA contraction can all increase > precision > > > and worst case it all behaves like -frounding-math for the ranges. > > > > > > So, perhaps use: > > > if ((mode_composite || (real_isneg () ? real_less (, > ) > > > : !real_less (, > )) > > > && (inexact || !real_identical (, > > > > Done. > > > > > ? > > > No need to do the real_isneg/real_less stuff for mode_composite, then > > > we do it always for inexacts, but otherwise we check if the rounding > > > performed by real.cc has been in the conservative direction (for upper > > > bound to +inf, for lower bound to -inf), if yes, we don't need to do > > > anything, if yes, we frange_nextafter. > > > > > > As discussed, for mode_composite, I think we want to do the extra > > > stuff for inexact denormals and otherwise do the nextafter > unconditionally, > > > because our internal mode_composite representation isn't precise > enough. > > > > > > > + // Be extra careful if there may be discrepancies between the > > > > + // compile and runtime results. > > > > + if ((rounding || mode_composite) > > > > + && (inexact || !real_identical (, ))) > > > > +{ > > > > + if (mode_composite) > > > > + { > > > > + bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0; > > > > > > Use real_isdenormal here? > > > > Done. > > > > > Though, real_iszero needs the same thing. > > > > So... real_isdenormal() || real_iszero() as in the attached patch? > > > > > > > > > + if (denormal) > > > > + { > > > > + REAL_VALUE_TYPE tmp; > > > > > > And explain here why is this, that IBM extended denormals have just > > > DFmode precision. > > > > Done. > > > > > Though, now that I think about it, while this is correct for denormals, > > > > > > > + real_convert (, DFmode, ); > > > > + frange_nextafter (DFmode, tmp, inf); > > > > + real_convert (, mode, ); > > > > + } > > > > > > there are also the cases where the higher double exponent is in the > > > [__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so. > > > https://en.wikipedia.org/wiki/Double-precision_floating-point_format > > > If the upper double is denormal in the DFmode sense, so smaller > absolute > > > value than __DBL_MIN__, then doing nextafter in DFmode is the right > thing to > > > do, the lower double must be always +/- zero. > > > Now, if the result is __DBL_MIN__, the upper double is already > normalized > > > but we can add __DBL_DENORM_MIN__ to it, which will make the number > have > > > 54-bit precision. > > > If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__ > > > and make it 55-bit precision. Etc. until we reach __DBL_MIN__ * 2e53 > > > where it acts like fully normalized 106-bit precision number. > > > I must say I'm not really sure what real_nextafter is doing in those > cases, > > > I'm
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On 10/24/22 00:04, Aldy Hernandez via Gcc-patches wrote: PING I'd be a lot more comfortable if Jakub would chime in here. Jeff
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
PING On Mon, Oct 17, 2022 at 8:21 AM Aldy Hernandez wrote: > > On Thu, Oct 13, 2022 at 7:57 PM Jakub Jelinek wrote: > > > > On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote: > > > +// Like real_arithmetic, but round the result to INF if the operation > > > +// produced inexact results. > > > +// > > > +// ?? There is still one problematic case, i387. With > > > +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in > > > +// XFmode (long_double_type_node), so that case is OK. But without > > > +// -mfpmath=sse, all the SF/DFmode computations are in XFmode > > > +// precision (64-bit mantissa) and only occassionally rounded to > > > +// SF/DFmode (when storing into memory from the 387 stack). Maybe > > > +// this is ok as well though it is just occassionally more precise. ?? > > > + > > > +static void > > > +frange_arithmetic (enum tree_code code, tree type, > > > +REAL_VALUE_TYPE , > > > +const REAL_VALUE_TYPE , > > > +const REAL_VALUE_TYPE , > > > +const REAL_VALUE_TYPE ) > > > +{ > > > + REAL_VALUE_TYPE value; > > > + enum machine_mode mode = TYPE_MODE (type); > > > + bool mode_composite = MODE_COMPOSITE_P (mode); > > > + > > > + bool inexact = real_arithmetic (, code, , ); > > > + real_convert (, mode, ); > > > + > > > + // If real_convert above has rounded an inexact value to towards > > > + // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp > > > + // later (real_nextafter). > > > + bool rounding = (flag_rounding_math > > > +&& (real_isneg () > > > +? real_less (, ) > > > +: !real_less (, ))); > > > > I thought the agreement during Cauldron was that we'd do this always, > > regardless of flag_rounding_math. > > Because excess precision (the fast one like on ia32 or -mfpmath=387 on > > x86_64), or -frounding-math, or FMA contraction can all increase precision > > and worst case it all behaves like -frounding-math for the ranges. > > > > So, perhaps use: > > if ((mode_composite || (real_isneg () ? real_less (, ) > > : !real_less (, )) > > && (inexact || !real_identical (, > > Done. > > > ? > > No need to do the real_isneg/real_less stuff for mode_composite, then > > we do it always for inexacts, but otherwise we check if the rounding > > performed by real.cc has been in the conservative direction (for upper > > bound to +inf, for lower bound to -inf), if yes, we don't need to do > > anything, if yes, we frange_nextafter. > > > > As discussed, for mode_composite, I think we want to do the extra > > stuff for inexact denormals and otherwise do the nextafter unconditionally, > > because our internal mode_composite representation isn't precise enough. > > > > > + // Be extra careful if there may be discrepancies between the > > > + // compile and runtime results. > > > + if ((rounding || mode_composite) > > > + && (inexact || !real_identical (, ))) > > > +{ > > > + if (mode_composite) > > > + { > > > + bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0; > > > > Use real_isdenormal here? > > Done. > > > Though, real_iszero needs the same thing. > > So... real_isdenormal() || real_iszero() as in the attached patch? > > > > > > + if (denormal) > > > + { > > > + REAL_VALUE_TYPE tmp; > > > > And explain here why is this, that IBM extended denormals have just > > DFmode precision. > > Done. > > > Though, now that I think about it, while this is correct for denormals, > > > > > + real_convert (, DFmode, ); > > > + frange_nextafter (DFmode, tmp, inf); > > > + real_convert (, mode, ); > > > + } > > > > there are also the cases where the higher double exponent is in the > > [__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so. > > https://en.wikipedia.org/wiki/Double-precision_floating-point_format > > If the upper double is denormal in the DFmode sense, so smaller absolute > > value than __DBL_MIN__, then doing nextafter in DFmode is the right thing to > > do, the lower double must be always +/- zero. > > Now, if the result is __DBL_MIN__, the upper double is already normalized > > but we can add __DBL_DENORM_MIN__ to it, which will make the number have > > 54-bit precision. > > If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__ > > and make it 55-bit precision. Etc. until we reach __DBL_MIN__ * 2e53 > > where it acts like fully normalized 106-bit precision number. > > I must say I'm not really sure what real_nextafter is doing in those cases, > > I'm afraid it doesn't handle it correctly but the only other use > > of real_nextafter is guarded with: > > /* Don't handle composite modes, nor decimal, nor modes without > > inf or denorm at least for now. */ > > if (format->pnan < format->p > > || format->b == 10 > > || !format->has_inf > >
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
ite modes, nor decimal, nor modes without > inf or denorm at least for now. */ > if (format->pnan < format->p > || format->b == 10 > || !format->has_inf > || !format->has_denorm) > return false; Dunno. Is there a conservative thing we can do for mode_composites that aren't denormal or zero? How does this look? Aldy From d7be6caf60133bc39f8224e2f0e00dabcdbbe55d Mon Sep 17 00:00:00 2001 From: Aldy Hernandez Date: Thu, 13 Oct 2022 08:14:16 +0200 Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats. This is the range-op entry for floating point PLUS_EXPR. It's the most intricate range entry we have so far, because we need to keep track of rounding and target FP formats. This will be the last FP entry I commit, mostly to avoid disturbing the tree any further, and also because what we have so far is enough for a solid VRP. So far we track NANs and signs correctly. We also handle relationals (symbolics and numeric), both ordered and unordered, ABS_EXPR and NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign (__builtin_copysign is coming up). All in all, I think this provide more than enough for basic VRP on floats, as well as provide a basis to flesh out the rest if there's interest. My goal with this entry is to provide a template for additional binary operators, as they tend to follow a similar pattern: handle NANs, do the arithmetic while keeping track of rounding, and adjust for NAN. I may abstract the general parts as we do for irange's fold_range and wi_fold. PR tree-optimization/24021 gcc/ChangeLog: * range-op-float.cc (update_nan_sign): New. (propagate_nans): New. (frange_nextafter): New. (frange_arithmetic): New. (class foperator_plus): New. (floating_op_table::floating_op_table): Add PLUS_EXPR entry. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-plus.c: New test. --- gcc/range-op-float.cc | 163 ++ .../gcc.dg/tree-ssa/vrp-float-plus.c | 21 +++ 2 files changed, 184 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index 23e0f5ef4e2..7ffd5795e17 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -200,6 +200,116 @@ frelop_early_resolve (irange , tree type, && relop_early_resolve (r, type, op1, op2, rel, my_rel)); } +// If R contains a NAN of unknown sign, update the NAN's signbit +// depending on two operands. + +inline void +update_nan_sign (frange , const frange , const frange ) +{ + if (!r.maybe_isnan ()) +return; + + bool op1_nan = op1.maybe_isnan (); + bool op2_nan = op2.maybe_isnan (); + bool sign1, sign2; + + gcc_checking_assert (!r.nan_signbit_p (sign1)); + if (op1_nan && op2_nan) +{ + if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2)) + r.update_nan (sign1 | sign2); +} + else if (op1_nan) +{ + if (op1.nan_signbit_p (sign1)) + r.update_nan (sign1); +} + else if (op2_nan) +{ + if (op2.nan_signbit_p (sign2)) + r.update_nan (sign2); +} +} + +// If either operand is a NAN, set R to the combination of both NANs +// signwise and return TRUE. + +inline bool +propagate_nans (frange , const frange , const frange ) +{ + if (op1.known_isnan () || op2.known_isnan ()) +{ + r.set_nan (op1.type ()); + update_nan_sign (r, op1, op2); + return true; +} + return false; +} + +// Set VALUE to its next real value, or INF if the operation overflows. + +inline void +frange_nextafter (enum machine_mode mode, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + const real_format *fmt = REAL_MODE_FORMAT (mode); + REAL_VALUE_TYPE tmp; + bool overflow = real_nextafter (, fmt, , ); + if (overflow) +value = inf; + else +value = tmp; +} + +// Like real_arithmetic, but round the result to INF if the operation +// produced inexact results. +// +// ?? There is still one problematic case, i387. With +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in +// XFmode (long_double_type_node), so that case is OK. But without +// -mfpmath=sse, all the SF/DFmode computations are in XFmode +// precision (64-bit mantissa) and only occassionally rounded to +// SF/DFmode (when storing into memory from the 387 stack). Maybe +// this is ok as well though it is just occassionally more precise. ?? + +static void +frange_arithmetic (enum tree_code code, tree type, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE , + const REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + REAL_VALUE_TYPE value; + enum machine_mode mode = TYPE_MODE (type); + bool mode_composite = MODE_COMPOSITE_P (mode); + + bool inexact = real_arithmetic (, code, , ); + real_convert (, mode, ); + + // Be extra careful if there may be discrepancies between the + // compile and runtime results. + if ((mode_composite || (real_i
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Thu, Oct 13, 2022 at 3:52 PM Toon Moene wrote: > > It was just a comment on the code of the PR ... Ahh... the patch adds support for ranger handling of floating point PLUS_EXPRs in the IL. The PR's testcase is just one of the many things it will be able to solve. Aldy > > Toon. > > On 10/13/22 15:44, Aldy Hernandez wrote: > > > I'm not following. My patch doesn't affect this behavior. > > > > What am I missing? > > > > Aldy > > > > On Thu, Oct 13, 2022 at 3:04 PM Toon Moene wrote: > >> > >> On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote: > >> > >>>PR tree-optimization/24021 > >> > >> Ah - Verboten in Fortran: > >> > >> $ cat d.f > >> DOUBLE PRECISION A, X > >> A = 0.0 > >> DO X = 0.1, 1.0 > >>A = A + X > >> ENDDO > >> END > >> $ gfortran d.f > >> d.f:3:9: > >> > >> 3 | DO X = 0.1, 1.0 > >> | 1 > >> Warning: Deleted feature: Loop variable at (1) must be integer > >> d.f:3:12: > >> > >> 3 | DO X = 0.1, 1.0 > >> |1 > >> Warning: Deleted feature: Start expression in DO loop at (1) must be > >> integer > >> d.f:3:17: > >> > >> 3 | DO X = 0.1, 1.0 > >> | 1 > >> Warning: Deleted feature: End expression in DO loop at (1) must be integer > >> > >> -- > >> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > >> Saturnushof 14, 3738 XG Maartensdijk, The Netherlands > >> > > > > -- > Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote: > +// Like real_arithmetic, but round the result to INF if the operation > +// produced inexact results. > +// > +// ?? There is still one problematic case, i387. With > +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in > +// XFmode (long_double_type_node), so that case is OK. But without > +// -mfpmath=sse, all the SF/DFmode computations are in XFmode > +// precision (64-bit mantissa) and only occassionally rounded to > +// SF/DFmode (when storing into memory from the 387 stack). Maybe > +// this is ok as well though it is just occassionally more precise. ?? > + > +static void > +frange_arithmetic (enum tree_code code, tree type, > +REAL_VALUE_TYPE , > +const REAL_VALUE_TYPE , > +const REAL_VALUE_TYPE , > +const REAL_VALUE_TYPE ) > +{ > + REAL_VALUE_TYPE value; > + enum machine_mode mode = TYPE_MODE (type); > + bool mode_composite = MODE_COMPOSITE_P (mode); > + > + bool inexact = real_arithmetic (, code, , ); > + real_convert (, mode, ); > + > + // If real_convert above has rounded an inexact value to towards > + // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp > + // later (real_nextafter). > + bool rounding = (flag_rounding_math > +&& (real_isneg () > +? real_less (, ) > +: !real_less (, ))); I thought the agreement during Cauldron was that we'd do this always, regardless of flag_rounding_math. Because excess precision (the fast one like on ia32 or -mfpmath=387 on x86_64), or -frounding-math, or FMA contraction can all increase precision and worst case it all behaves like -frounding-math for the ranges. So, perhaps use: if ((mode_composite || (real_isneg () ? real_less (, ) : !real_less (, )) && (inexact || !real_identical (, ? No need to do the real_isneg/real_less stuff for mode_composite, then we do it always for inexacts, but otherwise we check if the rounding performed by real.cc has been in the conservative direction (for upper bound to +inf, for lower bound to -inf), if yes, we don't need to do anything, if yes, we frange_nextafter. As discussed, for mode_composite, I think we want to do the extra stuff for inexact denormals and otherwise do the nextafter unconditionally, because our internal mode_composite representation isn't precise enough. > + // Be extra careful if there may be discrepancies between the > + // compile and runtime results. > + if ((rounding || mode_composite) > + && (inexact || !real_identical (, ))) > +{ > + if (mode_composite) > + { > + bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0; Use real_isdenormal here? Though, real_iszero needs the same thing. > + if (denormal) > + { > + REAL_VALUE_TYPE tmp; And explain here why is this, that IBM extended denormals have just DFmode precision. Though, now that I think about it, while this is correct for denormals, > + real_convert (, DFmode, ); > + frange_nextafter (DFmode, tmp, inf); > + real_convert (, mode, ); > + } there are also the cases where the higher double exponent is in the [__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so. https://en.wikipedia.org/wiki/Double-precision_floating-point_format If the upper double is denormal in the DFmode sense, so smaller absolute value than __DBL_MIN__, then doing nextafter in DFmode is the right thing to do, the lower double must be always +/- zero. Now, if the result is __DBL_MIN__, the upper double is already normalized but we can add __DBL_DENORM_MIN__ to it, which will make the number have 54-bit precision. If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__ and make it 55-bit precision. Etc. until we reach __DBL_MIN__ * 2e53 where it acts like fully normalized 106-bit precision number. I must say I'm not really sure what real_nextafter is doing in those cases, I'm afraid it doesn't handle it correctly but the only other use of real_nextafter is guarded with: /* Don't handle composite modes, nor decimal, nor modes without inf or denorm at least for now. */ if (format->pnan < format->p || format->b == 10 || !format->has_inf || !format->has_denorm) return false; so it isn't that big deal except for ranges. Jakub
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
It was just a comment on the code of the PR ... Toon. On 10/13/22 15:44, Aldy Hernandez wrote: I'm not following. My patch doesn't affect this behavior. What am I missing? Aldy On Thu, Oct 13, 2022 at 3:04 PM Toon Moene wrote: On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote: PR tree-optimization/24021 Ah - Verboten in Fortran: $ cat d.f DOUBLE PRECISION A, X A = 0.0 DO X = 0.1, 1.0 A = A + X ENDDO END $ gfortran d.f d.f:3:9: 3 | DO X = 0.1, 1.0 | 1 Warning: Deleted feature: Loop variable at (1) must be integer d.f:3:12: 3 | DO X = 0.1, 1.0 |1 Warning: Deleted feature: Start expression in DO loop at (1) must be integer d.f:3:17: 3 | DO X = 0.1, 1.0 | 1 Warning: Deleted feature: End expression in DO loop at (1) must be integer -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
I'm not following. My patch doesn't affect this behavior. What am I missing? Aldy On Thu, Oct 13, 2022 at 3:04 PM Toon Moene wrote: > > On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote: > > > PR tree-optimization/24021 > > Ah - Verboten in Fortran: > > $ cat d.f >DOUBLE PRECISION A, X >A = 0.0 >DO X = 0.1, 1.0 > A = A + X >ENDDO >END > $ gfortran d.f > d.f:3:9: > > 3 | DO X = 0.1, 1.0 >| 1 > Warning: Deleted feature: Loop variable at (1) must be integer > d.f:3:12: > > 3 | DO X = 0.1, 1.0 >|1 > Warning: Deleted feature: Start expression in DO loop at (1) must be integer > d.f:3:17: > > 3 | DO X = 0.1, 1.0 >| 1 > Warning: Deleted feature: End expression in DO loop at (1) must be integer > > -- > Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >
Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote: PR tree-optimization/24021 Ah - Verboten in Fortran: $ cat d.f DOUBLE PRECISION A, X A = 0.0 DO X = 0.1, 1.0 A = A + X ENDDO END $ gfortran d.f d.f:3:9: 3 | DO X = 0.1, 1.0 | 1 Warning: Deleted feature: Loop variable at (1) must be integer d.f:3:12: 3 | DO X = 0.1, 1.0 |1 Warning: Deleted feature: Start expression in DO loop at (1) must be integer d.f:3:17: 3 | DO X = 0.1, 1.0 | 1 Warning: Deleted feature: End expression in DO loop at (1) must be integer -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
[PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.
[Jakub, this is a cleaned up version of what we iterated on earlier this summer. It contains additional smarts to propagate NAN signs on entry. I'd like a nod before committing.] This is the range-op entry for floating point PLUS_EXPR. It's the most intricate range entry we have so far, because we need to keep track of rounding and target FP formats. This will be the last FP entry I commit, mostly to avoid disturbing the tree any further, and also because what we have so far is enough for a solid VRP. So far we track NANs and signs correctly. We also handle relationals (symbolics and numeric), both ordered and unordered, ABS_EXPR and NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign (__builtin_copysign is coming up). All in all, I think this provide more than enough for basic VRP on floats, as well as provide a basis to flesh out the rest if there's interest. My goal with this entry is to provide a template for additional binary operators, as they tend to follow a similar pattern: handle NANs, do the arithmetic while keeping track of rounding, and adjust for NAN. I may abstract the general parts as we do for irange's fold_range and wi_fold. Oh yeah... and I'd like to finally close this PR ;-). How does this look? PR tree-optimization/24021 gcc/ChangeLog: * range-op-float.cc (update_nan_sign): New. (propagate_nans): New. (frange_nextafter): New. (frange_arithmetic): New. (class foperator_plus): New. (floating_op_table::floating_op_table): Add PLUS_EXPR entry. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-plus.c: New test. --- gcc/range-op-float.cc | 171 ++ .../gcc.dg/tree-ssa/vrp-float-plus.c | 21 +++ 2 files changed, 192 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc index 23e0f5ef4e2..a967c4da393 100644 --- a/gcc/range-op-float.cc +++ b/gcc/range-op-float.cc @@ -200,6 +200,124 @@ frelop_early_resolve (irange , tree type, && relop_early_resolve (r, type, op1, op2, rel, my_rel)); } +// If R contains a NAN of unknown sign, update the NAN's signbit +// depending on two operands. + +inline void +update_nan_sign (frange , const frange , const frange ) +{ + if (!r.maybe_isnan ()) +return; + + bool op1_nan = op1.maybe_isnan (); + bool op2_nan = op2.maybe_isnan (); + bool sign1, sign2; + + gcc_checking_assert (!r.nan_signbit_p (sign1)); + if (op1_nan && op2_nan) +{ + if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2)) + r.update_nan (sign1 | sign2); +} + else if (op1_nan) +{ + if (op1.nan_signbit_p (sign1)) + r.update_nan (sign1); +} + else if (op2_nan) +{ + if (op2.nan_signbit_p (sign2)) + r.update_nan (sign2); +} +} + +// If either operand is a NAN, set R to the combination of both NANs +// signwise and return TRUE. + +inline bool +propagate_nans (frange , const frange , const frange ) +{ + if (op1.known_isnan () || op2.known_isnan ()) +{ + r.set_nan (op1.type ()); + update_nan_sign (r, op1, op2); + return true; +} + return false; +} + +// Set VALUE to its next real value, or INF if the operation overflows. + +inline void +frange_nextafter (enum machine_mode mode, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + const real_format *fmt = REAL_MODE_FORMAT (mode); + REAL_VALUE_TYPE tmp; + bool overflow = real_nextafter (, fmt, , ); + if (overflow) +value = inf; + else +value = tmp; +} + +// Like real_arithmetic, but round the result to INF if the operation +// produced inexact results. +// +// ?? There is still one problematic case, i387. With +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in +// XFmode (long_double_type_node), so that case is OK. But without +// -mfpmath=sse, all the SF/DFmode computations are in XFmode +// precision (64-bit mantissa) and only occassionally rounded to +// SF/DFmode (when storing into memory from the 387 stack). Maybe +// this is ok as well though it is just occassionally more precise. ?? + +static void +frange_arithmetic (enum tree_code code, tree type, + REAL_VALUE_TYPE , + const REAL_VALUE_TYPE , + const REAL_VALUE_TYPE , + const REAL_VALUE_TYPE ) +{ + REAL_VALUE_TYPE value; + enum machine_mode mode = TYPE_MODE (type); + bool mode_composite = MODE_COMPOSITE_P (mode); + + bool inexact = real_arithmetic (, code, , ); + real_convert (, mode, ); + + // If real_convert above has rounded an inexact value to towards + // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp + // later (real_nextafter). + bool rounding = (flag_rounding_math + && (real_isneg () + ? real_less (, ) + : !real_less (, ))); + +