subject:"\[PATCH\] \[PR24021\] Implement PLUS_EXPR range\-op entry for floats."

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

This patch fixes the oversight.

Tested on x86-64 Linux.

Pushed.

On Wed, Nov 9, 2022 at 12:05 AM Aldy Hernandez  wrote:
>
> Sigh, one more thing.
>
> There are further possibilities for a NAN result, even if the operands
> are !NAN and the result from frange_arithmetic is free of NANs.
> Adding different signed infinities.
>
> For example, [-INF,+INF] + [-INF,+INF] has the possibility of adding
> -INF and +INF, which is a NAN.  Since we end up calling frange
> arithmetic on the lower bounds and then on the upper bounds, we miss
> this, and mistakenly think we're free of NANs.
>
> I have a patch in testing, but FYI, in case anyone notices this before
> I get around to it tomorrow.
>
> Aldy
>
> On Tue, Nov 8, 2022 at 3:11 PM Jakub Jelinek  wrote:
> >
> > On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote:
> > > +// If either operand is a NAN, set R to the combination of both NANs
> > > +// signwise and return TRUE.
> >
> > This comment doesn't describe what it does now.
> > If either operand is a NAN, set R to NAN with unspecified sign bit and 
> > return
> > TRUE.
> > ?
> >
> > Other than this LGTM.
> >
> > Jakub
> >
From 68b0615be2aaff3a8ce91ba7cd0f69ebbd93702c Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Tue, 8 Nov 2022 23:42:04 +0100
Subject: [PATCH] [range-op-float] Set NAN possibility for INF + (-INF) and
 vice versa.

Some combinations of operations can yield a NAN even if no operands
have the possiblity of a NAN.  For example, [-INF] + [+INF] = NAN and
vice versa.

For [-INF,+INF] + [-INF,+INF], frange_arithmetic will not return a
NAN, and since the operands have no possibility of a NAN, we will
mistakenly assume the result cannot have a NAN.  This fixes the
oversight.

gcc/ChangeLog:

	* range-op-float.cc (foperator_plus::fold_range): Set NAN for
	addition of different signed infinities.
	(range_op_float_tests): New test.
---
 gcc/range-op-float.cc | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 3bc6cc8849d..8282c912fc4 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -1863,7 +1863,21 @@ foperator_plus::fold_range (frange , tree type,
 
   r.set (type, lb, ub);
 
-  if (lb_nan || ub_nan)
+  // Some combinations can yield a NAN even if no operands have the
+  // possibility of a NAN.
+  bool maybe_nan;
+  // [-INF] + [+INF] = NAN
+  if (real_isinf (_bound (), true)
+  && real_isinf (_bound (), false))
+maybe_nan = true;
+  // [+INF] + [-INF] = NAN
+  else if (real_isinf (_bound (), false)
+	   && real_isinf (_bound (), true))
+maybe_nan = true;
+  else
+maybe_nan = false;
+
+  if (lb_nan || ub_nan || maybe_nan)
 // Keep the default NAN (with a varying sign) set by the setter.
 ;
   else if (!op1.maybe_isnan () && !op2.maybe_isnan ())
@@ -1960,6 +1974,16 @@ range_op_float_tests ()
   r1 = frange_float ("-1", "-0");
   r1.update_nan (false);
   ASSERT_EQ (r, r1);
+
+  // [-INF,+INF] + [-INF,+INF] could be a NAN.
+  range_op_handler plus (PLUS_EXPR, float_type_node);
+  r0.set_varying (float_type_node);
+  r1.set_varying (float_type_node);
+  r0.clear_nan ();
+  r1.clear_nan ();
+  plus.fold_range (r, float_type_node, r0, r1);
+  if (HONOR_NANS (float_type_node))
+ASSERT_TRUE (r.maybe_isnan ());
 }
 
 } // namespace selftest
-- 
2.38.1

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

Sigh, one more thing.

There are further possibilities for a NAN result, even if the operands
are !NAN and the result from frange_arithmetic is free of NANs.
Adding different signed infinities.

For example, [-INF,+INF] + [-INF,+INF] has the possibility of adding
-INF and +INF, which is a NAN.  Since we end up calling frange
arithmetic on the lower bounds and then on the upper bounds, we miss
this, and mistakenly think we're free of NANs.

I have a patch in testing, but FYI, in case anyone notices this before
I get around to it tomorrow.

Aldy

On Tue, Nov 8, 2022 at 3:11 PM Jakub Jelinek  wrote:
>
> On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote:
> > +// If either operand is a NAN, set R to the combination of both NANs
> > +// signwise and return TRUE.
>
> This comment doesn't describe what it does now.
> If either operand is a NAN, set R to NAN with unspecified sign bit and return
> TRUE.
> ?
>
> Other than this LGTM.
>
> Jakub
>

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Andrew Waterman

On Tue, Nov 8, 2022 at 10:11 AM Jakub Jelinek  wrote:
>
> On Tue, Nov 08, 2022 at 09:44:40AM -0800, Andrew Waterman wrote:
> > On Tue, Nov 8, 2022 at 3:20 AM Jakub Jelinek via Gcc-patches
> >  wrote:
> > >
> > > On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote:
> > > > As suggested upthread, I have also adjusted update_nan_sign() to drop
> > > > the NAN sign to VARYING if both operands are NAN.  As an optimization
> > > > I keep the sign if both operands are NAN and have the same sign.
> > >
> > > For NaNs this still relies on something IEEE754 doesn't guarantee,
> > > as I cited, after a binary operation the sign bit of the NaN is
> > > unspecified, whether there is one NaN operand or two.
> > > It might be that all CPUs handle it the way you've implemented
> > > (that for one NaN operand the sign of NaN result will be the same
> > > as that NaN operand and for two it will be the sign of one of the two
> > > NaNs operands, never something else), but I think we'd need to check
> > > more than one implementation for that (I've only tried x86_64 and thus
> > > SSE behavior in it), so one would need to test i387 long double behavior
> > > too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ...
> > > The guarantee given by IEEE754 is only for those copy, negate, abs, 
> > > copySign
> > > operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*,
> > > __builtin_copysign*.
> >
> > FWIW, RISC-V canonicalizes NaNs by clearing the sign bit; the signs of
> > the input NaNs do not factor in.
>
> Just for binary operations and some unary, or also the ones that
> IEEE754 spells out (moves, negations, absolute value and copysign)?

I should've been more specific in my earlier email: I was referring to
the arithmetic operators.  Copysign and friends do not canonicalize
NaNs.

>
> Jakub
>

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 08, 2022 at 09:44:40AM -0800, Andrew Waterman wrote:
> On Tue, Nov 8, 2022 at 3:20 AM Jakub Jelinek via Gcc-patches
>  wrote:
> >
> > On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote:
> > > As suggested upthread, I have also adjusted update_nan_sign() to drop
> > > the NAN sign to VARYING if both operands are NAN.  As an optimization
> > > I keep the sign if both operands are NAN and have the same sign.
> >
> > For NaNs this still relies on something IEEE754 doesn't guarantee,
> > as I cited, after a binary operation the sign bit of the NaN is
> > unspecified, whether there is one NaN operand or two.
> > It might be that all CPUs handle it the way you've implemented
> > (that for one NaN operand the sign of NaN result will be the same
> > as that NaN operand and for two it will be the sign of one of the two
> > NaNs operands, never something else), but I think we'd need to check
> > more than one implementation for that (I've only tried x86_64 and thus
> > SSE behavior in it), so one would need to test i387 long double behavior
> > too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ...
> > The guarantee given by IEEE754 is only for those copy, negate, abs, copySign
> > operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*,
> > __builtin_copysign*.
> 
> FWIW, RISC-V canonicalizes NaNs by clearing the sign bit; the signs of
> the input NaNs do not factor in.

Just for binary operations and some unary, or also the ones that
IEEE754 spells out (moves, negations, absolute value and copysign)?

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Andrew Waterman

On Tue, Nov 8, 2022 at 3:20 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote:
> > As suggested upthread, I have also adjusted update_nan_sign() to drop
> > the NAN sign to VARYING if both operands are NAN.  As an optimization
> > I keep the sign if both operands are NAN and have the same sign.
>
> For NaNs this still relies on something IEEE754 doesn't guarantee,
> as I cited, after a binary operation the sign bit of the NaN is
> unspecified, whether there is one NaN operand or two.
> It might be that all CPUs handle it the way you've implemented
> (that for one NaN operand the sign of NaN result will be the same
> as that NaN operand and for two it will be the sign of one of the two
> NaNs operands, never something else), but I think we'd need to check
> more than one implementation for that (I've only tried x86_64 and thus
> SSE behavior in it), so one would need to test i387 long double behavior
> too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ...
> The guarantee given by IEEE754 is only for those copy, negate, abs, copySign
> operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*,
> __builtin_copysign*.

FWIW, RISC-V canonicalizes NaNs by clearing the sign bit; the signs of
the input NaNs do not factor in.

>
> Otherwise LGTM (but would be nice to get into GCC13 not just
> +, but also -, *, /, sqrt at least).
>
> Jakub
>

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 8, 2022 at 3:11 PM Jakub Jelinek  wrote:
>
> On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote:
> > +// If either operand is a NAN, set R to the combination of both NANs
> > +// signwise and return TRUE.
>
> This comment doesn't describe what it does now.
> If either operand is a NAN, set R to NAN with unspecified sign bit and return
> TRUE.
> ?

OMG, I suck!

// If either operand is a NAN, set R to NAN and return TRUE.

Tests on-going :).

Aldy

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 08, 2022 at 03:06:53PM +0100, Aldy Hernandez wrote:
> +// If either operand is a NAN, set R to the combination of both NANs
> +// signwise and return TRUE.

This comment doesn't describe what it does now.
If either operand is a NAN, set R to NAN with unspecified sign bit and return
TRUE.
?

Other than this LGTM.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 8, 2022 at 2:50 PM Jakub Jelinek  wrote:
>
> On Tue, Nov 08, 2022 at 02:47:35PM +0100, Aldy Hernandez wrote:
> > Well, perhaps we should just nuke update_nan_sign() altogether, and
> > always keep the sign varying?
> >
> > inline bool
> > propagate_nans (frange , const frange , const frange )
> > {
> >   if (op1.known_isnan () || op2.known_isnan ())
> > {
> >   r.set_nan (op1.type ());
> >   return true;
> > }
> >   return false;
> > }
> >
> > I'm fine either way.  The less code the better :).
>
> Yes, but you had 2 callers, so something needs to be done also if
> in foperator_plus::fold_range.

We can also remove the update_nan_sign() in the other call because the
r.set() before it sets a default NAN (with a varying sign).

Attached patch in testing.

Aldy
From 8da4fc39cc73f0ae785463bd5f371223fa59027e Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 Oct 2022 08:14:16 +0200
Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

This is the range-op entry for floating point PLUS_EXPR.  It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats.  This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.

So far we track NANs and signs correctly.  We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up).  All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.

My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN.  I
may abstract the general parts as we do for irange's fold_range and
wi_fold.

	PR tree-optimization/24021

gcc/ChangeLog:

	* range-op-float.cc (propagate_nans): New.
	(frange_nextafter): New.
	(frange_arithmetic): New.
	(class foperator_plus): New.
	(floating_op_table::floating_op_table): Add PLUS_EXPR entry.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
 gcc/range-op-float.cc | 128 ++
 .../gcc.dg/tree-ssa/vrp-float-plus.c  |  21 +++
 2 files changed, 149 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a1f372997bf..502b67aaa66 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -192,6 +192,81 @@ frelop_early_resolve (irange , tree type,
 	  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }
 
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange , const frange , const frange )
+{
+  if (op1.known_isnan () || op2.known_isnan ())
+{
+  r.set_nan (op1.type ());
+  return true;
+}
+  return false;
+}
+
+// Set VALUE to its next real value, or INF if the operation overflows.
+
+inline void
+frange_nextafter (enum machine_mode mode,
+		  REAL_VALUE_TYPE ,
+		  const REAL_VALUE_TYPE )
+{
+  const real_format *fmt = REAL_MODE_FORMAT (mode);
+  REAL_VALUE_TYPE tmp;
+  real_nextafter (, fmt, , );
+  value = tmp;
+}
+
+// Like real_arithmetic, but round the result to INF if the operation
+// produced inexact results.
+//
+// ?? There is still one problematic case, i387.  With
+// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
+// XFmode (long_double_type_node), so that case is OK.  But without
+// -mfpmath=sse, all the SF/DFmode computations are in XFmode
+// precision (64-bit mantissa) and only occassionally rounded to
+// SF/DFmode (when storing into memory from the 387 stack).  Maybe
+// this is ok as well though it is just occassionally more precise. ??
+
+static void
+frange_arithmetic (enum tree_code code, tree type,
+		   REAL_VALUE_TYPE ,
+		   const REAL_VALUE_TYPE ,
+		   const REAL_VALUE_TYPE ,
+		   const REAL_VALUE_TYPE )
+{
+  REAL_VALUE_TYPE value;
+  enum machine_mode mode = TYPE_MODE (type);
+  bool mode_composite = MODE_COMPOSITE_P (mode);
+
+  bool inexact = real_arithmetic (, code, , );
+  real_convert (, mode, );
+
+  // Be extra careful if there may be discrepancies between the
+  // compile and runtime results.
+  if ((mode_composite || (real_isneg () ? real_less (, )
+			  : !real_less (, )))
+  && (inexact || !real_identical (, )))
+{
+  if (mode_composite)
+	{
+	  if (real_isdenormal (, mode)
+	  || real_iszero ())
+	{
+	  // IBM extended denormals only have DFmode precision.
+	  REAL_VALUE_TYPE tmp;
+

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 08, 2022 at 03:02:40PM +0100, Aldy Hernandez wrote:
> From d02ce8eaf16d2fc6db6472268fd962e09c2fd81e Mon Sep 17 00:00:00 2001
> From: Aldy Hernandez 
> Date: Mon, 7 Nov 2022 14:18:57 +0100
> Subject: [PATCH] Provide normalized and denormal format version of
>  real_isdenormal.
> 
> Implement a variant of real_isdenormal() to be used within real.cc
> where the argument is known to be in denormal format.  Rewrite
> real_isdenormal() for use outside of real.cc where the argument is
> known to be normalized.
> 
> gcc/ChangeLog:
> 
>   * real.cc (real_isdenormal): New.
>   (encode_ieee_single): Call real_isdenormal.
>   (encode_ieee_double): Same.
>   (encode_ieee_extended): Same.
>   (encode_ieee_quad): Same.
>   (encode_ieee_half): Same.
>   (encode_arm_bfloat_half): Same.
>   * real.h (real_isdenormal): Add mode argument.  Rewrite for
>   normalized values.
>   * value-range.cc (frange::flush_denormals_to_zero): Pass mode to
>   real_isdenormal.

LGTM, thanks.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 8, 2022 at 2:15 PM Jakub Jelinek  wrote:
>
> On Tue, Nov 08, 2022 at 01:47:58PM +0100, Aldy Hernandez wrote:
> > On Tue, Nov 8, 2022 at 12:07 PM Jakub Jelinek  wrote:
> > >
> > > On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote:
> > > > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001
> > > > From: Aldy Hernandez 
> > > > Date: Mon, 7 Nov 2022 14:18:57 +0100
> > > > Subject: [PATCH] Provide normalized and denormal format version of
> > > >  real_isdenormal.
> > > >
> > > > Implement real_isdenormal_target() to be used within real.cc where the
> > > > argument is known to be in denormal format.  Rewrite real_isdenormal()
> > > > for use outside of real.cc where the argument is known to be
> > > > normalized.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >   * real.cc (real_isdenormal_target): New.
> > > >   (encode_ieee_single): Use real_isdenormal_target.
> > > >   (encode_ieee_double): Same.
> > > >   (encode_ieee_extended): Same.
> > > >   (encode_ieee_quad): Same.
> > > >   (encode_ieee_half): Same.
> > > >   (encode_arm_bfloat_half): Same.
> > > >   * value-range.cc (frange::flush_denormals_to_zero): Same.
> > > >   * real.h (real_isdenormal): Rewrite to look at mode.
> > >
> > > I'd make real_isdenormal_target static inline bool
> > > rather than inline bool, it is only defined in real.cc, so there is
> > > no point exporting it.
> >
> > Huh.  I thought inline alone would inhibit the exporting.  Thanks.
>
> That is what happens with C99 inline (unless there is extern for the decl),
> but C++ inline is different.  It isn't guaranteed to be exported, but
> with -fkeep-inline-functions or if you say take address of the inline
> in a way that can't be optimized back into a call to the inline (or even
> just call it with -O0), it is exported.
> >
> > > Though, as you've added the mode argument, the real.cc inline
> > > could very well also be called real_isdenormal too, it wouldn't be
> > > a redeclaration or ODR violation.
> >
> > Great, even better.
> >
> > OK pending tests?
> > Aldy
>
> > From c3ca1d606bfb22bf4f8bc7ac0ce561bd6afc3368 Mon Sep 17 00:00:00 2001
> > From: Aldy Hernandez 
> > Date: Mon, 7 Nov 2022 14:18:57 +0100
> > Subject: [PATCH] Provide normalized and denormal format version of
> >  real_isdenormal.
> >
> > Implement a variant of real_isdenormal() to be used within real.cc
> > where the argument is known to be in denormal format.  Rewrite
> > real_isdenormal() for use outside of real.cc where the argument is
> > known to be normalized.
> >
> > gcc/ChangeLog:
> >
> >   * real.cc (real_isdenormal): New.
> >   * real.h (real_isdenormal): Add mode argument.  Rewrite for
> >   normalized values.
> >   * value-range.cc (frange::flush_denormals_to_zero): Pass mode to
> >   real_isdenormal.
> > ---
> >  gcc/real.cc| 10 ++
> >  gcc/real.h |  7 ---
> >  gcc/value-range.cc |  5 +++--
> >  3 files changed, 17 insertions(+), 5 deletions(-)
> >
> > diff --git a/gcc/real.cc b/gcc/real.cc
> > index aae7c335d59..028aad95ec4 100644
> > --- a/gcc/real.cc
> > +++ b/gcc/real.cc
> > @@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int);
> >  static void times_pten (REAL_VALUE_TYPE *, int);
> >
> >  static void round_for_format (const struct real_format *, REAL_VALUE_TYPE 
> > *);
> > +
> > +/* Determine whether a floating-point value X is a denormal.  R is
> > +   expected to be in denormal form, so this function is only
> > +   meaningful after a call to round_for_format.  */
> > +
> > +static inline bool
> > +real_isdenormal (const REAL_VALUE_TYPE *r)
> > +{
> > +  return (r->sig[SIGSZ-1] & SIG_MSB) == 0;
>
> I would probably keep the r->cl == rvc_normal in here too.
> I know the code in real.cc didn't do it before, but what
> r->sig is for the rvc_zero/rvc_inf is unclear.  It is true
> that get_zero/get_canonical_?nan/get_inf clear the whole sig,
> but not really sure if we guarantee that everywhere.
> The real.cc uses were like:
>   bool denormal = ...;
> at the start of the function and then
>   switch (...)
> {
> ...
> case rvc_normal:
>   if (denormal)
> ...
> }
> so another even better possibility would be to use your simple
> real.cc (real_isdenormal) and drop all the denormal variables, so:
> - bool denormal = ...;
>   switch (...)
> {
> ...
> case rvc_normal:
> - if (denormal)
> + if (real_isdenormal (r))
> ...
> }

Sure.

Attached patch in testing.

Aldy
From d02ce8eaf16d2fc6db6472268fd962e09c2fd81e Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 7 Nov 2022 14:18:57 +0100
Subject: [PATCH] Provide normalized and denormal format version of
 real_isdenormal.

Implement a variant of real_isdenormal() to be used within real.cc
where the argument is known to be in denormal format.  Rewrite
real_isdenormal() for use outside of real.cc where the argument is
known to be normalized.

gcc/ChangeLog:

	*

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 08, 2022 at 02:47:35PM +0100, Aldy Hernandez wrote:
> Well, perhaps we should just nuke update_nan_sign() altogether, and
> always keep the sign varying?
> 
> inline bool
> propagate_nans (frange , const frange , const frange )
> {
>   if (op1.known_isnan () || op2.known_isnan ())
> {
>   r.set_nan (op1.type ());
>   return true;
> }
>   return false;
> }
> 
> I'm fine either way.  The less code the better :).

Yes, but you had 2 callers, so something needs to be done also if
in foperator_plus::fold_range.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 8, 2022 at 2:25 PM Jakub Jelinek  wrote:
>
> On Tue, Nov 08, 2022 at 02:06:58PM +0100, Aldy Hernandez wrote:
> > +  gcc_checking_assert (!r.nan_signbit_p (sign1));
> > +  if (op1_nan && op2_nan)
> > +{
> > +  // If boths signs agree, we could use that sign, but IEEE754
> > +  // does not guarantee this for a binary operator.  The x86_64
> > +  // architure does keep the common known sign, but further tests
> > +  // are needed to see if other architectures do the same (i387
> > +  // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc).
>
> s/RSICV/RISCV/
>
> > +  // In the meantime, keep sign VARYING.
> > +  ;
> > +}
> > +  else if (op1_nan)
> > +{
> > +  if (op1.nan_signbit_p (sign1))
> > + r.update_nan (sign1);
> > +}
> > +  else if (op2_nan)
> > +{
> > +  if (op2.nan_signbit_p (sign2))
> > + r.update_nan (sign2);
> > +}
>
> Well, these cases also aren't guaranteed for binary operator.
> I think a conforming implementation can say copy the NaN operand
> to result and toggle the sign.  Or, if the operand would be a sNaN,
> it must turn it into a qNaN (don't remember right now if there are
> requirements on what it can do with the mantissa which needs to change
> for the sNaN -> qNaN difference at least, but whether it can just
> generate a canonical qNaN or needs to preserve at least some bits),
> but could e.g. clear or toggle the sign of the NaN as well.
> Whether there are any such implementations or not is a question.
> For the single qNaN operand case, it would surprise me if anybody
> bothered to tweak the sign bit in any way, just copying the input
> seems simpler to me, but for the sNaN -> qNaN conversion it wouldn't
> surprise me that much.

Well, perhaps we should just nuke update_nan_sign() altogether, and
always keep the sign varying?

inline bool
propagate_nans (frange , const frange , const frange )
{
  if (op1.known_isnan () || op2.known_isnan ())
{
  r.set_nan (op1.type ());
  return true;
}
  return false;
}

I'm fine either way.  The less code the better :).

Aldy

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 08, 2022 at 02:06:58PM +0100, Aldy Hernandez wrote:
> +  gcc_checking_assert (!r.nan_signbit_p (sign1));
> +  if (op1_nan && op2_nan)
> +{
> +  // If boths signs agree, we could use that sign, but IEEE754
> +  // does not guarantee this for a binary operator.  The x86_64
> +  // architure does keep the common known sign, but further tests
> +  // are needed to see if other architectures do the same (i387
> +  // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc).

s/RSICV/RISCV/

> +  // In the meantime, keep sign VARYING.
> +  ;
> +}
> +  else if (op1_nan)
> +{
> +  if (op1.nan_signbit_p (sign1))
> + r.update_nan (sign1);
> +}
> +  else if (op2_nan)
> +{
> +  if (op2.nan_signbit_p (sign2))
> + r.update_nan (sign2);
> +}

Well, these cases also aren't guaranteed for binary operator.
I think a conforming implementation can say copy the NaN operand
to result and toggle the sign.  Or, if the operand would be a sNaN,
it must turn it into a qNaN (don't remember right now if there are
requirements on what it can do with the mantissa which needs to change
for the sNaN -> qNaN difference at least, but whether it can just
generate a canonical qNaN or needs to preserve at least some bits),
but could e.g. clear or toggle the sign of the NaN as well.
Whether there are any such implementations or not is a question.
For the single qNaN operand case, it would surprise me if anybody
bothered to tweak the sign bit in any way, just copying the input
seems simpler to me, but for the sNaN -> qNaN conversion it wouldn't
surprise me that much.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Tue, Nov 08, 2022 at 01:47:58PM +0100, Aldy Hernandez wrote:
> On Tue, Nov 8, 2022 at 12:07 PM Jakub Jelinek  wrote:
> >
> > On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote:
> > > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001
> > > From: Aldy Hernandez 
> > > Date: Mon, 7 Nov 2022 14:18:57 +0100
> > > Subject: [PATCH] Provide normalized and denormal format version of
> > >  real_isdenormal.
> > >
> > > Implement real_isdenormal_target() to be used within real.cc where the
> > > argument is known to be in denormal format.  Rewrite real_isdenormal()
> > > for use outside of real.cc where the argument is known to be
> > > normalized.
> > >
> > > gcc/ChangeLog:
> > >
> > >   * real.cc (real_isdenormal_target): New.
> > >   (encode_ieee_single): Use real_isdenormal_target.
> > >   (encode_ieee_double): Same.
> > >   (encode_ieee_extended): Same.
> > >   (encode_ieee_quad): Same.
> > >   (encode_ieee_half): Same.
> > >   (encode_arm_bfloat_half): Same.
> > >   * value-range.cc (frange::flush_denormals_to_zero): Same.
> > >   * real.h (real_isdenormal): Rewrite to look at mode.
> >
> > I'd make real_isdenormal_target static inline bool
> > rather than inline bool, it is only defined in real.cc, so there is
> > no point exporting it.
> 
> Huh.  I thought inline alone would inhibit the exporting.  Thanks.

That is what happens with C99 inline (unless there is extern for the decl),
but C++ inline is different.  It isn't guaranteed to be exported, but
with -fkeep-inline-functions or if you say take address of the inline
in a way that can't be optimized back into a call to the inline (or even
just call it with -O0), it is exported.
> 
> > Though, as you've added the mode argument, the real.cc inline
> > could very well also be called real_isdenormal too, it wouldn't be
> > a redeclaration or ODR violation.
> 
> Great, even better.
> 
> OK pending tests?
> Aldy

> From c3ca1d606bfb22bf4f8bc7ac0ce561bd6afc3368 Mon Sep 17 00:00:00 2001
> From: Aldy Hernandez 
> Date: Mon, 7 Nov 2022 14:18:57 +0100
> Subject: [PATCH] Provide normalized and denormal format version of
>  real_isdenormal.
> 
> Implement a variant of real_isdenormal() to be used within real.cc
> where the argument is known to be in denormal format.  Rewrite
> real_isdenormal() for use outside of real.cc where the argument is
> known to be normalized.
> 
> gcc/ChangeLog:
> 
>   * real.cc (real_isdenormal): New.
>   * real.h (real_isdenormal): Add mode argument.  Rewrite for
>   normalized values.
>   * value-range.cc (frange::flush_denormals_to_zero): Pass mode to
>   real_isdenormal.
> ---
>  gcc/real.cc| 10 ++
>  gcc/real.h |  7 ---
>  gcc/value-range.cc |  5 +++--
>  3 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/real.cc b/gcc/real.cc
> index aae7c335d59..028aad95ec4 100644
> --- a/gcc/real.cc
> +++ b/gcc/real.cc
> @@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int);
>  static void times_pten (REAL_VALUE_TYPE *, int);
>  
>  static void round_for_format (const struct real_format *, REAL_VALUE_TYPE *);
> +
> +/* Determine whether a floating-point value X is a denormal.  R is
> +   expected to be in denormal form, so this function is only
> +   meaningful after a call to round_for_format.  */
> +
> +static inline bool
> +real_isdenormal (const REAL_VALUE_TYPE *r)
> +{
> +  return (r->sig[SIGSZ-1] & SIG_MSB) == 0;

I would probably keep the r->cl == rvc_normal in here too.
I know the code in real.cc didn't do it before, but what
r->sig is for the rvc_zero/rvc_inf is unclear.  It is true
that get_zero/get_canonical_?nan/get_inf clear the whole sig,
but not really sure if we guarantee that everywhere.
The real.cc uses were like:
  bool denormal = ...;
at the start of the function and then
  switch (...)
{
...
case rvc_normal:
  if (denormal)
...
}
so another even better possibility would be to use your simple
real.cc (real_isdenormal) and drop all the denormal variables, so:
- bool denormal = ...;
  switch (...)
{
...
case rvc_normal:
- if (denormal)
+ if (real_isdenormal (r))
...
}

Otherwise LGTM.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 8, 2022 at 12:20 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote:
> > As suggested upthread, I have also adjusted update_nan_sign() to drop
> > the NAN sign to VARYING if both operands are NAN.  As an optimization
> > I keep the sign if both operands are NAN and have the same sign.
>
> For NaNs this still relies on something IEEE754 doesn't guarantee,
> as I cited, after a binary operation the sign bit of the NaN is
> unspecified, whether there is one NaN operand or two.
> It might be that all CPUs handle it the way you've implemented
> (that for one NaN operand the sign of NaN result will be the same
> as that NaN operand and for two it will be the sign of one of the two
> NaNs operands, never something else), but I think we'd need to check
> more than one implementation for that (I've only tried x86_64 and thus
> SSE behavior in it), so one would need to test i387 long double behavior
> too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ...
> The guarantee given by IEEE754 is only for those copy, negate, abs, copySign
> operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*,
> __builtin_copysign*.

Ughh, that's unfortunate.  OK, I've added a big note.

>
> Otherwise LGTM (but would be nice to get into GCC13 not just
> +, but also -, *, /, sqrt at least).

Minus is trivial as we can implement it with a negate and plus.  I
have a patch queued up for that.  The rest require a bit more thought,
though perhaps with what we have so far can serve as a base.  I'll
look into it.

Attached is the patch I'm retesting.

Thanks for your patience, and copious help here.
Aldy
From 32e9063bbd5a48bf7f7b16077ebc0c1e7bf3c33d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 Oct 2022 08:14:16 +0200
Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

This is the range-op entry for floating point PLUS_EXPR.  It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats.  This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.

So far we track NANs and signs correctly.  We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up).  All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.

My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN.  I
may abstract the general parts as we do for irange's fold_range and
wi_fold.

	PR tree-optimization/24021

gcc/ChangeLog:

	* range-op-float.cc (update_nan_sign): New.
	(propagate_nans): New.
	(frange_nextafter): New.
	(frange_arithmetic): New.
	(class foperator_plus): New.
	(floating_op_table::floating_op_table): Add PLUS_EXPR entry.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
 gcc/range-op-float.cc | 165 ++
 .../gcc.dg/tree-ssa/vrp-float-plus.c  |  21 +++
 2 files changed, 186 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a1f372997bf..1a6913b8b98 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -192,6 +192,118 @@ frelop_early_resolve (irange , tree type,
 	  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }

+// If R contains a NAN of unknown sign, update the NAN's signbit
+// depending on two operands.
+
+inline void
+update_nan_sign (frange , const frange , const frange )
+{
+  if (!r.maybe_isnan ())
+return;
+
+  bool op1_nan = op1.maybe_isnan ();
+  bool op2_nan = op2.maybe_isnan ();
+  bool sign1, sign2;
+
+  gcc_checking_assert (!r.nan_signbit_p (sign1));
+  if (op1_nan && op2_nan)
+{
+  // If boths signs agree, we could use that sign, but IEEE754
+  // does not guarantee this for a binary operator.  The x86_64
+  // architure does keep the common known sign, but further tests
+  // are needed to see if other architectures do the same (i387
+  // long double, ARM/aarch64, PowerPC, s390,{,x}, RSICV, etc).
+  // In the meantime, keep sign VARYING.
+  ;
+}
+  else if (op1_nan)
+{
+  if (op1.nan_signbit_p (sign1))
+	r.update_nan (sign1);
+}
+  else if (op2_nan)
+{
+  if (op2.nan_signbit_p (sign2))
+	r.update_nan (sign2);
+}
+}
+
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange , const frange , const fra

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Aldy Hernandez via Gcc-patches

On Tue, Nov 8, 2022 at 12:07 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote:
> > From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001
> > From: Aldy Hernandez 
> > Date: Mon, 7 Nov 2022 14:18:57 +0100
> > Subject: [PATCH] Provide normalized and denormal format version of
> >  real_isdenormal.
> >
> > Implement real_isdenormal_target() to be used within real.cc where the
> > argument is known to be in denormal format.  Rewrite real_isdenormal()
> > for use outside of real.cc where the argument is known to be
> > normalized.
> >
> > gcc/ChangeLog:
> >
> >   * real.cc (real_isdenormal_target): New.
> >   (encode_ieee_single): Use real_isdenormal_target.
> >   (encode_ieee_double): Same.
> >   (encode_ieee_extended): Same.
> >   (encode_ieee_quad): Same.
> >   (encode_ieee_half): Same.
> >   (encode_arm_bfloat_half): Same.
> >   * value-range.cc (frange::flush_denormals_to_zero): Same.
> >   * real.h (real_isdenormal): Rewrite to look at mode.
>
> I'd make real_isdenormal_target static inline bool
> rather than inline bool, it is only defined in real.cc, so there is
> no point exporting it.

Huh.  I thought inline alone would inhibit the exporting.  Thanks.

> Though, as you've added the mode argument, the real.cc inline
> could very well also be called real_isdenormal too, it wouldn't be
> a redeclaration or ODR violation.

Great, even better.

OK pending tests?
Aldy
From c3ca1d606bfb22bf4f8bc7ac0ce561bd6afc3368 Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 7 Nov 2022 14:18:57 +0100
Subject: [PATCH] Provide normalized and denormal format version of
 real_isdenormal.

Implement a variant of real_isdenormal() to be used within real.cc
where the argument is known to be in denormal format.  Rewrite
real_isdenormal() for use outside of real.cc where the argument is
known to be normalized.

gcc/ChangeLog:

	* real.cc (real_isdenormal): New.
	* real.h (real_isdenormal): Add mode argument.  Rewrite for
	normalized values.
	* value-range.cc (frange::flush_denormals_to_zero): Pass mode to
	real_isdenormal.
---
 gcc/real.cc| 10 ++
 gcc/real.h |  7 ---
 gcc/value-range.cc |  5 +++--
 3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/gcc/real.cc b/gcc/real.cc
index aae7c335d59..028aad95ec4 100644
--- a/gcc/real.cc
+++ b/gcc/real.cc
@@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int);
 static void times_pten (REAL_VALUE_TYPE *, int);
 
 static void round_for_format (const struct real_format *, REAL_VALUE_TYPE *);
+
+/* Determine whether a floating-point value X is a denormal.  R is
+   expected to be in denormal form, so this function is only
+   meaningful after a call to round_for_format.  */
+
+static inline bool
+real_isdenormal (const REAL_VALUE_TYPE *r)
+{
+  return (r->sig[SIGSZ-1] & SIG_MSB) == 0;
+}
 
 /* Initialize R with a positive zero.  */
 
diff --git a/gcc/real.h b/gcc/real.h
index 306e9593866..b14bcdd3fde 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -286,11 +286,12 @@ extern bool real_isnan (const REAL_VALUE_TYPE *);
 /* Determine whether a floating-point value X is a signaling NaN.  */
 extern bool real_issignaling_nan (const REAL_VALUE_TYPE *);
 
-/* Determine whether a floating-point value X is a denormal.  */
+/* Determine whether floating-point value R is a denormal.  This
+   function is only valid for normalized values.  */
 inline bool
-real_isdenormal (const REAL_VALUE_TYPE *r)
+real_isdenormal (const REAL_VALUE_TYPE *r, machine_mode mode)
 {
-  return r->cl == rvc_normal && (r->sig[SIGSZ-1] & SIG_MSB) == 0;
+  return r->cl == rvc_normal && REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin;
 }
 
 /* Determine whether a floating-point value X is finite.  */
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index a855aaf626c..859c7fb4af9 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -266,15 +266,16 @@ frange::flush_denormals_to_zero ()
   if (undefined_p () || known_isnan ())
 return;
 
+  machine_mode mode = TYPE_MODE (type ());
   // Flush [x, -DENORMAL] to [x, -0.0].
-  if (real_isdenormal (_max) && real_isneg (_max))
+  if (real_isdenormal (_max, mode) && real_isneg (_max))
 {
   m_max = dconst0;
   if (HONOR_SIGNED_ZEROS (m_type))
 	m_max.sign = 1;
 }
   // Flush [+DENORMAL, x] to [+0.0, x].
-  if (real_isdenormal (_min) && !real_isneg (_min))
+  if (real_isdenormal (_min, mode) && !real_isneg (_min))
 m_min = dconst0;
 }
 
-- 
2.38.1

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Mon, Nov 07, 2022 at 04:41:23PM +0100, Aldy Hernandez wrote:
> As suggested upthread, I have also adjusted update_nan_sign() to drop
> the NAN sign to VARYING if both operands are NAN.  As an optimization
> I keep the sign if both operands are NAN and have the same sign.

For NaNs this still relies on something IEEE754 doesn't guarantee,
as I cited, after a binary operation the sign bit of the NaN is
unspecified, whether there is one NaN operand or two.
It might be that all CPUs handle it the way you've implemented
(that for one NaN operand the sign of NaN result will be the same
as that NaN operand and for two it will be the sign of one of the two
NaNs operands, never something else), but I think we'd need to check
more than one implementation for that (I've only tried x86_64 and thus
SSE behavior in it), so one would need to test i387 long double behavior
too, ARM/AArch64, PowerPC, s390{,x}, RISCV, ...
The guarantee given by IEEE754 is only for those copy, negate, abs, copySign
operations, so copying values around, NEG_EXPR, ABS_EXPR, __builtin_fabs*,
__builtin_copysign*.

Otherwise LGTM (but would be nice to get into GCC13 not just
+, but also -, *, /, sqrt at least).

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-08 Thread Jakub Jelinek via Gcc-patches

On Mon, Nov 07, 2022 at 04:38:29PM +0100, Aldy Hernandez wrote:
> From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001
> From: Aldy Hernandez 
> Date: Mon, 7 Nov 2022 14:18:57 +0100
> Subject: [PATCH] Provide normalized and denormal format version of
>  real_isdenormal.
> 
> Implement real_isdenormal_target() to be used within real.cc where the
> argument is known to be in denormal format.  Rewrite real_isdenormal()
> for use outside of real.cc where the argument is known to be
> normalized.
> 
> gcc/ChangeLog:
> 
>   * real.cc (real_isdenormal_target): New.
>   (encode_ieee_single): Use real_isdenormal_target.
>   (encode_ieee_double): Same.
>   (encode_ieee_extended): Same.
>   (encode_ieee_quad): Same.
>   (encode_ieee_half): Same.
>   (encode_arm_bfloat_half): Same.
>   * value-range.cc (frange::flush_denormals_to_zero): Same.
>   * real.h (real_isdenormal): Rewrite to look at mode.

I'd make real_isdenormal_target static inline bool
rather than inline bool, it is only defined in real.cc, so there is
no point exporting it.
Though, as you've added the mode argument, the real.cc inline
could very well also be called real_isdenormal too, it wouldn't be
a redeclaration or ODR violation. 

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-07 Thread Aldy Hernandez via Gcc-patches

On Fri, Nov 4, 2022 at 8:14 PM Jakub Jelinek  wrote:
>
> On Mon, Oct 17, 2022 at 08:21:01AM +0200, Aldy Hernandez wrote:
> > +// Set VALUE to its next real value, or INF if the operation overflows.
> > +
> > +inline void
> > +frange_nextafter (enum machine_mode mode,
> > +   REAL_VALUE_TYPE ,
> > +   const REAL_VALUE_TYPE )
> > +{
> > +  const real_format *fmt = REAL_MODE_FORMAT (mode);
> > +  REAL_VALUE_TYPE tmp;
> > +  bool overflow = real_nextafter (, fmt, , );
> > +  if (overflow)
> > +value = inf;
> > +  else
> > +value = tmp;
>
> Please change the above 5 lines to just
>   real_nextafter (, fmt, , );
>   value = tmp;

Done.

As suggested upthread, I have also adjusted update_nan_sign() to drop
the NAN sign to VARYING if both operands are NAN.  As an optimization
I keep the sign if both operands are NAN and have the same sign.

How's this?

Tested on x86-64 Linux.  LAPACK testing as well.

Aldy
From 7717ad5c495934ff28ea2abd639ab4d91b7ffcab Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 Oct 2022 08:14:16 +0200
Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

This is the range-op entry for floating point PLUS_EXPR.  It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats.  This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.

So far we track NANs and signs correctly.  We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up).  All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.

My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN.  I
may abstract the general parts as we do for irange's fold_range and
wi_fold.

	PR tree-optimization/24021

gcc/ChangeLog:

	* range-op-float.cc (update_nan_sign): New.
	(propagate_nans): New.
	(frange_nextafter): New.
	(frange_arithmetic): New.
	(class foperator_plus): New.
	(floating_op_table::floating_op_table): Add PLUS_EXPR entry.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
 gcc/range-op-float.cc | 162 ++
 .../gcc.dg/tree-ssa/vrp-float-plus.c  |  21 +++
 2 files changed, 183 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a1f372997bf..9c09bb35dbe 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -192,6 +192,115 @@ frelop_early_resolve (irange , tree type,
 	  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }
 
+// If R contains a NAN of unknown sign, update the NAN's signbit
+// depending on two operands.
+
+inline void
+update_nan_sign (frange , const frange , const frange )
+{
+  if (!r.maybe_isnan ())
+return;
+
+  bool op1_nan = op1.maybe_isnan ();
+  bool op2_nan = op2.maybe_isnan ();
+  bool sign1, sign2;
+
+  gcc_checking_assert (!r.nan_signbit_p (sign1));
+  if (op1_nan && op2_nan)
+{
+  // If both signs agree, use that.  Otherwise keep sign VARYING.
+  if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2)
+	  && sign1 == sign2)
+	r.update_nan (sign1);
+}
+  else if (op1_nan)
+{
+  if (op1.nan_signbit_p (sign1))
+	r.update_nan (sign1);
+}
+  else if (op2_nan)
+{
+  if (op2.nan_signbit_p (sign2))
+	r.update_nan (sign2);
+}
+}
+
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange , const frange , const frange )
+{
+  if (op1.known_isnan () || op2.known_isnan ())
+{
+  r.set_nan (op1.type ());
+  update_nan_sign (r, op1, op2);
+  return true;
+}
+  return false;
+}
+
+// Set VALUE to its next real value, or INF if the operation overflows.
+
+inline void
+frange_nextafter (enum machine_mode mode,
+		  REAL_VALUE_TYPE ,
+		  const REAL_VALUE_TYPE )
+{
+  const real_format *fmt = REAL_MODE_FORMAT (mode);
+  REAL_VALUE_TYPE tmp;
+  real_nextafter (, fmt, , );
+  value = tmp;
+}
+
+// Like real_arithmetic, but round the result to INF if the operation
+// produced inexact results.
+//
+// ?? There is still one problematic case, i387.  With
+// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
+// XFmode (long_double_type_node), so that case is OK.  But without
+// -mfpmath=sse, all the SF/DFmode computations are in XFmode
+// precision (64-bit mantiss

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-07 Thread Aldy Hernandez via Gcc-patches

Fair enough.

How's this?

Tested on x86-64 Linux.  LAPACK regression testing as well.

On Mon, Nov 7, 2022 at 1:56 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 07, 2022 at 01:48:28PM +0100, Aldy Hernandez wrote:
> > On Mon, Nov 7, 2022 at 1:43 PM Jakub Jelinek  wrote:
> > >
> > > On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote:
> > > > Let me see if I understand you correctly...
> > > >
> > > > real_isdenormal is always returning false for our uses in frange?  So
> > > > instead of using real_isdenormal in flush_denormals_to_zero, perhaps
> > > > we should be using:
> > > >
> > > > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
> > > >
> > > > ??
> > > >
> > > > Could we perhaps make real_isdenormal always work for all values?  Is
> > > > that possible?
> > >
> > > Yes.  Or have real_isdenormal_target for the uses in real.cc
> > > which would be private to real.cc and would do what real_isdenormal
> > > currently does, and then real_isdenormal with the above definition
> > > (well, r->cl == rvc_normal && ...).
> >
> > If the REAL_EXP(r)... definition works for everyone, can't we just use
> > that?  Or do you think there'd be a measurable performance impact?
>
> It doesn't work alone.
>
> Either denormals are in normalized form, then
> r->cl == rvc_normal && REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
> is true, or they are in denormal form, then
> r->cl == rvc_normal && (r->sig[SIGSZ-1] & SIG_MSB) == 0
> is true.
>
> You could use
> r->cl == rvc_normal
> && (REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
> || (r->sig[SIGSZ-1] & SIG_MSB) == 0)
> but that would be waste of compile time both in real.cc and outside of
> real.cc.
>
> Jakub
>
From d214bcdff2cb90ad1eb808d29bda6fb98d510b4c Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 7 Nov 2022 14:18:57 +0100
Subject: [PATCH] Provide normalized and denormal format version of
 real_isdenormal.

Implement real_isdenormal_target() to be used within real.cc where the
argument is known to be in denormal format.  Rewrite real_isdenormal()
for use outside of real.cc where the argument is known to be
normalized.

gcc/ChangeLog:

	* real.cc (real_isdenormal_target): New.
	(encode_ieee_single): Use real_isdenormal_target.
	(encode_ieee_double): Same.
	(encode_ieee_extended): Same.
	(encode_ieee_quad): Same.
	(encode_ieee_half): Same.
	(encode_arm_bfloat_half): Same.
	* value-range.cc (frange::flush_denormals_to_zero): Same.
	* real.h (real_isdenormal): Rewrite to look at mode.
---
 gcc/real.cc| 22 --
 gcc/real.h |  7 ---
 gcc/value-range.cc |  5 +++--
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/gcc/real.cc b/gcc/real.cc
index aae7c335d59..058b0b8dcb4 100644
--- a/gcc/real.cc
+++ b/gcc/real.cc
@@ -111,6 +111,16 @@ static const REAL_VALUE_TYPE * real_digit (int);
 static void times_pten (REAL_VALUE_TYPE *, int);
 
 static void round_for_format (const struct real_format *, REAL_VALUE_TYPE *);
+
+/* Determine whether a floating-point value X is a denormal.  R is
+   expected to be in denormal form, so this function is only
+   meaningful after a call to round_for_format.  */
+
+inline bool
+real_isdenormal_target (const REAL_VALUE_TYPE *r)
+{
+  return (r->sig[SIGSZ-1] & SIG_MSB) == 0;
+}
 
 /* Initialize R with a positive zero.  */
 
@@ -2962,7 +2972,7 @@ encode_ieee_single (const struct real_format *fmt, long *buf,
 {
   unsigned long image, sig, exp;
   unsigned long sign = r->sign;
-  bool denormal = real_isdenormal (r);
+  bool denormal = real_isdenormal_target (r);
 
   image = sign << 31;
   sig = (r->sig[SIGSZ-1] >> (HOST_BITS_PER_LONG - 24)) & 0x7f;
@@ -3183,7 +3193,7 @@ encode_ieee_double (const struct real_format *fmt, long *buf,
 {
   unsigned long image_lo, image_hi, sig_lo, sig_hi, exp;
   unsigned long sign = r->sign;
-  bool denormal = real_isdenormal (r);
+  bool denormal = real_isdenormal_target (r);
 
   image_hi = sign << 31;
   image_lo = 0;
@@ -3441,7 +3451,7 @@ encode_ieee_extended (const struct real_format *fmt, long *buf,
 		  const REAL_VALUE_TYPE *r)
 {
   unsigned long image_hi, sig_hi, sig_lo;
-  bool denormal = real_isdenormal (r);
+  bool denormal = real_isdenormal_target (r);
 
   image_hi = r->sign << 15;
   sig_hi = sig_lo = 0;
@@ -3972,7 +3982,7 @@ encode_ieee_quad (const struct real_format *fmt, long *buf,
 {
   unsigned long image3, image2, image1, image0, exp;
   unsigned long sign = r->sign;
-  bool denormal = real_isdenormal (r);
+  bool denormal = real_isdenormal_target (r);
   REAL_VALUE_TYPE u;
 
   image3 = sign << 31;
@@ -4729,7 +4739,7 @@ encode_ieee_half (const struct real_format *fmt, long *buf,
 {
   unsigned long image, sig, exp;
   unsigned long sign = r->sign;
-  bool denormal = real_isdenormal (r);
+  bool denormal = real_isdenormal_target (r);
 
   image = sign << 15;
   sig = (r->sig[SIGSZ-1] >> (HOST_BITS_PER_LONG - 11)) & 0x3ff;
@@ -4843,7 +4853,7 @@ encode_arm_bfloat_half (const struct real_format *fmt, long *buf,
 {

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-07 Thread Jakub Jelinek via Gcc-patches

On Mon, Nov 07, 2022 at 01:48:28PM +0100, Aldy Hernandez wrote:
> On Mon, Nov 7, 2022 at 1:43 PM Jakub Jelinek  wrote:
> >
> > On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote:
> > > Let me see if I understand you correctly...
> > >
> > > real_isdenormal is always returning false for our uses in frange?  So
> > > instead of using real_isdenormal in flush_denormals_to_zero, perhaps
> > > we should be using:
> > >
> > > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
> > >
> > > ??
> > >
> > > Could we perhaps make real_isdenormal always work for all values?  Is
> > > that possible?
> >
> > Yes.  Or have real_isdenormal_target for the uses in real.cc
> > which would be private to real.cc and would do what real_isdenormal
> > currently does, and then real_isdenormal with the above definition
> > (well, r->cl == rvc_normal && ...).
> 
> If the REAL_EXP(r)... definition works for everyone, can't we just use
> that?  Or do you think there'd be a measurable performance impact?

It doesn't work alone.

Either denormals are in normalized form, then
r->cl == rvc_normal && REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
is true, or they are in denormal form, then
r->cl == rvc_normal && (r->sig[SIGSZ-1] & SIG_MSB) == 0
is true.

You could use
r->cl == rvc_normal
&& (REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
|| (r->sig[SIGSZ-1] & SIG_MSB) == 0)
but that would be waste of compile time both in real.cc and outside of
real.cc.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-07 Thread Aldy Hernandez via Gcc-patches

On Mon, Nov 7, 2022 at 1:43 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote:
> > Let me see if I understand you correctly...
> >
> > real_isdenormal is always returning false for our uses in frange?  So
> > instead of using real_isdenormal in flush_denormals_to_zero, perhaps
> > we should be using:
> >
> > REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
> >
> > ??
> >
> > Could we perhaps make real_isdenormal always work for all values?  Is
> > that possible?
>
> Yes.  Or have real_isdenormal_target for the uses in real.cc
> which would be private to real.cc and would do what real_isdenormal
> currently does, and then real_isdenormal with the above definition
> (well, r->cl == rvc_normal && ...).

If the REAL_EXP(r)... definition works for everyone, can't we just use
that?  Or do you think there'd be a measurable performance impact?

Aldy

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-07 Thread Jakub Jelinek via Gcc-patches

On Mon, Nov 07, 2022 at 01:35:35PM +0100, Aldy Hernandez wrote:
> Let me see if I understand you correctly...
> 
> real_isdenormal is always returning false for our uses in frange?  So
> instead of using real_isdenormal in flush_denormals_to_zero, perhaps
> we should be using:
> 
> REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin
> 
> ??
> 
> Could we perhaps make real_isdenormal always work for all values?  Is
> that possible?

Yes.  Or have real_isdenormal_target for the uses in real.cc
which would be private to real.cc and would do what real_isdenormal
currently does, and then real_isdenormal with the above definition
(well, r->cl == rvc_normal && ...).

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-07 Thread Aldy Hernandez via Gcc-patches

On Fri, Nov 4, 2022 at 8:53 PM Jakub Jelinek  wrote:
>
> On Fri, Nov 04, 2022 at 08:14:23PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > One thing that is clear is that real_isdenormal is never true for these
> > (but then a question is if flush_denormals_to_zero actually works).
>
> So, after some more investigation, I think that is actually the case,
> real_isdenormal is only meaningful (will ever return true) right after
> round_for_format.
> The uses inside of real.cc are fine, real_to_target first calls
> round_for_format and then calls fmt->encode in which the real_isdenormal
> calls are done.  And, round_for_format is what transforms the
> normalized way of expressing the number into the denormal form:
>   /* Check the range of the exponent.  If we're out of range,
>  either underflow or overflow.  */
>   if (REAL_EXP (r) > emax2)
> goto overflow;
>   else if (REAL_EXP (r) <= emin2m1)
> {
>   int diff;
>
>   if (!fmt->has_denorm)
> {
>   /* Don't underflow completely until we've had a chance to round.  */
>   if (REAL_EXP (r) < emin2m1)
> goto underflow;
> }
>   else
> {
>   diff = emin2m1 - REAL_EXP (r) + 1;
>   if (diff > p2)
> goto underflow;
>
>   /* De-normalize the significand.  */
>   r->sig[0] |= sticky_rshift_significand (r, r, diff);
>   SET_REAL_EXP (r, REAL_EXP (r) + diff);
> }
> }
> But, real_to_target is one of the 4 callers of static round_for_format,
> another one is real_convert, but that one undoes that immediately:
>   round_for_format (fmt, r);
>
>   /* Make resulting NaN value to be qNaN. The caller has the
>  responsibility to avoid the operation if flag_signaling_nans
>  is on.  */
>   if (r->cl == rvc_nan)
> r->signalling = 0;
>
>   /* round_for_format de-normalizes denormals.  Undo just that part.  */
>   if (r->cl == rvc_normal)
> normalize (r);
> and the last two are inside of encoding the IBM double double composite.
> So, I think everywhere in the frange you'll see normalized REAL_VALUE_TYPE
> and so real_isdenormal will always be false there.
> You need to check for REAL_EXP (r) < fmt->emin or so (and ideally only on
> REAL_VALUE_TYPE already real_converted to the right mode (your
> frange_arithmetics does that already, which is good, but say conversions
> and other unary ops might need it too).

Let me see if I understand you correctly...

real_isdenormal is always returning false for our uses in frange?  So
instead of using real_isdenormal in flush_denormals_to_zero, perhaps
we should be using:

REAL_EXP (r) < REAL_MODE_FORMAT (mode)->emin

??

Could we perhaps make real_isdenormal always work for all values?  Is
that possible?

Aldy

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-04 Thread Jakub Jelinek via Gcc-patches

On Fri, Nov 04, 2022 at 08:14:23PM +0100, Jakub Jelinek via Gcc-patches wrote:
> One thing that is clear is that real_isdenormal is never true for these
> (but then a question is if flush_denormals_to_zero actually works).

So, after some more investigation, I think that is actually the case,
real_isdenormal is only meaningful (will ever return true) right after
round_for_format.
The uses inside of real.cc are fine, real_to_target first calls
round_for_format and then calls fmt->encode in which the real_isdenormal
calls are done.  And, round_for_format is what transforms the
normalized way of expressing the number into the denormal form:
  /* Check the range of the exponent.  If we're out of range,
 either underflow or overflow.  */
  if (REAL_EXP (r) > emax2)
goto overflow;
  else if (REAL_EXP (r) <= emin2m1)
{
  int diff;

  if (!fmt->has_denorm)
{
  /* Don't underflow completely until we've had a chance to round.  */
  if (REAL_EXP (r) < emin2m1)
goto underflow;
}
  else
{
  diff = emin2m1 - REAL_EXP (r) + 1;
  if (diff > p2)
goto underflow;

  /* De-normalize the significand.  */
  r->sig[0] |= sticky_rshift_significand (r, r, diff);
  SET_REAL_EXP (r, REAL_EXP (r) + diff);
}
}
But, real_to_target is one of the 4 callers of static round_for_format,
another one is real_convert, but that one undoes that immediately:
  round_for_format (fmt, r);

  /* Make resulting NaN value to be qNaN. The caller has the
 responsibility to avoid the operation if flag_signaling_nans
 is on.  */
  if (r->cl == rvc_nan)
r->signalling = 0;

  /* round_for_format de-normalizes denormals.  Undo just that part.  */
  if (r->cl == rvc_normal)
normalize (r);
and the last two are inside of encoding the IBM double double composite.
So, I think everywhere in the frange you'll see normalized REAL_VALUE_TYPE
and so real_isdenormal will always be false there.
You need to check for REAL_EXP (r) < fmt->emin or so (and ideally only on
REAL_VALUE_TYPE already real_converted to the right mode (your
frange_arithmetics does that already, which is good, but say conversions
and other unary ops might need it too).
If I in the hack from last mail replace
- fprintf (stderr, "%d %d %s %s\n", i, real_isdenormal (), buf, 
buf2);
+ fprintf (stderr, "%d %d %s %s\n", i, REAL_EXP () < 
REAL_MODE_FORMAT (mode)->emin, buf, buf2);
then it is 1 until:
102 1 0x0.8p-971 0x0.88p-971
103 1 0x0.8p-970 0x0.88p-970
104 1 0x0.8p-969 0x0.88p-969
105 0 0x0.8p-968 0x0.88p-968
106 0 0x0.8p-967 0x0.88p-967
107 0 0x0.8p-966 0x0.88p-966

Now, the __LDBL_MIN__ powerpc64 gcc announces is
2.00416836000897277799610805135016205e-292L
which is equivalent to:
0x1p-969
and that is equivalent to
0x0.8p-968, so I think that is exactly what we want.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-04 Thread Jakub Jelinek via Gcc-patches

On Mon, Oct 17, 2022 at 08:21:01AM +0200, Aldy Hernandez wrote:
> +// Set VALUE to its next real value, or INF if the operation overflows.
> +
> +inline void
> +frange_nextafter (enum machine_mode mode,
> +   REAL_VALUE_TYPE ,
> +   const REAL_VALUE_TYPE )
> +{
> +  const real_format *fmt = REAL_MODE_FORMAT (mode);
> +  REAL_VALUE_TYPE tmp;
> +  bool overflow = real_nextafter (, fmt, , );
> +  if (overflow)
> +value = inf;
> +  else
> +value = tmp;

Please change the above 5 lines to just
  real_nextafter (, fmt, , );
  value = tmp;
What real_nextafter returns isn't bool whether it overflowed, but
a bool whether some exception should be raised for the nextafter,
which can be either an overflow (in that case tmp will be already set
to infinity of the right sign:
  if (REAL_EXP (r) > fmt->emax)
{
  get_inf (r, x->sign);
  return true;
}
or underflow to zero:
  if (REAL_EXP (r) <= fmt->emin - fmt->p)
{
  get_zero (r, x->sign);
  return true;
}
or when the result is subnormal:
  return r->cl == rvc_zero || REAL_EXP (r) < fmt->emin;
But we shouldn't care about exceptions, we aren't emitting code, just
estimating floating point range, and +-inf, or +-0, or subnormal
are the right answers.

Just for playing with this, I've tried:
--- range-op-float.cc.jj2022-11-02 10:06:02.620960861 +0100
+++ range-op-float.cc   2022-11-04 19:49:14.188075070 +0100
@@ -1815,6 +1815,20 @@ frange_float (const char *lb, const char
   return frange (type, min, max);
 }
 
+inline void


+frange_nextafter (enum machine_mode mode,  


+ REAL_VALUE_TYPE ,   

  
+ const REAL_VALUE_TYPE )   

  
+{  


+  const real_format *fmt = REAL_MODE_FORMAT (mode);


+  REAL_VALUE_TYPE tmp; 


+  bool overflow = real_nextafter (, fmt, , );


+  if (overflow)


+value = inf;   


+  else 


+   value = tmp;

   
+}  


+
 void
 range_op_float_tests ()
 {
@@ -1833,6 +1847,30 @@ range_op_float_tests ()
   r1 = frange_float ("-1", "-0");
   r1.update_nan (false);
   ASSERT_EQ (r, r1);
+
+  if (MODE_COMPOSITE_P (TYPE_MODE (long_double_type_node)))
+{
+  enum machine_mode mode = TYPE_MODE (long_double_type_node);
+  REAL_VALUE_TYPE result = dconst0;
+  {
+   REAL_VALUE_TYPE tmp;
+   real_convert (, DFmode, );
+   frange_nextafter (DFmode, tmp, dconstinf);
+   real_convert (, mode, );
+  }
+  for (int i = 0; i < 108; ++i)
+   {
+ char buf[1024], buf2[1024];
+ REAL_VALUE_TYPE tmp, tmp2;
+ real_convert (, DFmode, );
+ frange_nextafter (DFmode, tmp, dconstinf);
+ real_convert (, mode, );
+ real_to_hexadecimal (buf, , 1024, 0, 1);
+ real_to_hexadecimal (buf2, , 1024, 0, 1);
+ fprintf (stderr, "%d %d %s %s\n", i, real_isdenormal (), buf, 
buf2);
+ real_arithmetic (, MULT_EXPR, , );
+   }
+}
 }
 
 } // namespace selftest

  
in cross to

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-11-04 Thread Jakub Jelinek via Gcc-patches

On Mon, Oct 17, 2022 at 08:21:01AM +0200, Aldy Hernandez wrote:
> --- a/gcc/range-op-float.cc
> +++ b/gcc/range-op-float.cc
> @@ -200,6 +200,116 @@ frelop_early_resolve (irange , tree type,
> && relop_early_resolve (r, type, op1, op2, rel, my_rel));
>  }
>  
> +// If R contains a NAN of unknown sign, update the NAN's signbit
> +// depending on two operands.
> +
> +inline void
> +update_nan_sign (frange , const frange , const frange )
> +{
> +  if (!r.maybe_isnan ())
> +return;
> +
> +  bool op1_nan = op1.maybe_isnan ();
> +  bool op2_nan = op2.maybe_isnan ();
> +  bool sign1, sign2;
> +
> +  gcc_checking_assert (!r.nan_signbit_p (sign1));
> +  if (op1_nan && op2_nan)
> +{
> +  if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2))
> + r.update_nan (sign1 | sign2);
> +}
> +  else if (op1_nan)
> +{
> +  if (op1.nan_signbit_p (sign1))
> + r.update_nan (sign1);
> +}
> +  else if (op2_nan)
> +{
> +  if (op2.nan_signbit_p (sign2))
> + r.update_nan (sign2);
> +}
> +}

IEEE 754-2008 says:
"When either an input or result is NaN, this standard does not interpret the 
sign of a NaN. Note, however,
that operations on bit strings — copy, negate, abs, copySign — specify the sign 
bit of a NaN result,
sometimes based upon the sign bit of a NaN operand. The logical predicate 
totalOrder is also affected by
the sign bit of a NaN operand. For all other operations, this standard does not 
specify the sign bit of a NaN
result, even when there is only one input NaN, or when the NaN is produced from 
an invalid
operation."
so, while one can e.g. see on x86_64 that in simple -O0
int
main ()
{
  volatile float f1 = __builtin_nansf ("");
  volatile float f2 = __builtin_copysignf (__builtin_nansf (""), -1.0f);
  volatile float f3 = __builtin_nanf ("");
  volatile float f4 = __builtin_copysignf (__builtin_nanf (""), -1.0f);
  volatile float fzero = 0.0f;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f1 = -f1;
  f2 = -f2;
  f3 = -f3;
  f4 = -f4;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  volatile float f5 = f1 + fzero;
  volatile float f6 = fzero + f1;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f2 + fzero;
  f6 = fzero + f2;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f3 + fzero;
  f6 = fzero + f3;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f4 + fzero;
  f6 = fzero + f4;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f1 + f2;
  f6 = f2 + f1;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f2 + f3;
  f6 = f3 + f2;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f3 + f4;
  f6 = f4 + f3;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  f5 = f4 + f1;
  f6 = f1 + f4;
  __builtin_printf ("%08x %08x\n", *(unsigned *), *(unsigned *));
  return 0;
}
result of:
7fa0 ffa0
7fc0 ffc0
ffa0 7fa0
ffc0 7fc0
ffe0 ffe0
7fe0 7fe0
ffc0 ffc0
7fc0 7fc0
7fe0 ffe0
ffc0 7fe0
7fc0 ffc0
ffe0 7fc0
which basically shows that copysign copies sign bit, negation toggles it,
binary operation with a single NaN (quiet or signaling) get that NaN
with its sign and for binary operation on 2 NaNs (again quiet or signaling)
one gets sign from the second? operand, I think the above IEEE text
doesn't guarantee it except for a simple assignment (but with no mode
conversion; that one copies the bit), NEGATE_EXPR (toggles it),
ABS_EXPR (clears it), __builtin_copysign*/IFN_COPYSIGN (copies it from
the second operand).  Everything else, including invalid operation cases,
should set the possible sign bit values of NANs to 0/1 rather than just
one of them.  Perhaps COND_EXPR 2nd/3rd operand is a move too.

As you are adding a binary operation, that should be one of those cases
where we drop the NAN sign to VARYING.

> +
> +// If either operand is a NAN, set R to the combination of both NANs
> +// signwise and return TRUE.
> +
> +inline bool
> +propagate_nans (frange , const frange , const frange )
> +{
> +  if (op1.known_isnan () || op2.known_isnan ())
> +{
> +  r.set_nan (op1.type ());
> +  update_nan_sign (r, op1, op2);
> +  return true;
> +}
> +  return false;
> +}
> +
> +// Set VALUE to its next real value, or INF if the operation overflows.
> +
> +inline void
> +frange_nextafter (enum machine_mode mode,
> +   REAL_VALUE_TYPE ,
> +   const REAL_VALUE_TYPE )
> +{
> +  const real_format *fmt = REAL_MODE_FORMAT (mode);
> +  REAL_VALUE_TYPE tmp;
> +  bool overflow = real_nextafter (, fmt, , );
> +  if (overflow)
> +value = inf;
> +  else
> +value = tmp;
> +}
> +
> +// Like real_arithmetic, but round the result to INF

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-31 Thread Aldy Hernandez via Gcc-patches

Ping ping

On Mon, Oct 24, 2022, 08:04 Aldy Hernandez  wrote:

> PING
>
> On Mon, Oct 17, 2022 at 8:21 AM Aldy Hernandez  wrote:
> >
> > On Thu, Oct 13, 2022 at 7:57 PM Jakub Jelinek  wrote:
> > >
> > > On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote:
> > > > +// Like real_arithmetic, but round the result to INF if the
> operation
> > > > +// produced inexact results.
> > > > +//
> > > > +// ?? There is still one problematic case, i387.  With
> > > > +// -fexcess-precision=standard we perform most SF/DFmode arithmetic
> in
> > > > +// XFmode (long_double_type_node), so that case is OK.  But without
> > > > +// -mfpmath=sse, all the SF/DFmode computations are in XFmode
> > > > +// precision (64-bit mantissa) and only occassionally rounded to
> > > > +// SF/DFmode (when storing into memory from the 387 stack).  Maybe
> > > > +// this is ok as well though it is just occassionally more precise.
> ??
> > > > +
> > > > +static void
> > > > +frange_arithmetic (enum tree_code code, tree type,
> > > > +REAL_VALUE_TYPE ,
> > > > +const REAL_VALUE_TYPE ,
> > > > +const REAL_VALUE_TYPE ,
> > > > +const REAL_VALUE_TYPE )
> > > > +{
> > > > +  REAL_VALUE_TYPE value;
> > > > +  enum machine_mode mode = TYPE_MODE (type);
> > > > +  bool mode_composite = MODE_COMPOSITE_P (mode);
> > > > +
> > > > +  bool inexact = real_arithmetic (, code, , );
> > > > +  real_convert (, mode, );
> > > > +
> > > > +  // If real_convert above has rounded an inexact value to towards
> > > > +  // inf, we can keep the result as is, otherwise we'll adjust by 1
> ulp
> > > > +  // later (real_nextafter).
> > > > +  bool rounding = (flag_rounding_math
> > > > +&& (real_isneg ()
> > > > +? real_less (, )
> > > > +: !real_less (, )));
> > >
> > > I thought the agreement during Cauldron was that we'd do this always,
> > > regardless of flag_rounding_math.
> > > Because excess precision (the fast one like on ia32 or -mfpmath=387 on
> > > x86_64), or -frounding-math, or FMA contraction can all increase
> precision
> > > and worst case it all behaves like -frounding-math for the ranges.
> > >
> > > So, perhaps use:
> > >   if ((mode_composite || (real_isneg () ? real_less (,
> )
> > > : !real_less (,
> ))
> > >   && (inexact || !real_identical (, 
> >
> > Done.
> >
> > > ?
> > > No need to do the real_isneg/real_less stuff for mode_composite, then
> > > we do it always for inexacts, but otherwise we check if the rounding
> > > performed by real.cc has been in the conservative direction (for upper
> > > bound to +inf, for lower bound to -inf), if yes, we don't need to do
> > > anything, if yes, we frange_nextafter.
> > >
> > > As discussed, for mode_composite, I think we want to do the extra
> > > stuff for inexact denormals and otherwise do the nextafter
> unconditionally,
> > > because our internal mode_composite representation isn't precise
> enough.
> > >
> > > > +  // Be extra careful if there may be discrepancies between the
> > > > +  // compile and runtime results.
> > > > +  if ((rounding || mode_composite)
> > > > +  && (inexact || !real_identical (, )))
> > > > +{
> > > > +  if (mode_composite)
> > > > + {
> > > > +   bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0;
> > >
> > > Use real_isdenormal here?
> >
> > Done.
> >
> > > Though, real_iszero needs the same thing.
> >
> > So... real_isdenormal() || real_iszero() as in the attached patch?
> >
> > >
> > > > +   if (denormal)
> > > > + {
> > > > +   REAL_VALUE_TYPE tmp;
> > >
> > > And explain here why is this, that IBM extended denormals have just
> > > DFmode precision.
> >
> > Done.
> >
> > > Though, now that I think about it, while this is correct for denormals,
> > >
> > > > +   real_convert (, DFmode, );
> > > > +   frange_nextafter (DFmode, tmp, inf);
> > > > +   real_convert (, mode, );
> > > > + }
> > >
> > > there are also the cases where the higher double exponent is in the
> > > [__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so.
> > > https://en.wikipedia.org/wiki/Double-precision_floating-point_format
> > > If the upper double is denormal in the DFmode sense, so smaller
> absolute
> > > value than __DBL_MIN__, then doing nextafter in DFmode is the right
> thing to
> > > do, the lower double must be always +/- zero.
> > > Now, if the result is __DBL_MIN__, the upper double is already
> normalized
> > > but we can add __DBL_DENORM_MIN__ to it, which will make the number
> have
> > > 54-bit precision.
> > > If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__
> > > and make it 55-bit precision.  Etc. until we reach __DBL_MIN__ * 2e53
> > > where it acts like fully normalized 106-bit precision number.
> > > I must say I'm not really sure what real_nextafter is doing in those
> cases,
> > > I'm

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-28 Thread Jeff Law via Gcc-patches




On 10/24/22 00:04, Aldy Hernandez via Gcc-patches wrote:

PING


I'd be a lot more comfortable if Jakub would chime in here.


Jeff

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-24 Thread Aldy Hernandez via Gcc-patches

PING

On Mon, Oct 17, 2022 at 8:21 AM Aldy Hernandez  wrote:
>
> On Thu, Oct 13, 2022 at 7:57 PM Jakub Jelinek  wrote:
> >
> > On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote:
> > > +// Like real_arithmetic, but round the result to INF if the operation
> > > +// produced inexact results.
> > > +//
> > > +// ?? There is still one problematic case, i387.  With
> > > +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
> > > +// XFmode (long_double_type_node), so that case is OK.  But without
> > > +// -mfpmath=sse, all the SF/DFmode computations are in XFmode
> > > +// precision (64-bit mantissa) and only occassionally rounded to
> > > +// SF/DFmode (when storing into memory from the 387 stack).  Maybe
> > > +// this is ok as well though it is just occassionally more precise. ??
> > > +
> > > +static void
> > > +frange_arithmetic (enum tree_code code, tree type,
> > > +REAL_VALUE_TYPE ,
> > > +const REAL_VALUE_TYPE ,
> > > +const REAL_VALUE_TYPE ,
> > > +const REAL_VALUE_TYPE )
> > > +{
> > > +  REAL_VALUE_TYPE value;
> > > +  enum machine_mode mode = TYPE_MODE (type);
> > > +  bool mode_composite = MODE_COMPOSITE_P (mode);
> > > +
> > > +  bool inexact = real_arithmetic (, code, , );
> > > +  real_convert (, mode, );
> > > +
> > > +  // If real_convert above has rounded an inexact value to towards
> > > +  // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp
> > > +  // later (real_nextafter).
> > > +  bool rounding = (flag_rounding_math
> > > +&& (real_isneg ()
> > > +? real_less (, )
> > > +: !real_less (, )));
> >
> > I thought the agreement during Cauldron was that we'd do this always,
> > regardless of flag_rounding_math.
> > Because excess precision (the fast one like on ia32 or -mfpmath=387 on
> > x86_64), or -frounding-math, or FMA contraction can all increase precision
> > and worst case it all behaves like -frounding-math for the ranges.
> >
> > So, perhaps use:
> >   if ((mode_composite || (real_isneg () ? real_less (, )
> > : !real_less (, ))
> >   && (inexact || !real_identical (, 
>
> Done.
>
> > ?
> > No need to do the real_isneg/real_less stuff for mode_composite, then
> > we do it always for inexacts, but otherwise we check if the rounding
> > performed by real.cc has been in the conservative direction (for upper
> > bound to +inf, for lower bound to -inf), if yes, we don't need to do
> > anything, if yes, we frange_nextafter.
> >
> > As discussed, for mode_composite, I think we want to do the extra
> > stuff for inexact denormals and otherwise do the nextafter unconditionally,
> > because our internal mode_composite representation isn't precise enough.
> >
> > > +  // Be extra careful if there may be discrepancies between the
> > > +  // compile and runtime results.
> > > +  if ((rounding || mode_composite)
> > > +  && (inexact || !real_identical (, )))
> > > +{
> > > +  if (mode_composite)
> > > + {
> > > +   bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0;
> >
> > Use real_isdenormal here?
>
> Done.
>
> > Though, real_iszero needs the same thing.
>
> So... real_isdenormal() || real_iszero() as in the attached patch?
>
> >
> > > +   if (denormal)
> > > + {
> > > +   REAL_VALUE_TYPE tmp;
> >
> > And explain here why is this, that IBM extended denormals have just
> > DFmode precision.
>
> Done.
>
> > Though, now that I think about it, while this is correct for denormals,
> >
> > > +   real_convert (, DFmode, );
> > > +   frange_nextafter (DFmode, tmp, inf);
> > > +   real_convert (, mode, );
> > > + }
> >
> > there are also the cases where the higher double exponent is in the
> > [__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so.
> > https://en.wikipedia.org/wiki/Double-precision_floating-point_format
> > If the upper double is denormal in the DFmode sense, so smaller absolute
> > value than __DBL_MIN__, then doing nextafter in DFmode is the right thing to
> > do, the lower double must be always +/- zero.
> > Now, if the result is __DBL_MIN__, the upper double is already normalized
> > but we can add __DBL_DENORM_MIN__ to it, which will make the number have
> > 54-bit precision.
> > If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__
> > and make it 55-bit precision.  Etc. until we reach __DBL_MIN__ * 2e53
> > where it acts like fully normalized 106-bit precision number.
> > I must say I'm not really sure what real_nextafter is doing in those cases,
> > I'm afraid it doesn't handle it correctly but the only other use
> > of real_nextafter is guarded with:
> >   /* Don't handle composite modes, nor decimal, nor modes without
> >  inf or denorm at least for now.  */
> >   if (format->pnan < format->p
> >   || format->b == 10
> >   || !format->has_inf
> >

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-17 Thread Aldy Hernandez via Gcc-patches

ite modes, nor decimal, nor modes without
>  inf or denorm at least for now.  */
>   if (format->pnan < format->p
>   || format->b == 10
>   || !format->has_inf
>   || !format->has_denorm)
> return false;

Dunno.  Is there a conservative thing we can do for mode_composites
that aren't denormal or zero?

How does this look?
Aldy
From d7be6caf60133bc39f8224e2f0e00dabcdbbe55d Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Thu, 13 Oct 2022 08:14:16 +0200
Subject: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

This is the range-op entry for floating point PLUS_EXPR.  It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats.  This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.

So far we track NANs and signs correctly.  We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up).  All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.

My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN.  I
may abstract the general parts as we do for irange's fold_range and
wi_fold.

	PR tree-optimization/24021

gcc/ChangeLog:

	* range-op-float.cc (update_nan_sign): New.
	(propagate_nans): New.
	(frange_nextafter): New.
	(frange_arithmetic): New.
	(class foperator_plus): New.
	(floating_op_table::floating_op_table): Add PLUS_EXPR entry.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
 gcc/range-op-float.cc | 163 ++
 .../gcc.dg/tree-ssa/vrp-float-plus.c  |  21 +++
 2 files changed, 184 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 23e0f5ef4e2..7ffd5795e17 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -200,6 +200,116 @@ frelop_early_resolve (irange , tree type,
 	  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }
 
+// If R contains a NAN of unknown sign, update the NAN's signbit
+// depending on two operands.
+
+inline void
+update_nan_sign (frange , const frange , const frange )
+{
+  if (!r.maybe_isnan ())
+return;
+
+  bool op1_nan = op1.maybe_isnan ();
+  bool op2_nan = op2.maybe_isnan ();
+  bool sign1, sign2;
+
+  gcc_checking_assert (!r.nan_signbit_p (sign1));
+  if (op1_nan && op2_nan)
+{
+  if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2))
+	r.update_nan (sign1 | sign2);
+}
+  else if (op1_nan)
+{
+  if (op1.nan_signbit_p (sign1))
+	r.update_nan (sign1);
+}
+  else if (op2_nan)
+{
+  if (op2.nan_signbit_p (sign2))
+	r.update_nan (sign2);
+}
+}
+
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange , const frange , const frange )
+{
+  if (op1.known_isnan () || op2.known_isnan ())
+{
+  r.set_nan (op1.type ());
+  update_nan_sign (r, op1, op2);
+  return true;
+}
+  return false;
+}
+
+// Set VALUE to its next real value, or INF if the operation overflows.
+
+inline void
+frange_nextafter (enum machine_mode mode,
+		  REAL_VALUE_TYPE ,
+		  const REAL_VALUE_TYPE )
+{
+  const real_format *fmt = REAL_MODE_FORMAT (mode);
+  REAL_VALUE_TYPE tmp;
+  bool overflow = real_nextafter (, fmt, , );
+  if (overflow)
+value = inf;
+  else
+value = tmp;
+}
+
+// Like real_arithmetic, but round the result to INF if the operation
+// produced inexact results.
+//
+// ?? There is still one problematic case, i387.  With
+// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
+// XFmode (long_double_type_node), so that case is OK.  But without
+// -mfpmath=sse, all the SF/DFmode computations are in XFmode
+// precision (64-bit mantissa) and only occassionally rounded to
+// SF/DFmode (when storing into memory from the 387 stack).  Maybe
+// this is ok as well though it is just occassionally more precise. ??
+
+static void
+frange_arithmetic (enum tree_code code, tree type,
+		   REAL_VALUE_TYPE ,
+		   const REAL_VALUE_TYPE ,
+		   const REAL_VALUE_TYPE ,
+		   const REAL_VALUE_TYPE )
+{
+  REAL_VALUE_TYPE value;
+  enum machine_mode mode = TYPE_MODE (type);
+  bool mode_composite = MODE_COMPOSITE_P (mode);
+
+  bool inexact = real_arithmetic (, code, , );
+  real_convert (, mode, );
+
+  // Be extra careful if there may be discrepancies between the
+  // compile and runtime results.
+  if ((mode_composite || (real_i

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-14 Thread Aldy Hernandez via Gcc-patches

On Thu, Oct 13, 2022 at 3:52 PM Toon Moene  wrote:
>
> It was just a comment on the code of the PR ...

Ahh... the patch adds support for ranger handling of floating point
PLUS_EXPRs in the IL.  The PR's testcase is just one of the many
things it will be able to solve.

Aldy

>
> Toon.
>
> On 10/13/22 15:44, Aldy Hernandez wrote:
>
> > I'm not following.  My patch doesn't affect this behavior.
> >
> > What am I missing?
> >
> > Aldy
> >
> > On Thu, Oct 13, 2022 at 3:04 PM Toon Moene  wrote:
> >>
> >> On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote:
> >>
> >>>PR tree-optimization/24021
> >>
> >> Ah - Verboten in Fortran:
> >>
> >> $ cat d.f
> >> DOUBLE PRECISION A, X
> >> A = 0.0
> >> DO X = 0.1, 1.0
> >>A = A + X
> >> ENDDO
> >> END
> >> $ gfortran d.f
> >> d.f:3:9:
> >>
> >>   3 |   DO X = 0.1, 1.0
> >> | 1
> >> Warning: Deleted feature: Loop variable at (1) must be integer
> >> d.f:3:12:
> >>
> >>   3 |   DO X = 0.1, 1.0
> >> |1
> >> Warning: Deleted feature: Start expression in DO loop at (1) must be 
> >> integer
> >> d.f:3:17:
> >>
> >>   3 |   DO X = 0.1, 1.0
> >> | 1
> >> Warning: Deleted feature: End expression in DO loop at (1) must be integer
> >>
> >> --
> >> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
> >> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
> >>
> >
>
> --
> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
>

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Jakub Jelinek via Gcc-patches

On Thu, Oct 13, 2022 at 02:36:49PM +0200, Aldy Hernandez wrote:
> +// Like real_arithmetic, but round the result to INF if the operation
> +// produced inexact results.
> +//
> +// ?? There is still one problematic case, i387.  With
> +// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
> +// XFmode (long_double_type_node), so that case is OK.  But without
> +// -mfpmath=sse, all the SF/DFmode computations are in XFmode
> +// precision (64-bit mantissa) and only occassionally rounded to
> +// SF/DFmode (when storing into memory from the 387 stack).  Maybe
> +// this is ok as well though it is just occassionally more precise. ??
> +
> +static void
> +frange_arithmetic (enum tree_code code, tree type,
> +REAL_VALUE_TYPE ,
> +const REAL_VALUE_TYPE ,
> +const REAL_VALUE_TYPE ,
> +const REAL_VALUE_TYPE )
> +{
> +  REAL_VALUE_TYPE value;
> +  enum machine_mode mode = TYPE_MODE (type);
> +  bool mode_composite = MODE_COMPOSITE_P (mode);
> +
> +  bool inexact = real_arithmetic (, code, , );
> +  real_convert (, mode, );
> +
> +  // If real_convert above has rounded an inexact value to towards
> +  // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp
> +  // later (real_nextafter).
> +  bool rounding = (flag_rounding_math
> +&& (real_isneg ()
> +? real_less (, )
> +: !real_less (, )));

I thought the agreement during Cauldron was that we'd do this always,
regardless of flag_rounding_math.
Because excess precision (the fast one like on ia32 or -mfpmath=387 on
x86_64), or -frounding-math, or FMA contraction can all increase precision
and worst case it all behaves like -frounding-math for the ranges.

So, perhaps use:
  if ((mode_composite || (real_isneg () ? real_less (, )
: !real_less (, ))
  && (inexact || !real_identical (, 
?
No need to do the real_isneg/real_less stuff for mode_composite, then
we do it always for inexacts, but otherwise we check if the rounding
performed by real.cc has been in the conservative direction (for upper
bound to +inf, for lower bound to -inf), if yes, we don't need to do
anything, if yes, we frange_nextafter.

As discussed, for mode_composite, I think we want to do the extra
stuff for inexact denormals and otherwise do the nextafter unconditionally,
because our internal mode_composite representation isn't precise enough.

> +  // Be extra careful if there may be discrepancies between the
> +  // compile and runtime results.
> +  if ((rounding || mode_composite)
> +  && (inexact || !real_identical (, )))
> +{
> +  if (mode_composite)
> + {
> +   bool denormal = (result.sig[SIGSZ-1] & SIG_MSB) == 0;

Use real_isdenormal here?
Though, real_iszero needs the same thing.

> +   if (denormal)
> + {
> +   REAL_VALUE_TYPE tmp;

And explain here why is this, that IBM extended denormals have just
DFmode precision.
Though, now that I think about it, while this is correct for denormals,

> +   real_convert (, DFmode, );
> +   frange_nextafter (DFmode, tmp, inf);
> +   real_convert (, mode, );
> + }

there are also the cases where the higher double exponent is in the
[__DBL_MIN_EXP__, __LDBL_MIN_EXP__] aka [-1021, -968] or so.
https://en.wikipedia.org/wiki/Double-precision_floating-point_format
If the upper double is denormal in the DFmode sense, so smaller absolute
value than __DBL_MIN__, then doing nextafter in DFmode is the right thing to
do, the lower double must be always +/- zero.
Now, if the result is __DBL_MIN__, the upper double is already normalized
but we can add __DBL_DENORM_MIN__ to it, which will make the number have
54-bit precision.
If the result is __DBL_MIN__ * 2, we can again add __DBL_DENORM_MIN__
and make it 55-bit precision.  Etc. until we reach __DBL_MIN__ * 2e53
where it acts like fully normalized 106-bit precision number.
I must say I'm not really sure what real_nextafter is doing in those cases,
I'm afraid it doesn't handle it correctly but the only other use
of real_nextafter is guarded with:
  /* Don't handle composite modes, nor decimal, nor modes without
 inf or denorm at least for now.  */
  if (format->pnan < format->p
  || format->b == 10
  || !format->has_inf
  || !format->has_denorm)
return false;
so it isn't that big deal except for ranges.

Jakub

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Toon Moene


It was just a comment on the code of the PR ...

Toon.

On 10/13/22 15:44, Aldy Hernandez wrote:


I'm not following.  My patch doesn't affect this behavior.

What am I missing?

Aldy

On Thu, Oct 13, 2022 at 3:04 PM Toon Moene  wrote:


On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote:


   PR tree-optimization/24021


Ah - Verboten in Fortran:

$ cat d.f
DOUBLE PRECISION A, X
A = 0.0
DO X = 0.1, 1.0
   A = A + X
ENDDO
END
$ gfortran d.f
d.f:3:9:

  3 |   DO X = 0.1, 1.0
| 1
Warning: Deleted feature: Loop variable at (1) must be integer
d.f:3:12:

  3 |   DO X = 0.1, 1.0
|1
Warning: Deleted feature: Start expression in DO loop at (1) must be integer
d.f:3:17:

  3 |   DO X = 0.1, 1.0
| 1
Warning: Deleted feature: End expression in DO loop at (1) must be integer

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands





--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Aldy Hernandez via Gcc-patches

I'm not following.  My patch doesn't affect this behavior.

What am I missing?

Aldy

On Thu, Oct 13, 2022 at 3:04 PM Toon Moene  wrote:
>
> On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote:
>
> >   PR tree-optimization/24021
>
> Ah - Verboten in Fortran:
>
> $ cat d.f
>DOUBLE PRECISION A, X
>A = 0.0
>DO X = 0.1, 1.0
>   A = A + X
>ENDDO
>END
> $ gfortran d.f
> d.f:3:9:
>
>  3 |   DO X = 0.1, 1.0
>| 1
> Warning: Deleted feature: Loop variable at (1) must be integer
> d.f:3:12:
>
>  3 |   DO X = 0.1, 1.0
>|1
> Warning: Deleted feature: Start expression in DO loop at (1) must be integer
> d.f:3:17:
>
>  3 |   DO X = 0.1, 1.0
>| 1
> Warning: Deleted feature: End expression in DO loop at (1) must be integer
>
> --
> Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
> Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
>

Re: [PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Toon Moene


On 10/13/22 14:36, Aldy Hernandez via Gcc-patches wrote:


PR tree-optimization/24021


Ah - Verboten in Fortran:

$ cat d.f
  DOUBLE PRECISION A, X
  A = 0.0
  DO X = 0.1, 1.0
 A = A + X
  ENDDO
  END
$ gfortran d.f
d.f:3:9:

3 |   DO X = 0.1, 1.0
  | 1
Warning: Deleted feature: Loop variable at (1) must be integer
d.f:3:12:

3 |   DO X = 0.1, 1.0
  |1
Warning: Deleted feature: Start expression in DO loop at (1) must be integer
d.f:3:17:

3 |   DO X = 0.1, 1.0
  | 1
Warning: Deleted feature: End expression in DO loop at (1) must be integer

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

[PATCH] [PR24021] Implement PLUS_EXPR range-op entry for floats.

2022-10-13 Thread Aldy Hernandez via Gcc-patches

[Jakub, this is a cleaned up version of what we iterated on earlier
this summer.  It contains additional smarts to propagate NAN signs on
entry.  I'd like a nod before committing.]

This is the range-op entry for floating point PLUS_EXPR.  It's the
most intricate range entry we have so far, because we need to keep
track of rounding and target FP formats.  This will be the last FP
entry I commit, mostly to avoid disturbing the tree any further, and
also because what we have so far is enough for a solid VRP.

So far we track NANs and signs correctly.  We also handle relationals
(symbolics and numeric), both ordered and unordered, ABS_EXPR and
NEGATE_EXPR which are used to fold __builtin_isinf, and __builtin_sign
(__builtin_copysign is coming up).  All in all, I think this provide
more than enough for basic VRP on floats, as well as provide a basis
to flesh out the rest if there's interest.

My goal with this entry is to provide a template for additional binary
operators, as they tend to follow a similar pattern: handle NANs, do
the arithmetic while keeping track of rounding, and adjust for NAN.  I
may abstract the general parts as we do for irange's fold_range and
wi_fold.

Oh yeah... and I'd like to finally close this PR ;-).

How does this look?

PR tree-optimization/24021

gcc/ChangeLog:

* range-op-float.cc (update_nan_sign): New.
(propagate_nans): New.
(frange_nextafter): New.
(frange_arithmetic): New.
(class foperator_plus): New.
(floating_op_table::floating_op_table): Add PLUS_EXPR entry.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-plus.c: New test.
---
 gcc/range-op-float.cc | 171 ++
 .../gcc.dg/tree-ssa/vrp-float-plus.c  |  21 +++
 2 files changed, 192 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-plus.c

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 23e0f5ef4e2..a967c4da393 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -200,6 +200,124 @@ frelop_early_resolve (irange , tree type,
  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }
 
+// If R contains a NAN of unknown sign, update the NAN's signbit
+// depending on two operands.
+
+inline void
+update_nan_sign (frange , const frange , const frange )
+{
+  if (!r.maybe_isnan ())
+return;
+
+  bool op1_nan = op1.maybe_isnan ();
+  bool op2_nan = op2.maybe_isnan ();
+  bool sign1, sign2;
+
+  gcc_checking_assert (!r.nan_signbit_p (sign1));
+  if (op1_nan && op2_nan)
+{
+  if (op1.nan_signbit_p (sign1) && op2.nan_signbit_p (sign2))
+   r.update_nan (sign1 | sign2);
+}
+  else if (op1_nan)
+{
+  if (op1.nan_signbit_p (sign1))
+   r.update_nan (sign1);
+}
+  else if (op2_nan)
+{
+  if (op2.nan_signbit_p (sign2))
+   r.update_nan (sign2);
+}
+}
+
+// If either operand is a NAN, set R to the combination of both NANs
+// signwise and return TRUE.
+
+inline bool
+propagate_nans (frange , const frange , const frange )
+{
+  if (op1.known_isnan () || op2.known_isnan ())
+{
+  r.set_nan (op1.type ());
+  update_nan_sign (r, op1, op2);
+  return true;
+}
+  return false;
+}
+
+// Set VALUE to its next real value, or INF if the operation overflows.
+
+inline void
+frange_nextafter (enum machine_mode mode,
+ REAL_VALUE_TYPE ,
+ const REAL_VALUE_TYPE )
+{
+  const real_format *fmt = REAL_MODE_FORMAT (mode);
+  REAL_VALUE_TYPE tmp;
+  bool overflow = real_nextafter (, fmt, , );
+  if (overflow)
+value = inf;
+  else
+value = tmp;
+}
+
+// Like real_arithmetic, but round the result to INF if the operation
+// produced inexact results.
+//
+// ?? There is still one problematic case, i387.  With
+// -fexcess-precision=standard we perform most SF/DFmode arithmetic in
+// XFmode (long_double_type_node), so that case is OK.  But without
+// -mfpmath=sse, all the SF/DFmode computations are in XFmode
+// precision (64-bit mantissa) and only occassionally rounded to
+// SF/DFmode (when storing into memory from the 387 stack).  Maybe
+// this is ok as well though it is just occassionally more precise. ??
+
+static void
+frange_arithmetic (enum tree_code code, tree type,
+  REAL_VALUE_TYPE ,
+  const REAL_VALUE_TYPE ,
+  const REAL_VALUE_TYPE ,
+  const REAL_VALUE_TYPE )
+{
+  REAL_VALUE_TYPE value;
+  enum machine_mode mode = TYPE_MODE (type);
+  bool mode_composite = MODE_COMPOSITE_P (mode);
+
+  bool inexact = real_arithmetic (, code, , );
+  real_convert (, mode, );
+
+  // If real_convert above has rounded an inexact value to towards
+  // inf, we can keep the result as is, otherwise we'll adjust by 1 ulp
+  // later (real_nextafter).
+  bool rounding = (flag_rounding_math
+  && (real_isneg ()
+  ? real_less (, )
+  : !real_less (, )));
+
+

37 matches

Mail list logo