Re: proposal: minscale, rtrim, btrim functions for numeric

Pavel Stehule Sat, 07 Dec 2019 23:40:07 -0800

Hi

ne 8. 12. 2019 v 2:23 odesílatel Karl O. Pinc <k...@meme.com> napsal:


> Hello Pavel,
>
> On Mon, 11 Nov 2019 15:47:37 +0100
> Pavel Stehule <pavel.steh...@gmail.com> wrote:
>
> > Here is a patch. It's based on Dean's suggestions.
> >
> > I implemented two functions - first minscale, second trim_scale. The
> > overhead of second is minimal - so I think it can be good to have it.
> > I started design with the name "trim_scale", but the name can be any
> > other.
>
> Here are my thoughts on your patch.
>
> My one substantial criticism is that I believe that
> trim_scale('NaN'::numeric) should return NaN.
> So the test output should look like:
>
> template1=# select trim_scale('nan'::numeric) = 'nan'::numeric;
>  ?column?
> ----------
>  t
> (1 row)
>

fixed


>
> FWIW, I bumped around the Internet and looked at Oracle docs to see if
> there's any reason why minscale() might not be a good function name.
> I couldn't find any problems.
>
> I also couldn't think of a better name than trim_scale() and don't
> have any problems with the name.
>
> My other suggestions mostly have to do with documentation.  Your code
> looks pretty good to me, looks like the existing code, you name
> variables and function names as in existing code, etc.
>
> I comment on various hunks in line below:
>
> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> index 28eb322f3f..6f142cd679 100644
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -918,6 +918,19 @@
>         <entry><literal>6.0000000000</literal></entry>
>        </row>
>
> +      <row>
> +       <entry>
> +        <indexterm>
> +         <primary>minscale</primary>
> +        </indexterm>
> +
> <literal><function>minscale(<type>numeric</type>)</function></literal>
> +       </entry>
> +       <entry><type>integer</type></entry>
> +       <entry>returns minimal scale of the argument (the number of
> decimal digits in the fractional part)</entry>
> +       <entry><literal>scale(8.4100)</literal></entry>
> +       <entry><literal>2</literal></entry>
> +      </row>
> +
>        <row>
>         <entry>
>          <indexterm>
>
> *****
> Your description does not say what the minimal scale is.  How about:
>
> minimal scale (number of decimal digits in the fractional part) needed
> to store the supplied value without data loss
> *****
>

sounds better, updated


> @@ -1041,6 +1054,19 @@
>         <entry><literal>1.4142135623731</literal></entry>
>        </row>
>
> +      <row>
> +       <entry>
> +        <indexterm>
> +         <primary>trim_scale</primary>
> +        </indexterm>
> +
> <literal><function>trim_scale(<type>numeric</type>)</function></literal>
> +       </entry>
> +       <entry><type>numeric</type></entry>
> +       <entry>reduce scale of the argument (the number of decimal
> digits in the fractional part)</entry>
> +       <entry><literal>scale(8.4100)</literal></entry>
> +       <entry><literal>8.41</literal></entry>
> +      </row>
> +
>        <row>
>         <entry>
>          <indexterm>
>
> ****
> How about:
>
> reduce the scale (the number of decimal digits in the fractional part)
> to the minimum needed to store the supplied value without data loss
> *****
>

ok, changed


> diff --git a/src/backend/utils/adt/numeric.c
> b/src/backend/utils/adt/numeric.c index a00db3ce7a..35234aee4c 100644
> --- a/src/backend/utils/adt/numeric.c
> +++ b/src/backend/utils/adt/numeric.c
>
> ****
> I believe the hunks in this file should start at about line# 3181.
> This is right after numeric_scale().  Seems like all the scale
> related functions should be together.
>
> There's no hard standard but I don't see why lines (comment lines in
> your case) should be longer than 78 characters without good reason.
> Please reformat.
> ****
>
> @@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS)
>         PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum));
>  }
>
> +/*
> + * Calculate minimal display scale. The var should be stripped already.
>
> ****
> I think you can get rid of the word "display" in the comment.
> ****
>

done


> + */
> +static int
> +get_min_scale(NumericVar *var)
> +{
> +       int             minscale = 0;
> +
> +       if (var->ndigits > 0)
> +       {
> +               NumericDigit last_digit;
> +
> +               /* maximal size of minscale, can be lower */
> +               minscale = (var->ndigits - var->weight - 1) *
>   DEC_DIGITS; +
> +               /*
> +                * When there are not digits after decimal point, the
>   previous expression
>
> ****
> s/not/no/
> ****
>
> +                * can be negative. In this case, the minscale must be
>   zero.
> +                */
>
> ****
> s/can be/is/
> ****
>
> +               if (minscale > 0)
> +               {
> +                       /* reduce minscale if trailing digits in last
>   numeric digits are zero */
> +                       last_digit = var->digits[var->ndigits - 1];
> +
> +                       while (last_digit % 10 ==  0)
> +                       {
> +                               minscale--;
> +                               last_digit /= 10;
> +                       }
> +               }
> +               else
> +                       minscale = 0;
> +       }
> +
> +       return minscale;
> +}
> +
> +/*
> + * Returns minimal scale of numeric value when value is not changed
>
> ****
> Improve comment, something like:
>   minimal scale required to represent supplied value without loss
>

ok

****
>
> + */
> +Datum
> +numeric_minscale(PG_FUNCTION_ARGS)
> +{
> +       Numeric         num = PG_GETARG_NUMERIC(0);
> +       NumericVar      arg;
> +       int                     minscale;
> +
> +       if (NUMERIC_IS_NAN(num))
> +               PG_RETURN_NULL();
> +
> +       init_var_from_num(num, &arg);
> +       strip_var(&arg);
> +
> +       minscale = get_min_scale(&arg);
> +       free_var(&arg);
> +
> +       PG_RETURN_INT32(minscale);
> +}
> +
> +/*
> + * Reduce scale of numeric value so value is not changed
>
> ****
> Likewise, comment text could be improved
> ****
>
> + */
> +Datum
> +numeric_trim_scale(PG_FUNCTION_ARGS)
> +{
> +       Numeric         num = PG_GETARG_NUMERIC(0);
> +       Numeric         res;
> +       NumericVar      result;
> +
> +       if (NUMERIC_IS_NAN(num))
> +               PG_RETURN_NULL();
> +
> +       init_var_from_num(num, &result);
> +       strip_var(&result);
> +
> +       result.dscale = get_min_scale(&result);
> +
> +       res = make_result(&result);
> +       free_var(&result);
> +
> +       PG_RETURN_NUMERIC(res);
> +}
>
>  /*
>   ----------------------------------------------------------------------
>   * diff --git a/src/include/catalog/pg_proc.dat
>   b/src/include/catalog/pg_proc.dat index 58ea5b982b..e603a5d8dd 100644
>
> ****
> How about moving these new lines to right after line# 4255, the
> scale() function?
> ****
>

has sense, moved


> --- a/src/include/catalog/pg_proc.dat
> +++ b/src/include/catalog/pg_proc.dat
> @@ -4288,6 +4288,12 @@
>    proname => 'width_bucket', prorettype => 'int4',
>    proargtypes => 'numeric numeric numeric int4',
>    prosrc => 'width_bucket_numeric' },
> +{ oid => '3434', descr => 'returns minimal scale of numeric value',
>
> ****
> How about a descr of?:
>
>   minimal scale needed to store the supplied value without data loss
> ****
>

done

>
> +  proname => 'minscale', prorettype => 'int4', proargtypes =>
>   'numeric',
> +  prosrc => 'numeric_minscale' },
> +{ oid => '3435', descr => 'returns numeric value with minimal scale',
>
> ****
> And likewise a descr of?:
>
>   numeric with minimal scale needed to represent the given value
> ****
>
> +  proname => 'trim_scale', prorettype => 'numeric', proargtypes =>
>   'numeric',
> +  prosrc => 'numeric_trim_scale' },
>

done


>  { oid => '1747',
>    proname => 'time_pl_interval', prorettype => 'time',
> diff --git a/src/test/regress/expected/numeric.out
>   b/src/test/regress/expected/numeric.out index 1cb3c3bfab..778c204b13
>   100644
>
> ****
> I have suggestions:
>
> Give the 2 functions separate comments (-- Tests for minscale() and
> -- Tests for trim_scale())
>
> Put () after the function names in the comments
> because that's what scale() does.
>
> Move the lines so the tests are right after the tests of scale().
>
> Be explicit when testing for NULL or NaN.  I don't know that this is
> consistent with the rest of the regression tests but I don't see how
> being explicit could be wrong.  Otherwise NULL and NaN are output the
> same ("") and you're not really testing.
>
> So test with expressions like "foo(NULL) IS NULL" or
> "foo('NaN'::NUMERIC) = 'NaN::NUMERIC" and look for t (or f) results.
>

ok fixed

Thank you for review - I am sending updated rebased patch. Please, update
comments freely - my language skills (about English lang) are basic.

Regards

Pavel



> ****
>
> Regards,
>
> Karl <k...@meme.com>
> Free Software:  "You don't pay back, you pay forward."
>                  -- Robert A. Heinlein
>

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 57a1539506..e7de1837e2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -918,6 +918,20 @@
        <entry><literal>6.0000000000</literal></entry>
       </row>
 
+      <row>
+       <entry>
+        <indexterm>
+         <primary>minscale</primary>
+        </indexterm>
+        <literal><function>minscale(<type>numeric</type>)</function></literal>
+       </entry>
+       <entry><type>integer</type></entry>
+       <entry>minimal scale (number of decimal digits in the fractional part) needed
+       to store the supplied value without data loss</entry>
+       <entry><literal>scale(8.4100)</literal></entry>
+       <entry><literal>2</literal></entry>
+      </row>
+
       <row>
        <entry>
         <indexterm>
@@ -1041,6 +1055,20 @@
        <entry><literal>1.4142135623731</literal></entry>
       </row>
 
+      <row>
+       <entry>
+        <indexterm>
+         <primary>trim_scale</primary>
+        </indexterm>
+        <literal><function>trim_scale(<type>numeric</type>)</function></literal>
+       </entry>
+       <entry><type>numeric</type></entry>
+       <entry>reduce the scale (the number of decimal digits in the fractional part)
+       to the minimum needed to store the supplied value without data loss</entry>
+       <entry><literal>scale(8.4100)</literal></entry>
+       <entry><literal>8.41</literal></entry>
+      </row>
+
       <row>
        <entry>
         <indexterm>
diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index a00db3ce7a..085651d51d 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -5620,6 +5620,88 @@ int2int4_sum(PG_FUNCTION_ARGS)
 	PG_RETURN_DATUM(Int64GetDatumFast(transdata->sum));
 }
 
+/*
+ * Calculate minimal scale. The var should be stripped already.
+ */
+static int
+get_min_scale(NumericVar *var)
+{
+	int		minscale = 0;
+
+	if (var->ndigits > 0)
+	{
+		NumericDigit last_digit;
+
+		/* maximal size of minscale, can be lower */
+		minscale = (var->ndigits - var->weight - 1) * DEC_DIGITS;
+
+		/*
+		 * When there are not digits after decimal point, the previous expression
+		 * can be negative. In this case, the minscale must be zero.
+		 */
+		if (minscale > 0)
+		{
+			/* reduce minscale if trailing digits in last numeric digits are zero */
+			last_digit = var->digits[var->ndigits - 1];
+
+			while (last_digit % 10 ==  0)
+			{
+				minscale--;
+				last_digit /= 10;
+			}
+		}
+		else
+			minscale = 0;
+	}
+
+	return minscale;
+}
+
+/*
+ * Returns minimal scale required to represent supplied value without loss.
+ */
+Datum
+numeric_minscale(PG_FUNCTION_ARGS)
+{
+	Numeric		num = PG_GETARG_NUMERIC(0);
+	NumericVar	arg;
+	int			minscale;
+
+	if (NUMERIC_IS_NAN(num))
+		PG_RETURN_NULL();
+
+	init_var_from_num(num, &arg);
+	strip_var(&arg);
+
+	minscale = get_min_scale(&arg);
+	free_var(&arg);
+
+	PG_RETURN_INT32(minscale);
+}
+
+/*
+ * Reduce scale of numeric value to represent supplied value without loss.
+ */
+Datum
+numeric_trim_scale(PG_FUNCTION_ARGS)
+{
+	Numeric		num = PG_GETARG_NUMERIC(0);
+	Numeric		res;
+	NumericVar	result;
+
+	if (NUMERIC_IS_NAN(num))
+		PG_RETURN_NUMERIC(make_result(&const_nan));
+
+	init_var_from_num(num, &result);
+	strip_var(&result);
+
+	result.dscale = get_min_scale(&result);
+
+	res = make_result(&result);
+	free_var(&result);
+
+	PG_RETURN_NUMERIC(res);
+}
 
 /* ----------------------------------------------------------------------
  *
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index ac8f64b219..1c66b8dd44 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4254,6 +4254,12 @@
 { oid => '3281', descr => 'number of decimal digits in the fractional part',
   proname => 'scale', prorettype => 'int4', proargtypes => 'numeric',
   prosrc => 'numeric_scale' },
+{ oid => '3434', descr => 'minimal scale needed to store the supplied value without data loss',
+  proname => 'minscale', prorettype => 'int4', proargtypes => 'numeric',
+  prosrc => 'numeric_minscale' },
+{ oid => '3435', descr => 'numeric with minimal scale needed to represent the given value',
+  proname => 'trim_scale', prorettype => 'numeric', proargtypes => 'numeric',
+  prosrc => 'numeric_trim_scale' },
 { oid => '1740', descr => 'convert int4 to numeric',
   proname => 'numeric', prorettype => 'numeric', proargtypes => 'int4',
   prosrc => 'int4_numeric' },
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index 1cb3c3bfab..19a2cccde3 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2094,3 +2094,129 @@ SELECT SUM((-9999)::numeric) FROM generate_series(1, 100000);
  -999900000
 (1 row)
 
+--
+-- Tests for minscale()
+--
+select minscale(numeric 'NaN') IS NULL; -- should be true
+ ?column? 
+----------
+ t
+(1 row)
+
+select minscale(NULL::numeric) IS NULL; -- should be true
+ ?column? 
+----------
+ t
+(1 row)
+
+select minscale(1.120);
+ minscale 
+----------
+        2
+(1 row)
+
+select minscale(0);
+ minscale 
+----------
+        0
+(1 row)
+
+select minscale(0.00);
+ minscale 
+----------
+        0
+(1 row)
+
+select minscale(1.1234500);
+ minscale 
+----------
+        5
+(1 row)
+
+select minscale(110123.12475871856128000);
+ minscale 
+----------
+       14
+(1 row)
+
+select minscale(-1123.124718561280000000);
+ minscale 
+----------
+       11
+(1 row)
+
+select minscale(-13.00000000000000000000);
+ minscale 
+----------
+        0
+(1 row)
+
+select minscale(1e100);
+ minscale 
+----------
+        0
+(1 row)
+
+--
+-- Tests for trim_scale()
+--
+select trim_scale(numeric 'NaN') = 'NaN':: numeric; -- should be true
+ ?column? 
+----------
+ t
+(1 row)
+
+select trim_scale(NULL::numeric) IS NULL; -- should be true
+ ?column? 
+----------
+ t
+(1 row)
+
+select trim_scale(1.120);
+ trim_scale 
+------------
+       1.12
+(1 row)
+
+select trim_scale(0);
+ trim_scale 
+------------
+          0
+(1 row)
+
+select trim_scale(0.00);
+ trim_scale 
+------------
+          0
+(1 row)
+
+select trim_scale(1.1234500);
+ trim_scale 
+------------
+    1.12345
+(1 row)
+
+select trim_scale(110123.12475871856128000);
+      trim_scale       
+-----------------------
+ 110123.12475871856128
+(1 row)
+
+select trim_scale(-1123.124718561280000000);
+    trim_scale     
+-------------------
+ -1123.12471856128
+(1 row)
+
+select trim_scale(-13.00000000000000000000);
+ trim_scale 
+------------
+        -13
+(1 row)
+
+select trim_scale(1e100);
+                                              trim_scale                                               
+-------------------------------------------------------------------------------------------------------
+ 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+(1 row)
+
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index a939412359..5fbcf20a5b 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1043,3 +1043,33 @@ select scale(-13.000000000000000);
 -- cases that need carry propagation
 SELECT SUM(9999::numeric) FROM generate_series(1, 100000);
 SELECT SUM((-9999)::numeric) FROM generate_series(1, 100000);
+
+--
+-- Tests for minscale()
+--
+
+select minscale(numeric 'NaN') IS NULL; -- should be true
+select minscale(NULL::numeric) IS NULL; -- should be true
+select minscale(1.120);
+select minscale(0);
+select minscale(0.00);
+select minscale(1.1234500);
+select minscale(110123.12475871856128000);
+select minscale(-1123.124718561280000000);
+select minscale(-13.00000000000000000000);
+select minscale(1e100);
+
+--
+-- Tests for trim_scale()
+--
+
+select trim_scale(numeric 'NaN') = 'NaN':: numeric; -- should be true
+select trim_scale(NULL::numeric) IS NULL; -- should be true
+select trim_scale(1.120);
+select trim_scale(0);
+select trim_scale(0.00);
+select trim_scale(1.1234500);
+select trim_scale(110123.12475871856128000);
+select trim_scale(-1123.124718561280000000);
+select trim_scale(-13.00000000000000000000);
+select trim_scale(1e100);

Re: proposal: minscale, rtrim, btrim functions for numeric

Reply via email to