Re: Fix overflow in DecodeInterval

2022-04-03 Thread Joseph Koshakow
On Sun, Apr 3, 2022 at 3:06 PM Tom Lane  wrote:
> That buildfarm machine is pretty slow, so I'm not in a hurry to test
> it manually either.  However, now that we realize the issue is about
> whether strtod(".") produces EINVAL or not, I think we need to fix
> all the places in datetime.c that are risking that.  After a bit of
> hacking I have the attached.  (I think that the call sites for
> strtoint and its variants are not at risk of passing empty strings,
> so there's not need for concern there.)
>
> BTW, the way you had it coded would allow 'P.Y0M3DT4H5M6S', which
> I don't think we want to allow --- at least, that's rejected by v14
> on my machine.


Oh yeah, good catch. Your patch seems like it should
fix all the issues. Thanks again for the help!

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Tom Lane
Joseph Koshakow  writes:
> How does this patch look? I don't really have any way to test it on
> AIX.

That buildfarm machine is pretty slow, so I'm not in a hurry to test
it manually either.  However, now that we realize the issue is about
whether strtod(".") produces EINVAL or not, I think we need to fix
all the places in datetime.c that are risking that.  After a bit of
hacking I have the attached.  (I think that the call sites for
strtoint and its variants are not at risk of passing empty strings,
so there's not need for concern there.)

BTW, the way you had it coded would allow 'P.Y0M3DT4H5M6S', which
I don't think we want to allow --- at least, that's rejected by v14
on my machine.

regards, tom lane

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 462f2ed7a8..4c12c4d663 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -668,19 +668,50 @@ AdjustYears(int64 val, int scale,
 }
 
 
-/* Fetch a fractional-second value with suitable error checking */
+/*
+ * Parse the fractional part of a number (decimal point and optional digits,
+ * followed by end of string).  Returns the fractional value into *frac.
+ *
+ * Returns 0 if successful, DTERR code if bogus input detected.
+ */
+static int
+ParseFraction(char *cp, double *frac)
+{
+	/* Caller should always pass the start of the fraction part */
+	Assert(*cp == '.');
+
+	/*
+	 * We want to allow just "." with no digits, but some versions of strtod
+	 * will report EINVAL for that, so special-case it.
+	 */
+	if (cp[1] == '\0')
+	{
+		*frac = 0;
+	}
+	else
+	{
+		errno = 0;
+		*frac = strtod(cp, );
+		/* check for parse failure */
+		if (*cp != '\0' || errno != 0)
+			return DTERR_BAD_FORMAT;
+	}
+	return 0;
+}
+
+/*
+ * Fetch a fractional-second value with suitable error checking.
+ * Same as ParseFraction except we convert the result to integer microseconds.
+ */
 static int
 ParseFractionalSecond(char *cp, fsec_t *fsec)
 {
 	double		frac;
+	int			dterr;
 
-	/* Caller should always pass the start of the fraction part */
-	Assert(*cp == '.');
-	errno = 0;
-	frac = strtod(cp, );
-	/* check for parse failure */
-	if (*cp != '\0' || errno != 0)
-		return DTERR_BAD_FORMAT;
+	dterr = ParseFraction(cp, );
+	if (dterr)
+		return dterr;
 	*fsec = rint(frac * 100);
 	return 0;
 }
@@ -1248,10 +1279,9 @@ DecodeDateTime(char **field, int *ftype, int nf,
 			{
 double		time;
 
-errno = 0;
-time = strtod(cp, );
-if (*cp != '\0' || errno != 0)
-	return DTERR_BAD_FORMAT;
+dterr = ParseFraction(cp, );
+if (dterr)
+	return dterr;
 time *= USECS_PER_DAY;
 dt2time(time,
 		>tm_hour, >tm_min,
@@ -2146,10 +2176,9 @@ DecodeTimeOnly(char **field, int *ftype, int nf,
 			{
 double		time;
 
-errno = 0;
-time = strtod(cp, );
-if (*cp != '\0' || errno != 0)
-	return DTERR_BAD_FORMAT;
+dterr = ParseFraction(cp, );
+if (dterr)
+	return dterr;
 time *= USECS_PER_DAY;
 dt2time(time,
 		>tm_hour, >tm_min,
@@ -3035,13 +3064,21 @@ DecodeNumberField(int len, char *str, int fmask,
 		 * Can we use ParseFractionalSecond here?  Not clear whether trailing
 		 * junk should be rejected ...
 		 */
-		double		frac;
+		if (cp[1] == '\0')
+		{
+			/* avoid assuming that strtod will accept "." */
+			*fsec = 0;
+		}
+		else
+		{
+			double		frac;
 
-		errno = 0;
-		frac = strtod(cp, NULL);
-		if (errno != 0)
-			return DTERR_BAD_FORMAT;
-		*fsec = rint(frac * 100);
+			errno = 0;
+			frac = strtod(cp, NULL);
+			if (errno != 0)
+return DTERR_BAD_FORMAT;
+			*fsec = rint(frac * 100);
+		}
 		/* Now truncate off the fraction for further processing */
 		*cp = '\0';
 		len = strlen(str);
@@ -3467,11 +3504,9 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 }
 else if (*cp == '.')
 {
-	errno = 0;
-	fval = strtod(cp, );
-	if (*cp != '\0' || errno != 0)
-		return DTERR_BAD_FORMAT;
-
+	dterr = ParseFraction(cp, );
+	if (dterr)
+		return dterr;
 	if (*field[i] == '-')
 		fval = -fval;
 }
@@ -3650,6 +3685,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
  * Helper functions to avoid duplicated code in DecodeISO8601Interval.
  *
  * Parse a decimal value and break it into integer and fractional parts.
+ * Set *endptr to end+1 of the parsed substring.
  * Returns 0 or DTERR code.
  */
 static int
@@ -3676,7 +3712,20 @@ ParseISO8601Number(char *str, char **endptr, int64 *ipart, double *fpart)
 
 	/* Parse fractional part if there is any */
 	if (**endptr == '.')
-		*fpart = strtod(*endptr, endptr) * sign;
+	{
+		/*
+		 * As in ParseFraction, some versions of strtod insist on seeing some
+		 * digits after '.', but some don't.  We want to allow zero digits
+		 * after '.' as long as there were some before it.
+		 */
+		if 

Re: Fix overflow in DecodeInterval

2022-04-03 Thread Joseph Koshakow
On Sun, Apr 3, 2022 at 12:44 PM Joseph Koshakow  wrote:
>
> On Sun, Apr 3, 2022 at 12:30 PM Tom Lane  wrote:
> >
> > Joseph Koshakow  writes:
> > > So I think we need to check that endptr has moved both after
> > > the call to strtoi64() and strtod().
> >
> > I'm not sure we need to do that explicitly, given that there's
> > a check later as to whether endptr is pointing at \0; that will
> > fail if endptr wasn't advanced.
> >
> > The fix I was loosely envisioning was to check for cp[1] == '\0'
> > and not bother calling strtod() in that case.
>
> Ah, ok I see what you mean. I agree an approach like that should
> work, but I don't actually think cp is null terminated in this case. The
> entire Interval is passed to DecodeISO8601Interval() as one big
> string, so the specific number we're parsing may be somewhere
> in the middle.
>
> If we just do the opposite and check isdigit(cp[1]) and only call
> strtod() in that case I think it should work.
>
> - Joe Koshakow

How does this patch look? I don't really have any way to test it on
AIX.

- Joe Koshakow
From 46b1ce5a78e21b65536c62ca6270c26c992a1ef7 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sun, 3 Apr 2022 12:58:36 -0400
Subject: [PATCH] Fix parsing trailing decimal point in ISO8601

---
 src/backend/utils/adt/datetime.c   |  9 +++--
 src/test/regress/expected/interval.out | 49 --
 src/test/regress/sql/interval.sql  | 11 +-
 3 files changed, 63 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 462f2ed7a8..178313e0d1 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3676,8 +3676,13 @@ ParseISO8601Number(char *str, char **endptr, int64 *ipart, double *fpart)
 
 	/* Parse fractional part if there is any */
 	if (**endptr == '.')
-		*fpart = strtod(*endptr, endptr) * sign;
-
+	{
+		/* A decimal point with no trailing numbers should be parsed as 0 */
+		if (isdigit((unsigned char) *(*endptr + 1)))
+			*fpart = strtod(*endptr, endptr) * sign;
+		else
+			(*endptr)++;
+	}
 	/* did we not see anything that looks like a number? */
 	if (*endptr == str || errno != 0)
 		return DTERR_BAD_FORMAT;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index 86c8d4bc99..ed051a55c4 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -1464,9 +1464,9 @@ select interval 'PT2562047788.1:00:54.775807';
 ERROR:  interval field value out of range: "PT2562047788.1:00:54.775807"
 LINE 1: select interval 'PT2562047788.1:00:54.775807';
 ^
-select interval 'PT2562047788:01.:54.775807';
-ERROR:  interval field value out of range: "PT2562047788:01.:54.775807"
-LINE 1: select interval 'PT2562047788:01.:54.775807';
+select interval 'PT2562047788:01:54.775807';
+ERROR:  interval field value out of range: "PT2562047788:01:54.775807"
+LINE 1: select interval 'PT2562047788:01:54.775807';
 ^
 -- overflowing with fractional fields - SQL standard format
 select interval '0.1 2562047788:0:54.775807';
@@ -1539,6 +1539,49 @@ select interval '-2147483648 months -2147483648 days -9223372036854775808 us';
  @ 178956970 years 8 mons 2147483648 days 2562047788 hours 54.775808 secs ago
 (1 row)
 
+-- check that ISO8601 format accepts trailing '.'
+select interval 'P1.Y2M3DT4H5M6S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
+select interval 'P1Y2.M3DT4H5M6S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
+select interval 'P1Y2M3.DT4H5M6S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
+select interval 'P1Y2M3DT4.H5M6S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
+select interval 'P1Y2M3DT4H5.M6S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
+select interval 'P1Y2M3DT4H5M6.S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
+select interval 'P1.Y2.M3.DT4.H5.M6.S';
+   interval   
+--
+ @ 1 year 2 mons 3 days 4 hours 5 mins 6 secs
+(1 row)
+
 -- check that '30 days' equals '1 month' according to the hash function
 select '30 days'::interval = '1 month'::interval as t;
  t 
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index f05055e03a..62d97f232c 100644
--- 

Re: Fix overflow in DecodeInterval

2022-04-03 Thread Joseph Koshakow
On Sun, Apr 3, 2022 at 12:30 PM Tom Lane  wrote:
>
> Joseph Koshakow  writes:
> > So I think we need to check that endptr has moved both after
> > the call to strtoi64() and strtod().
>
> I'm not sure we need to do that explicitly, given that there's
> a check later as to whether endptr is pointing at \0; that will
> fail if endptr wasn't advanced.
>
> The fix I was loosely envisioning was to check for cp[1] == '\0'
> and not bother calling strtod() in that case.

Ah, ok I see what you mean. I agree an approach like that should
work, but I don't actually think cp is null terminated in this case. The
entire Interval is passed to DecodeISO8601Interval() as one big
string, so the specific number we're parsing may be somewhere
in the middle.

If we just do the opposite and check isdigit(cp[1]) and only call
strtod() in that case I think it should work.

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Tom Lane
Joseph Koshakow  writes:
> On Sun, Apr 3, 2022 at 12:03 PM Tom Lane  wrote:
>> Oh ... a bit of testing says that strtod() on an empty string
>> succeeds (returning zero) on Linux, but fails with EINVAL on
>> AIX.  The latter is a lot less surprising than the former,
>> so we'd better cope.

> I'm not sure I follow exactly. Where would we pass an empty
> string to strtod()? Wouldn't we be passing a string with a
> single character of '.'?

Oh, I was thinking that we passed "cp + 1" to strtod, but that
was just caffeine deprivation.  You're right, what we are asking
it to parse is "." not "".  The result is the same though:
per testing, AIX sets EINVAL and Linux doesn't.

> So I think we need to check that endptr has moved both after
> the call to strtoi64() and strtod().

I'm not sure we need to do that explicitly, given that there's
a check later as to whether endptr is pointing at \0; that will
fail if endptr wasn't advanced.

The fix I was loosely envisioning was to check for cp[1] == '\0'
and not bother calling strtod() in that case.

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Joseph Koshakow
On Sun, Apr 3, 2022 at 12:03 PM Tom Lane  wrote:
>
> I wrote:
> > Joseph Koshakow  writes:
> >> I think I know that the issue is. It's with `ParseISO8601Number` and
> >> the minutes field "1.".
> >> Previously that function parsed the entire field into a single double,
> >> so "1." would
> >> be parsed into 1.0. Now we try to parse the integer and decimal parts
> >> separately. So
> >> we first parse "1" into 1 and then fail to "." into anything because
> >> it's not a valid decimal.
>
> > Interesting point, but then why doesn't it fail everywhere?
>
> Oh ... a bit of testing says that strtod() on an empty string
> succeeds (returning zero) on Linux, but fails with EINVAL on
> AIX.  The latter is a lot less surprising than the former,
> so we'd better cope.
>
> (Reading POSIX with an eagle eye, it looks like both behaviors
> are allowed per spec: this is why you have to check that endptr
> was advanced to be sure everything is kosher.)
>
> regards, tom lane

I'm not sure I follow exactly. Where would we pass an empty
string to strtod()? Wouldn't we be passing a string with a
single character of '.'?

Either way, from reading the man pages though it seems
that strtod() has the same behavior on any invalid input in
Linux, return 0 and don't advance endptr.

So I think we need to check that endptr has moved both after
the call to strtoi64() and strtod().

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Tom Lane
I wrote:
> Joseph Koshakow  writes:
>> I think I know that the issue is. It's with `ParseISO8601Number` and
>> the minutes field "1.".
>> Previously that function parsed the entire field into a single double,
>> so "1." would
>> be parsed into 1.0. Now we try to parse the integer and decimal parts
>> separately. So
>> we first parse "1" into 1 and then fail to "." into anything because
>> it's not a valid decimal.

> Interesting point, but then why doesn't it fail everywhere?

Oh ... a bit of testing says that strtod() on an empty string
succeeds (returning zero) on Linux, but fails with EINVAL on
AIX.  The latter is a lot less surprising than the former,
so we'd better cope.

(Reading POSIX with an eagle eye, it looks like both behaviors
are allowed per spec: this is why you have to check that endptr
was advanced to be sure everything is kosher.)

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Tom Lane
Joseph Koshakow  writes:
> On Sun, Apr 3, 2022 at 3:09 AM Tom Lane  wrote:
>> Hmm ... buildfarm's not entirely happy [1][2][3]:

> I think I know that the issue is. It's with `ParseISO8601Number` and
> the minutes field "1.".
> Previously that function parsed the entire field into a single double,
> so "1." would
> be parsed into 1.0. Now we try to parse the integer and decimal parts
> separately. So
> we first parse "1" into 1 and then fail to "." into anything because
> it's not a valid decimal.

Interesting point, but then why doesn't it fail everywhere?

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Joseph Koshakow
On Sun, Apr 3, 2022 at 3:09 AM Tom Lane  wrote:
>
> I wrote:
> > Cool.  I've pushed the patch.
>
> Hmm ... buildfarm's not entirely happy [1][2][3]:
>
> diff -U3 
> /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/expected/interval.out 
> /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/results/interval.out
> --- 
> /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/expected/interval.out 
> 2022-04-03 04:56:32.0 +
> +++ 
> /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/results/interval.out  
> 2022-04-03 05:23:00.0 +
> @@ -1465,7 +1465,7 @@
>   LINE 1: select interval 'PT2562047788.1:00:54.775807';
>   ^
>   select interval 'PT2562047788:01.:54.775807';
> - ERROR:  interval field value out of range: "PT2562047788:01.:54.775807"
> + ERROR:  invalid input syntax for type interval: "PT2562047788:01.:54.775807"
>   LINE 1: select interval 'PT2562047788:01.:54.775807';
>   ^
>   -- overflowing with fractional fields - SQL standard format
>
> What do you make of that?  I'm betting that strtod() works a
> bit differently on those old platforms, but too tired to
> look closer tonight.
>
> regards, tom lane
>
> [1] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2022-04-03%2004%3A56%3A34
> [2] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoverfly=2022-04-03%2000%3A51%3A50
> [3] 
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole=2022-04-03%2000%3A32%3A10

I think I know that the issue is. It's with `ParseISO8601Number` and
the minutes field "1.".
Previously that function parsed the entire field into a single double,
so "1." would
be parsed into 1.0. Now we try to parse the integer and decimal parts
separately. So
we first parse "1" into 1 and then fail to "." into anything because
it's not a valid decimal.

What's interesting is that I believe this syntax, "1.", always would
have failed for
non-ISO8601 Interval. It was only previously valid with ISO8601 intervals.

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-03 Thread Tom Lane
I wrote:
> Cool.  I've pushed the patch.

Hmm ... buildfarm's not entirely happy [1][2][3]:

diff -U3 
/home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/expected/interval.out 
/home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/results/interval.out
--- /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/expected/interval.out 
2022-04-03 04:56:32.0 +
+++ /home/nm/farm/gcc64/HEAD/pgsql.build/src/test/regress/results/interval.out  
2022-04-03 05:23:00.0 +
@@ -1465,7 +1465,7 @@
  LINE 1: select interval 'PT2562047788.1:00:54.775807';
  ^
  select interval 'PT2562047788:01.:54.775807';
- ERROR:  interval field value out of range: "PT2562047788:01.:54.775807"
+ ERROR:  invalid input syntax for type interval: "PT2562047788:01.:54.775807"
  LINE 1: select interval 'PT2562047788:01.:54.775807';
  ^
  -- overflowing with fractional fields - SQL standard format

What do you make of that?  I'm betting that strtod() works a
bit differently on those old platforms, but too tired to
look closer tonight.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2022-04-03%2004%3A56%3A34
[2] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hoverfly=2022-04-03%2000%3A51%3A50
[3] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole=2022-04-03%2000%3A32%3A10




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Tom Lane
Joseph Koshakow  writes:
> On Sat, Apr 2, 2022 at 3:08 PM Tom Lane  wrote:
>> I think it's not, at least not for the interesting range of possible
>> values in this code.  Given that abs(frac) < 1 to start with, the
>> abs value of usec can't exceed the value of scale, which is at most
>> USECS_PER_DAY so it's at most 37 or so bits, which is well within
>> the exact range for any sane implementation of double.  It would
>> take a very poor floating-point implementation to not get the right
>> answer here.  (And we're largely assuming IEEE-compliant floats these
>> days.)

> Ah, I see. That makes sense to me.

Cool.  I've pushed the patch.

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Joseph Koshakow
On Sat, Apr 2, 2022 at 3:08 PM Tom Lane  wrote:
>
> Joseph Koshakow  writes:
> > Ok I actually remember now, the issue is with the rounding
> > code in AdjustFractMicroseconds.
> > ...
> > I believe it's possible for `frac -= usec;` to result in a value greater
> > than 1 or less than -1 due to the lossiness of int64 to double
> > conversions.
>
> I think it's not, at least not for the interesting range of possible
> values in this code.  Given that abs(frac) < 1 to start with, the
> abs value of usec can't exceed the value of scale, which is at most
> USECS_PER_DAY so it's at most 37 or so bits, which is well within
> the exact range for any sane implementation of double.  It would
> take a very poor floating-point implementation to not get the right
> answer here.  (And we're largely assuming IEEE-compliant floats these
> days.)

Ah, I see. That makes sense to me.

On Sat, Apr 2, 2022 at 3:10 PM Tom Lane  wrote:
>
> Joseph Koshakow  writes:
> > I took a stab at this issue and the attached patch (which would be
> > applied on top of your v10 patch) seems to fix the issue. Feel
> > free to ignore it if you're already working on a fix.
>
> You really only need to flip val/fval in one place.  More to the
> point, there's also the hh:mm:ss paths to deal with; see my v11.

Good point. Thanks again for all the help!

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Tom Lane
Joseph Koshakow  writes:
> I took a stab at this issue and the attached patch (which would be
> applied on top of your v10 patch) seems to fix the issue. Feel
> free to ignore it if you're already working on a fix.

You really only need to flip val/fval in one place.  More to the
point, there's also the hh:mm:ss paths to deal with; see my v11.

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Tom Lane
Joseph Koshakow  writes:
> Ok I actually remember now, the issue is with the rounding
> code in AdjustFractMicroseconds.
> ...
> I believe it's possible for `frac -= usec;` to result in a value greater
> than 1 or less than -1 due to the lossiness of int64 to double
> conversions.

I think it's not, at least not for the interesting range of possible
values in this code.  Given that abs(frac) < 1 to start with, the
abs value of usec can't exceed the value of scale, which is at most 
USECS_PER_DAY so it's at most 37 or so bits, which is well within
the exact range for any sane implementation of double.  It would
take a very poor floating-point implementation to not get the right
answer here.  (And we're largely assuming IEEE-compliant floats these
days.)

Anyway, the other issue indeed turns out to be easy to fix.
Attached is a v11 that deals with that.  If the cfbot doesn't
complain about it, I'll push this later today.

regards, tom lane

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index ba0ec35ac5..462f2ed7a8 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -37,17 +38,31 @@ static int	DecodeNumber(int flen, char *field, bool haveTextMonth,
 static int	DecodeNumberField(int len, char *str,
 			  int fmask, int *tmask,
 			  struct pg_tm *tm, fsec_t *fsec, bool *is2digits);
+static int	DecodeTimeCommon(char *str, int fmask, int range,
+			 int *tmask, struct pg_itm *itm);
 static int	DecodeTime(char *str, int fmask, int range,
 	   int *tmask, struct pg_tm *tm, fsec_t *fsec);
+static int	DecodeTimeForInterval(char *str, int fmask, int range,
+  int *tmask, struct pg_itm_in *itm_in);
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
-			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
-			int scale);
+static bool int64_multiply_add(int64 val, int64 multiplier, int64 *sum);
+static bool AdjustFractMicroseconds(double frac, int64 scale,
+	struct pg_itm_in *itm_in);
+static bool AdjustFractDays(double frac, int scale,
+			struct pg_itm_in *itm_in);
+static bool AdjustFractYears(double frac, int scale,
+			 struct pg_itm_in *itm_in);
+static bool AdjustMicroseconds(int64 val, double fval, int64 scale,
+			   struct pg_itm_in *itm_in);
+static bool AdjustDays(int64 val, int scale,
+	   struct pg_itm_in *itm_in);
+static bool AdjustMonths(int64 val, struct pg_itm_in *itm_in);
+static bool AdjustYears(int64 val, int scale,
+		struct pg_itm_in *itm_in);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -425,7 +440,7 @@ GetCurrentTimeUsec(struct pg_tm *tm, fsec_t *fsec, int *tzp)
  * Returns a pointer to the new end of string.  No NUL terminator is put
  * there; callers are responsible for NUL terminating str themselves.
  *
- * Note that any sign is stripped from the input seconds values.
+ * Note that any sign is stripped from the input sec and fsec values.
  */
 static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
@@ -471,7 +486,7 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 
 		/*
 		 * If we still have a non-zero value then precision must have not been
-		 * enough to print the number.  We punt the problem to pg_ltostr(),
+		 * enough to print the number.  We punt the problem to pg_ultostr(),
 		 * which will generate a correct answer in the minimum valid width.
 		 */
 		if (value)
@@ -496,39 +511,163 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 	return AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
 }
 
+
+/*
+ * Add val * multiplier to *sum.
+ * Returns true if successful, false on overflow.
+ */
+static bool
+int64_multiply_add(int64 val, int64 multiplier, int64 *sum)
+{
+	int64		product;
+
+	if (pg_mul_s64_overflow(val, multiplier, ) ||
+		pg_add_s64_overflow(*sum, product, sum))
+		return false;
+	return true;
+}
+
 /*
- * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
- * We assume the input frac is less than 1 so overflow is not an issue.
+ * Multiply frac by scale (to produce microseconds) and add to itm_in->tm_usec.
+ * Returns true if successful, false if itm_in overflows.
  */
-static void
-AdjustFractSeconds(double frac, struct pg_tm 

Re: Fix overflow in DecodeInterval

2022-04-02 Thread Joseph Koshakow
On Fri, Apr 1, 2022 at 8:06 PM Tom Lane  wrote:
> I think the patch can be salvaged, though.  I like the concept
> of converting all the sub-day fields to microseconds immediately,
> because it avoids a host of issues, so I don't want to give that up.
> What I'm going to look into is detecting the sign-adjustment-needed
> case up front (which is easy enough, since it's looking at the
> input data not the conversion results) and then forcing the
> individual field values negative before we accumulate them into
> the pg_itm_in struct.

I took a stab at this issue and the attached patch (which would be
applied on top of your v10 patch) seems to fix the issue. Feel
free to ignore it if you're already working on a fix.

- Joe
From f43d27142a76fcbabf49e45b9457f8376744e759 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 2 Apr 2022 14:42:18 -0400
Subject: [PATCH 2/2] Fix sql standard style negative semantics

---
 src/backend/utils/adt/datetime.c   | 107 ++---
 src/test/regress/expected/interval.out |  14 
 src/test/regress/sql/interval.sql  |   5 ++
 3 files changed, 79 insertions(+), 47 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index dae90e4a9e..5842d249ab 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -50,6 +50,8 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
 static bool int64_multiply_add(int64 val, int64 multiplier, int64 *sum);
+static void AdjustForSqlStandardGlobalNegative(int64 *val, double *fval, 
+			   bool global_negative);
 static bool AdjustFractMicroseconds(double frac, int64 scale,
 	struct pg_itm_in *itm_in);
 static bool AdjustFractDays(double frac, int scale,
@@ -527,6 +529,19 @@ int64_multiply_add(int64 val, int64 multiplier, int64 *sum)
 	return true;
 }
 
+/*
+ * Adjust values sign if SQL Standard style is being used and there's a 
+ * single leading negative sign.
+ */
+static void AdjustForSqlStandardGlobalNegative(int64 *val, double *fval,
+			   bool global_negative)
+{
+	if (*val > 0 && global_negative) {
+		*val = -*val;
+		*fval = -*fval;
+	}
+}
+
 /*
  * Multiply frac by scale (to produce microseconds) and add to itm_in->tm_usec.
  * Returns true if successful, false if itm_in overflows.
@@ -3307,10 +3322,43 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 	int64		val;
 	double		fval;
 
+	bool		global_negative = false;
+
 	*dtype = DTK_DELTA;
 	type = IGNORE_DTF;
 	ClearPgItmIn(itm_in);
 
+	/*--
+	 * The SQL standard defines the interval literal
+	 *	 '-1 1:00:00'
+	 * to mean "negative 1 days and negative 1 hours", while Postgres
+	 * traditionally treats this as meaning "negative 1 days and positive
+	 * 1 hours".  In SQL_STANDARD intervalstyle, we apply the leading sign
+	 * to all fields if there are no other explicit signs.
+	 *
+	 * We leave the signs alone if there are additional explicit signs.
+	 * This protects us against misinterpreting postgres-style dump output,
+	 * since the postgres-style output code has always put an explicit sign on
+	 * all fields following a negative field.  But note that SQL-spec output
+	 * is ambiguous and can be misinterpreted on load!	(So it's best practice
+	 * to dump in postgres style, not SQL style.)
+	 *--
+	 */
+	if (IntervalStyle == INTSTYLE_SQL_STANDARD && *field[0] == '-')
+	{
+		/* Check for additional explicit signs */
+		boolmore_signs = false;
+		for (i = 1; i < nf; i++)
+		{
+			if (*field[i] == '-' || *field[i] == '+')
+			{
+more_signs = true;
+break;
+			}
+		}
+		global_negative = !more_signs;
+	}
+
 	/* read through list backwards to pick up units before values */
 	for (i = nf - 1; i >= 0; i--)
 	{
@@ -3447,18 +3495,21 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 switch (type)
 {
 	case DTK_MICROSEC:
+		AdjustForSqlStandardGlobalNegative(, , global_negative);
 		if (!AdjustMicroseconds(val, fval, 1, itm_in))
 			return DTERR_FIELD_OVERFLOW;
 		tmask = DTK_M(MICROSECOND);
 		break;
 
 	case DTK_MILLISEC:
+		AdjustForSqlStandardGlobalNegative(, , global_negative);
 		if (!AdjustMicroseconds(val, fval, 1000, itm_in))
 			return DTERR_FIELD_OVERFLOW;
 		tmask = DTK_M(MILLISECOND);
 		break;
 
 	case DTK_SECOND:
+		AdjustForSqlStandardGlobalNegative(, , global_negative);
 		if (!AdjustMicroseconds(val, fval, USECS_PER_SEC, itm_in))
 			return DTERR_FIELD_OVERFLOW;
 
@@ -3473,12 +3524,14 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 		break;
 
 	case DTK_MINUTE:
+		AdjustForSqlStandardGlobalNegative(, , global_negative);
 		if (!AdjustMicroseconds(val, fval, USECS_PER_MINUTE, itm_in))
 			return DTERR_FIELD_OVERFLOW;
 		tmask = DTK_M(MINUTE);
 		break;
 
 	

Re: Fix overflow in DecodeInterval

2022-04-02 Thread Joseph Koshakow
On Sat, Apr 2, 2022 at 2:22 PM Joseph Koshakow  wrote:
>
> On Sat, Apr 2, 2022 at 1:29 PM Joseph Koshakow  wrote:
> >
> > On Fri, Apr 1, 2022 at 8:06 PM Tom Lane  wrote:
> > >
> > > Joseph Koshakow  writes:
> > > > * The existing code for rounding had a lot of int to double
> > > > casting and vice versa. I *think* that doubles are able to completely
> > > > represent the range of ints. However doubles are not able to represent
> > > > the full range of int64. After making the change I started noticing
> > > > a lot of lossy behavior. One thought I had was to change the doubles
> > > > to long doubles, but I wasn't able to figure out if long doubles could
> > > > completely represent the range of int64. Especially since their size
> > > > varies depending on the architecture. Does anyone know the answer to
> > > > this?
> > >
> > > I agree that relying on long double is not a great plan.  However,
> > > I'm not seeing where there's a problem.  AFAICS the revised code
> > > only uses doubles to represent fractions from the input, ie if you
> > > write "123.456 hours" then the ".456" is carried around for awhile
> > > as a float.  This does not seem likely to pose any real-world
> > > problem; do you have a counterexample?
> >
> > Yeah, you're correct, I don't think there is any problem with just
> > using double. I don't exactly remember why I thought long double
> > was necessary in the revised code. I probably just confused
> > myself because it would have been necessary with the old
> > rounding code, but not the revised code.
>
> Ok I actually remember now, the issue is with the rounding
> code in AdjustFractMicroseconds.
>
> >frac *= scale;
> >usec = (int64) frac;
> >
> >/* Round off any fractional microsecond */
> >frac -= usec;
> >if (frac > 0.5)
> >   usec++;
> >else if (frac < -0.5)
> >   usec--;
>
> I believe it's possible for `frac -= usec;` to result in a value greater
> than 1 or less than -1 due to the lossiness of int64 to double
> conversions. Then we'd incorrectly round in one direction. I don't
> have a concrete counter example, but at worst we'd end up with a
> result that's a couple of microseconds off, so it's probably not a huge
> deal.
>
> If I'm right about the above, and we care enough to fix it, then I think
> it can be fixed with the following:
>
> >frac *= scale;
> >usec = (int64) frac;
> >
> >/* Remove non fractional part from frac */
> >frac -= (double) usec;
> >/* Adjust for lossy conversion from int64 to double */
> >while (frac < 0 && frac < -1)
> >   frac++;
> >while (frac > 0 && frac > 1)
> >   frac--;
> >
> >/* Round off any fractional microsecond */
> >if (frac > 0.5)
> >   usec++;
> >else if (frac < -0.5)
> >   usec--;


Sorry, those should be inclusive comparisons
>frac *= scale;
>usec = (int64) frac;
>
>/* Remove non fractional part from frac */
>frac -= (double) usec;
>/* Adjust for lossy conversion from int64 to double */
>while (frac < 0 && frac <= -1)
>   frac++;
>while (frac > 0 && frac >= 1)
>   frac--;
>
>/* Round off any fractional microsecond */
>if (frac > 0.5)
>   usec++;
>else if (frac < -0.5)
>   usec--;




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Joseph Koshakow
On Sat, Apr 2, 2022 at 1:29 PM Joseph Koshakow  wrote:
>
> On Fri, Apr 1, 2022 at 8:06 PM Tom Lane  wrote:
> >
> > Joseph Koshakow  writes:
> > > * The existing code for rounding had a lot of int to double
> > > casting and vice versa. I *think* that doubles are able to completely
> > > represent the range of ints. However doubles are not able to represent
> > > the full range of int64. After making the change I started noticing
> > > a lot of lossy behavior. One thought I had was to change the doubles
> > > to long doubles, but I wasn't able to figure out if long doubles could
> > > completely represent the range of int64. Especially since their size
> > > varies depending on the architecture. Does anyone know the answer to
> > > this?
> >
> > I agree that relying on long double is not a great plan.  However,
> > I'm not seeing where there's a problem.  AFAICS the revised code
> > only uses doubles to represent fractions from the input, ie if you
> > write "123.456 hours" then the ".456" is carried around for awhile
> > as a float.  This does not seem likely to pose any real-world
> > problem; do you have a counterexample?
>
> Yeah, you're correct, I don't think there is any problem with just
> using double. I don't exactly remember why I thought long double
> was necessary in the revised code. I probably just confused
> myself because it would have been necessary with the old
> rounding code, but not the revised code.

Ok I actually remember now, the issue is with the rounding
code in AdjustFractMicroseconds.

>frac *= scale;
>usec = (int64) frac;
>
>/* Round off any fractional microsecond */
>frac -= usec;
>if (frac > 0.5)
>   usec++;
>else if (frac < -0.5)
>   usec--;

I believe it's possible for `frac -= usec;` to result in a value greater
than 1 or less than -1 due to the lossiness of int64 to double
conversions. Then we'd incorrectly round in one direction. I don't
have a concrete counter example, but at worst we'd end up with a
result that's a couple of microseconds off, so it's probably not a huge
deal.

If I'm right about the above, and we care enough to fix it, then I think
it can be fixed with the following:

>frac *= scale;
>usec = (int64) frac;
>
>/* Remove non fractional part from frac */
>frac -= (double) usec;
>/* Adjust for lossy conversion from int64 to double */
>while (frac < 0 && frac < -1)
>   frac++;
>while (frac > 0 && frac > 1)
>   frac--;
>
>/* Round off any fractional microsecond */
>if (frac > 0.5)
>   usec++;
>else if (frac < -0.5)
>   usec--;

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Joseph Koshakow
On Fri, Apr 1, 2022 at 8:06 PM Tom Lane  wrote:
>
> Joseph Koshakow  writes:
> > * The existing code for rounding had a lot of int to double
> > casting and vice versa. I *think* that doubles are able to completely
> > represent the range of ints. However doubles are not able to represent
> > the full range of int64. After making the change I started noticing
> > a lot of lossy behavior. One thought I had was to change the doubles
> > to long doubles, but I wasn't able to figure out if long doubles could
> > completely represent the range of int64. Especially since their size
> > varies depending on the architecture. Does anyone know the answer to
> > this?
>
> I agree that relying on long double is not a great plan.  However,
> I'm not seeing where there's a problem.  AFAICS the revised code
> only uses doubles to represent fractions from the input, ie if you
> write "123.456 hours" then the ".456" is carried around for awhile
> as a float.  This does not seem likely to pose any real-world
> problem; do you have a counterexample?

Yeah, you're correct, I don't think there is any problem with just
using double. I don't exactly remember why I thought long double
was necessary in the revised code. I probably just confused
myself because it would have been necessary with the old
rounding code, but not the revised code.

> Anyway, I've spent today reviewing the code and cleaning up things
> I didn't like, and attached is a v10.

Thanks so much for the review and updates!

> I think the patch can be salvaged, though.  I like the concept
> of converting all the sub-day fields to microseconds immediately,
> because it avoids a host of issues, so I don't want to give that up.
> What I'm going to look into is detecting the sign-adjustment-needed
> case up front (which is easy enough, since it's looking at the
> input data not the conversion results) and then forcing the
> individual field values negative before we accumulate them into
> the pg_itm_in struct.

This sounds like a very reasonable and achievable approach
to me.

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-04-02 Thread Tom Lane
I wrote:
> ... I almost feel that this is
> committable, but there is one thing that is bothering me.  The
> part of DecodeInterval that does strange things with signs in the
> INTSTYLE_SQL_STANDARD case (starting about line 3400 in datetime.c
> before this patch, or line 3600 after) used to separately force the
> hour, minute, second, and microsecond fields to negative.
> Now it forces the merged tm_usec field to negative.  It seems to
> me that this could give a different answer than before, if the
> h/m/s/us values had been of different signs before they got merged.
> However, I don't think that that situation is possible in SQL-spec-
> compliant input, so it may not be a problem.  Again, a counterexample
> would be interesting.

As best I can tell, the case isn't reachable with spec-compliant input,
but it's easy to demonstrate an issue if you set intervalstyle to
sql_standard and then put in Postgres-format input.  Historically,
you got

regression=# show intervalstyle;
 IntervalStyle 
---
 postgres
(1 row)

regression=# select '-23 hours 45 min 12.34 sec'::interval;
   interval   
--
 -22:14:47.66
(1 row)

(because by default the field signs are taken as independent)

regression=# set intervalstyle = sql_standard ;
SET
regression=# select '-23 hours 45 min 12.34 sec'::interval;
   interval   
--
 -23:45:12.34
(1 row)

However, with this patch both cases produce "-22:14:47.66",
because we already merged the differently-signed fields and
DecodeInterval can't tease them apart again.  Perhaps we could
get away with changing this nonstandard corner case, but I'm
pretty uncomfortable with that thought --- I don't think
random semantics changes are within the charter of this patch.

I think the patch can be salvaged, though.  I like the concept
of converting all the sub-day fields to microseconds immediately,
because it avoids a host of issues, so I don't want to give that up.
What I'm going to look into is detecting the sign-adjustment-needed
case up front (which is easy enough, since it's looking at the
input data not the conversion results) and then forcing the
individual field values negative before we accumulate them into
the pg_itm_in struct.

Meanwhile, the fact that we didn't detect this issue immediately
shows that there's a gap in our regression tests.  So the *first*
thing I'm gonna do is push a patch to add test cases like what
I showed above.

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-04-01 Thread Tom Lane
Joseph Koshakow  writes:
> * The existing code for rounding had a lot of int to double
> casting and vice versa. I *think* that doubles are able to completely
> represent the range of ints. However doubles are not able to represent
> the full range of int64. After making the change I started noticing
> a lot of lossy behavior. One thought I had was to change the doubles
> to long doubles, but I wasn't able to figure out if long doubles could
> completely represent the range of int64. Especially since their size
> varies depending on the architecture. Does anyone know the answer to
> this?

I agree that relying on long double is not a great plan.  However,
I'm not seeing where there's a problem.  AFAICS the revised code
only uses doubles to represent fractions from the input, ie if you
write "123.456 hours" then the ".456" is carried around for awhile
as a float.  This does not seem likely to pose any real-world
problem; do you have a counterexample?

Anyway, I've spent today reviewing the code and cleaning up things
I didn't like, and attached is a v10.  I almost feel that this is
committable, but there is one thing that is bothering me.  The
part of DecodeInterval that does strange things with signs in the
INTSTYLE_SQL_STANDARD case (starting about line 3400 in datetime.c
before this patch, or line 3600 after) used to separately force the
hour, minute, second, and microsecond fields to negative.
Now it forces the merged tm_usec field to negative.  It seems to
me that this could give a different answer than before, if the
h/m/s/us values had been of different signs before they got merged.
However, I don't think that that situation is possible in SQL-spec-
compliant input, so it may not be a problem.  Again, a counterexample
would be interesting.

regards, tom lane

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index ba0ec35ac5..dae90e4a9e 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -37,17 +38,31 @@ static int	DecodeNumber(int flen, char *field, bool haveTextMonth,
 static int	DecodeNumberField(int len, char *str,
 			  int fmask, int *tmask,
 			  struct pg_tm *tm, fsec_t *fsec, bool *is2digits);
+static int	DecodeTimeCommon(char *str, int fmask, int range,
+			 int *tmask, struct pg_itm *itm);
 static int	DecodeTime(char *str, int fmask, int range,
 	   int *tmask, struct pg_tm *tm, fsec_t *fsec);
+static int	DecodeTimeForInterval(char *str, int fmask, int range,
+  int *tmask, struct pg_itm_in *itm_in);
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
-			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
-			int scale);
+static bool int64_multiply_add(int64 val, int64 multiplier, int64 *sum);
+static bool AdjustFractMicroseconds(double frac, int64 scale,
+	struct pg_itm_in *itm_in);
+static bool AdjustFractDays(double frac, int scale,
+			struct pg_itm_in *itm_in);
+static bool AdjustFractYears(double frac, int scale,
+			 struct pg_itm_in *itm_in);
+static bool AdjustMicroseconds(int64 val, double fval, int64 scale,
+			   struct pg_itm_in *itm_in);
+static bool AdjustDays(int64 val, int scale,
+	   struct pg_itm_in *itm_in);
+static bool AdjustMonths(int64 val, struct pg_itm_in *itm_in);
+static bool AdjustYears(int64 val, int scale,
+		struct pg_itm_in *itm_in);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -425,7 +440,7 @@ GetCurrentTimeUsec(struct pg_tm *tm, fsec_t *fsec, int *tzp)
  * Returns a pointer to the new end of string.  No NUL terminator is put
  * there; callers are responsible for NUL terminating str themselves.
  *
- * Note that any sign is stripped from the input seconds values.
+ * Note that any sign is stripped from the input sec and fsec values.
  */
 static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
@@ -471,7 +486,7 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 
 		/*
 		 * If we still have a non-zero value then precision must have not been
-		 * enough to print the number.  We punt the problem to pg_ltostr(),
+		 * enough to print the number.  We punt the problem to pg_ultostr(),
 		 * which will generate a correct answer in the minimum valid width.
 		 */
 		if (value)
@@ 

Re: Fix overflow in DecodeInterval

2022-03-23 Thread Tom Lane
Joseph Koshakow  writes:
> Sorry about that, I didn't have my IDE set up quite right and
> noticed a little too late that I had some auto-formatting turned
> on. Thanks for doing the rebase, did it end up fixing
> the whitespace issues? If not I'll go through the patch and try
> and fix them all.

No, I just fixed the merge failure.

Our standard way to clean up whitespace issues and make sure code
meets our layout conventions is to run pgindent over it [1].
For this particular patch, that might be too much, because it will
reindent the sections that you added braces around, making the patch
harder to review.  So maybe the best bet is to leave well enough
alone and expect the committer to re-pgindent before pushing it.
However, if you spot any diff hunks where there's just a whitespace
change, getting rid of those would be appreciated.

regards, tom lane

[1] 
https://wiki.postgresql.org/wiki/Running_pgindent_on_non-core_code_or_development_code




Re: Fix overflow in DecodeInterval

2022-03-23 Thread Joseph Koshakow
On Mon, Mar 21, 2022 at 8:31 PM Tom Lane  wrote:
> This isn't applying per the cfbot; looks like it got sideswiped
> by 9e9858389.  Here's a quick rebase.  I've not reviewed it, but
> I did notice (because git was in my face about this) that it's
> got whitespace issues.  Please try to avoid unnecessary whitespace
> changes ... pgindent will clean those up, but it makes reviewing
> harder.

Sorry about that, I didn't have my IDE set up quite right and
noticed a little too late that I had some auto-formatting turned
on. Thanks for doing the rebase, did it end up fixing
the whitespace issues? If not I'll go through the patch and try
and fix them all.

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-03-21 Thread Tom Lane
Joseph Koshakow  writes:
> [ v8-0001-Check-for-overflow-when-decoding-an-interval.patch ]

This isn't applying per the cfbot; looks like it got sideswiped
by 9e9858389.  Here's a quick rebase.  I've not reviewed it, but
I did notice (because git was in my face about this) that it's
got whitespace issues.  Please try to avoid unnecessary whitespace
changes ... pgindent will clean those up, but it makes reviewing
harder.

regards, tom lane

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index ba0ec35ac5..014ec88e0d 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -38,16 +39,20 @@ static int	DecodeNumberField(int len, char *str,
 			  int fmask, int *tmask,
 			  struct pg_tm *tm, fsec_t *fsec, bool *is2digits);
 static int	DecodeTime(char *str, int fmask, int range,
-	   int *tmask, struct pg_tm *tm, fsec_t *fsec);
+	   int *tmask, struct pg_tm *tm, int64 *hour, fsec_t *fsec);
+static int DecodeTimeForInterval(char *str, int fmask, int range,
+ int *tmask, struct pg_itm_in *itm_in);
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
-static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
+static char *AppendSeconds(char *cp, int sec, int64 fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
-			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
-			int scale);
+static bool AdjustFractMicroseconds(long double frac, struct pg_itm_in *itm_in, int64 scale);
+static bool AdjustFractDays(double frac, struct pg_itm_in *pg_itm_in, int scale);
+static bool AdjustFractMonths(double frac, struct pg_itm_in *itm_in, int scale);
+static bool AdjustMicroseconds(int64 val, struct pg_itm_in *itm_in, int64 multiplier, double fval);
+static bool AdjustDays(int val, struct pg_itm_in *itm_in, int multiplier);
+static bool AdjustYears(int val, struct pg_itm_in *itm_in, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -428,7 +433,7 @@ GetCurrentTimeUsec(struct pg_tm *tm, fsec_t *fsec, int *tzp)
  * Note that any sign is stripped from the input seconds values.
  */
 static char *
-AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
+AppendSeconds(char *cp, int sec, int64 fsec, int precision, bool fillzeros)
 {
 	Assert(precision >= 0);
 
@@ -437,10 +442,9 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 	else
 		cp = pg_ultostr(cp, Abs(sec));
 
-	/* fsec_t is just an int32 */
 	if (fsec != 0)
 	{
-		int32		value = Abs(fsec);
+		int64		value = Abs(fsec);
 		char	   *end = [precision + 1];
 		bool		gotnonzero = false;
 
@@ -453,8 +457,8 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 		 */
 		while (precision--)
 		{
-			int32		oldval = value;
-			int32		remainder;
+			int64		oldval = value;
+			int64		remainder;
 
 			value /= 10;
 			remainder = oldval - value * 10;
@@ -475,7 +479,7 @@ AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 		 * which will generate a correct answer in the minimum valid width.
 		 */
 		if (value)
-			return pg_ultostr(cp, Abs(fsec));
+			return pg_ulltostr(cp, Abs(fsec));
 
 		return end;
 	}
@@ -497,36 +501,96 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 }
 
 /*
- * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
- * We assume the input frac is less than 1 so overflow is not an issue.
+ * Multiply frac by scale (to produce microseconds) and add to *itm.
+ * We assume the input frac is less than 1 so overflow of frac is not an issue.
  */
-static void
-AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
+static bool
+AdjustFractMicroseconds(long double frac, struct pg_itm_in *itm_in, int64 scale)
 {
-	int			sec;
+	int64		usec;
+	int64		round = 0;
 
 	if (frac == 0)
-		return;
+		return true;
 	frac *= scale;
-	sec = (int) frac;
-	tm->tm_sec += sec;
-	frac -= sec;
-	*fsec += rint(frac * 100);
+	usec = (int64) frac;
+	if (pg_add_s64_overflow(itm_in->tm_usec, usec, _in->tm_usec))
+		return false;
+	
+	frac = frac - usec;
+	if (frac > 0.5)
+		round = 1;
+	else if (frac < -0.5)
+		round = -1;
+
+	return !pg_add_s64_overflow(itm_in->tm_usec, round, _in->tm_usec);
 }
 
 /* As above, but initial scale produces days */
-static void
-AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
+static bool

Re: Fix overflow in DecodeInterval

2022-03-07 Thread Joseph Koshakow
I just realized another issue today. It may have been obvious from one
of Tom's earlier messages, but I'm just now putting the pieces
together.
On Fri, Feb 18, 2022 at 11:44 PM Tom Lane  wrote:
> Also, I notice that there's an overflow hazard upstream of here,
> in interval2tm:
>
> regression=# select interval '214748364 hours' * 11;
> ERROR: interval out of range
> regression=# \errverbose
> ERROR: 22008: interval out of range
> LOCATION: interval2tm, timestamp.c:1982
>
> There's no good excuse for not being able to print a value that
> we computed successfully.

Scenarios like this can properly decode the interval, but actually
error out when encoding the interval. As a consequence you can insert
the value successfully into a table, but any attempt to query the table
that includes the "bad interval" value in the result will cause an
error. Below I've demonstrated an example:

postgres=# CREATE TABLE tbl (i INTERVAL);
CREATE TABLE
postgres=# INSERT INTO tbl VALUES ('1 day'), ('3 months'), ('2 years');
INSERT 0 3
postgres=# SELECT * FROM tbl;
i
-
1 day
3 mons
2 years
(3 rows)

postgres=# INSERT INTO tbl VALUES ('2147483647 hours 60 minutes');
INSERT 0 1
postgres=# SELECT * FROM tbl;
ERROR: interval out of range

This would seriously reduce the usable of any table that contains one
of these "bad interval" values.

My patch actually fixes this issue, but I just wanted to call it out
because it might be relevant when reviewing.




Re: Fix overflow in DecodeInterval

2022-03-06 Thread Joseph Koshakow
Hi All,

Sorry for the delay in the new patch, I've attached my most recent
patch to this email. I ended up reworking a good portion of my previous
patch so below I've included some reasons why, notes on my current
approach, and some pro/cons to the approach.

* The main reason for the rework had to do with double conversions and
shared code.

* The existing code for rounding had a lot of int to double
casting and vice versa. I *think* that doubles are able to completely
represent the range of ints. However doubles are not able to represent
the full range of int64. After making the change I started noticing
a lot of lossy behavior. One thought I had was to change the doubles
to long doubles, but I wasn't able to figure out if long doubles could
completely represent the range of int64. Especially since their size
varies depending on the architecture. Does anyone know the answer to
this?

* I ended up creating two intermediate data structures for Intervals.
One for decoding and one for everything else. I'll go into more detail
below.
* One common benefit was that they both contain a usec field which
means that the Interval methods no longer need to carry around a
separate fsec argument.
* The obvious con here is that Intervals require two unique
intermediate data structures, while all other date/time types
can share a single intermediate data structure. I find this to
be a bit clunky.

* pg_itm_in is the struct used for Interval decoding. It's very similar
to pg_tm, except all of the time related fields are collapsed into a
single `int64 usec` field.
* The biggest benefit of this was that all int64-double conversions
are limited to a single function, AdjustFractMicroseconds. Instead
of fractional units flowing down over every single time field, they
only need to flow down into the single `int64 usec` field.
* Overflows are caught much earlier in the decoding process which
helps avoid wasted work.
* I found that the decoding code became simpler for time fields,
though this is a bit subjective.

* pg_itm is the struct used for all other Interval functionality. It's
very similar to pg_tm, except the tm_hour field is converted from int
to int64 and an `int tm_usec` field was added.
* When encoding and working with Intervals, we almost always want
to break the time field out into hours, min, sec, usec. So it's
helpful to have a common place to do this, instead of every
function duplicating this code.
* When breaking the time fields out, a single field will never
contain a value greater than could have fit in the next unit
higher. Meaning that minutes will never be greater than 60, seconds
will be never greater than 60, and usec will never be greater than
1,000. So hours is actually the only field that needs to be int64
and the rest can be an int.
* This also helps limit the impact to shared code (see below).

* There's some shared code between Intervals and other date/time types.
Specifically the DecodeTime function and the datetime_to_char_body
function. These functions take in a `struct pg_tm` and a `fsec_t fsec`
(fsec_t is just an alias for int32) which allows them to be re-used by
all date/time types. The only difference now between pg_tm and pg_itm
is the tm_hour field size (the tm_usec field in pg_itm can be used as
the fsec). So to get around this I changed the function signatures to
take a `struct pg_tm`, `fsec_t fsec`, and an `int64 hour` argument.
It's up to the caller to provide to correct hour field. Intervals can
easily convert pg_itm to a pg_tm, fsec, and hour. It's honestly a bit
error-prone since those functions have to explicitly ignore the
pg_tm->tm_hour field and use the provided hour argument instead, but I
couldn't think of a better less intrusive solution. If anyone has a
better idea, please don't hesitate to bring it up.

* This partly existed in the previous patch, but I just wanted to
restate it. All modifications to pg_itm_in during decoding is done via
helper functions that check for overflow. All invocations of these
functions either return an error on overflow or explicitly state why an
overflow is impossible.

* I completely rewrote the ParseISO8601Number function to try and avoid
double to int64 conversions. I tried to model it after the parsing done
in DecodeInterval, though I would appreciate extra scrutiny here.

- Joe Koshakow
From a2afce720fb65b87638a634078067a796a639ddc Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Mon, 28 Feb 2022 22:52:55 -0500
Subject: [PATCH] Rework Interval encoding and decoding

The current Interval encoding and decoding has the following issues:
  * The decoding functions have many uncaught errors that allow the
Interval value to overflow/underflow.
  * Both the decoding and encoding functions do not protect against
taking the absolute value or negating INT_MIN which leads to
undefined behavior (usually it just leaves the value unchanged at
INT_MIN, which is not the desired result).
  * The encoding and decoding process 

Re: Fix overflow in DecodeInterval

2022-02-20 Thread Joseph Koshakow
On Sun, Feb 20, 2022 at 6:37 PM Tom Lane  wrote:
> Couple of quick comments:

Thanks for the comments Tom, I'll work on fixing these and submit a
new patch.

> * The uses of tm2itm make me a bit itchy.  Is that sweeping
> upstream-of-there overflow problems under the rug?

I agree, I'm not super happy with that approach. In fact
I'm pretty sure it will cause queries like
SELECT INTERVAL '2147483648:00:00';
to overflow upstream, even though queries like
SELECT INTERVAL '2147483648 hours';
would not. The places tm2itm is being used are
 * After DecodeTime
 * In interval_to_char.
The more general issue is how to share code with
functions that are doing almost identical things but use
pg_tm instead of the new pg_itm? I'm not really sure what
the best solution is right now but I will think about it. If
anyone has suggestions though, feel free to chime in.

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-02-20 Thread Tom Lane
Joseph Koshakow  writes:
> Attached is a patch of my first pass. The to_char method isn't finished
> and I need to add a bunch of tests, but everything else is in place. It
> ended up being a fairly large change in case anyone wants to take a look
> the changes so far.

Couple of quick comments:

* You've assumed in a number of places that long == int64.  This is wrong
on many platforms.  Current project style for use of int64 in printf-type
calls is to cast the argument to (long long) explicitly and use "%lld" (or
"%llu", etc).  As for strtoint_64, surely that functionality exists
somewhere already (hopefully with fewer bugs than this has).

* I think that tools like Coverity will complain about how you've
got both calls that check the result of AdjustFractSeconds (etc)
and calls that don't.  You might consider coding the latter like

+/* this can't overflow: */
+(void) AdjustFractSeconds(fval, itm, fsec, SECS_PER_DAY);

in hopes of (a) silencing those warnings and (b) making the implicit
assumption clear to readers.

* I'm not entirely buying use of pg_time_t, rather than just int64,
in struct itm.  I think the point of this struct is to get away
from datetime-specific datatypes.  Also, if pg_time_t ever changed
width, it'd create issues for users of this struct.  I also note
a number of places in the patch that are assuming that these fields
are int64 not something else.

* I'm a little inclined to rename interval2tm/tm2interval to
interval2itm/itm2interval, as I think it'd be confusing for those
function names to refer to a typedef they no longer use.

* The uses of tm2itm make me a bit itchy.  Is that sweeping
upstream-of-there overflow problems under the rug?

* // comments are not project style, either.  While pgindent will
convert them to /* ... */ style, the results might not be pleasing.

> One thing I noticed is that interval.c has a ton of code copied and pasted
> from other files. The code seemed out of date from those other files, so
> I tried to bring it up to date and add my changes.

We haven't actually maintained ecpg's copies of backend datatype-specific
code in many years.  While bringing that stuff up to speed might be
worthwhile (or perhaps not, given the lack of complaints), I'd see it
as a separate project.

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-02-19 Thread Joseph Koshakow
Attached is a patch of my first pass. The to_char method isn't finished
and I need to add a bunch of tests, but everything else is in place. It
ended up being a fairly large change in case anyone wants to take a look
the changes so far.

One thing I noticed is that interval.c has a ton of code copied and pasted
from other files. The code seemed out of date from those other files, so
I tried to bring it up to date and add my changes.

- Joe Koshakow
From f678ee1bdcd6e075f1b1c771e6d5f6b3f9089086 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Additionally adding the word "ago" to the interval negates every
field. However the negative of INT_MIN is still INT_MIN. This
commit adds a check to make sure that we don't try and negate
a field if it's INT_MIN.

Additionally when encoding an interval there are numerous points
where fields are negating or passed to abs(). When any of the
fields are INT_MIN this results in undefined behavior, but
usually just leaves the value unchanged as INT_MIN. This
commit fixes this issue.

Additionally this commit changes the Interval code to use it's
own intermediate pg_tm data structure that uses int64s instead
of ints. This helps avoid numerous overflow/underflow issues.
Additionally it allows us to use the full range of int64
microseconds instead of being artificially constrained by the
int pg_tm fields.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c  | 413 +--
 src/backend/utils/adt/formatting.c|  24 +-
 src/backend/utils/adt/timestamp.c |  54 +-
 src/common/string.c   |  15 +
 src/include/common/string.h   |   2 +
 src/include/pgtime.h  |  12 +
 src/include/utils/datetime.h  |   6 +-
 src/include/utils/timestamp.h |   5 +-
 src/interfaces/ecpg/pgtypeslib/dt.h   |  20 +-
 src/interfaces/ecpg/pgtypeslib/interval.c | 584 ++
 src/test/regress/expected/interval.out| 147 ++
 src/test/regress/sql/interval.sql |  46 ++
 12 files changed, 946 insertions(+), 382 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..908a67648d 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -42,12 +43,15 @@ static int	DecodeTime(char *str, int fmask, int range,
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
-static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
+static char *AppendSeconds(char *cp, int64 sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractSeconds(double frac, struct pg_itm *itm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractDays(double frac, struct pg_itm *itm, fsec_t *fsec,
 			int scale);
+static bool AdjustFractMonths(double frac, struct pg_itm *itm, int scale);
+static bool AdjustDays(int val, struct pg_itm *itm, int multiplier);
+static bool AdjustYears(int val, struct pg_itm *itm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -428,7 +432,7 @@ GetCurrentTimeUsec(struct pg_tm *tm, fsec_t *fsec, int *tzp)
  * Note that any sign is stripped from the input seconds values.
  */
 static char *
-AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
+AppendSeconds(char *cp, int64 sec, fsec_t fsec, int precision, bool fillzeros)
 {
 	Assert(precision >= 0);
 
@@ -500,33 +504,82 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
  * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
  * We assume the input frac is less than 1 so overflow is not an issue.
  */
-static void

Re: Fix overflow in DecodeInterval

2022-02-19 Thread Joseph Koshakow
Copying over a related thread here to have info all in one place.
It's a little fragmented so sorry about that.


On Fri, Feb 18, 2022 at 11:44 PM Tom Lane  wrote:
>
> Joseph Koshakow  writes:
> > When formatting the output of an Interval, we call abs() on the hours
> > field of the Interval. Calling abs(INT_MIN) returns back INT_MIN
> > causing the output to contain two '-' characters. The attached patch
> > fixes that issue by special casing INT_MIN hours.
>
> Good catch, but it looks to me like three out of the four formats in
> EncodeInterval have variants of this problem --- there are assumptions
> throughout that code that we can compute "-x" or "abs(x)" without
> fear.  Not much point in fixing only one symptom.
>
> Also, I notice that there's an overflow hazard upstream of here,
> in interval2tm:
>
> regression=# select interval '214748364 hours' * 11;
> ERROR:  interval out of range
> regression=# \errverbose
> ERROR:  22008: interval out of range
> LOCATION:  interval2tm, timestamp.c:1982
>
> There's no good excuse for not being able to print a value that
> we computed successfully.
>
> I wonder if the most reasonable fix would be to start using int64
> instead of int arithmetic for the values that are potentially large.
> I doubt that we'd be taking much of a performance hit on modern
> hardware.
>
> regards, tom lane

Joseph Koshakow  writes:
> On Fri, Feb 18, 2022 at 11:44 PM Tom Lane  wrote:
>> I wonder if the most reasonable fix would be to start using int64
>> instead of int arithmetic for the values that are potentially large.
>> I doubt that we'd be taking much of a performance hit on modern
>> hardware.

> That's an interesting idea. I've always assumed that the range of the
> time fields of Intervals was 2147483647 hours 59 minutes
> 59.99 seconds to -2147483648 hours -59 minutes
> -59.99 seconds. However the only reason that we can't support
> the full range of int64 microseconds is because the struct pg_tm fields
> are only ints. If we increase those fields to int64 then we'd be able to
> support the full int64 range for microseconds as well as implicitly fix
> some of the overflow issues in DecodeInterval and EncodeInterval.

On Sat, Feb 19, 2022 at 1:52 PM Tom Lane  wrote:
>
> Joseph Koshakow  writes:
> > On Sat, Feb 19, 2022 at 12:00 PM Tom Lane  wrote:
> >> I think that messing with struct pg_tm might have too many side-effects.
> >> However, pg_tm isn't all that well adapted to intervals in the first
> >> place, so it'd make sense to make a new struct specifically for interval
> >> decoding.
>
> > Yeah that's a good idea, pg_tm is used all over the place. Is there a
> > reason we need an intermediate structure to convert to and from?
> > We could just convert directly to an Interval in DecodeInterval and
> > print directly from an Interval in EncodeInterval.
>
> Yeah, I thought about that too, but if you look at the other callers of
> interval2tm, you'll see this same set of issues.  I'm inclined to keep
> the current code structure and just fix the data structure.  Although
> interval2tm isn't *that* big, I don't think four copies of its logic
> would be an improvement.
>
> > Also I originally created a separate thread and patch because I
> > thought this wouldn't be directly related to my other patch,
> > https://commitfest.postgresql.org/37/3541/. However I think with these
> > discussed changes it's directly related. So I think it's best to close
> > this thread and patch and copy this conversion to the other thread.
>
> Agreed.
>
> regards, tom lane




Re: Fix overflow in DecodeInterval

2022-02-17 Thread Joseph Koshakow
Ok, so I've attached a patch with my final unprompted changes. It
contains the following two changes:

1. I found some more overflows with the ISO8601 formats and have
included some fixes.
2. I reverted the overflow checks for the seconds field. It's actually a
bit more complicated than I thought. For example consider the following
query:
postgres=# SELECT INTERVAL '0. min 2147483647 sec';
interval
--
-596523:13:09.01
(1 row)
This query will overflow the tm_sec field of the struct pg_tm, however
it's not actually out of range for the Interval. I'm not sure the best
way to handle this right now, but I think it would be best to leave it
for a future patch. Perhaps the time related fields in struct pg_tm
need to be changed to 64 bit ints.

- Joe Koshakow
From 5750dcfbc00cb1259263e9986898f1960edf9c7f Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Additionally adding the word "ago" to the interval negates every
field. However the negative of INT_MIN is still INT_MIN. This
commit adds a check to make sure that we don't try and negate
a field if it's INT_MIN.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 115 +++--
 src/test/regress/expected/interval.out | 112 
 src/test/regress/sql/interval.sql  |  29 +++
 3 files changed, 232 insertions(+), 24 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..1c2ab58f82 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -46,8 +47,11 @@ static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
 static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static bool AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static bool AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static bool AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -515,18 +519,56 @@ AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 }
 
 /* As above, but initial scale produces days */
-static void
+static bool
 AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			extra_days;
 
 	if (frac == 0)
-		return;
+		return true;
 	frac *= scale;
 	extra_days = (int) frac;
-	tm->tm_mday += extra_days;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return false;
 	frac -= extra_days;
 	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+	return true;
+}
+
+/* As above, but initial scale produces months */
+static bool
+AdjustFractMonths(double frac, struct pg_tm *tm, int scale)
+{
+	int months = rint(frac * MONTHS_PER_YEAR * scale);
+	return !pg_add_s32_overflow(tm->tm_mon, months, >tm_mon);
+}
+
+/*
+ * Multiply val by multiplier (to produce days) and add to *tm.
+ * Returns true if successful, false if tm overflows.
+ */
+static bool
+AdjustDays(int val, struct pg_tm *tm, int multiplier)
+{
+	int			extra_days;
+	return !pg_mul_s32_overflow(val, multiplier, _days) &&
+		!pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday);
+}
+
+/* As above, but initial val produces months */
+static bool
+AdjustMonths(int val, struct pg_tm *tm)
+{
+	return !pg_add_s32_overflow(tm->tm_mon, val, >tm_mon);
+}
+
+/* As above, but initial val produces years */
+static bool
+AdjustYears(int val, struct pg_tm *tm, int multiplier)
+{
+	int			years;
+	return !pg_mul_s32_overflow(val, multiplier, ) &&
+		!pg_add_s32_overflow(tm->tm_year, years, >tm_year);
 }
 
 /* Fetch a fractional-second value with suitable error checking */
@@ -3287,44 +3329,56 @@ DecodeInterval(char **field, int 

Re: Fix overflow in DecodeInterval

2022-02-15 Thread Andres Freund
Hi,

On 2022-02-15 06:44:40 -0500, Joseph Koshakow wrote:
> Thanks for your feedback! I'm not super familiar with the process,
> should I submit this thread to the commitfest or just leave it as is?

Submit it to the CF - then we get an automatic test on a few platforms. I
think we can apply it quickly after that...

Greetings,

Andres Freund




Re: Fix overflow in DecodeInterval

2022-02-15 Thread Joseph Koshakow
On Sun, Feb 13, 2022 at 5:12 PM Joseph Koshakow  wrote:
>
> Attached is a new version switching from ints to bools, as requested.
>
> - Joe Koshakow

Tom, Andres,

Thanks for your feedback! I'm not super familiar with the process,
should I submit this thread to the commitfest or just leave it as is?

Thanks,
Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-02-13 Thread Joseph Koshakow
Attached is a new version switching from ints to bools, as requested.

- Joe Koshakow
From af8f030ad146602b4386f77b5664c6013743569b Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Additionally adding the word "ago" to the interval negates every
field. However the negative of INT_MIN is still INT_MIN. This
commit adds a check to make sure that we don't try and negate
a field if it's INT_MIN.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 100 +++--
 src/test/regress/expected/interval.out |  68 +
 src/test/regress/sql/interval.sql  |  18 +
 3 files changed, 164 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..15a1ebbe67 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -44,10 +45,13 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static bool AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static bool AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static bool AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static bool AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -499,34 +503,68 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 /*
  * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
  * We assume the input frac is less than 1 so overflow is not an issue.
+ * Returns true if successful, false if tm overflows.
  */
-static void
+static bool
 AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			sec;
 
 	if (frac == 0)
-		return;
+		return true;
 	frac *= scale;
 	sec = (int) frac;
-	tm->tm_sec += sec;
+	if (pg_add_s32_overflow(tm->tm_sec, sec, >tm_sec))
+		return false;
 	frac -= sec;
 	*fsec += rint(frac * 100);
+	return true;
 }
 
 /* As above, but initial scale produces days */
-static void
+static bool
 AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			extra_days;
 
 	if (frac == 0)
-		return;
+		return true;
 	frac *= scale;
 	extra_days = (int) frac;
-	tm->tm_mday += extra_days;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return false;
 	frac -= extra_days;
-	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+	return AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+}
+
+/* As above, but initial scale produces months */
+static bool
+AdjustFractMonths(double frac, struct pg_tm *tm, int scale)
+{
+	int months = rint(frac * MONTHS_PER_YEAR * scale);
+
+	return !pg_add_s32_overflow(tm->tm_mon, months, >tm_mon);
+}
+
+/*
+ * Multiply val by multiplier (to produce days) and add to *tm.
+ * Returns true if successful, false if tm overflows.
+ */
+static bool
+AdjustDays(int val, struct pg_tm *tm, int multiplier)
+{
+	int			extra_days;
+	return !pg_mul_s32_overflow(val, multiplier, _days) &&
+		!pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday);
+}
+
+/* As above, but initial val produces years */
+static bool
+AdjustYears(int val, struct pg_tm *tm, int multiplier)
+{
+	int			years;
+	return !pg_mul_s32_overflow(val, multiplier, ) &&
+		!pg_add_s32_overflow(tm->tm_year, years, >tm_year);
 }
 
 /* Fetch a fractional-second value with suitable error checking */
@@ -3275,56 +3313,69 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 
 	case DTK_MINUTE:
 		tm->tm_min += val;
-		AdjustFractSeconds(fval, tm, fsec, SECS_PER_MINUTE);

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Tom Lane
Andres Freund  writes:
> Do we want to consider backpatching these fixes?

I wasn't thinking that we should.  It's a behavioral change and
people might not be pleased to have it appear in minor releases.

regards, tom lane




Re: Fix overflow in DecodeInterval

2022-02-13 Thread Andres Freund
Hi,

On 2022-02-13 09:35:47 -0500, Joseph Koshakow wrote:
> I chose int return types to keep all these methods
> consistent with DecodeInterval, which returns a
> non-zero int to indicate an error.

That's different, because it actually returns different types of errors. IMO
that difference is actually reason to use a bool for the new cases, because
then it's a tad clearer that they don't return DTERR_*.

> Though I wasn't sure
> if an int or bool would be best, so I'm happy to change
> to bool if people think that's better.

+1 or bool.


> Also I'm realizing now that I've incorrectly been using the
> number of the patch to indicate the version, instead of just
> sticking a v3 to the front. So sorry about that, all the patches
> I sent in this thread are the same patch, just different versions.

No worries ;)


Do we want to consider backpatching these fixes? If so, I'd argue for skipping
10, because it doesn't have int.h stuff yet. There's also the issue with
potentially breaking indexes / constraints? Not that goes entirely away across
major versions...

Greetings,

Andres Freund




Re: Fix overflow in DecodeInterval

2022-02-13 Thread Joseph Koshakow
Actually found an additional overflow issue in
DecodeInterval. I've updated the patch with the
fix. Specifically it prevents trying to negate a field
if it equals INT_MIN. Let me know if this is best
put into another patch.

- Joe Koshakow
From 09eafa9b496c8461f2dc52ea62c9e833ab10a17f Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Additionally adding the word "ago" to the interval negates every
field. However the negative of INT_MIN is still INT_MIN. This
commit adds a check to make sure that we don't try and negate
a field if it's INT_MIN.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 108 -
 src/test/regress/expected/interval.out |  68 
 src/test/regress/sql/interval.sql  |  18 +
 3 files changed, 172 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..49bd12075e 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -44,10 +45,13 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static int AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static int AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static int AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool DetermineTimeZoneAbbrevOffsetInternal(pg_time_t t,
@@ -499,34 +503,76 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 /*
  * Multiply frac by scale (to produce seconds) and add to *tm & *fsec.
  * We assume the input frac is less than 1 so overflow is not an issue.
+ * Returns 0 if successful, 1 if tm overflows.
  */
-static void
+static int
 AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			sec;
 
 	if (frac == 0)
-		return;
+		return 0;
 	frac *= scale;
 	sec = (int) frac;
-	tm->tm_sec += sec;
+	if (pg_add_s32_overflow(tm->tm_sec, sec, >tm_sec))
+		return 1;
 	frac -= sec;
 	*fsec += rint(frac * 100);
+	return 0;
 }
 
 /* As above, but initial scale produces days */
-static void
+static int
 AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec, int scale)
 {
 	int			extra_days;
 
 	if (frac == 0)
-		return;
+		return 0;
 	frac *= scale;
 	extra_days = (int) frac;
-	tm->tm_mday += extra_days;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return 1;
 	frac -= extra_days;
-	AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+	return AdjustFractSeconds(frac, tm, fsec, SECS_PER_DAY);
+}
+
+/* As above, but initial scale produces months */
+static int
+AdjustFractMonths(double frac, struct pg_tm *tm, int scale)
+{
+	int months = rint(frac * MONTHS_PER_YEAR * scale);
+
+	if (pg_add_s32_overflow(tm->tm_mon, months, >tm_mon))
+		return 1;
+	return 0;
+}
+
+/*
+ * Multiply val by multiplier (to produce days) and add to *tm.
+ * Returns 0 if successful, 1 if tm overflows.
+ */
+static int
+AdjustDays(int val, struct pg_tm *tm, int multiplier)
+{
+	int			extra_days;
+	if (pg_mul_s32_overflow(val, multiplier, _days))
+		return 1;
+	if (pg_add_s32_overflow(tm->tm_mday, extra_days, >tm_mday))
+		return 1;
+	return 0;
+}
+
+/* As above, but initial val produces years */
+static int
+AdjustYears(int val, struct pg_tm *tm, int multiplier)
+{
+	int			years;
+	if (pg_mul_s32_overflow(val, multiplier, ))
+		return 1;
+	if (pg_add_s32_overflow(tm->tm_year, years, >tm_year))
+		return 1;
+	return 0;
 }
 
 /* Fetch a fractional-second value with suitable error 

Re: Fix overflow in DecodeInterval

2022-02-13 Thread Joseph Koshakow
On Sat, Feb 12, 2022 at 10:51 PM Andres Freund  wrote:
> Any reason for using int return types?
>
> particularly since the pg_*_overflow stuff uses bool?

I chose int return types to keep all these methods
consistent with DecodeInterval, which returns a
non-zero int to indicate an error. Though I wasn't sure
if an int or bool would be best, so I'm happy to change
to bool if people think that's better.

Also I'm realizing now that I've incorrectly been using the
number of the patch to indicate the version, instead of just
sticking a v3 to the front. So sorry about that, all the patches
I sent in this thread are the same patch, just different versions.

- Joe Koshakow




Re: Fix overflow in DecodeInterval

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 21:16:05 -0500, Joseph Koshakow wrote:
> I've attached the patch below.

Any reason for using int return types?


> +/* As above, but initial val produces years */
> +static int
> +AdjustYears(int val, struct pg_tm *tm, int multiplier)
> +{
> + int years;
> + if (pg_mul_s32_overflow(val, multiplier, ))
> + return 1;
> + if (pg_add_s32_overflow(tm->tm_year, years, >tm_year))
> + return 1;
> + return 0;
>  }

particularly since the pg_*_overflow stuff uses bool?


Greetings,

Andres Freund




Re: Fix overflow in DecodeInterval

2022-02-12 Thread Joseph Koshakow
On Fri, Feb 11, 2022 at 4:58 PM Joseph Koshakow  wrote:
>
> Tom Lane  writes:
> > Joseph Koshakow  writes:
> > > The attached patch fixes an overflow bug in DecodeInterval when applying
> > > the units week, decade, century, and millennium. The overflow check logic
> > > was modelled after the overflow check at the beginning of `int
> > > tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in 
> > > timestamp.c.
> >
> >
> > Good catch, but I don't think that tm2interval code is best practice
> > anymore.  Rather than bringing "double" arithmetic into the mix,
> > you should use the overflow-detecting arithmetic functions in
> > src/include/common/int.h.  The existing code here is also pretty
> > faulty in that it doesn't notice addition overflow when combining
> > multiple units.  So for example, instead of
> >
> >
> > tm->tm_mday += val * 7;
> >
> >
> > I think we should write something like
> >
> >
> > if (pg_mul_s32_overflow(val, 7, ))
> > return DTERR_FIELD_OVERFLOW;
> > if (pg_add_s32_overflow(tm->tm_mday, tmp, >tm_mday))
> > return DTERR_FIELD_OVERFLOW;
> >
> >
> > Perhaps some macros could be used to make this more legible?
> >
> >
> > regards, tom lane
> >
> >
> > @postgresql
>
> Thanks for the feedback Tom, I've attached an updated patch with
> your suggestions. Feel free to rename the horribly named macro.
>
> Also while fixing this I noticed that fractional intervals can also
> cause an overflow issue.
> postgres=# SELECT INTERVAL '0.1 months 2147483647 days';
>  interval
> --
>  -2147483646 days
> (1 row)
> I haven't looked into it, but it's probably a similar cause.

Hey Tom,

I was originally going to fix the fractional issue in a follow-up
patch, but I took a look at it and decided to make a new patch
with both fixes. I tried to make the handling of fractional and
non-fractional units consistent. I've attached the patch below.

While investigating this, I've noticed a couple more overflow
issues with Intervals, but I think they're best handled in separate
patches. I've included the ones I've found in this email.

  postgres=# CREATE TEMP TABLE INTERVAL_TBL_OF (f1 interval);
  CREATE TABLE
  postgres=# INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('0.1 weeks
2147483647 hrs');
  INSERT 0 1
  postgres=# SELECT * FROM INTERVAL_TBL_OF;
  ERROR:  interval out of range

  postgres=# SELECT justify_days(INTERVAL '2147483647 months 2147483647 days');
 justify_days
  ---
   -172991738 years -4 mons -23 days
  (1 row)

Cheers,
Joe Koshakow
From de041874b73600312e7ce71931c297f0aad3aa06 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.

Additionally fractional date/time units are rolled down into the next
smallest unit. For example 0.1 months is 3 days. When adding these
fraction units, there is no check for overflow, allowing the interval
to overflow. This commit adds an overflow check for all of the
fractional units.

Signed-off-by: Joseph Koshakow 
---
 src/backend/utils/adt/datetime.c   | 103 +++--
 src/test/regress/expected/interval.out |  56 ++
 src/test/regress/sql/interval.sql  |  15 
 3 files changed, 152 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..f89cc7c29b 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -44,10 +45,13 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static void AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractSeconds(double frac, struct pg_tm *tm, fsec_t *fsec,
 			   int scale);
-static void AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
+static int AdjustFractDays(double frac, struct pg_tm *tm, fsec_t *fsec,
 			int scale);
+static int AdjustFractMonths(double frac, struct pg_tm *tm, int scale);
+static int AdjustDays(int val, struct pg_tm *tm, int multiplier);
+static int AdjustYears(int val, struct pg_tm *tm, int multiplier);
 static int	DetermineTimeZoneOffsetInternal(struct pg_tm *tm, pg_tz *tzp,
 			pg_time_t *tp);
 static bool 

Re: Fix overflow in DecodeInterval

2022-02-11 Thread Joseph Koshakow
Tom Lane  writes:
> Joseph Koshakow  writes:
> > The attached patch fixes an overflow bug in DecodeInterval when applying
> > the units week, decade, century, and millennium. The overflow check logic
> > was modelled after the overflow check at the beginning of `int
> > tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in timestamp.c.
>
>
> Good catch, but I don't think that tm2interval code is best practice
> anymore.  Rather than bringing "double" arithmetic into the mix,
> you should use the overflow-detecting arithmetic functions in
> src/include/common/int.h.  The existing code here is also pretty
> faulty in that it doesn't notice addition overflow when combining
> multiple units.  So for example, instead of
>
>
> tm->tm_mday += val * 7;
>
>
> I think we should write something like
>
>
> if (pg_mul_s32_overflow(val, 7, ))
> return DTERR_FIELD_OVERFLOW;
> if (pg_add_s32_overflow(tm->tm_mday, tmp, >tm_mday))
> return DTERR_FIELD_OVERFLOW;
>
>
> Perhaps some macros could be used to make this more legible?
>
>
> regards, tom lane
>
>
> @postgresql

Thanks for the feedback Tom, I've attached an updated patch with
your suggestions. Feel free to rename the horribly named macro.

Also while fixing this I noticed that fractional intervals can also
cause an overflow issue.
postgres=# SELECT INTERVAL '0.1 months 2147483647 days';
 interval
--
 -2147483646 days
(1 row)
I haven't looked into it, but it's probably a similar cause.
From 49b5beb4132a0c9e793e1fd7b91a4da0c96a6196 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Fri, 11 Feb 2022 14:18:32 -0500
Subject: [PATCH] Check for overflow when decoding an interval

When decoding an interval there are various date units which are
aliases for some multiple of another unit. For example a week is 7 days
and a decade is 10 years. When decoding these specific units, there is
no check for overflow, allowing the interval to overflow. This commit
adds an overflow check for all of these units.
---
 src/backend/utils/adt/datetime.c   | 22 ++
 src/test/regress/expected/interval.out | 32 ++
 src/test/regress/sql/interval.sql  |  8 +++
 3 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 7926258c06..59d391adda 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -21,6 +21,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/pg_type.h"
+#include "common/int.h"
 #include "common/string.h"
 #include "funcapi.h"
 #include "miscadmin.h"
@@ -3106,6 +3107,19 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 	int			val;
 	double		fval;
 
+	/*
+	 * Multiply src by mult and add it to dest while checking for overflow.
+	 */
+#define SAFE_MUL_ADD(src, mult, dst)	\
+	do\
+	{\
+		int tmp;		\
+		if (pg_mul_s32_overflow(src, mult, ))		\
+			return DTERR_FIELD_OVERFLOW;			\
+		if (pg_add_s32_overflow(dst, tmp, &(dst)))		\
+			return DTERR_FIELD_OVERFLOW;			\
+	} while (0)
+
 	*dtype = DTK_DELTA;
 	type = IGNORE_DTF;
 	ClearPgTm(tm, fsec);
@@ -3293,7 +3307,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 		break;
 
 	case DTK_WEEK:
-		tm->tm_mday += val * 7;
+		SAFE_MUL_ADD(val, 7, tm->tm_mday);
 		AdjustFractDays(fval, tm, fsec, 7);
 		tmask = DTK_M(WEEK);
 		break;
@@ -3311,19 +3325,19 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 		break;
 
 	case DTK_DECADE:
-		tm->tm_year += val * 10;
+		SAFE_MUL_ADD(val, 10, tm->tm_year);
 		tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 10);
 		tmask = DTK_M(DECADE);
 		break;
 
 	case DTK_CENTURY:
-		tm->tm_year += val * 100;
+		SAFE_MUL_ADD(val, 100, tm->tm_year);
 		tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 100);
 		tmask = DTK_M(CENTURY);
 		break;
 
 	case DTK_MILLENNIUM:
-		tm->tm_year += val * 1000;
+		SAFE_MUL_ADD(val, 1000, tm->tm_year);
 		tm->tm_mon += rint(fval * MONTHS_PER_YEAR * 1000);
 		tmask = DTK_M(MILLENNIUM);
 		break;
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index accd4a7d90..88865483fa 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -232,6 +232,38 @@ INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('-2147483648 years');
 ERROR:  interval out of range
 LINE 1: INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('-2147483648 years'...
  ^
+INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('2147483647 weeks');
+ERROR:  interval field value out of range: "2147483647 weeks"
+LINE 1: INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('2147483647 weeks')...
+ ^
+INSERT INTO INTERVAL_TBL_OF (f1) VALUES ('-2147483648 weeks');

Re: Fix overflow in DecodeInterval

2022-02-11 Thread Tom Lane
Joseph Koshakow  writes:
> The attached patch fixes an overflow bug in DecodeInterval when applying
> the units week, decade, century, and millennium. The overflow check logic
> was modelled after the overflow check at the beginning of `int
> tm2interval(struct pg_tm *tm, fsec_t fsec, Interval *span);` in timestamp.c.

Good catch, but I don't think that tm2interval code is best practice
anymore.  Rather than bringing "double" arithmetic into the mix,
you should use the overflow-detecting arithmetic functions in
src/include/common/int.h.  The existing code here is also pretty
faulty in that it doesn't notice addition overflow when combining
multiple units.  So for example, instead of

tm->tm_mday += val * 7;

I think we should write something like

if (pg_mul_s32_overflow(val, 7, ))
return DTERR_FIELD_OVERFLOW;
if (pg_add_s32_overflow(tm->tm_mday, tmp, >tm_mday))
return DTERR_FIELD_OVERFLOW;

Perhaps some macros could be used to make this more legible?

regards, tom lane