Hi all,

This patch does three things in the DecodeInterval function:

1) Removes dead code for handling unit type RESERVE. There used to be
a unit called "invalid" that was of type RESERVE. At some point that
unit was removed and there were no more units of type RESERVE.
Therefore, the code for RESERVE unit handling is unreachable.

2) Restrict the unit "ago" to only appear at the end of the
interval. According to the docs [0], this is the only valid place to
put it, but we allowed it multiple times at any point in the input.

3) Error when the user has multiple consecutive units or a unit without
an accompanying value. I spent a lot of time trying to come up with
robust ways to detect this and ultimately settled on using the "type"
field. I'm not entirely happy with this approach, because it involves
having to look ahead to the next field in a couple of places. The other
approach I was considering was to introduce a boolean flag called
"unhandled_unit". After parsing a unit it would be set to true, after
applying the unit to a number it would be set to false. If it was true
right before parsing a unit, then we would error. Please let me know
if you have any suggestions here.

There is one more problem I noticed, but didn't fix. We allow multiple
"@" to be sprinkled anywhere in the input, even though the docs [0]
only allow it to appear at the beginning of the input. For example,
the following query works fine:

    # SELECT INTERVAL '1 @ year @ @ @ 6 days @ 10 @ minutes';
            interval
    ------------------------
     1 year 6 days 00:10:00
    (1 row)

Unfortunately, this can't be fixed in DecodeInterval, because all of
the "@" fields are filtered out before this method. Additionally, I
believe this means that the lines

     if (type == IGNORE_DTF)
         continue;

in DecodeInterval, that appears right after decoding the units, are
unreachable since
"@" is the only unit of type IGNORE_DTF. Since "@" is filtered out,
we'll never decode a unit of type IGNORE_DTF.

For reference, I previously merged a couple similar patches to this
one, but for other date-time types [1], [2].

Thanks,
Joe Koshakow

[0]
https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5b3c5953553bb9fb0b171abc6041e7c7e9ca5b4d
[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=bcc704b52490492e6bd73c4444056b3e9644504d
From 4c5641f15e5409ef5973a5f305352018f06da538 Mon Sep 17 00:00:00 2001
From: Joseph Koshakow <kosh...@gmail.com>
Date: Sun, 9 Apr 2023 20:37:27 -0400
Subject: [PATCH] Fix interval decode handling of invalid intervals

This patch does three things in the DecodeInterval function:

1) Removes dead code for handling unit type RESERVE. There used to be
a unit called "invalid" that was of type RESERVE. At some point that
unit was removed and there were no more units of type RESERVE.
Therefore, the code for RESERVE unit handling is unreachable.

2) Restrict the unit "ago" to only appear at the end of the
interval. According to the docs [0], this is the only valid place to
put it, but we allowed it multiple times at any point in the input.

3) Error when the user has multiple consecutive units or a unit without
an accompanying value.

[0] https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
---
 src/backend/utils/adt/datetime.c       | 55 +++++++++++++++++++-------
 src/test/regress/expected/interval.out | 18 +++++++++
 src/test/regress/sql/interval.sql      |  7 ++++
 3 files changed, 65 insertions(+), 15 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index be2e55bb29..17fc0d45ea 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3335,7 +3335,8 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 				if (force_negative &&
 					itm_in->tm_usec > 0)
 					itm_in->tm_usec = -itm_in->tm_usec;
-				type = DTK_DAY;
+				if (i != 0 && ftype[i - 1] != DTK_STRING && ftype[i - 1] != DTK_SPECIAL)
+					type = DTK_DAY;
 				break;
 
 			case DTK_TZ:
@@ -3372,7 +3373,8 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 					 * specified. This handles the case of '1 +02:03' since we
 					 * are reading right to left.
 					 */
-					type = DTK_DAY;
+					if (i != 0 && ftype[i - 1] != DTK_STRING && ftype[i - 1] != DTK_SPECIAL)
+						type = DTK_DAY;
 					break;
 				}
 
@@ -3475,12 +3477,14 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 						if (!AdjustMicroseconds(val, fval, 1, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(MICROSECOND);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_MILLISEC:
 						if (!AdjustMicroseconds(val, fval, 1000, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(MILLISECOND);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_SECOND:
@@ -3495,19 +3499,24 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							tmask = DTK_M(SECOND);
 						else
 							tmask = DTK_ALL_SECS_M;
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_MINUTE:
 						if (!AdjustMicroseconds(val, fval, USECS_PER_MINUTE, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(MINUTE);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_HOUR:
 						if (!AdjustMicroseconds(val, fval, USECS_PER_HOUR, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(HOUR);
-						type = DTK_DAY; /* set for next field */
+						if (i != 0 && ftype[i - 1] != DTK_STRING && ftype[i - 1] != DTK_SPECIAL)
+							type = DTK_DAY; /* set for next field */
+						else
+							type = IGNORE_DTF;
 						break;
 
 					case DTK_DAY:
@@ -3515,6 +3524,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractMicroseconds(fval, USECS_PER_DAY, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(DAY);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_WEEK:
@@ -3522,6 +3532,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractDays(fval, 7, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(WEEK);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_MONTH:
@@ -3529,6 +3540,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractDays(fval, DAYS_PER_MONTH, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(MONTH);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_YEAR:
@@ -3536,6 +3548,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractYears(fval, 1, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(YEAR);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_DECADE:
@@ -3543,6 +3556,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractYears(fval, 10, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(DECADE);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_CENTURY:
@@ -3550,6 +3564,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractYears(fval, 100, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(CENTURY);
+						type = IGNORE_DTF;
 						break;
 
 					case DTK_MILLENNIUM:
@@ -3557,6 +3572,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 							!AdjustFractYears(fval, 1000, itm_in))
 							return DTERR_FIELD_OVERFLOW;
 						tmask = DTK_M(MILLENNIUM);
+						type = IGNORE_DTF;
 						break;
 
 					default:
@@ -3566,6 +3582,9 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 
 			case DTK_STRING:
 			case DTK_SPECIAL:
+				/* reject consecutive unhandled units */
+				if (type != IGNORE_DTF)
+					return DTERR_BAD_FORMAT;
 				type = DecodeUnits(i, field[i], &uval);
 				if (type == IGNORE_DTF)
 					continue;
@@ -3578,13 +3597,15 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 						break;
 
 					case AGO:
-						is_before = true;
-						type = uval;
-						break;
 
-					case RESERV:
-						tmask = (DTK_DATE_M | DTK_TIME_M);
-						*dtype = uval;
+						/*
+						 * 'ago' is only allowed to appear at the end of the
+						 * interval.
+						 */
+						if (i != nf - 1)
+							return DTERR_BAD_FORMAT;
+						is_before = true;
+						type = IGNORE_DTF;
 						break;
 
 					default:
@@ -3605,6 +3626,10 @@ DecodeInterval(char **field, int *ftype, int nf, int range,
 	if (fmask == 0)
 		return DTERR_BAD_FORMAT;
 
+	/* reject if unit appeared and was never handled */
+	if (type != IGNORE_DTF)
+		return DTERR_BAD_FORMAT;
+
 	/* finally, AGO negates everything */
 	if (is_before)
 	{
@@ -4482,17 +4507,17 @@ EncodeInterval(struct pg_itm *itm, int style, char *str)
 		case INTSTYLE_SQL_STANDARD:
 			{
 				bool		has_negative = year < 0 || mon < 0 ||
-				mday < 0 || hour < 0 ||
-				min < 0 || sec < 0 || fsec < 0;
+					mday < 0 || hour < 0 ||
+					min < 0 || sec < 0 || fsec < 0;
 				bool		has_positive = year > 0 || mon > 0 ||
-				mday > 0 || hour > 0 ||
-				min > 0 || sec > 0 || fsec > 0;
+					mday > 0 || hour > 0 ||
+					min > 0 || sec > 0 || fsec > 0;
 				bool		has_year_month = year != 0 || mon != 0;
 				bool		has_day_time = mday != 0 || hour != 0 ||
-				min != 0 || sec != 0 || fsec != 0;
+					min != 0 || sec != 0 || fsec != 0;
 				bool		has_day = mday != 0;
 				bool		sql_standard_value = !(has_negative && has_positive) &&
-				!(has_year_month && has_day_time);
+					!(has_year_month && has_day_time);
 
 				/*
 				 * SQL Standard wants only 1 "<sign>" preceding the whole
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index 28b71d9681..7aba799351 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -1787,3 +1787,21 @@ SELECT extract(epoch from interval '1000000000 days');
  86400000000000.000000
 (1 row)
 
+-- test that ago can only appear once at the end of the interval.
+SELECT INTERVAL '42 days 2 seconds ago ago';
+ERROR:  invalid input syntax for type interval: "42 days 2 seconds ago ago"
+LINE 1: SELECT INTERVAL '42 days 2 seconds ago ago';
+                        ^
+SELECT INTERVAL '2 minutes ago 5 days';
+ERROR:  invalid input syntax for type interval: "2 minutes ago 5 days"
+LINE 1: SELECT INTERVAL '2 minutes ago 5 days';
+                        ^
+-- test that consecutive and dangling units are not allowed.
+SELECT INTERVAL 'hour 5 months';
+ERROR:  invalid input syntax for type interval: "hour 5 months"
+LINE 1: SELECT INTERVAL 'hour 5 months';
+                        ^
+SELECT INTERVAL '1 year months days 5 hours';
+ERROR:  invalid input syntax for type interval: "1 year months days 5 hours"
+LINE 1: SELECT INTERVAL '1 year months days 5 hours';
+                        ^
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index 56feda1a3d..b228be14fd 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -582,3 +582,10 @@ SELECT f1,
 
 -- internal overflow test case
 SELECT extract(epoch from interval '1000000000 days');
+
+-- test that ago can only appear once at the end of the interval.
+SELECT INTERVAL '42 days 2 seconds ago ago';
+SELECT INTERVAL '2 minutes ago 5 days';
+-- test that consecutive and dangling units are not allowed.
+SELECT INTERVAL 'hour 5 months';
+SELECT INTERVAL '1 year months days 5 hours';
-- 
2.34.1

Reply via email to