Jacob Champion <jchamp...@timescale.com> writes: > Hi Joe, here's a partial review:
Thanks so much for the review! > I'm new to this code, but I agree that the use of `type` and the > lookahead are not particularly obvious/intuitive. At the very least, > they'd need some more explanation in the code. Your boolean flag idea > sounds reasonable, though. I've updated the patch with the boolean flag idea. I think it's a bit cleaner and more readable. >> There is one more problem I noticed, but didn't fix. We allow multiple >> "@" to be sprinkled anywhere in the input, even though the docs [0] >> only allow it to appear at the beginning of the input. > > (No particular opinion on this.) I looked into this a bit. The reason this works is because the date time lexer filters out all punctuation. That's what allows us to parse things like `SELECT date 'January 8, 1999';`. It's probably not worth trying to be smarter about what punctuation we allow where, at least for now. Maybe in the future we can exclude "@" from the punctuation that get's filtered out. > It looks like this patch needs a rebase for the CI, too, but there are > no conflicts. The attached patch is rebased against master. Thanks, Joe Koshakow
From eee98dd65c3556528803b6ee2cab10e9ece8d871 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow <kosh...@gmail.com> Date: Sun, 9 Apr 2023 20:37:27 -0400 Subject: [PATCH] Fix interval decode handling of invalid intervals This patch does three things in the DecodeInterval function: 1) Removes dead code for handling unit type RESERVE. There used to be a unit called "invalid" that was of type RESERVE. At some point that unit was removed and there were no more units of type RESERVE. Therefore, the code for RESERVE unit handling is unreachable. 2) Restrict the unit "ago" to only appear at the end of the interval. According to the docs [0], this is the only valid place to put it, but we allowed it multiple times at any point in the input. 3) Error when the user has multiple consecutive units or a unit without an accompanying value. [0] https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT --- src/backend/utils/adt/datetime.c | 25 +++++++++++++++++++------ src/test/regress/expected/interval.out | 18 ++++++++++++++++++ src/test/regress/sql/interval.sql | 7 +++++++ 3 files changed, 44 insertions(+), 6 deletions(-) diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c index 5d8d583ddc..b930a67007 100644 --- a/src/backend/utils/adt/datetime.c +++ b/src/backend/utils/adt/datetime.c @@ -3278,6 +3278,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range, { bool force_negative = false; bool is_before = false; + bool parsing_unit_val = false; char *cp; int fmask = 0, tmask, @@ -3336,6 +3337,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range, itm_in->tm_usec > 0) itm_in->tm_usec = -itm_in->tm_usec; type = DTK_DAY; + parsing_unit_val = false; break; case DTK_TZ: @@ -3373,6 +3375,7 @@ DecodeInterval(char **field, int *ftype, int nf, int range, * are reading right to left. */ type = DTK_DAY; + parsing_unit_val = false; break; } @@ -3562,10 +3565,14 @@ DecodeInterval(char **field, int *ftype, int nf, int range, default: return DTERR_BAD_FORMAT; } + parsing_unit_val = false; break; case DTK_STRING: case DTK_SPECIAL: + /* reject consecutive unhandled units */ + if (parsing_unit_val) + return DTERR_BAD_FORMAT; type = DecodeUnits(i, field[i], &uval); if (type == IGNORE_DTF) continue; @@ -3575,16 +3582,18 @@ DecodeInterval(char **field, int *ftype, int nf, int range, { case UNITS: type = uval; + parsing_unit_val = true; break; case AGO: - is_before = true; - type = uval; - break; - case RESERV: - tmask = (DTK_DATE_M | DTK_TIME_M); - *dtype = uval; + /* + * 'ago' is only allowed to appear at the end of the + * interval. + */ + if (i != nf - 1) + return DTERR_BAD_FORMAT; + is_before = true; break; default: @@ -3605,6 +3614,10 @@ DecodeInterval(char **field, int *ftype, int nf, int range, if (fmask == 0) return DTERR_BAD_FORMAT; + /* reject if unit appeared and was never handled */ + if (parsing_unit_val) + return DTERR_BAD_FORMAT; + /* finally, AGO negates everything */ if (is_before) { diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out index 28b71d9681..7aba799351 100644 --- a/src/test/regress/expected/interval.out +++ b/src/test/regress/expected/interval.out @@ -1787,3 +1787,21 @@ SELECT extract(epoch from interval '1000000000 days'); 86400000000000.000000 (1 row) +-- test that ago can only appear once at the end of the interval. +SELECT INTERVAL '42 days 2 seconds ago ago'; +ERROR: invalid input syntax for type interval: "42 days 2 seconds ago ago" +LINE 1: SELECT INTERVAL '42 days 2 seconds ago ago'; + ^ +SELECT INTERVAL '2 minutes ago 5 days'; +ERROR: invalid input syntax for type interval: "2 minutes ago 5 days" +LINE 1: SELECT INTERVAL '2 minutes ago 5 days'; + ^ +-- test that consecutive and dangling units are not allowed. +SELECT INTERVAL 'hour 5 months'; +ERROR: invalid input syntax for type interval: "hour 5 months" +LINE 1: SELECT INTERVAL 'hour 5 months'; + ^ +SELECT INTERVAL '1 year months days 5 hours'; +ERROR: invalid input syntax for type interval: "1 year months days 5 hours" +LINE 1: SELECT INTERVAL '1 year months days 5 hours'; + ^ diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql index 56feda1a3d..b228be14fd 100644 --- a/src/test/regress/sql/interval.sql +++ b/src/test/regress/sql/interval.sql @@ -582,3 +582,10 @@ SELECT f1, -- internal overflow test case SELECT extract(epoch from interval '1000000000 days'); + +-- test that ago can only appear once at the end of the interval. +SELECT INTERVAL '42 days 2 seconds ago ago'; +SELECT INTERVAL '2 minutes ago 5 days'; +-- test that consecutive and dangling units are not allowed. +SELECT INTERVAL 'hour 5 months'; +SELECT INTERVAL '1 year months days 5 hours'; -- 2.34.1