On Wed, 28 Dec 2022 at 14:28, Andrew Dunstan <and...@dunslane.net> wrote: > > On 2022-12-27 Tu 09:55, Tom Lane wrote: > > We already accept that numeric input is different from numeric > > literals: you can't write Infinity or NaN in SQL without quotes. > > So I don't see an argument that we have to allow this in numeric > > input for consistency. > > That's almost the same, but not quite, ISTM. Those are things you can't > say without quotes, but here unless I'm mistaken you'd be disallowing > this style if you use quotes. I get the difficulties with input > functions, but it seems like we'll be building lots of grounds for > confusion. >
Yeah, it's easy to see why something like 'NaN' needs quotes, but it would be harder to explain why something like 1000_000 mustn't have quotes, and couldn't be used as input to COPY. My feeling is that we should try to make the datatype input functions accept anything that is legal syntax as a numeric literal, even if the reverse isn't always possible. That said, I think it's very important to minimise any performance hit, especially in the existing case of inputs with no underscores. Looking at the patch's changes to pg_strtointNN(), I think there's more that can be done to reduce that performance hit. As it stands, every input character is checked to see if it's an underscore, and then there's a new check at the end to ensure that the input string doesn't have a trailing underscore. Both of those can be avoided by rearranging things a little, as in the attached v2 patch. In the v2 patch, each input character is only compared with underscore if it's not a digit, so in the case of an input with no underscores or trailing spaces, the new checks for underscores are never executed. In addition, if an underscore is seen, it now checks that the next character is a digit. This eliminates the possibility of two underscores in a row, and also of a trailing underscore, and so there is no need for the final check for trailing underscores. Thus, if the input consists only of digits, it never has to test for underscores at all, and the performance hit for this case is minimised. My other concern with this patch is that the responsibility for handling underscores is distributed over a couple of different places. I had the same concern about the non-decimal integer patch, but at the time I couldn't see any way round it. Now that we have soft error handling though, I think that there is a way to improve this, centralising the logic for both underscore and non-decimal handling to one place for each datatype, reducing code duplication and the chances of bugs. For example, make_const() in the T_Float case has gained new code to parse both the sign and base-prefix of the input, duplicating the logic in pg_strtointNN(). That can now be avoided by having it call pg_strtoint64_safe() with an ErrorSaveContext, instead of strtoi64(). In the process, it would then gain the ability to handle underscores, so they wouldn't need to be stripped off elsewhere. Similarly, process_integer_literal() could be made to call pg_strtoint32_safe() with an ErrorSaveContext instead of strtoint(), and it then wouldn't need to strip off underscores, or be passed the number's base, since pg_strtoint32_safe() would handle all of that. In addition, I think that strip_underscores() could then go away if numeric_in() were made to handle underscores. Essentially then, that would move all responsibility for parsing underscores and non-decimal integers to the datatype input functions, or their support routines, rather than having it distributed. Regards, Dean
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml new file mode 100644 index 956182e..27e53b4 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -729,6 +729,20 @@ $function$ </note> <para> + For visual grouping, underscores can be inserted between digits. These + have no further effect on the value of the literal. For example: +<literallayout>1_500_000_000 +0b10001000_00000000 +0o_1_755 +0xFFFF_FFFF +1.618_034 +</literallayout> + Underscores are not allowed at the start or end of a numeric constant or + a group of digits (that is, immediately before or after a period or the + <quote>e</quote>), and more than one underscore in a row is not allowed. + </para> + + <para> <indexterm><primary>integer</primary></indexterm> <indexterm><primary>bigint</primary></indexterm> <indexterm><primary>numeric</primary></indexterm> diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt new file mode 100644 index abad216..3766762 --- a/src/backend/catalog/sql_features.txt +++ b/src/backend/catalog/sql_features.txt @@ -528,6 +528,7 @@ T653 SQL-schema statements in external r T654 SQL-dynamic statements in external routines NO T655 Cyclically dependent routines YES T661 Non-decimal integer literals YES SQL:202x draft +T662 Underscores in integer literals YES SQL:202x draft T811 Basic SQL/JSON constructor functions NO T812 SQL/JSON: JSON_OBJECTAGG NO T813 SQL/JSON: JSON_ARRAYAGG with ORDER BY NO diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l new file mode 100644 index 9ad9e0c..a1ea94e --- a/src/backend/parser/scan.l +++ b/src/backend/parser/scan.l @@ -124,6 +124,7 @@ static void addlit(char *ytext, int ylen static void addlitchar(unsigned char ychar, core_yyscan_t yyscanner); static char *litbufdup(core_yyscan_t yyscanner); static unsigned char unescape_single_char(unsigned char c, core_yyscan_t yyscanner); +static char *strip_underscores(const char *in); static int process_integer_literal(const char *token, YYSTYPE *lval, int base); static void addunicode(pg_wchar c, yyscan_t yyscanner); @@ -395,19 +396,19 @@ hexdigit [0-9A-Fa-f] octdigit [0-7] bindigit [0-1] -decinteger {decdigit}+ -hexinteger 0[xX]{hexdigit}+ -octinteger 0[oO]{octdigit}+ -bininteger 0[bB]{bindigit}+ +decinteger {decdigit}(_?{decdigit})* +hexinteger 0[xX](_?{hexdigit})+ +octinteger 0[oO](_?{octdigit})+ +bininteger 0[bB](_?{bindigit})+ -hexfail 0[xX] -octfail 0[oO] -binfail 0[bB] +hexfail 0[xX]_? +octfail 0[oO]_? +binfail 0[bB]_? numeric (({decinteger}\.{decinteger}?)|(\.{decinteger})) numericfail {decdigit}+\.\. -real ({decinteger}|{numeric})[Ee][-+]?{decdigit}+ +real ({decinteger}|{numeric})[Ee][-+]?{decinteger}+ realfail ({decinteger}|{numeric})[Ee][-+] decinteger_junk {decinteger}{ident_start} @@ -1028,7 +1029,7 @@ other . } {numeric} { SET_YYLLOC(); - yylval->str = pstrdup(yytext); + yylval->str = strip_underscores(yytext); return FCONST; } {numericfail} { @@ -1039,7 +1040,7 @@ other . } {real} { SET_YYLLOC(); - yylval->str = pstrdup(yytext); + yylval->str = strip_underscores(yytext); return FCONST; } {realfail} { @@ -1357,6 +1358,30 @@ litbufdup(core_yyscan_t yyscanner) return new; } +static char * +strip_underscores(const char *in) +{ + if (strchr(in, '_')) + { + char *out = palloc(strlen(in)); + const char *p1; + char *p2; + + p1 = in; + p2 = out; + while (*p1) + { + if (*p1 != '_') + *p2++ = *p1; + p1++; + } + *p2 = '\0'; + return out; + } + else + return pstrdup(in); +} + /* * Process {decinteger}, {hexinteger}, etc. Note this will also do the right * thing with {numeric}, ie digits and a decimal point. @@ -1367,6 +1392,24 @@ process_integer_literal(const char *toke int val; char *endptr; + if (strchr(token, '_')) + { + char *newtoken = palloc(strlen(token)); + const char *p1; + char *p2; + + p1 = token; + p2 = newtoken; + while (*p1) + { + if (*p1 != '_') + *p2++ = *p1; + p1++; + } + *p2 = '\0'; + token = newtoken; + } + errno = 0; val = strtoint(base == 10 ? token : token + 2, &endptr, base); if (*endptr != '\0' || errno == ERANGE) diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c new file mode 100644 index 1e1a982..17d573a --- a/src/backend/utils/adt/numutils.c +++ b/src/backend/utils/adt/numutils.c @@ -141,48 +141,99 @@ pg_strtoint16_safe(const char *s, Node * { firstdigit = ptr += 2; - while (*ptr && isxdigit((unsigned char) *ptr)) + while (*ptr) { - if (unlikely(tmp > -(PG_INT16_MIN / 16))) - goto out_of_range; + if (isxdigit((unsigned char) *ptr)) + { + if (unlikely(tmp > -(PG_INT16_MIN / 16))) + goto out_of_range; - tmp = tmp * 16 + hexlookup[(unsigned char) *ptr++]; + tmp = tmp * 16 + hexlookup[(unsigned char) *ptr++]; + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || !isxdigit((unsigned char) *ptr)) + goto invalid_syntax; + } + else + break; } } else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O')) { firstdigit = ptr += 2; - while (*ptr && (*ptr >= '0' && *ptr <= '7')) + while (*ptr) { - if (unlikely(tmp > -(PG_INT16_MIN / 8))) - goto out_of_range; + if (*ptr >= '0' && *ptr <= '7') + { + if (unlikely(tmp > -(PG_INT16_MIN / 8))) + goto out_of_range; - tmp = tmp * 8 + (*ptr++ - '0'); + tmp = tmp * 8 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || *ptr < '0' || *ptr > '7') + goto invalid_syntax; + } + else + break; } } else if (ptr[0] == '0' && (ptr[1] == 'b' || ptr[1] == 'B')) { firstdigit = ptr += 2; - while (*ptr && (*ptr >= '0' && *ptr <= '1')) + while (*ptr) { - if (unlikely(tmp > -(PG_INT16_MIN / 2))) - goto out_of_range; + if (*ptr >= '0' && *ptr <= '1') + { + if (unlikely(tmp > -(PG_INT16_MIN / 2))) + goto out_of_range; - tmp = tmp * 2 + (*ptr++ - '0'); + tmp = tmp * 2 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || *ptr < '0' || *ptr > '1') + goto invalid_syntax; + } + else + break; } } else { firstdigit = ptr; - while (*ptr && isdigit((unsigned char) *ptr)) + while (*ptr) { - if (unlikely(tmp > -(PG_INT16_MIN / 10))) - goto out_of_range; + if (isdigit((unsigned char) *ptr)) + { + if (unlikely(tmp > -(PG_INT16_MIN / 10))) + goto out_of_range; - tmp = tmp * 10 + (*ptr++ - '0'); + tmp = tmp * 10 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore may not be first */ + if (unlikely(ptr == firstdigit)) + goto invalid_syntax; + /* and it must be followed by more digits */ + ptr++; + if (*ptr == '\0' || !isdigit((unsigned char) *ptr)) + goto invalid_syntax; + } + else + break; } } @@ -268,48 +319,99 @@ pg_strtoint32_safe(const char *s, Node * { firstdigit = ptr += 2; - while (*ptr && isxdigit((unsigned char) *ptr)) + while (*ptr) { - if (unlikely(tmp > -(PG_INT32_MIN / 16))) - goto out_of_range; + if (isxdigit((unsigned char) *ptr)) + { + if (unlikely(tmp > -(PG_INT32_MIN / 16))) + goto out_of_range; - tmp = tmp * 16 + hexlookup[(unsigned char) *ptr++]; + tmp = tmp * 16 + hexlookup[(unsigned char) *ptr++]; + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || !isxdigit((unsigned char) *ptr)) + goto invalid_syntax; + } + else + break; } } else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O')) { firstdigit = ptr += 2; - while (*ptr && (*ptr >= '0' && *ptr <= '7')) + while (*ptr) { - if (unlikely(tmp > -(PG_INT32_MIN / 8))) - goto out_of_range; + if (*ptr >= '0' && *ptr <= '7') + { + if (unlikely(tmp > -(PG_INT32_MIN / 8))) + goto out_of_range; - tmp = tmp * 8 + (*ptr++ - '0'); + tmp = tmp * 8 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || *ptr < '0' || *ptr > '7') + goto invalid_syntax; + } + else + break; } } else if (ptr[0] == '0' && (ptr[1] == 'b' || ptr[1] == 'B')) { firstdigit = ptr += 2; - while (*ptr && (*ptr >= '0' && *ptr <= '1')) + while (*ptr) { - if (unlikely(tmp > -(PG_INT32_MIN / 2))) - goto out_of_range; + if (*ptr >= '0' && *ptr <= '1') + { + if (unlikely(tmp > -(PG_INT32_MIN / 2))) + goto out_of_range; - tmp = tmp * 2 + (*ptr++ - '0'); + tmp = tmp * 2 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || *ptr < '0' || *ptr > '1') + goto invalid_syntax; + } + else + break; } } else { firstdigit = ptr; - while (*ptr && isdigit((unsigned char) *ptr)) + while (*ptr) { - if (unlikely(tmp > -(PG_INT32_MIN / 10))) - goto out_of_range; + if (isdigit((unsigned char) *ptr)) + { + if (unlikely(tmp > -(PG_INT32_MIN / 10))) + goto out_of_range; - tmp = tmp * 10 + (*ptr++ - '0'); + tmp = tmp * 10 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore may not be first */ + if (unlikely(ptr == firstdigit)) + goto invalid_syntax; + /* and it must be followed by more digits */ + ptr++; + if (*ptr == '\0' || !isdigit((unsigned char) *ptr)) + goto invalid_syntax; + } + else + break; } } @@ -395,48 +497,99 @@ pg_strtoint64_safe(const char *s, Node * { firstdigit = ptr += 2; - while (*ptr && isxdigit((unsigned char) *ptr)) + while (*ptr) { - if (unlikely(tmp > -(PG_INT64_MIN / 16))) - goto out_of_range; + if (isxdigit((unsigned char) *ptr)) + { + if (unlikely(tmp > -(PG_INT64_MIN / 16))) + goto out_of_range; - tmp = tmp * 16 + hexlookup[(unsigned char) *ptr++]; + tmp = tmp * 16 + hexlookup[(unsigned char) *ptr++]; + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || !isxdigit((unsigned char) *ptr)) + goto invalid_syntax; + } + else + break; } } else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O')) { firstdigit = ptr += 2; - while (*ptr && (*ptr >= '0' && *ptr <= '7')) + while (*ptr) { - if (unlikely(tmp > -(PG_INT64_MIN / 8))) - goto out_of_range; + if (*ptr >= '0' && *ptr <= '7') + { + if (unlikely(tmp > -(PG_INT64_MIN / 8))) + goto out_of_range; - tmp = tmp * 8 + (*ptr++ - '0'); + tmp = tmp * 8 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || *ptr < '0' || *ptr > '7') + goto invalid_syntax; + } + else + break; } } else if (ptr[0] == '0' && (ptr[1] == 'b' || ptr[1] == 'B')) { firstdigit = ptr += 2; - while (*ptr && (*ptr >= '0' && *ptr <= '1')) + while (*ptr) { - if (unlikely(tmp > -(PG_INT64_MIN / 2))) - goto out_of_range; + if (*ptr >= '0' && *ptr <= '1') + { + if (unlikely(tmp > -(PG_INT64_MIN / 2))) + goto out_of_range; - tmp = tmp * 2 + (*ptr++ - '0'); + tmp = tmp * 2 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore must be followed by more digits */ + ptr++; + if (*ptr == '\0' || *ptr < '0' || *ptr > '1') + goto invalid_syntax; + } + else + break; } } else { firstdigit = ptr; - while (*ptr && isdigit((unsigned char) *ptr)) + while (*ptr) { - if (unlikely(tmp > -(PG_INT64_MIN / 10))) - goto out_of_range; + if (isdigit((unsigned char) *ptr)) + { + if (unlikely(tmp > -(PG_INT64_MIN / 10))) + goto out_of_range; - tmp = tmp * 10 + (*ptr++ - '0'); + tmp = tmp * 10 + (*ptr++ - '0'); + } + else if (*ptr == '_') + { + /* underscore may not be first */ + if (unlikely(ptr == firstdigit)) + goto invalid_syntax; + /* and it must be followed by more digits */ + ptr++; + if (*ptr == '\0' || !isdigit((unsigned char) *ptr)) + goto invalid_syntax; + } + else + break; } } diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l new file mode 100644 index cb1fc52..44f80e4 --- a/src/fe_utils/psqlscan.l +++ b/src/fe_utils/psqlscan.l @@ -333,19 +333,19 @@ hexdigit [0-9A-Fa-f] octdigit [0-7] bindigit [0-1] -decinteger {decdigit}+ -hexinteger 0[xX]{hexdigit}+ -octinteger 0[oO]{octdigit}+ -bininteger 0[bB]{bindigit}+ +decinteger {decdigit}(_?{decdigit})* +hexinteger 0[xX](_?{hexdigit})+ +octinteger 0[oO](_?{octdigit})+ +bininteger 0[bB](_?{bindigit})+ -hexfail 0[xX] -octfail 0[oO] -binfail 0[bB] +hexfail 0[xX]_? +octfail 0[oO]_? +binfail 0[bB]_? numeric (({decinteger}\.{decinteger}?)|(\.{decinteger})) numericfail {decdigit}+\.\. -real ({decinteger}|{numeric})[Ee][-+]?{decdigit}+ +real ({decinteger}|{numeric})[Ee][-+]?{decinteger} realfail ({decinteger}|{numeric})[Ee][-+] decinteger_junk {decinteger}{ident_start} diff --git a/src/interfaces/ecpg/preproc/pgc.l b/src/interfaces/ecpg/preproc/pgc.l new file mode 100644 index 2c09c6c..2761ae3 --- a/src/interfaces/ecpg/preproc/pgc.l +++ b/src/interfaces/ecpg/preproc/pgc.l @@ -361,19 +361,19 @@ hexdigit [0-9A-Fa-f] octdigit [0-7] bindigit [0-1] -decinteger {decdigit}+ -hexinteger 0[xX]{hexdigit}+ -octinteger 0[oO]{octdigit}+ -bininteger 0[bB]{bindigit}+ +decinteger {decdigit}(_?{decdigit})* +hexinteger 0[xX](_?{hexdigit})+ +octinteger 0[oO](_?{octdigit})+ +bininteger 0[bB](_?{bindigit})+ -hexfail 0[xX] -octfail 0[oO] -binfail 0[bB] +hexfail 0[xX]_? +octfail 0[oO]_? +binfail 0[bB]_? numeric (({decinteger}\.{decinteger}?)|(\.{decinteger})) numericfail {decdigit}+\.\. -real ({decinteger}|{numeric})[Ee][-+]?{decdigit}+ +real ({decinteger}|{numeric})[Ee][-+]?{decinteger} realfail ({decinteger}|{numeric})[Ee][-+] decinteger_junk {decinteger}{ident_start} diff --git a/src/pl/plpgsql/src/expected/plpgsql_trap.out b/src/pl/plpgsql/src/expected/plpgsql_trap.out new file mode 100644 index 90cf6c2..62d1679 --- a/src/pl/plpgsql/src/expected/plpgsql_trap.out +++ b/src/pl/plpgsql/src/expected/plpgsql_trap.out @@ -141,7 +141,7 @@ begin declare x int; begin -- we assume this will take longer than 1 second: - select count(*) into x from generate_series(1, 1000000000000); + select count(*) into x from generate_series(1, 1_000_000_000_000); exception when others then raise notice 'caught others?'; diff --git a/src/pl/plpgsql/src/sql/plpgsql_trap.sql b/src/pl/plpgsql/src/sql/plpgsql_trap.sql new file mode 100644 index c6c1ad8..5459b34 --- a/src/pl/plpgsql/src/sql/plpgsql_trap.sql +++ b/src/pl/plpgsql/src/sql/plpgsql_trap.sql @@ -88,7 +88,7 @@ begin declare x int; begin -- we assume this will take longer than 1 second: - select count(*) into x from generate_series(1, 1000000000000); + select count(*) into x from generate_series(1, 1_000_000_000_000); exception when others then raise notice 'caught others?'; diff --git a/src/test/regress/expected/int2.out b/src/test/regress/expected/int2.out new file mode 100644 index 08c333b..73b4ee0 --- a/src/test/regress/expected/int2.out +++ b/src/test/regress/expected/int2.out @@ -440,3 +440,47 @@ SELECT int2 '-0x8001'; ERROR: value "-0x8001" is out of range for type smallint LINE 1: SELECT int2 '-0x8001'; ^ +-- underscores +SELECT int2 '1_000'; + int2 +------ + 1000 +(1 row) + +SELECT int2 '1_2_3'; + int2 +------ + 123 +(1 row) + +SELECT int2 '0xE_FF'; + int2 +------ + 3839 +(1 row) + +SELECT int2 '0o2_73'; + int2 +------ + 187 +(1 row) + +SELECT int2 '0b_10_0101'; + int2 +------ + 37 +(1 row) + +-- error cases +SELECT int2 '_100'; +ERROR: invalid input syntax for type smallint: "_100" +LINE 1: SELECT int2 '_100'; + ^ +SELECT int2 '100_'; +ERROR: invalid input syntax for type smallint: "100_" +LINE 1: SELECT int2 '100_'; + ^ +SELECT int2 '10__000'; +ERROR: invalid input syntax for type smallint: "10__000" +LINE 1: SELECT int2 '10__000'; + ^ diff --git a/src/test/regress/expected/int4.out b/src/test/regress/expected/int4.out new file mode 100644 index 8386c7c..9c20574 --- a/src/test/regress/expected/int4.out +++ b/src/test/regress/expected/int4.out @@ -548,3 +548,47 @@ SELECT int4 '-0x80000001'; ERROR: value "-0x80000001" is out of range for type integer LINE 1: SELECT int4 '-0x80000001'; ^ +-- underscores +SELECT int4 '1_000_000'; + int4 +--------- + 1000000 +(1 row) + +SELECT int4 '1_2_3'; + int4 +------ + 123 +(1 row) + +SELECT int4 '0x1EEE_FFFF'; + int4 +----------- + 518979583 +(1 row) + +SELECT int4 '0o2_73'; + int4 +------ + 187 +(1 row) + +SELECT int4 '0b_10_0101'; + int4 +------ + 37 +(1 row) + +-- error cases +SELECT int4 '_100'; +ERROR: invalid input syntax for type integer: "_100" +LINE 1: SELECT int4 '_100'; + ^ +SELECT int4 '100_'; +ERROR: invalid input syntax for type integer: "100_" +LINE 1: SELECT int4 '100_'; + ^ +SELECT int4 '100__000'; +ERROR: invalid input syntax for type integer: "100__000" +LINE 1: SELECT int4 '100__000'; + ^ diff --git a/src/test/regress/expected/int8.out b/src/test/regress/expected/int8.out new file mode 100644 index 5b62b51..d9dca64 --- a/src/test/regress/expected/int8.out +++ b/src/test/regress/expected/int8.out @@ -1044,3 +1044,47 @@ SELECT int8 '-0x8000000000000001'; ERROR: value "-0x8000000000000001" is out of range for type bigint LINE 1: SELECT int8 '-0x8000000000000001'; ^ +-- underscores +SELECT int8 '1_000_000'; + int8 +--------- + 1000000 +(1 row) + +SELECT int8 '1_2_3'; + int8 +------ + 123 +(1 row) + +SELECT int8 '0x1EEE_FFFF'; + int8 +----------- + 518979583 +(1 row) + +SELECT int8 '0o2_73'; + int8 +------ + 187 +(1 row) + +SELECT int8 '0b_10_0101'; + int8 +------ + 37 +(1 row) + +-- error cases +SELECT int8 '_100'; +ERROR: invalid input syntax for type bigint: "_100" +LINE 1: SELECT int8 '_100'; + ^ +SELECT int8 '100_'; +ERROR: invalid input syntax for type bigint: "100_" +LINE 1: SELECT int8 '100_'; + ^ +SELECT int8 '100__000'; +ERROR: invalid input syntax for type bigint: "100__000" +LINE 1: SELECT int8 '100__000'; + ^ diff --git a/src/test/regress/expected/numerology.out b/src/test/regress/expected/numerology.out new file mode 100644 index 15cd6b1..6b9b089 --- a/src/test/regress/expected/numerology.out +++ b/src/test/regress/expected/numerology.out @@ -166,10 +166,6 @@ SELECT 0x0o; ERROR: trailing junk after numeric literal at or near "0x0o" LINE 1: SELECT 0x0o; ^ -SELECT 1_2_3; -ERROR: trailing junk after numeric literal at or near "1_" -LINE 1: SELECT 1_2_3; - ^ SELECT 0.a; ERROR: trailing junk after numeric literal at or near "0.a" LINE 1: SELECT 0.a; @@ -234,6 +230,94 @@ SELECT 0x0y; ERROR: trailing junk after numeric literal at or near "0x0y" LINE 1: SELECT 0x0y; ^ +-- underscores +SELECT 1_000_000; + ?column? +---------- + 1000000 +(1 row) + +SELECT 1_2_3; + ?column? +---------- + 123 +(1 row) + +SELECT 0x1EEE_FFFF; + ?column? +----------- + 518979583 +(1 row) + +SELECT 0o2_73; + ?column? +---------- + 187 +(1 row) + +SELECT 0b_10_0101; + ?column? +---------- + 37 +(1 row) + +SELECT 1_000.000_005; + ?column? +------------- + 1000.000005 +(1 row) + +SELECT 1_000.; + ?column? +---------- + 1000 +(1 row) + +SELECT .000_005; + ?column? +---------- + 0.000005 +(1 row) + +SELECT 1_000.5e0_1; + ?column? +---------- + 10005 +(1 row) + +-- error cases +SELECT _100; +ERROR: column "_100" does not exist +LINE 1: SELECT _100; + ^ +SELECT 100_; +ERROR: trailing junk after numeric literal at or near "100_" +LINE 1: SELECT 100_; + ^ +SELECT 100__000; +ERROR: trailing junk after numeric literal at or near "100_" +LINE 1: SELECT 100__000; + ^ +SELECT _1_000.5; +ERROR: syntax error at or near ".5" +LINE 1: SELECT _1_000.5; + ^ +SELECT 1_000_.5; +ERROR: trailing junk after numeric literal at or near "1_000_" +LINE 1: SELECT 1_000_.5; + ^ +SELECT 1_000._5; +ERROR: trailing junk after numeric literal at or near "1_000._" +LINE 1: SELECT 1_000._5; + ^ +SELECT 1_000.5_; +ERROR: trailing junk after numeric literal at or near "1_000.5_" +LINE 1: SELECT 1_000.5_; + ^ +SELECT 1_000.5e_1; +ERROR: trailing junk after numeric literal at or near "1_000.5e" +LINE 1: SELECT 1_000.5e_1; + ^ -- -- Test implicit type conversions -- This fails for Postgres v6.1 (and earlier?) diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out new file mode 100644 index 7555764..d700c00 --- a/src/test/regress/expected/partition_prune.out +++ b/src/test/regress/expected/partition_prune.out @@ -1503,7 +1503,7 @@ explain (costs off) select * from like_o create table lparted_by_int2 (a smallint) partition by list (a); create table lparted_by_int2_1 partition of lparted_by_int2 for values in (1); create table lparted_by_int2_16384 partition of lparted_by_int2 for values in (16384); -explain (costs off) select * from lparted_by_int2 where a = 100000000000000; +explain (costs off) select * from lparted_by_int2 where a = 100_000_000_000_000; QUERY PLAN -------------------------- Result @@ -1514,7 +1514,7 @@ create table rparted_by_int2 (a smallint create table rparted_by_int2_1 partition of rparted_by_int2 for values from (1) to (10); create table rparted_by_int2_16384 partition of rparted_by_int2 for values from (10) to (16384); -- all partitions pruned -explain (costs off) select * from rparted_by_int2 where a > 100000000000000; +explain (costs off) select * from rparted_by_int2 where a > 100_000_000_000_000; QUERY PLAN -------------------------- Result @@ -1523,7 +1523,7 @@ explain (costs off) select * from rparte create table rparted_by_int2_maxvalue partition of rparted_by_int2 for values from (16384) to (maxvalue); -- all partitions but rparted_by_int2_maxvalue pruned -explain (costs off) select * from rparted_by_int2 where a > 100000000000000; +explain (costs off) select * from rparted_by_int2 where a > 100_000_000_000_000; QUERY PLAN ------------------------------------------------------ Seq Scan on rparted_by_int2_maxvalue rparted_by_int2 diff --git a/src/test/regress/sql/int2.sql b/src/test/regress/sql/int2.sql new file mode 100644 index a812235..ce8ac97 --- a/src/test/regress/sql/int2.sql +++ b/src/test/regress/sql/int2.sql @@ -141,3 +141,17 @@ SELECT int2 '-0o100000'; SELECT int2 '-0o100001'; SELECT int2 '-0x8000'; SELECT int2 '-0x8001'; + + +-- underscores + +SELECT int2 '1_000'; +SELECT int2 '1_2_3'; +SELECT int2 '0xE_FF'; +SELECT int2 '0o2_73'; +SELECT int2 '0b_10_0101'; + +-- error cases +SELECT int2 '_100'; +SELECT int2 '100_'; +SELECT int2 '10__000'; diff --git a/src/test/regress/sql/int4.sql b/src/test/regress/sql/int4.sql new file mode 100644 index 9e6a404..146963e --- a/src/test/regress/sql/int4.sql +++ b/src/test/regress/sql/int4.sql @@ -196,3 +196,17 @@ SELECT int4 '-0o20000000000'; SELECT int4 '-0o20000000001'; SELECT int4 '-0x80000000'; SELECT int4 '-0x80000001'; + + +-- underscores + +SELECT int4 '1_000_000'; +SELECT int4 '1_2_3'; +SELECT int4 '0x1EEE_FFFF'; +SELECT int4 '0o2_73'; +SELECT int4 '0b_10_0101'; + +-- error cases +SELECT int4 '_100'; +SELECT int4 '100_'; +SELECT int4 '100__000'; diff --git a/src/test/regress/sql/int8.sql b/src/test/regress/sql/int8.sql new file mode 100644 index 06f273e..c85717c --- a/src/test/regress/sql/int8.sql +++ b/src/test/regress/sql/int8.sql @@ -277,3 +277,17 @@ SELECT int8 '-0o1000000000000000000000'; SELECT int8 '-0o1000000000000000000001'; SELECT int8 '-0x8000000000000000'; SELECT int8 '-0x8000000000000001'; + + +-- underscores + +SELECT int8 '1_000_000'; +SELECT int8 '1_2_3'; +SELECT int8 '0x1EEE_FFFF'; +SELECT int8 '0o2_73'; +SELECT int8 '0b_10_0101'; + +-- error cases +SELECT int8 '_100'; +SELECT int8 '100_'; +SELECT int8 '100__000'; diff --git a/src/test/regress/sql/numerology.sql b/src/test/regress/sql/numerology.sql new file mode 100644 index 310d9e5..1941c58 --- a/src/test/regress/sql/numerology.sql +++ b/src/test/regress/sql/numerology.sql @@ -45,7 +45,6 @@ SELECT -0x8000000000000001; -- error cases SELECT 123abc; SELECT 0x0o; -SELECT 1_2_3; SELECT 0.a; SELECT 0.0a; SELECT .0a; @@ -66,6 +65,29 @@ SELECT 0x; SELECT 1x; SELECT 0x0y; +-- underscores +SELECT 1_000_000; +SELECT 1_2_3; +SELECT 0x1EEE_FFFF; +SELECT 0o2_73; +SELECT 0b_10_0101; + +SELECT 1_000.000_005; +SELECT 1_000.; +SELECT .000_005; +SELECT 1_000.5e0_1; + +-- error cases +SELECT _100; +SELECT 100_; +SELECT 100__000; + +SELECT _1_000.5; +SELECT 1_000_.5; +SELECT 1_000._5; +SELECT 1_000.5_; +SELECT 1_000.5e_1; + -- -- Test implicit type conversions diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql new file mode 100644 index d70bd86..fb0583f --- a/src/test/regress/sql/partition_prune.sql +++ b/src/test/regress/sql/partition_prune.sql @@ -283,16 +283,16 @@ explain (costs off) select * from like_o create table lparted_by_int2 (a smallint) partition by list (a); create table lparted_by_int2_1 partition of lparted_by_int2 for values in (1); create table lparted_by_int2_16384 partition of lparted_by_int2 for values in (16384); -explain (costs off) select * from lparted_by_int2 where a = 100000000000000; +explain (costs off) select * from lparted_by_int2 where a = 100_000_000_000_000; create table rparted_by_int2 (a smallint) partition by range (a); create table rparted_by_int2_1 partition of rparted_by_int2 for values from (1) to (10); create table rparted_by_int2_16384 partition of rparted_by_int2 for values from (10) to (16384); -- all partitions pruned -explain (costs off) select * from rparted_by_int2 where a > 100000000000000; +explain (costs off) select * from rparted_by_int2 where a > 100_000_000_000_000; create table rparted_by_int2_maxvalue partition of rparted_by_int2 for values from (16384) to (maxvalue); -- all partitions but rparted_by_int2_maxvalue pruned -explain (costs off) select * from rparted_by_int2 where a > 100000000000000; +explain (costs off) select * from rparted_by_int2 where a > 100_000_000_000_000; drop table lp, coll_pruning, rlp, mc3p, mc2p, boolpart, boolrangep, rp, coll_pruning_multi, like_op_noprune, lparted_by_int2, rparted_by_int2;