Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Mon, Jul 26, 2021 at 05:46:22PM +0900, Kyotaro Horiguchi wrote: > Thanks for revising and committing! I'm fine with all of the recent > discussions on the committed part. Though I don't think it works for > "live" command line options, making the omitting policy symmetric > looks good. I see the same done in several similar use of strto[il]. OK, applied this one. So for now we are done here. > The change in 020_pg_receivewal.pl results in a chain of four > successive failures, which is fine. But the last failure (#23) happens > for a bit dubious reason. Yes, I saw that as well. I'll address that separately. -- Michael signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
At Mon, 26 Jul 2021 15:01:35 +0900, Michael Paquier wrote in > On Sat, Jul 24, 2021 at 07:41:12PM +0900, Michael Paquier wrote: > > I have looked at that over the last couple of days, and applied it > > after some small fixes, including an indentation. > > One thing that we forgot here is the handling of trailing > whitespaces. Attached is small patch to adjust that, with one > positive and one negative tests. > > > The int64 and float > > parts are extra types we could handle, but I have not looked yet at > > how much benefits we'd get in those cases. > > I have looked at these two but there is really less benefits, so for > now I am not planning to do more in this area. For float options, > pg_basebackup --max-rate could be one target on top of the three set > of options in pgbench, but it needs to handle units :( Thanks for revising and committing! I'm fine with all of the recent discussions on the committed part. Though I don't think it works for "live" command line options, making the omitting policy symmetric looks good. I see the same done in several similar use of strto[il]. The change in 020_pg_receivewal.pl results in a chain of four successive failures, which is fine. But the last failure (#23) happens for a bit dubious reason. > Use of uninitialized value in pattern match (m//) at t/020_pg_receivewal.pl > line 114. > not ok 23 - one partial WAL segment is now completed It might not be worth amending, but we don't need to use m/// there and eq works fine. 020_pg_receivewal.pl: 114 - is($zlib_wals[0] =~ m/$partial_wals[0]/, + is($zlib_wals[0] eq $partial_wals[0], regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Sat, Jul 24, 2021 at 07:41:12PM +0900, Michael Paquier wrote: > I have looked at that over the last couple of days, and applied it > after some small fixes, including an indentation. One thing that we forgot here is the handling of trailing whitespaces. Attached is small patch to adjust that, with one positive and one negative tests. > The int64 and float > parts are extra types we could handle, but I have not looked yet at > how much benefits we'd get in those cases. I have looked at these two but there is really less benefits, so for now I am not planning to do more in this area. For float options, pg_basebackup --max-rate could be one target on top of the three set of options in pgbench, but it needs to handle units :( -- Michael From 2c38b36841965fc458dab69d846bcc0dde07aca2 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 26 Jul 2021 14:41:15 +0900 Subject: [PATCH] Skip trailing whitespaces when parsing integer options strtoint(), via strtol(), would skip any leading whitespace but the same rule was not applied for trailing whitespaces, which was a bit inconsistent. Some tests are added while on it. --- src/fe_utils/option_utils.c | 9 - src/bin/pg_basebackup/t/020_pg_receivewal.pl | 4 +++- src/bin/pg_dump/t/001_basic.pl | 3 ++- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c index 3e7e512ad9..bcfe7365fd 100644 --- a/src/fe_utils/option_utils.c +++ b/src/fe_utils/option_utils.c @@ -57,7 +57,14 @@ option_parse_int(const char *optarg, const char *optname, errno = 0; val = strtoint(optarg, , 10); - if (*endptr) + /* + * Skip any trailing whitespace; if anything but whitespace remains before + * the terminating character, fail. + */ + while (*endptr != '\0' && isspace((unsigned char) *endptr)) + endptr++; + + if (*endptr != '\0') { pg_log_error("invalid value \"%s\" for option %s", optarg, optname); diff --git a/src/bin/pg_basebackup/t/020_pg_receivewal.pl b/src/bin/pg_basebackup/t/020_pg_receivewal.pl index 158f7d176f..47c4ecb073 100644 --- a/src/bin/pg_basebackup/t/020_pg_receivewal.pl +++ b/src/bin/pg_basebackup/t/020_pg_receivewal.pl @@ -88,10 +88,12 @@ SKIP: $primary->psql('postgres', 'INSERT INTO test_table VALUES (generate_series(100,200));'); + # Note the trailing whitespace after the value of --compress, that is + # a valid value. $primary->command_ok( [ 'pg_receivewal', '-D', $stream_dir, '--verbose', - '--endpos', $nextlsn, '--compress', '1' + '--endpos', $nextlsn, '--compress', '1 ' ], "streaming some WAL using ZLIB compression"); diff --git a/src/bin/pg_dump/t/001_basic.pl b/src/bin/pg_dump/t/001_basic.pl index 59de6df7ba..d1a7e1db40 100644 --- a/src/bin/pg_dump/t/001_basic.pl +++ b/src/bin/pg_dump/t/001_basic.pl @@ -101,8 +101,9 @@ command_fails_like( qr/\Qpg_dump: error: parallel backup only supported by the directory format\E/, 'pg_dump: parallel backup only supported by the directory format'); +# Note the trailing whitespace for the value of --jobs, that is valid. command_fails_like( - [ 'pg_dump', '-j', '-1' ], + [ 'pg_dump', '-j', '-1 ' ], qr/\Qpg_dump: error: -j\/--jobs must be in range\E/, 'pg_dump: -j/--jobs must be in range'); -- 2.32.0 signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, Jul 22, 2021 at 02:32:35PM +0900, Michael Paquier wrote: > Okay, done those parts as per the attached. While on it, I noticed an > extra one for pg_dump --rows-per-insert. I am counting 25 translated > strings cut in total. > > Any objections to this first step? I have looked at that over the last couple of days, and applied it after some small fixes, including an indentation. The int64 and float parts are extra types we could handle, but I have not looked yet at how much benefits we'd get in those cases. -- Michael signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, Jul 22, 2021 at 09:42:00AM -0400, Alvaro Herrera wrote: > May I suggest for the second sentence something like "If the parsing is > successful, returns true and stores the result in *result if that's > given; if parsing fails, returns false" Sounds fine to me. Thanks. -- Michael signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On 2021-Jul-21, Michael Paquier wrote: > +/* > + * option_parse_int > + * > + * Parse an integer for a given option. Returns true if the parsing > + * could be done with optionally *result holding the parsed value, and > + * false on failure. > + */ May I suggest for the second sentence something like "If the parsing is successful, returns true and stores the result in *result if that's given; if parsing fails, returns false" -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "Aprender sin pensar es inútil; pensar sin aprender, peligroso" (Confucio)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, Jul 22, 2021 at 01:19:41AM +1200, David Rowley wrote: > On Thu, 22 Jul 2021 at 00:44, Michael Paquier wrote: >> >> On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote: >> > I see both of these are limited to 64 on windows. Won't those fail on >> > Windows? >> >> Yes, thanks, they would. I would just cut the range numbers from the >> expected output here. This does not matter in terms of coverage >> either. > > Sounds good. > >> x> I also wondered if it would be worth doing #define MAX_JOBS somewhere >> > away from the option parsing code. This part is pretty ugly: >> >> Agreed as well. pg_dump and pg_restore have their own idea of >> parallelism in parallel.{c.h}. What about putting MAX_JOBS in >> parallel.h then? > > parallel.h looks ok to me. Okay, done those parts as per the attached. While on it, I noticed an extra one for pg_dump --rows-per-insert. I am counting 25 translated strings cut in total. Any objections to this first step? -- Michael From 36be37c0dd0810b65d75e9807b61c244d4f56333 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Thu, 22 Jul 2021 14:31:31 +0900 Subject: [PATCH v5] Introduce and use routine for parsing of int32 options --- src/include/fe_utils/option_utils.h| 3 ++ src/fe_utils/option_utils.c| 39 ++ src/bin/pg_amcheck/pg_amcheck.c| 8 ++- src/bin/pg_basebackup/pg_basebackup.c | 17 +++--- src/bin/pg_basebackup/pg_receivewal.c | 23 +++-- src/bin/pg_basebackup/pg_recvlogical.c | 25 - src/bin/pg_checksums/Makefile | 3 ++ src/bin/pg_checksums/pg_checksums.c| 9 ++-- src/bin/pg_dump/parallel.h | 13 + src/bin/pg_dump/pg_dump.c | 46 - src/bin/pg_dump/pg_restore.c | 22 ++-- src/bin/pg_dump/t/001_basic.pl | 20 src/bin/pgbench/pgbench.c | 60 +++--- src/bin/pgbench/t/002_pgbench_no_server.pl | 36 - src/bin/scripts/reindexdb.c| 9 ++-- src/bin/scripts/vacuumdb.c | 31 --- 16 files changed, 174 insertions(+), 190 deletions(-) diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h index d653cb94e3..e999d56ec0 100644 --- a/src/include/fe_utils/option_utils.h +++ b/src/include/fe_utils/option_utils.h @@ -19,5 +19,8 @@ typedef void (*help_handler) (const char *progname); extern void handle_help_version_opts(int argc, char *argv[], const char *fixed_progname, help_handler hlp); +extern bool option_parse_int(const char *optarg, const char *optname, + int min_range, int max_range, + int *result); #endif /* OPTION_UTILS_H */ diff --git a/src/fe_utils/option_utils.c b/src/fe_utils/option_utils.c index e19a495dba..a09d57db71 100644 --- a/src/fe_utils/option_utils.c +++ b/src/fe_utils/option_utils.c @@ -12,6 +12,8 @@ #include "postgres_fe.h" +#include "common/logging.h" +#include "common/string.h" #include "fe_utils/option_utils.h" /* @@ -36,3 +38,40 @@ handle_help_version_opts(int argc, char *argv[], } } } + +/* + * option_parse_int + * + * Parse integer value for an option. Returns true if the parsing could + * be done with optionally *result holding the parsed value, and false on + * failure. + */ +bool +option_parse_int(const char *optarg, const char *optname, + int min_range, int max_range, + int *result) +{ + char *endptr; + int val; + + errno = 0; + val = strtoint(optarg, , 10); + + if (*endptr) + { + pg_log_error("invalid value \"%s\" for option \"%s\"", + optarg, optname); + return false; + } + + if (errno == ERANGE || val < min_range || val > max_range) + { + pg_log_error("\"%s\" must be in range %d..%d", + optname, min_range, max_range); + return false; + } + + if (result) + *result = val; + return true; +} diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c index 4bde16fb4b..ee8aa71bdf 100644 --- a/src/bin/pg_amcheck/pg_amcheck.c +++ b/src/bin/pg_amcheck/pg_amcheck.c @@ -12,6 +12,7 @@ */ #include "postgres_fe.h" +#include #include #include "catalog/pg_am_d.h" @@ -326,12 +327,9 @@ main(int argc, char *argv[]) append_btree_pattern(, optarg, encoding); break; case 'j': -opts.jobs = atoi(optarg); -if (opts.jobs < 1) -{ - pg_log_error("number of parallel jobs must be at least 1"); +if (!option_parse_int(optarg, "-j/--jobs", 1, INT_MAX, + )) exit(1); -} break; case 'p': port = pg_strdup(optarg); diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 8bb0acf498..9ea98481d8 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -31,6 +31,7 @@ #include "common/file_utils.h" #include "common/logging.h" #include "common/string.h" +#include "fe_utils/option_utils.h"
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, 22 Jul 2021 at 00:44, Michael Paquier wrote: > > On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote: > > I see both of these are limited to 64 on windows. Won't those fail on > > Windows? > > Yes, thanks, they would. I would just cut the range numbers from the > expected output here. This does not matter in terms of coverage > either. Sounds good. > x> I also wondered if it would be worth doing #define MAX_JOBS somewhere > > away from the option parsing code. This part is pretty ugly: > > Agreed as well. pg_dump and pg_restore have their own idea of > parallelism in parallel.{c.h}. What about putting MAX_JOBS in > parallel.h then? parallel.h looks ok to me. David
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, Jul 22, 2021 at 12:32:39AM +1200, David Rowley wrote: > I see both of these are limited to 64 on windows. Won't those fail on Windows? Yes, thanks, they would. I would just cut the range numbers from the expected output here. This does not matter in terms of coverage either. x> I also wondered if it would be worth doing #define MAX_JOBS somewhere > away from the option parsing code. This part is pretty ugly: Agreed as well. pg_dump and pg_restore have their own idea of parallelism in parallel.{c.h}. What about putting MAX_JOBS in parallel.h then? -- Michael signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Wed, 21 Jul 2021 at 23:50, Michael Paquier wrote: > Hacking on that, I am finishing with the attached. It is less > ambitious, still very useful as it removes a dozen of custom error > messages in favor of the two ones introduced in option_utils.c. On > top of that this reduces a bit the code: > 15 files changed, 156 insertions(+), 169 deletions(-) > > What do you think? This is just a driveby review, but I think that it's good to see no increase in the number of lines of code to make these improvements. It's also good to see the added consistency introduced by this patch. I didn't test the patch, so this is just from reading through. I wondered about the TAP tests here: command_fails_like( [ 'pg_dump', '-j', '-1' ], qr/\Qpg_dump: error: -j\/--jobs must be in range 0..2147483647\E/, 'pg_dump: invalid number of parallel jobs'); command_fails_like( [ 'pg_restore', '-j', '-1', '-f -' ], qr/\Qpg_restore: error: -j\/--jobs must be in range 0..2147483647\E/, 'pg_restore: invalid number of parallel jobs'); I see both of these are limited to 64 on windows. Won't those fail on Windows? I also wondered if it would be worth doing #define MAX_JOBS somewhere away from the option parsing code. This part is pretty ugly: /* * On Windows we can only have at most MAXIMUM_WAIT_OBJECTS * (= 64 usually) parallel jobs because that's the maximum * limit for the WaitForMultipleObjects() call. */ if (!option_parse_int(optarg, "-j/--jobs", 0, #ifdef WIN32 MAXIMUM_WAIT_OBJECTS, #else INT_MAX, #endif )) exit(1); break; David
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Wed, Jul 21, 2021 at 05:02:29PM +0900, Kyotaro Horiguchi wrote: > The difference is your suggestion is making the function output the > message within. I guess that the reason for the original proposal is > different style of message is required in several places. That's one step toward having a maximum number of frontend tools to use the central logging APIs of src/common/. > 1. Some "bare" options (that is not preceded by a hyphen option) like > PID of pg_ctl kill doesn't fit the format. \pset parameters of > pg_ctl is the same. Yep. I was reviewing this one, but I have finished by removing it. The argument 2 just below also came into my mind. > 2. pg_ctl, pg_upgrade use their own error reporting mechanisms. Yeah, for this reason I don't think that it is a good idea to switch those areas to use the parsing of option_utils.c. Perhaps we should consider switching pg_upgrade to have a better logging infra, but there are also reasons behind what we have now. pg_ctl is out of scope as it needs to cover WIN32 event logging. > 3. parameters that take real numbers doesn't fit the scheme specifying > range borders. For example boundary values may or may not be included > in the range. This concerns only pgbench, which I'd be fine to let as-is. > 4. Most of the errors are PG_LOG_ERROR, but a few ones are > PG_LOG_FATAL. I would take it that pgbench is inconsistent with the rest. Note that pg_dump uses fatal(), but that's just a wrapper to pg_log_error(). > loglevel specifies the loglevel to use to emit error messages. If it > is the newly added item PG_LOG_OMIT, the function never emits an error > message. Addition to that, the return type is changed to an enum which > indicates what trouble the given string has. The caller can print > arbitrary error messages consulting the value. (killproc in pg_ctl.c) I am not much a fan of that. If we do so, what's the point in having a dependency to logging.c anyway in option_utils.c? This OMIT option only exists to bypass the specific logging needs where this gets added. That does not seem a design adapted to me in the long term, neither am I a fan of specific error codes for a code path that's just going to be used to parse command options. > I added two more similar functions option_parse_long/double. The > former is a clone of _int. The latter doesn't perform explicit range > checks for the reason described above. These have a limited impact, so I would limit things to int32 for now. > Maybe we need to make pg_upgrade use the common-logging features > instead of its own, but it is not included in this patch. Maybe. That would be good in the long term, though its case is very particular. > The attached patch needs more polish but should be enough to tell what > I have in my mind. This breaks some of the TAP tests of pgbench and pg_dump, at short sight. The checks for the port value in pg_receivewal and pg_recvlogical is strange to have. We don't care about that in any other tools. The number of checks for --jobs and workers could be made more consistent across the board, but I have let that out for now. Hacking on that, I am finishing with the attached. It is less ambitious, still very useful as it removes a dozen of custom error messages in favor of the two ones introduced in option_utils.c. On top of that this reduces a bit the code: 15 files changed, 156 insertions(+), 169 deletions(-) What do you think? -- Michael From 3aaf98a9e53d90ca2ac8c2734ae77487aa03843f Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Wed, 21 Jul 2021 20:42:13 +0900 Subject: [PATCH v4] Introduce and use routine for parsing of int32 options --- src/include/fe_utils/option_utils.h| 3 ++ src/fe_utils/option_utils.c| 39 ++ src/bin/pg_amcheck/pg_amcheck.c| 8 ++- src/bin/pg_basebackup/pg_basebackup.c | 17 +++--- src/bin/pg_basebackup/pg_receivewal.c | 23 +++-- src/bin/pg_basebackup/pg_recvlogical.c | 25 - src/bin/pg_checksums/Makefile | 3 ++ src/bin/pg_checksums/pg_checksums.c| 9 ++-- src/bin/pg_dump/pg_dump.c | 41 +++ src/bin/pg_dump/pg_restore.c | 31 +-- src/bin/pg_dump/t/001_basic.pl | 8 +-- src/bin/pgbench/pgbench.c | 60 +++--- src/bin/pgbench/t/002_pgbench_no_server.pl | 18 +++ src/bin/scripts/reindexdb.c| 9 ++-- src/bin/scripts/vacuumdb.c | 31 --- 15 files changed, 156 insertions(+), 169 deletions(-) diff --git a/src/include/fe_utils/option_utils.h b/src/include/fe_utils/option_utils.h index d653cb94e3..e999d56ec0 100644 --- a/src/include/fe_utils/option_utils.h +++ b/src/include/fe_utils/option_utils.h @@ -19,5 +19,8 @@ typedef void (*help_handler) (const char *progname); extern void handle_help_version_opts(int argc, char *argv[], const char *fixed_progname,
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
At Thu, 15 Jul 2021 16:19:07 +0900, Michael Paquier wrote in > On Wed, Jul 14, 2021 at 11:02:47AM -0400, Alvaro Herrera wrote: > > No, this doesn't work. When the first word is something that is > > not to be translated (a literal parameter name), let's use a %s (but of > > course the values must be taken out of the phrase too). But when it is > > a translatable phrase, it must be present a full phrase, no > > substitution: > > > > pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", > > "3"); > > pg_log..("compression level must be in range %s..%s", "0", "9")) > > > > I think the purity test is whether you want to use a _() around the > > argument, then you have to include it into the original message. > > After thinking about that, it seems to me that we don't have that much > context to lose once we build those error messages to show the option > name of the command. And the patch, as proposed, finishes with the Agreed. > same error message patterns all over the place, which would be a > recipe for more inconsistencies in the future. I think that we should > centralize all that, say with a small-ish routine in a new file called > src/fe_utils/option_parsing.c that uses strtoint() as follows: > bool option_parse_int(const char *optarg, > const char *optname, > int min_range, > int max_range, > int *result); > > Then this routine may print two types of errors through > pg_log_error(): > - Incorrect range, using min_range/max_range. > - Junk input. > The boolean status is here to let the caller do any custom exit() > actions he wishes if there one of those two failures. pg_dump has > some of that with exit_nicely(), for one. It is substantially the same suggestion with [1] in the original thread. The original proposal in the old thread was > bool pg_strtoint64_range(arg, , min, max, _message) The difference is your suggestion is making the function output the message within. I guess that the reason for the original proposal is different style of message is required in several places. Actually there are several irregulars. 1. Some "bare" options (that is not preceded by a hyphen option) like PID of pg_ctl kill doesn't fit the format. \pset parameters of pg_ctl is the same. 2. pg_ctl, pg_upgrade use their own error reporting mechanisms. 3. parameters that take real numbers doesn't fit the scheme specifying range borders. For example boundary values may or may not be included in the range. 4. Most of the errors are PG_LOG_ERROR, but a few ones are PG_LOG_FATAL. That being said, most of the caller sites are satisfied by fixed-formed messages. So I changed the interface as the following to deal with the above issues: +extern optparse_result option_parse_int(int loglevel, + const char *optarg, const char *optname, + int min_range, int max_range, + int *result); loglevel specifies the loglevel to use to emit error messages. If it is the newly added item PG_LOG_OMIT, the function never emits an error message. Addition to that, the return type is changed to an enum which indicates what trouble the given string has. The caller can print arbitrary error messages consulting the value. (killproc in pg_ctl.c) Other points: I added two more similar functions option_parse_long/double. The former is a clone of _int. The latter doesn't perform explicit range checks for the reason described above. Maybe we need to make pg_upgraded use the common-logging features instead of its own, but it is not included in this patch. pgbench's -L option accepts out-of-range values for internal variable. As the added comment says, we can limit the value with the large exact number but I limited it to 3600s since I can't imagine people needs larger latency limit than that. Similarly to the above, -R option can take for example 1E-300, which leads to int64 overflow later. Similar to -L, I don't think people don't need less than 1E-5 or so as this parameter. The attached patch needs more polish but should be enough to tell what I have in my mind. regards. 1: https://www.postgresql.org/message-id/CAKJS1f94kkuB_53LcEf0HF%2BuxMiTCk5FtLx9gSsXcCByUKMz1g%40mail.gmail.com -- Kyotaro Horiguchi NTT Open Source Software Center >From aaf81ac0340e9df3b74e18c1492e5984c0412fb5 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 8 Jul 2021 15:08:01 +0900 Subject: [PATCH v3 1/3] Be strict in numeric parameters on command line Some numeric command line parameters are tolerant of valid values followed by garbage like "123xyz". Be strict to reject such invalid values. Do the same for psql meta command parameters. --- src/bin/pg_amcheck/pg_amcheck.c| 9 +- src/bin/pg_basebackup/pg_basebackup.c | 17 ++-
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Wed, Jul 14, 2021 at 11:02:47AM -0400, Alvaro Herrera wrote: > On 2021-Jul-14, Kyotaro Horiguchi wrote: >>> Should we take this occasion to reduce the burden of translators and >>> reword that as "%s must be in range %d..%d"? That could be a separate >>> patch. > > Yes, please, let's do it here. Okay. > No, this doesn't work. When the first word is something that is > not to be translated (a literal parameter name), let's use a %s (but of > course the values must be taken out of the phrase too). But when it is > a translatable phrase, it must be present a full phrase, no > substitution: > > pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", > "3"); > pg_log..("compression level must be in range %s..%s", "0", "9")) > > I think the purity test is whether you want to use a _() around the > argument, then you have to include it into the original message. After thinking about that, it seems to me that we don't have that much context to lose once we build those error messages to show the option name of the command. And the patch, as proposed, finishes with the same error message patterns all over the place, which would be a recipe for more inconsistencies in the future. I think that we should centralize all that, say with a small-ish routine in a new file called src/fe_utils/option_parsing.c that uses strtoint() as follows: bool option_parse_int(const char *optarg, const char *optname, int min_range, int max_range, int *result); Then this routine may print two types of errors through pg_log_error(): - Incorrect range, using min_range/max_range. - Junk input. The boolean status is here to let the caller do any custom exit() actions he wishes if there one of those two failures. pg_dump has some of that with exit_nicely(), for one. -- Michael signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On 2021-Jul-14, Kyotaro Horiguchi wrote: > > > pg_log_error("extra_float_digits must be in range > > > -15..3"); > > > exit_nicely(1); > > > > Should we take this occasion to reduce the burden of translators and > > reword that as "%s must be in range %d..%d"? That could be a separate > > patch. Yes, please, let's do it here. > The first %s is not always an invariable symbol name so it could > result in concatenating several phrases into one, like this. > > pg_log..("%s must be in range %s..%s", _("compression level"), "0", "9")) > > It is translatable at least into Japanese but I'm not sure about other > languages. No, this doesn't work. When the first word is something that is not to be translated (a literal parameter name), let's use a %s (but of course the values must be taken out of the phrase too). But when it is a translatable phrase, it must be present a full phrase, no substitution: pg_log_error("%s must be in range %s..%s", "extra_float_digits", "-15", "3"); pg_log..("compression level must be in range %s..%s", "0", "9")) I think the purity test is whether you want to use a _() around the argument, then you have to include it into the original message. Thanks -- Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/ "After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was amazing when I first started using it at 7.2, and I'm continually astounded by learning new features and techniques made available by the continuing work of the development team." Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
Thanks for the discussion. At Tue, 13 Jul 2021 09:28:30 +0900, Michael Paquier wrote in > On Fri, Jul 09, 2021 at 04:50:28PM +0900, Kyotaro Horiguchi wrote: > > At Fri, 9 Jul 2021 10:29:07 +0900, Michael Paquier > > wrote in > >> Er, wait. We've actually allowed negative values for pg_ctl > >> --timeout or the subcommand kill!? > > > > --timeout accepts values less than 1, which values cause the command > > ends with the timeout error before checking for the ending state. Not > > destructive but useless as a behavior. We have two choices here. One > > is simply inhibiting zero or negative timeouts, and another is > > allowing zero as timeout and giving it the same meaning to > > --no-wait. The former is strictly right behavior but the latter is > > casual and convenient. I took the former way in this version. > > Yeah, that's not useful. ^^; Ok, I'm fine with taking the second way. Changed it. "-t 0" means "-W" in the attached and that behavior is described in the doc part. The environment variable PGCTLTIMEOUT accepts the same range of values. I added a warning that notifies that -t 0 overrides explicit -w . > $ pg_ctl -w -t 0 start > pg_ctl: WARNING: -w is ignored because timeout is set to 0 > server starting > >> This one in pg_upgrade is incomplete. Perhaps the missing comment > >> should tell that negative job values are checked later on? > > > > Zero or negative job numbers mean non-parallel mode and don't lead to > > an error. If we don't need to preserve that behavior (I personally > > don't think it is ether useful and/or breaks someone's existing > > usage.), it is sensible to do range-check here. > > Hmm. It would be good to preserve some compatibility here, but I can > see more benefits in being consistent between all the tools, and make > people aware that such commands are not generated more carefully. Ageed. > > case 'j': > > -opts.jobs = atoi(optarg); > > -if (opts.jobs < 1) > > +errno = 0; > > +opts.jobs = strtoint(optarg, , 10); > > +if (*endptr || errno == ERANGE || opts.jobs < 1) > > { > > pg_log_error("number of parallel jobs must be at least > > 1"); > > exit(1); > > specifying a value that triggers ERANGE could be confusing for values > higher than INT_MAX with pg_amcheck --jobs: > $ pg_amcheck --jobs=40 > pg_amcheck: error: number of parallel jobs must be at least 1 > I think that this should be reworded as "invalid number of parallel > jobs \"$s\"", or "number of parallel jobs must be in range %d..%d". > Perhaps we could have a combination of both? Using the first message > is useful with junk, non-numeric values or trailing characters. The > second is useful to make users aware that the value is numeric, but > not quite right. Yeah, I noticed that but ignored as a kind of impossible, or something-needless-to-say:p > "invalid number of parallel jobs \"$s\"" > "number of parallel jobs must be in range %d..%d" The resulting combined message looks like this: > "number of parallel jobs must be an integer in range 1..2147483647" I don't think it's not great that the message explicitly suggests a limit like INT_MAX, which is far above the practical limit. So, (even though I avoided to do that but) in the attached, I changed my mind to split most of the errors into two messages to avoid suggesting such impractical limits. $ pg_amcheck -j -1 $ pg_amcheck -j 1x $ pg_amcheck -j 10x pg_amcheck: error: number of parallel jobs must be an integer greater than zero: "" $ pg_amcheck -j 10 pg_amcheck: error: number of parallel jobs out of range: "10" If you feel it's too-much or pointless to split that way, I'll happy to change it the "%d..%d" form. Still I used the "%d..%d" notation for some parameters that has a finite range by design. They look like the following: (%d..%d doesn't work well for real numbers.) "sampling rate must be an real number between 0.0 and 1.0: \"%s\"" I'm not sure what to do for numWorkers of pg_dump/restore. The limit for numWorkers are lowered on Windows to quite low value, which would be worth showing, but otherwise the limit is INT_MAX. The attached makes pg_dump respond to -j 100 with the following error message which doesn't suggest the finite limit 64 on Windows. $ pg_dump -j 100 pg_dump: error: number of parallel jobs out of range: "100" > > @@ -587,8 +602,10 @@ main(int argc, char **argv) > > > > case 8: > > have_extra_float_digits = true; > > -extra_float_digits = atoi(optarg); > > -if (extra_float_digits < -15 || extra_float_digits > 3) > > +errno = 0; > > +extra_float_digits = strtoint(optarg, , 10); > > +if (*endptr || errno == ERANGE || > > +extra_float_digits < -15 || extra_float_digits >
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Fri, Jul 09, 2021 at 04:50:28PM +0900, Kyotaro Horiguchi wrote: > At Fri, 9 Jul 2021 10:29:07 +0900, Michael Paquier > wrote in >> Er, wait. We've actually allowed negative values for pg_ctl >> --timeout or the subcommand kill!? > > --timeout accepts values less than 1, which values cause the command > ends with the timeout error before checking for the ending state. Not > destructive but useless as a behavior. We have two choices here. One > is simply inhibiting zero or negative timeouts, and another is > allowing zero as timeout and giving it the same meaning to > --no-wait. The former is strictly right behavior but the latter is > casual and convenient. I took the former way in this version. Yeah, that's not useful. >> This one in pg_upgrade is incomplete. Perhaps the missing comment >> should tell that negative job values are checked later on? > > Zero or negative job numbers mean non-parallel mode and don't lead to > an error. If we don't need to preserve that behavior (I personally > don't think it is ether useful and/or breaks someone's existing > usage.), it is sensible to do range-check here. Hmm. It would be good to preserve some compatibility here, but I can see more benefits in being consistent between all the tools, and make people aware that such commands are not generated more carefully. > case 'j': > -opts.jobs = atoi(optarg); > -if (opts.jobs < 1) > +errno = 0; > +opts.jobs = strtoint(optarg, , 10); > +if (*endptr || errno == ERANGE || opts.jobs < 1) > { > pg_log_error("number of parallel jobs must be at least > 1"); > exit(1); specifying a value that triggers ERANGE could be confusing for values higher than INT_MAX with pg_amcheck --jobs: $ pg_amcheck --jobs=40 pg_amcheck: error: number of parallel jobs must be at least 1 I think that this should be reworded as "invalid number of parallel jobs \"$s\"", or "number of parallel jobs must be in range %d..%d". Perhaps we could have a combination of both? Using the first message is useful with junk, non-numeric values or trailing characters. The second is useful to make users aware that the value is numeric, but not quite right. > --- a/src/bin/pg_checksums/pg_checksums.c > case 'f': > -if (atoi(optarg) == 0) > +errno = 0; > +if (strtoint(optarg, , 10) == 0 > +|| *endptr || errno == ERANGE) > { > pg_log_error("invalid filenode specification, must be > numeric: %s", optarg); > exit(1); The confusion is equal here with pg_checksums -f: $ ./pg_checksums --f 40 pg_checksums: error: invalid filenode specification, must be numeric: 4 We could say "invalid file specification: \"%s\"". Another idea to be crystal-clear about the range requirements is to use that: "file specification must be in range %d..%d" > @@ -587,8 +602,10 @@ main(int argc, char **argv) > > case 8: > have_extra_float_digits = true; > -extra_float_digits = atoi(optarg); > -if (extra_float_digits < -15 || extra_float_digits > 3) > +errno = 0; > +extra_float_digits = strtoint(optarg, , 10); > +if (*endptr || errno == ERANGE || > +extra_float_digits < -15 || extra_float_digits > 3) > { > pg_log_error("extra_float_digits must be in range > -15..3"); > exit_nicely(1); Should we take this occasion to reduce the burden of translators and reword that as "%s must be in range %d..%d"? That could be a separate patch. > case 'p': > -if ((old_cluster.port = atoi(optarg)) <= 0) > -pg_fatal("invalid old port number\n"); > +errno = 0; > +if ((old_cluster.port = strtoint(optarg, , 10)) <= 0 > || > +*endptr || errno == ERANGE) > +pg_fatal("invalid old port number %s\n", optarg); > break; You may want to use columns here, or specify the port range: "invalid old port number: %s" or "old port number must be in range %d..%d". > case 'P': > -if ((new_cluster.port = atoi(optarg)) <= 0) > -pg_fatal("invalid new port number\n"); > +errno = 0; > +if ((new_cluster.port = strtoint(optarg, , 10)) <= 0 > || > +*endptr || errno == ERANGE) > +pg_fatal("invalid new port number %s\n", optarg); > break; Ditto. > +if (*endptr || errno == ERANGE || concurrentCons <= 0) > { > -pg_log_error("number of parallel jobs must be at least > 1"); > +
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
Thank you for the comments. At Fri, 9 Jul 2021 10:29:07 +0900, Michael Paquier wrote in > On Thu, Jul 08, 2021 at 05:30:23PM +0900, Kyotaro Horiguchi wrote: > > [1] is trying to expose pg_strtoint16/32 to frontend, but I don't see > > much point in doing that in conjunction with [2] or this thread. Since > > the integral parameter values of pg-commands are in int, which the > > exising function strtoint() is sufficient to read. So even [2] itself > > doesn't need to utilize [1]. > > It sounds sensible from here to just use strtoint(), some strtol(), > son strtod() and call it a day as these are already available. Thanks. > > -wait_seconds = atoi(optarg); > > +errno = 0; > > +wait_seconds = strtoint(optarg, , 10); > > +if (*endptr || errno == ERANGE || wait_seconds < 0) > > +{ > > +pg_log_error("invalid timeout \"%s\"", optarg); > > +exit(1); > > +} > > [ ... ] > > -killproc = atol(argv[++optind]); > > +errno = 0; > > +killproc = strtol(argv[++optind], , 10); > > +if (*endptr || errno == ERANGE || killproc < 0) > > +{ > > +pg_log_error("invalid process ID \"%s\"", > > argv[optind]); > > +exit(1); > > +} > > Er, wait. We've actually allowed negative values for pg_ctl > --timeout or the subcommand kill!? For killproc, leading minus sign suggests that it is an command line option. Since pg_ctl doesn't have such an option, that values is recognized as invalid *options*, even with the additional check. The additional check is useless in that sense. My intension is just to make the restriction explicit so I won't feel sad even if it should be removed. $ pg_ctl kill HUP -12345 cg_ctl: infalid option -- '1' --timeout accepts values less than 1, which values cause the command ends with the timeout error before checking for the ending state. Not destructive but useless as a behavior. We have two choices here. One is simply inhibiting zero or negative timeouts, and another is allowing zero as timeout and giving it the same meaning to --no-wait. The former is strictly right behavior but the latter is casual and convenient. I took the former way in this version. $ pg_ctl -w -t 0 start pg_ctl: error: invalid timeout value "0", use --no-wait to finish without waiting The same message is shown for non-decimal values but that would not matters. > > case 'j': > > -user_opts.jobs = atoi(optarg); > > +errno = 0; > > +user_opts.jobs = strtoint(optarg, , 10); > > +/**/ > > +if (*endptr || errno == ERANGE) > > +pg_fatal("invalid number of jobs %s\n", optarg); > > + > > break; > > This one in pg_upgrade is incomplete. Perhaps the missing comment > should tell that negative job values are checked later on? Zero or negative job numbers mean non-parallel mode and don't lead to an error. If we don't need to preserve that behavior (I personally don't think it is ether useful and/or breaks someone's existing usage.), it is sensible to do range-check here. I noticed that I overlooked PGCTLTIMEOUT. The attached is: - disallowed less-than-one numbers as jobs in pg_upgrade. - disallowed less-than-one timeout in pg_ctl - Use strtoint for PGCTLTIMEOUT in pg_ctl is regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 0818de16f9febea3d90ae0404b4f5b3643f6cbe0 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 8 Jul 2021 15:08:01 +0900 Subject: [PATCH v2 1/2] Be strict in numeric parameters on command line Some numeric command line parameters are tolerant of valid values followed by garbage like "123xyz". Be strict to reject such invalid values. Do the same for psql meta command parameters. --- src/bin/pg_amcheck/pg_amcheck.c| 6 ++- src/bin/pg_basebackup/pg_basebackup.c | 13 +++-- src/bin/pg_basebackup/pg_receivewal.c | 18 +-- src/bin/pg_basebackup/pg_recvlogical.c | 17 +-- src/bin/pg_checksums/pg_checksums.c| 7 ++- src/bin/pg_ctl/pg_ctl.c| 18 ++- src/bin/pg_dump/pg_dump.c | 39 --- src/bin/pg_dump/pg_restore.c | 17 --- src/bin/pg_upgrade/option.c| 21 ++-- src/bin/pgbench/pgbench.c | 66 -- src/bin/psql/command.c | 52 ++-- src/bin/scripts/reindexdb.c| 10 ++-- src/bin/scripts/vacuumdb.c | 23 + 13 files changed, 219 insertions(+), 88 deletions(-) diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c index 4bde16fb4b..71a82f9b75 100644 --- a/src/bin/pg_amcheck/pg_amcheck.c +++ b/src/bin/pg_amcheck/pg_amcheck.c
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, Jul 08, 2021 at 05:30:23PM +0900, Kyotaro Horiguchi wrote: > Looked through the three threads. Thanks! > [1] is trying to expose pg_strtoint16/32 to frontend, but I don't see > much point in doing that in conjunction with [2] or this thread. Since > the integral parameter values of pg-commands are in int, which the > exising function strtoint() is sufficient to read. So even [2] itself > doesn't need to utilize [1]. It sounds sensible from here to just use strtoint(), some strtol(), son strtod() and call it a day as these are already available. > -wait_seconds = atoi(optarg); > +errno = 0; > +wait_seconds = strtoint(optarg, , 10); > +if (*endptr || errno == ERANGE || wait_seconds < 0) > +{ > +pg_log_error("invalid timeout \"%s\"", optarg); > +exit(1); > +} > [ ... ] > -killproc = atol(argv[++optind]); > +errno = 0; > +killproc = strtol(argv[++optind], , 10); > +if (*endptr || errno == ERANGE || killproc < 0) > +{ > +pg_log_error("invalid process ID \"%s\"", argv[optind]); > +exit(1); > +} Er, wait. We've actually allowed negative values for pg_ctl --timeout or the subcommand kill!? > case 'j': > -user_opts.jobs = atoi(optarg); > +errno = 0; > +user_opts.jobs = strtoint(optarg, , 10); > +/**/ > +if (*endptr || errno == ERANGE) > +pg_fatal("invalid number of jobs %s\n", optarg); > + > break; This one in pg_upgrade is incomplete. Perhaps the missing comment should tell that negative job values are checked later on? -- Michael signature.asc Description: PGP signature
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
At Wed, 7 Jul 2021 17:40:13 +0530, Bharath Rupireddy wrote in > On Fri, Jun 4, 2021 at 10:23 PM Alvaro Herrera > wrote: > > > > On 2021-Jun-04, Bharath Rupireddy wrote: > > > > > On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera > > > wrote: > > > > > > I would suggest that the best way forward in this area is to rebase both > > > > there patches on current master. > > > > > > Thanks. I will read both the threads [1], [2] and try to rebase the > > > patches. If at all I get to rebase them, do you prefer the patches to > > > be in this thread or in a new thread? > > > > Thanks, that would be helpful. This thread is a good place. > > I'm unable to spend time on this work as promised. I'd be happy if > someone could take it forward, although it's not critical work(IMO) > that needs immediate focus. I will try to spend time maybe later this > year. Looked through the three threads. [1] is trying to expose pg_strtoint16/32 to frontend, but I don't see much point in doing that in conjunction with [2] or this thread. Since the integral parameter values of pg-commands are in int, which the exising function strtoint() is sufficient to read. So even [2] itself doesn't need to utilize [1]. So I agree to the Bharath's point. So the attached is a standalone patch that: - doesn't contain [1], since that functions are not needed for this purpose. - basically does the same thing with [2], but uses strtoint/strtol/strtod instead of pg_strtoint16/32. - doesn't try to make error messages verbose. That results in a somewhat strange message like this but I'm not sure we should be so strict at that point. > reindexdb: error: number of parallel jobs must be at least 1: hoge - is extended to cover all usages of atoi/l/f in command line processing, which are not fully covered by [2]. (Maybe) - is extended to cover psql's meta command parameters. - is extended to cover integral environment variables. (PGPORTOLD/NEW of pg_upgrade and COLUMNS of psql). The commands emit a warning for invalid values, but I'm not sure it's worthwhile. (The second attached.) > psql: warning: ignored invalid setting of environemt variable COLUMNS: 3x - doesn't cover pgbench's meta command parameters (for speed). [1] - https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1904201223040.29102@lancre [2] - https://www.postgresql.org/message-id/20191028012000.ga59...@begriffs.com regards. -- Kyotaro Horiguchi NTT Open Source Software Center >From 54ed83143012473727faff9f9e64dee613cedcfe Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Thu, 8 Jul 2021 15:08:01 +0900 Subject: [PATCH 1/2] Be strict in numeric parameters on command line Some numeric command line parameters are tolerant of valid values followed by garbage like "123xyz". Be strict to reject such invalid values. Do the same for psql meta command parameters. --- src/bin/pg_amcheck/pg_amcheck.c| 6 ++- src/bin/pg_basebackup/pg_basebackup.c | 13 +++-- src/bin/pg_basebackup/pg_receivewal.c | 18 +-- src/bin/pg_basebackup/pg_recvlogical.c | 17 +-- src/bin/pg_checksums/pg_checksums.c| 7 ++- src/bin/pg_ctl/pg_ctl.c| 18 ++- src/bin/pg_dump/pg_dump.c | 39 --- src/bin/pg_dump/pg_restore.c | 17 --- src/bin/pg_upgrade/option.c| 21 ++-- src/bin/pgbench/pgbench.c | 66 -- src/bin/psql/command.c | 52 ++-- src/bin/scripts/reindexdb.c| 10 ++-- src/bin/scripts/vacuumdb.c | 23 + 13 files changed, 219 insertions(+), 88 deletions(-) diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c index 4bde16fb4b..71a82f9b75 100644 --- a/src/bin/pg_amcheck/pg_amcheck.c +++ b/src/bin/pg_amcheck/pg_amcheck.c @@ -17,6 +17,7 @@ #include "catalog/pg_am_d.h" #include "catalog/pg_namespace_d.h" #include "common/logging.h" +#include "common/string.h" #include "common/username.h" #include "fe_utils/cancel.h" #include "fe_utils/option_utils.h" @@ -326,8 +327,9 @@ main(int argc, char *argv[]) append_btree_pattern(, optarg, encoding); break; case 'j': -opts.jobs = atoi(optarg); -if (opts.jobs < 1) +errno = 0; +opts.jobs = strtoint(optarg, , 10); +if (*endptr || errno == ERANGE || opts.jobs < 1) { pg_log_error("number of parallel jobs must be at least 1"); exit(1); diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c index 8bb0acf498..29be95b96a 100644 --- a/src/bin/pg_basebackup/pg_basebackup.c +++ b/src/bin/pg_basebackup/pg_basebackup.c @@ -2287,6 +2287,8 @@ main(int argc, char **argv) while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP", long_options, _index)) != -1) { + char *endptr; + switch (c) { case 'C': @@ -2371,8 +2373,10 @@ main(int argc, char **argv) #endif break; case 'Z': -
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Fri, Jun 4, 2021 at 10:23 PM Alvaro Herrera wrote: > > On 2021-Jun-04, Bharath Rupireddy wrote: > > > On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera > > wrote: > > > > I would suggest that the best way forward in this area is to rebase both > > > there patches on current master. > > > > Thanks. I will read both the threads [1], [2] and try to rebase the > > patches. If at all I get to rebase them, do you prefer the patches to > > be in this thread or in a new thread? > > Thanks, that would be helpful. This thread is a good place. I'm unable to spend time on this work as promised. I'd be happy if someone could take it forward, although it's not critical work(IMO) that needs immediate focus. I will try to spend time maybe later this year. Regards, Bharath Rupireddy.
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On 2021-Jun-04, Bharath Rupireddy wrote: > On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera wrote: > > I would suggest that the best way forward in this area is to rebase both > > there patches on current master. > > Thanks. I will read both the threads [1], [2] and try to rebase the > patches. If at all I get to rebase them, do you prefer the patches to > be in this thread or in a new thread? Thanks, that would be helpful. This thread is a good place. -- Álvaro Herrera Valdivia, Chile
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Fri, Jun 4, 2021 at 8:58 PM Alvaro Herrera wrote: > > On 2021-Jun-04, Bharath Rupireddy wrote: > > > On Thu, May 27, 2021 at 3:05 AM Alvaro Herrera > > wrote: > > > Hi, how is this related to > > > https://postgr.es/m/20191028012000.ga59...@begriffs.com ? > > > > Thanks. The proposed approach there was to implement postgres's own > > strtol i.e. string parsing, conversion to integers and use it in the > > places where atoi is being used. I'm not sure how far that can go. > > What I'm proposing here is to use strtol inplace of atoi to properly > > detect errors in case of inputs like '1211efe', '-14adc' and so on as > > atoi can't detect such errors. Thoughts? > > Well, if you scroll back to Surafel's initial submission in that thread, > it looks very similar in spirit to what you have here. > > Another thing I just noticed which I hadn't realized is that Joe > Nelson's patch depends on Fabien Coelho's patch in this other thread, > https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1904201223040.29102@lancre > which was closed as returned-with-feedback, I suppose mostly due to > exhaustion/frustration at the lack of progress/interest. > > I would suggest that the best way forward in this area is to rebase both > there patches on current master. Thanks. I will read both the threads [1], [2] and try to rebase the patches. If at all I get to rebase them, do you prefer the patches to be in this thread or in a new thread? [1] - https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1904201223040.29102@lancre [2] - https://www.postgresql.org/message-id/20191028012000.ga59...@begriffs.com With Regards, Bharath Rupireddy.
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On 2021-Jun-04, Bharath Rupireddy wrote: > On Thu, May 27, 2021 at 3:05 AM Alvaro Herrera > wrote: > > Hi, how is this related to > > https://postgr.es/m/20191028012000.ga59...@begriffs.com ? > > Thanks. The proposed approach there was to implement postgres's own > strtol i.e. string parsing, conversion to integers and use it in the > places where atoi is being used. I'm not sure how far that can go. > What I'm proposing here is to use strtol inplace of atoi to properly > detect errors in case of inputs like '1211efe', '-14adc' and so on as > atoi can't detect such errors. Thoughts? Well, if you scroll back to Surafel's initial submission in that thread, it looks very similar in spirit to what you have here. Another thing I just noticed which I hadn't realized is that Joe Nelson's patch depends on Fabien Coelho's patch in this other thread, https://www.postgresql.org/message-id/flat/alpine.DEB.2.21.1904201223040.29102@lancre which was closed as returned-with-feedback, I suppose mostly due to exhaustion/frustration at the lack of progress/interest. I would suggest that the best way forward in this area is to rebase both there patches on current master. -- Álvaro Herrera Valdivia, Chile "La virtud es el justo medio entre dos defectos" (Aristóteles)
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On Thu, May 27, 2021 at 3:05 AM Alvaro Herrera wrote: > > On 2021-May-19, Bharath Rupireddy wrote: > > > While working on [1], I found that some parts of the code is using > > strtol and atoi without checking for non-numeric junk input strings. I > > found this strange. Most of the time users provide proper numeric > > strings but there can be some scenarios where these strings are not > > user-supplied but generated by some other code which may contain > > non-numeric strings. Shouldn't the code use strtol or atoi > > appropriately and error out in such cases? One way to fix this once > > and for all is to have a common API something like int > > pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which > > returns a generic message upon invalid strings ("invalid value \"%s\" > > is provided for option \"%s\", opt_name, opt_value) and returns > > integers on successful parsing. > > Hi, how is this related to > https://postgr.es/m/20191028012000.ga59...@begriffs.com ? Thanks. The proposed approach there was to implement postgres's own strtol i.e. string parsing, conversion to integers and use it in the places where atoi is being used. I'm not sure how far that can go. What I'm proposing here is to use strtol inplace of atoi to properly detect errors in case of inputs like '1211efe', '-14adc' and so on as atoi can't detect such errors. Thoughts? With Regards, Bharath Rupireddy.
Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
On 2021-May-19, Bharath Rupireddy wrote: > While working on [1], I found that some parts of the code is using > strtol and atoi without checking for non-numeric junk input strings. I > found this strange. Most of the time users provide proper numeric > strings but there can be some scenarios where these strings are not > user-supplied but generated by some other code which may contain > non-numeric strings. Shouldn't the code use strtol or atoi > appropriately and error out in such cases? One way to fix this once > and for all is to have a common API something like int > pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which > returns a generic message upon invalid strings ("invalid value \"%s\" > is provided for option \"%s\", opt_name, opt_value) and returns > integers on successful parsing. Hi, how is this related to https://postgr.es/m/20191028012000.ga59...@begriffs.com ? -- Álvaro Herrera Valdivia, Chile
Incorrect usage of strtol, atoi for non-numeric junk inputs
Hi, While working on [1], I found that some parts of the code is using strtol and atoi without checking for non-numeric junk input strings. I found this strange. Most of the time users provide proper numeric strings but there can be some scenarios where these strings are not user-supplied but generated by some other code which may contain non-numeric strings. Shouldn't the code use strtol or atoi appropriately and error out in such cases? One way to fix this once and for all is to have a common API something like int pg_strtol/pg_str_convert_to_int(char *opt_name, char *opt_value) which returns a generic message upon invalid strings ("invalid value \"%s\" is provided for option \"%s\", opt_name, opt_value) and returns integers on successful parsing. Although this is not a critical issue, I would like to seek thoughts. [1] - https://www.postgresql.org/message-id/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA%40mail.gmail.com [2] strtol: vacuumlo.c --> ./vacuumlo -l '2323adfd' postgres -p '5432ERE' libpq_pipeline.c --> ./libpq_pipeline -r '2232adf' tests atoi: pg_amcheck.c --> ./pg_amcheck -j '1211efe' pg_basebackup --> ./pg_basebackup -Z 'EFEF' -s 'a$##' pg_receivewal.c --> ./pg_receivewal -p '54343GDFD' -s '54343GDFD' pg_recvlogical.c --> ./pg_recvlogical -F '-$#$#' -p '5432$$$' -s '100$$$' pg_checksums.c. --> ./pg_checksums -f '1212abc' pg_ctl.c --> ./pg_ctl -t 'efc' pg_dump.c --> ./pg_dump -j '454adc' -Z '4adc' --extra-float-digits '-14adc' pg_upgrade/option.c pgbench.c reindexdb.c vacuumdb.c pg_regress.c With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com