PR lib/57828 (https://gnats.NetBSD.org/57828) proposes changing an edge case of the strtoi(3) function so that if ERANGE and ENOTSUP are both applicable then it should fail with ERANGE rather than ENOTSUP.
This is the case when the number is out of range, and there's trailing garbage -- e.g., s="42z", min=3, max=7. The rationale is that the caller can trivially detect the ENOTSUP condition (test end[0] != '\0'), but not the ERANGE condition (short of parsing the number again, which would defeat the purpose). And applications may find the out-of-range case more significant than the trailing garbage case. For example, libutil's pidfile_read(3) tries to parse a pid on a single line, so any file whose content matches /[0-9]+\n/ with the resulting number in the interval [1, INT_MAX] is accepted, and one might hope that anything else should be rejected: https://nxr.netbsd.org/xref/src/lib/libutil/pidfile.c?r=1.16#132 But if the pidfile contains, e.g., `999999999999999999999999999\n', pidfile_read(3) as implemented will accept this -- and interpret it as pid 2147483647. If it's `-999999999999999999999999999\n', pidfile_read(3) will accept it as pid 1. Counterarguments: - We've had this semantics since NetBSD 7, so changing it is risky -- we'd have to audit all the existing code that uses it. - There are copies of the code in various places and they might diverge. - I searched github for /[^a-zA-Z0-9_]strtoi[^a-zA-Z0-9_]/ and came up with almost 16k results, so auditing that myself isn't going to happen unless we can narrow it down a lot. - If you need to act on the trailing stuff, maybe it's better to just use a real lexer instead of relying on edge cases of error cases. - If you rely on this change, your code won't work in any of the deployed implementations of strtoi anyway. Thoughts? The attached patch implements the proposed change in libc and the automatic tests.
>From a64349a5ac209d9967ba3c5d6a62aa243f44f603 Mon Sep 17 00:00:00 2001 From: Taylor R Campbell <riastr...@netbsd.org> Date: Sat, 20 Jan 2024 18:17:27 +0000 Subject: [PATCH 1/2] strtoi(3): Test for ERANGE before ENOTSUP. Expect failure because that's not what we implemented. PR lib/57828 --- tests/lib/libc/stdlib/t_strtoi.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/tests/lib/libc/stdlib/t_strtoi.c b/tests/lib/libc/stdlib/t_strtoi.c index 3301058a5be1..093b71abe04a 100644 --- a/tests/lib/libc/stdlib/t_strtoi.c +++ b/tests/lib/libc/stdlib/t_strtoi.c @@ -313,14 +313,9 @@ ATF_TC_BODY(strtoi_erroredges, tc) { "+5z", 5, 0, "z", 3, 7, ENOTSUP }, { "+6z", 6, 0, "z", 3, 7, ENOTSUP }, { "+7z", 7, 0, "z", 3, 7, ENOTSUP }, - /* - * PR lib/57828 suggests these should yield ERANGE, not - * ENOTSUP, because ENOTSUP can be discovered - * separately by the caller anyway. - */ - { "-42z", 3, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */ - { "2z", 3, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */ - { "42z", 7, 0, "z", 3, 7, ENOTSUP }, /* XXX ERANGE */ + { "-42z", 3, 0, "z", 3, 7, ERANGE }, + { "2z", 3, 0, "z", 3, 7, ERANGE }, + { "42z", 7, 0, "z", 3, 7, ERANGE }, }; intmax_t rv; @@ -330,6 +325,9 @@ ATF_TC_BODY(strtoi_erroredges, tc) for (i = 0; i < __arraycount(t); i++) { + if (t[i].rstatus == ERANGE) + atf_tc_expect_fail("PR lib/57828"); + errno = 0; rv = strtoi(t[i].str, &end, t[i].base, t[i].lo, t[i].hi, &e); >From ee1b6083ea09f569e95ae5137b5fe870beb9d017 Mon Sep 17 00:00:00 2001 From: Taylor R Campbell <riastr...@netbsd.org> Date: Sat, 20 Jan 2024 18:19:34 +0000 Subject: [PATCH 2/2] strtoi(3): Return ERANGE before ENOTSUP if both are applicable. Caller of strtoi can trivially detect the ENOTSUP case themselves by checking whether end[0] == '\0', but they can't trivially detect the ERANGE case (short of doing their own parsing which would defeat the purpose of using strtoi(3)). PR lib/57828 --- common/lib/libc/stdlib/_strtoi.h | 9 ++++++--- tests/lib/libc/stdlib/t_strtoi.c | 3 --- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/common/lib/libc/stdlib/_strtoi.h b/common/lib/libc/stdlib/_strtoi.h index e6b0ce778b60..d48d58350050 100644 --- a/common/lib/libc/stdlib/_strtoi.h +++ b/common/lib/libc/stdlib/_strtoi.h @@ -104,9 +104,6 @@ INT_FUNCNAME(_int_, _FUNCNAME, _l)(const char * __restrict nptr, /* No digits were found */ if (nptr == *endptr) *rstatus = ECANCELED; - /* There are further characters after number */ - else if (**endptr != '\0') - *rstatus = ENOTSUP; } if (im < lo) { @@ -120,6 +117,12 @@ INT_FUNCNAME(_int_, _FUNCNAME, _l)(const char * __restrict nptr, return hi; } + if (*rstatus == 0) { + /* There are further characters after number */ + if (**endptr != '\0') + *rstatus = ENOTSUP; + } + return im; } diff --git a/tests/lib/libc/stdlib/t_strtoi.c b/tests/lib/libc/stdlib/t_strtoi.c index 093b71abe04a..ee9b5a3e899f 100644 --- a/tests/lib/libc/stdlib/t_strtoi.c +++ b/tests/lib/libc/stdlib/t_strtoi.c @@ -325,9 +325,6 @@ ATF_TC_BODY(strtoi_erroredges, tc) for (i = 0; i < __arraycount(t); i++) { - if (t[i].rstatus == ERANGE) - atf_tc_expect_fail("PR lib/57828"); - errno = 0; rv = strtoi(t[i].str, &end, t[i].base, t[i].lo, t[i].hi, &e);