Re: new module 'strlcpy'
Paul Eggert wrote: > > Here's doc I propose to add to the gnulib documentation (and similar one > > to wcscpy and wcsncpy): > > Thanks, that looks good. I too share an aversion to strncpy ok, I'm committing this doc change: 2017-10-03 Bruno Haible doc: warn about misuse of strncpy and wcsncpy. * doc/posix-functions/strcpy.texi: Describe requirements on prior memory allocation. * doc/posix-functions/wcscpy.texi: Likewise. * doc/posix-functions/strncpy.texi: Describe what this function is not useful for. * doc/posix-functions/wcsncpy.texi: Likewise. diff --git a/doc/posix-functions/strcpy.texi b/doc/posix-functions/strcpy.texi index 3289362..89c6cd3 100644 --- a/doc/posix-functions/strcpy.texi +++ b/doc/posix-functions/strcpy.texi @@ -17,3 +17,6 @@ OS X 10.8. Portability problems not fixed by Gnulib: @itemize @end itemize + +Note: @code{strcpy (dst, src)} is only safe to use when you can guarantee that +there are at least @code{strlen (src) + 1} bytes allocated at @code{dst}. diff --git a/doc/posix-functions/strncpy.texi b/doc/posix-functions/strncpy.texi index 3cc6b45..5e22233 100644 --- a/doc/posix-functions/strncpy.texi +++ b/doc/posix-functions/strncpy.texi @@ -17,3 +17,12 @@ OS X 10.8. Portability problems not fixed by Gnulib: @itemize @end itemize + +Note: This function was designed for the use-case of filling a fixed-size +record with a string, before writing it to a file. This function is +@strong{not} appropriate for copying a string into a bounded memory area, +because you have no guarantee that the result will be NUL-terminated. +Even if you add the NUL byte at the end yourself, this function is +inefficient (as it spends time clearing unused memory) and will allow +silent truncation to occur, which is not a good behavior for GNU programs. +For more details, see @see{https://meyering.net/crusade-to-eliminate-strncpy/}. diff --git a/doc/posix-functions/wcscpy.texi b/doc/posix-functions/wcscpy.texi index fbac143..5e4cb01 100644 --- a/doc/posix-functions/wcscpy.texi +++ b/doc/posix-functions/wcscpy.texi @@ -19,3 +19,7 @@ Portability problems not fixed by Gnulib: On AIX and Windows platforms, @code{wchar_t} is a 16-bit type and therefore cannot accommodate all Unicode characters. @end itemize + +Note: @code{wcscpy (dst, src)} is only safe to use when you can guarantee that +there are at least @code{wcslen (src) + 1} wide characters allocated at +@code{dst}. diff --git a/doc/posix-functions/wcsncpy.texi b/doc/posix-functions/wcsncpy.texi index 4a6794a..a9a555e 100644 --- a/doc/posix-functions/wcsncpy.texi +++ b/doc/posix-functions/wcsncpy.texi @@ -16,6 +16,15 @@ IRIX 5.3, Solaris 2.5.1. Portability problems not fixed by Gnulib: @itemize @item -On AIX and Windows platforms, @code{wchar_t} is a 16-bit type and therefore cannot -accommodate all Unicode characters. +On AIX and Windows platforms, @code{wchar_t} is a 16-bit type and therefore +cannot accommodate all Unicode characters. @end itemize + +Note: This function has no real use: It cannot be used for filling a fixed-size +record with a wide string, before writing it to a file, because the wide string +encoding is platform dependent and, on some platforms, also locale dependent. +And this function is @strong{not} appropriate for copying a wide string into a +bounded memory area, because you have no guarantee that the result will be +null-terminated. Even if you add the null character at the end yourself, this +function is inefficient (as it spends time clearing unused memory) and will +allow silent truncation to occur, which is not a good behavior for GNU programs.
Re: new module 'strlcpy'
On 09/28/2017 10:54 AM, Bruno Haible wrote: Here's doc I propose to add to the gnulib documentation (and similar one to wcscpy and wcsncpy): Thanks, that looks good. I too share an aversion to strncpy, and wish that I had not stirred up this hornet's nest by using it in a moment of weakness.
Re: new module 'strlcpy'
On 09/27/2017 11:29 PM, Dmitry Selyutin wrote: How about strscpy from the Linux kernel? Yes, that's a better API than strlcpy. Though I would change its ssize_t to ptrdiff_t, so that it depends only on stddef.h. Plus, I would define its behavior even if the buffers overlap - that's safer, and the function is for when safety is more important than performance or correctness.
Re: new module 'strlcpy'
Hi Jim, > I developed a strong aversion to strncpy, and wrote this about it: > https://meyering.net/crusade-to-eliminate-strncpy/ Thanks for your voice and past effort here. Here's doc I propose to add to the gnulib documentation (and similar one to wcscpy and wcsncpy): diff --git a/doc/posix-functions/strcpy.texi b/doc/posix-functions/strcpy.texi index 3289362..89c6cd3 100644 --- a/doc/posix-functions/strcpy.texi +++ b/doc/posix-functions/strcpy.texi @@ -17,3 +17,6 @@ OS X 10.8. Portability problems not fixed by Gnulib: @itemize @end itemize + +Note: @code{strcpy (dst, src)} is only safe to use when you can guarantee that +there are at least @code{strlen (src) + 1} bytes allocated at @code{dst}. diff --git a/doc/posix-functions/strncpy.texi b/doc/posix-functions/strncpy.texi index 3cc6b45..087acaf 100644 --- a/doc/posix-functions/strncpy.texi +++ b/doc/posix-functions/strncpy.texi @@ -17,3 +17,12 @@ OS X 10.8. Portability problems not fixed by Gnulib: @itemize @end itemize + +Note: This function was designed for the use-case of filling a fixed-size +record with a string, before writing it to a file. This function is +@strong{not} appropriate for copying a string into a bounded memory area, +because you have no guarantee that the result will be NUL-terminated. +Even if you add the NUL byte at the end yourself, this function is +inefficient (as it spends time clearing unused memory) and will allow +silent truncation occur, which is not a good behavior for GNU programs. +For more details, see @see{https://meyering.net/crusade-to-eliminate-strncpy/}.
Re: new module 'strlcpy'
On 09/28/2017 08:29 AM, Dmitry Selyutin wrote: > How about strscpy from the Linux kernel? > > https://www.kernel.org/doc/htmldocs/kernel-api/API-strscpy.html As an application library & programmer, I need this (thanks, Dmitry) *and* strlcpy. A gnulib module would reduce configure.ac code and several checks - and thus would be very welcome (thanks, Bruno). In production code strlcpy replaces snprintf with format string "%s" - the return value (length of source string) often comes in handy for malloc fallbacks. But I already stumbled into the side-effects of strlcpy in the past. Determining the length of the source is sometimes dangerous and may lead to unexpected CPU cycle waste. So this strscpy function seems to be a good choice if you are not interested in the source length. With Best Regards, Tim signature.asc Description: OpenPGP digital signature
Re: new module 'strlcpy'
How about strscpy from the Linux kernel? https://www.kernel.org/doc/htmldocs/kernel-api/API-strscpy.html 28 сент. 2017 г. 4:23 ДП пользователь "Paul Eggert" написал: > On 09/27/2017 05:35 PM, Bruno Haible wrote: > >> strlcpy with __warn_unused_result__ is the best solution for this task. >> > > No it's not, because strlcpy always computes the string length of the > source, and that is neither necessary nor desirable. Furthermore, in the > bad style of programming that uses strlcpy, it's often acceptable to ignore > the result since the programmer *wants* silent truncation. That is what > strlcpy means in OpenBSD etc., and we shouldn't be trying to reuse their > name for a function with different semantics. > > A better way to fix the test case is to remove its arbitrary limit on > month name lengths. That way, it won't need snprintf or strlcpy or strncpy > or anything like that. And it'll be following the GNU coding standards > better. I can propose a patch along those lines, if you like. I do not want > Gnulib to promote the use of strlcpy; that's a step in the wrong direction. > > If the same standards were set for test code >> than for application code, it would become even more tedious and >> boring to write unit tests than it already is. >> > In that case, snprintf suffices: maybe snprintf is not good enough for > production code, but it's plenty good for this test case. > > If you really want a function whose semantics are like strlcpy's but has > __warn_unused_result__ semantics (and while we're at it, also does not > count overlong sources, because that's silly), then I suppose we could > invent one and use it for Gnulib. But we should not call it strlcpy, and we > shouldn't use it in production code. > > >
Re: new module 'strlcpy'
On 09/27/2017 05:35 PM, Bruno Haible wrote: strlcpy with __warn_unused_result__ is the best solution for this task. No it's not, because strlcpy always computes the string length of the source, and that is neither necessary nor desirable. Furthermore, in the bad style of programming that uses strlcpy, it's often acceptable to ignore the result since the programmer *wants* silent truncation. That is what strlcpy means in OpenBSD etc., and we shouldn't be trying to reuse their name for a function with different semantics. A better way to fix the test case is to remove its arbitrary limit on month name lengths. That way, it won't need snprintf or strlcpy or strncpy or anything like that. And it'll be following the GNU coding standards better. I can propose a patch along those lines, if you like. I do not want Gnulib to promote the use of strlcpy; that's a step in the wrong direction. If the same standards were set for test code than for application code, it would become even more tedious and boring to write unit tests than it already is. In that case, snprintf suffices: maybe snprintf is not good enough for production code, but it's plenty good for this test case. If you really want a function whose semantics are like strlcpy's but has __warn_unused_result__ semantics (and while we're at it, also does not count overlong sources, because that's silly), then I suppose we could invent one and use it for Gnulib. But we should not call it strlcpy, and we shouldn't use it in production code.
Re: new module 'strlcpy'
On Wed, Sep 27, 2017 at 5:44 PM, Bruno Haible wrote: > I wrote: >>In some places the users >>will notice that strlcpy does not buy them much, compared to the >>"avoid arbitrary limits"[1] approach, and will switch over to what >>you call "GNU style". In other places, they will insert an abort() >>or assert() to handle the truncation case - which is *better* than >>the strncpy approach. > > For example, in gnulib's setlocale.c override. This file has fixed-size > buffers and silent truncation - because it uses the "strncpy and set NUL byte" > approach. As soon as someone (me) will use strlcpy with __warn_unused_result__ > there, he will change the code to do > - either dynamic allocation and no arbitrary limits, > - or provide a good alternative to the silent truncation. > Either result will be better than the current one. I have to agree that we need something, and it may even be spelled strlcpy. It seems reasonable to allow new uses of strlcpy, given a __warn_unused_result__ attribute -- that should prevent those new users from ignoring that return value. I developed a strong aversion to strncpy, and wrote this about it: https://meyering.net/crusade-to-eliminate-strncpy/
Re: new module 'strlcpy'
I wrote: >In some places the users >will notice that strlcpy does not buy them much, compared to the >"avoid arbitrary limits"[1] approach, and will switch over to what >you call "GNU style". In other places, they will insert an abort() >or assert() to handle the truncation case - which is *better* than >the strncpy approach. For example, in gnulib's setlocale.c override. This file has fixed-size buffers and silent truncation - because it uses the "strncpy and set NUL byte" approach. As soon as someone (me) will use strlcpy with __warn_unused_result__ there, he will change the code to do - either dynamic allocation and no arbitrary limits, - or provide a good alternative to the silent truncation. Either result will be better than the current one. Bruno
Re: new module 'strlcpy'
Hi Paul, > Anyway, strlcpy is overkill here, as snprintf does the job just as well > here Not at all. snprintf is not at the right level of abstraction. There are three levels of abstractions to consider: (1) C code which composes a memory region by writing to bytes individually. (2) C code which works with strings. (3) C code which does memory allocations, formatted output etc. Here the task is to copy a string to a bounded memory area. The level (1) - which is embodied by the "strncpy and set NUL byte" approach - is too low-level, because - it is easy to make mistakes at this level (off-by-1), - it does not encourage the user to think about truncation and how to handle it. The level (3) - which is embodied by the "snprintf %s" approach = is too high-level, because - it is overkill to parse a format string when all you want to do is copy a single string, - snprintf can call malloc() and can fail with ENOMEM; these are undesired outcomes here. The level (2) is right for the task, but the standardized functions (memcpy, strcpy, strncpy) are not up to the job: - memcpy because it requires too much preparation code, and is still vulnerable to off-by-1 mistakes, - strcpy because it may produce buffer overruns, - strncpy because it may produce silent truncation. strlcpy with __warn_unused_result__ is the best solution for this task. > I'd really rather not promote the use of strlcpy in GNU code, as it is > contrary to GNU style. 1) I do not promote "strlcpy" with the flaw of ignoring the return value. I promote "strlcpy with __warn_unused_result__", which is an entirely different beast, because it will make the users aware of the return value and of the silent truncation issue. In some places the users will notice that strlcpy does not buy them much, compared to the "avoid arbitrary limits"[1] approach, and will switch over to what you call "GNU style". In other places, they will insert an abort() or assert() to handle the truncation case - which is *better* than the strncpy approach. 2) You need to distinguish application code and test code. The GNU standards [1] request to avoid arbitrary limits for programs. It is completely OK to assume that month names are less than 100 bytes long, *in unit tests*. If the same standards were set for test code than for application code, it would become even more tedious and boring to write unit tests than it already is. Probably I should configure the Coverity scan of Gnulib such that it puts the test code in a different area, so that we can apply different filters and prioritizations to the tests, and so that we won't be troubled by "sloppy" style in the tests. > Plus, I'm not a fan of __warn_unused_result__; in > my experience it's too often more trouble than it's worth We have a module 'ignore-value' that deals with it; so the trouble you are mentioning must be a trouble in the past, not in the future. Bruno [1] https://www.gnu.org/prep/standards/html_node/Semantics.html
Re: new module 'strlcpy'
I'd really rather not promote the use of strlcpy in GNU code, as it is contrary to GNU style. Plus, I'm not a fan of __warn_unused_result__; in my experience it's too often more trouble than it's worth, and I expect this trouble would occur with strlcpy if we started using it. That being said, I take your point that I should not have used strncpy to pacify GCC in that test case: I plead guilty to being lazy because the test-case code already has arbitrary limits. Anyway, strlcpy is overkill here, as snprintf does the job just as well here, and is already standard and supported by Gnulib. So I propose the attached simpler patch instead. >From 2513ff5a2d0cb11edbca47d5ad6b1b5ae0c690fa Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 27 Sep 2017 16:16:43 -0700 Subject: [PATCH] duplocale-tests: use snprintf, not strncpy * modules/duplocale-tests (Depends-on): Add snprintf. * tests/test-duplocale.c (get_locale_dependent_values): Use snprintf instead of strncpy to avoid overrunning the fixed-size output buffer. --- ChangeLog | 8 modules/duplocale-tests | 1 + tests/test-duplocale.c | 4 ++-- 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 8a9bbe625..7aa6c03d7 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,11 @@ +2017-09-27 Paul Eggert + + duplocale-tests: use snprintf, not strncpy + * modules/duplocale-tests (Depends-on): Add snprintf. + * tests/test-duplocale.c (get_locale_dependent_values): + Use snprintf instead of strncpy to avoid overrunning + the fixed-size output buffer. + 2017-09-26 Bruno Haible uniname/uniname-tests: Tighten code. diff --git a/modules/duplocale-tests b/modules/duplocale-tests index baa9a6310..98f85b448 100644 --- a/modules/duplocale-tests +++ b/modules/duplocale-tests @@ -5,6 +5,7 @@ tests/macros.h Depends-on: langinfo +snprintf configure.ac: AC_CHECK_FUNCS_ONCE([duplocale uselocale strfmon_l snprintf_l nl_langinfo_l]) diff --git a/tests/test-duplocale.c b/tests/test-duplocale.c index f48fedf6a..c812f0b82 100644 --- a/tests/test-duplocale.c +++ b/tests/test-duplocale.c @@ -42,14 +42,14 @@ struct locale_dependent_values static void get_locale_dependent_values (struct locale_dependent_values *result) { + size_t ntime = sizeof result->time; strfmon (result->monetary, sizeof (result->monetary), "%n", 123.75); /* result->monetary is usually "$123.75" */ snprintf (result->numeric, sizeof (result->numeric), "%g", 3.5); /* result->numeric is usually "3,5" */ - strncpy (result->time, nl_langinfo (MON_1), sizeof result->time - 1); - result->time[sizeof result->time - 1] = '\0'; + ASSERT (snprintf (result->time, ntime, "%s", nl_langinfo (MON_1)) < ntime); /* result->time is usually "janvier" */ } -- 2.13.5