Module Name: src Committed By: kre Date: Wed Aug 7 15:40:03 UTC 2024
Modified Files: src/usr.bin/printf: printf.c Log Message: Correctly handle extracting wide chars from empty strings. Fix a (probably would have rarely been seen) bug I installed yesterday. It turns out that mbtowc() needs to include the terminating \0 in the length arg passed to it, or it errors (EILSEQ) on a zero length (instead of doing the sane thing and treating that the same as "\0" (treated as being length 1). So, increase the length passed to mbtowc() by 1. That makes no difference in the typical case, it is an upper limit on the number of bytes to examine, and mbtowc() stops after it has converted 1 character, so in the non "" input cases, nothing that matters changes. The rest of this you can skip if you like, not directly related to this change... Note: it is not clear to me what is correct here, POSIX looks to be ambiguous, or strange anyway; in the RETURN VALUE section it says: If s is not a null pointer, mbtowc() shall either return 0 (if s points to the null byte), or return the number of bytes [...] Further for the error possibilities it says: [EILSEQ] An invalid character sequence is detected. In the POSIX locale an [EILSEQ] error cannot occur since all byte values are valid characters. On the other hand our mbtowc(3) says: There are special cases: n == 0 In this case, the first n bytes of the array pointed to by s never form a complete character. Thus, the mbtowc() always fails. Since EILSEQ is the only defined error for mbtowc() in POSIX, and cannot happen (according to it) in the POSIX locale, that "always fails" in our manual page looks dubious. What actually happens in our mbtowc() in the POSIX locale, is that if passed n==0 (and *s == '\0') mbtowc() returns 0 (that's good) but also sets errno to EILSEQ (not so good - though this is not one of the functions guaranteed to not alter errno if it doesn't fail). In other locales it returns -1 (with errno == EILSEQ) when n == 0. (Well, in some other locales anyway, I didn't go and test all of them). Where POSIX gets weird, is that earlier it says: At most n bytes of the array pointed to by s shall be examined. If n == 0, then no bytes can be examined. In that case mbtowc() cannot test whether s points to the null byte, even in the POSIX locale. So it is unclear (to me) what should be returned in that case. To generate a diff of this commit: cvs rdiff -u -r1.57 -r1.58 src/usr.bin/printf/printf.c Please note that diffs are not public domain; they are subject to the copyright notices on the relevant files.