On 03/29/2018 10:56 AM, Ingo Schwarze wrote:
Hi Karl,

Karl Williamson wrote on Mon, Mar 19, 2018 at 11:51:31AM -0600:

I still believe that in my program the setlocale() returning C for
LC_ALL is a bug.  LC_CTYPE should have successfully been set to Romanian
UTF-8, and so LC_ALL isn't C.  Instead, it is a combination of C for all
the other categories, and UTF-8 for LC_CTYPE.  A return of just "C"
doesn't reflect that complexity.   There is a footnote in the ANSI/ISO
9899-1990 C standard that the returned string must support that
heterogeneity, and that the return value be able to be used in a future
setlocale to get back to the original state.  Your setlocale violates
the standard therefore, and harms your portability goal.

Thanks again for reporting this bug, i fixed it with the commit below.

(And to be
consistent, LC_ALL should have been the heterogenous composition of all
the locale strings that the program sets, even if they just boil down to
C or UTF-8.  Programs that run on other OS's are expecting this, and
again your portability goal is compromised)

I fear i still don't understand what you mean here.
Is that part also fixed with my commit?
If not, or if you are not sure (because you lack a test system),
can you state as precisely as possibly which call, that currently
has which effect and returns which value should have which other
effect and return which other value instead?

Your setlocale is a stub, but it should act consistently.
If I set LC_CTYPE to foo, and LC_COLLATE to bar, and then request setlocale(LC_CTYPE, NULL), it returns 'foo', and setlocale(LC_COLLATE, NULL), it returns 'bar', even though that's not really true. I understand why you do that, but I claim that you should maintain consistency when asking for
setlocale(LC_ALL, NULL)
Currently it just says 'C', but code from other systems would be expecting it to act as if the previous calls actually took effect.
In Linux it would return
LC_CTYPE=foo;LC_COLLATE=bar;LC_NUMERIC=C...
And I think you should do something similar, so that what gets returned by LC_ALL reflects what previous individual setlocale's have done.

Right now, if you set LC_CTYPE to foo, then call
const char * save=setlocale(LC_ALL, NULL);
 to save things, then change LC_CTYPE to baz, then call
setlocale(LC_ALL, save)
printf("%s\n", setlocale(LC_CTYPE,NULL)
I don't believe it will be foo because your LC_ALL returned a value that didn't contain 'foo'.

A stub needs to sufficiently emulate the real thing so that you don't get inconsistent results.

If this isn't clear, let me know.

I urge you to update your man page.  If it had set out what you've
stated in your replies, it would have saved our project a bunch of hours
of work.

Sorry for that.  We do regard bugs in manuals as about as important
as bugs in the implementation, but just as with code, it can't be
helped that sometimes, bugs and omissions happen anyway.

Yours,
   Ingo


CVSROOT:        /cvs
Module name:    src
Changes by:     schwa...@cvs.openbsd.org        2018/03/29 10:34:25

Modified files:
        lib/libc/locale: setlocale.c
        regress/lib/libc/locale/setlocale: setlocale.c

Log message:
Fix three bugs in setlocale(3):
1. setlocale(LC_ALL, "A"); setlocale(LC_CTYPE, "T"); setlocale(LC_ALL, NULL);
must return "A/T/A/A/A/A", not "A".  Fix this by always initializing the
LC_ALL entry of newgl to "" in dupgl().  Reported by Karl Williamson
<public at khwilliamson dot com> on bugs@, thanks!
2. Do not leak newgl when strdup(3) fails in setlocale(3).
3. For setlocale(LC_ALL, "C/C/fr_FR.UTF-8/C/C/C"); correctly set
_GlobalRuneLocale; i found 2. and 3. while looking at the code.
Feedback on a buggy earlier version and OK martijn@.


Index: lib/libc/locale/setlocale.c
===================================================================
RCS file: /cvs/src/lib/libc/locale/setlocale.c,v
retrieving revision 1.27
diff -u -p -r1.27 setlocale.c
--- lib/libc/locale/setlocale.c 5 Sep 2017 03:16:13 -0000       1.27
+++ lib/libc/locale/setlocale.c 29 Mar 2018 16:32:53 -0000
@@ -42,8 +42,8 @@ dupgl(char **oldgl)
        if ((newgl = calloc(_LC_LAST, sizeof(*newgl))) == NULL) 
                return NULL;
        for (ic = LC_ALL; ic < _LC_LAST; ic++) {
-               if ((newgl[ic] = strdup(oldgl != NULL ?
-                   oldgl[ic] : ic == LC_ALL ? "" : "C")) == NULL) {
+               if ((newgl[ic] = strdup(ic == LC_ALL ? "" :
+                   oldgl == NULL ? "C" : oldgl[ic])) == NULL) {
                        freegl(newgl);
                        return NULL;
                }
@@ -92,8 +92,10 @@ setlocale(int category, const char *locn
                if (category == LC_ALL && strchr(locname, '/') != NULL) {
/* One value for each category. */
-                       if ((firstname = strdup(locname)) == NULL)
+                       if ((firstname = strdup(locname)) == NULL) {
+                               freegl(newgl);
                                return NULL;
+                       }
                        nextname = firstname;
                        for (ic = 1; ic < _LC_LAST; ic++)
                                if (nextname == NULL || changegl(ic,
@@ -184,7 +186,7 @@ done:
                        global_locale = newgl;
                        if (category == LC_ALL || category == LC_CTYPE)
                                _GlobalRuneLocale =
-                                   strchr(global_locname, '.') == NULL ?
+                                   strchr(newgl[LC_CTYPE], '.') == NULL ?
                                    &_DefaultRuneLocale : _Utf8RuneLocale;
                } else {
                        freegl(newgl);
Index: regress/lib/libc/locale/setlocale/setlocale.c
===================================================================
RCS file: /cvs/src/regress/lib/libc/locale/setlocale/setlocale.c,v
retrieving revision 1.3
diff -u -p -r1.3 setlocale.c
--- regress/lib/libc/locale/setlocale/setlocale.c       25 Feb 2017 07:28:32 
-0000      1.3
+++ regress/lib/libc/locale/setlocale/setlocale.c       29 Mar 2018 16:32:53 
-0000
@@ -75,7 +75,6 @@ main(int argc, char *argv[])
/* load from env */
        /* NOTE: we don't support non-C locales for some categories */
-       /*test_setlocale("fr_FR.UTF-8", LC_ALL, "");*/ /* set */
        test_setlocale("fr_FR.UTF-8", LC_CTYPE, ""); /* set */
        test_setlocale("fr_FR.UTF-8", LC_MESSAGES, ""); /* set */
        test_MB_CUR_MAX(4);
@@ -113,6 +112,7 @@ main(int argc, char *argv[])
        test_setlocale("C", LC_ALL, "C"); /* reset */
        test_setlocale("invalid.UTF-8", LC_CTYPE, "invalid.UTF-8"); /* set */
        test_setlocale("invalid.UTF-8", LC_CTYPE, NULL);
+       test_setlocale("C/invalid.UTF-8/C/C/C/C", LC_ALL, NULL);
        test_MB_CUR_MAX(4);
/* with invalid codeset (is an error) */


Reply via email to