Hi Stefan, Stefan Sperling wrote on Tue, Mar 20, 2018 at 10:09:39AM +0100: > On Tue, Mar 20, 2018 at 02:42:31AM +0100, Ingo Schwarze wrote:
>> So here is a rewrite of our setlocale(3) manual page. No idea why >> i didn't do that earlier, there are lots of obvious issues with that >> manual page. > Thanks, I've read through it and it looks good. Thanks for checking. > I think we explicitly allow LC_MESSAGES to be set to any > value which is accepted by LC_CTYPE, for compatibility with > natural language support in ports. Maybe mention that as well? My text was intended to imply that, but i admit it wasn't all that clear. Patch updated to make the point explicit in the following way, with no other changes: On OpenBSD, the only useful value for the category is LC_CTYPE. It sets the locale used for character encoding, character classification, and case conversion. For compatibility with natural language support in packages(7), all other categories - LC_COLLATE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and LC_TIME - can be set and retrieved, too, but their values are ignored by the OpenBSD C library. A category of LC_ALL sets the entire locale generically. Yours, Ingo Index: setlocale.3 =================================================================== RCS file: /cvs/src/lib/libc/locale/setlocale.3,v retrieving revision 1.19 diff -u -r1.19 setlocale.3 --- setlocale.3 21 Sep 2015 14:46:59 -0000 1.19 +++ setlocale.3 20 Mar 2018 14:50:09 -0000 @@ -39,7 +39,7 @@ .Sh NAME .Nm setlocale , .Nm localeconv -.Nd natural language formatting for C +.Nd select character encoding .Sh SYNOPSIS .In locale.h .Ft char * @@ -49,78 +49,74 @@ .Sh DESCRIPTION The .Fn setlocale -function sets the C library's notion -of natural language formatting style -for particular sets of routines. -Each such style is called a -.Dq locale -and is invoked using an appropriate name passed as a C string. -The -.Fn localeconv -routine returns the current locale's parameters -for formatting numbers. +function selects the given +.Fa locale +for the current process. +The locale modifies the behaviour of some functions in the C library +with respect to the character encoding, and on other operating systems +also with respect to some language and cultural conventions. +For more information about locales in general, see the +.Xr locale 1 +manual page. .Pp -The -.Fn setlocale -function recognizes several categories of routines. -These are the categories and the sets of routines they select: -.Bl -tag -width LC_MONETARY -.It Dv LC_ALL -Set the entire locale generically. -.It Dv LC_COLLATE -Set a locale for string collation routines. -This controls alphabetic ordering in -.Xr strcoll 3 -and -.Xr strxfrm 3 . -.It Dv LC_CTYPE -Set a locale for the functions declared in -.In ctype.h -and -.In wctype.h . -This controls recognition of upper and lower case, -alphabetic or non-alphabetic characters, and so on. -.It Dv LC_MESSAGES -Set a locale for message strings. -Controls the behaviour of -.Xr catopen 3 -and internationalization tools. -.It Dv LC_MONETARY -Set a locale for formatting monetary values; -this affects the -.Fn localeconv -function. -.It Dv LC_NUMERIC -Set a locale for formatting numbers. -This controls the formatting of decimal points -in input and output of floating point numbers -in functions such as -.Xr printf 3 +On +.Ox , +the only useful value for the +.Fa category +is +.Dv LC_CTYPE . +It sets the locale used for character encoding, character classification, +and case conversion. +For compatibility with natural language support in +.Xr packages 7 , +all other categories \(em +.Dv LC_COLLATE , +.Dv LC_MESSAGES , +.Dv LC_MONETARY , +.Dv LC_NUMERIC , and -.Xr scanf 3 , -as well as values returned by -.Fn localeconv . -.It Dv LC_TIME -Set a locale for formatting dates and times using the -.Xr strftime 3 -function. -.El +.Dv LC_TIME +\(em can be set and retrieved, too, but their values are ignored by the +.Ox +C library. +A category of +.Dv LC_ALL +sets the entire locale generically. .Pp -Only three locales are defined by default, -the empty string -.Qq -which denotes the native environment, and the -.Qq C -and -.Qq POSIX -locales, which denote the C language environment. -A +The syntax and semantics of the +.Fa locale +argument are not standardized and vary among operating systems. +On +.Ox , +if the .Fa locale -argument of -.Dv NULL -causes +string ends with +.Qq ".UTF-8" , +the UTF-8 locale is selected; otherwise, the C/POSIX/ASCII locale +is selected. +If the +.Fa locale +contains a dot but does not end with +.Qq ".UTF-8" , .Fn setlocale -to return the current locale. +fails. +.Pp +If +.Fa locale +is an empty string +.Pq Qq , +the value of the environment variables corresponding to +.Fa category +is used instead, as documented in the +.Xr locale 1 +manual page. +.Pp +If +.Fa locale +is +.Dv NULL , +the locale remains unchanged. +.Pp By default, C programs start in the .Qq C locale. @@ -130,39 +126,14 @@ .Pp The .Fn localeconv -function returns a pointer to a structure -which provides parameters for formatting numbers, -especially currency values: -.Bd -literal -offset indent -struct lconv { - char *decimal_point; - char *thousands_sep; - char *grouping; - char *int_curr_symbol; - char *currency_symbol; - char *mon_decimal_point; - char *mon_thousands_sep; - char *mon_grouping; - char *positive_sign; - char *negative_sign; - char int_frac_digits; - char frac_digits; - char p_cs_precedes; - char p_sep_by_space; - char n_cs_precedes; - char n_sep_by_space; - char p_sign_posn; - char n_sign_posn; - char int_p_cs_precedes; - char int_p_sep_by_space; - char int_n_cs_precedes; - char int_n_sep_by_space; - char int_p_sign_posn; - char int_n_sign_posn; -}; -.Ed +function returns a pointer to a static structure +which provides parameters for formatting numbers. +On +.Ox , +nothing in the returned structure ever changes. .Pp -The individual fields have the following meanings: +It provides the following fields of type +.Vt char * : .Bl -tag -width mon_decimal_point .It Fa decimal_point The decimal point character, except for currency values. @@ -200,6 +171,11 @@ .It Fa negative_sign The character used to denote negative currency values, usually a minus sign. +.El +.Pp +It also provides the following fields of type +.Vt char : +.Bl -tag -width mon_decimal_point .It Fa int_frac_digits The number of digits after the decimal point in an international-style currency value. @@ -279,38 +255,69 @@ .Dv CHAR_MAX result similarly denotes an unavailable value. .Sh RETURN VALUES -The +In case of success, +.Fn setlocale +returns a pointer to a static string describing the locale +that is in force after the call. +Subsequent calls to .Fn setlocale -function returns -.Dv NULL -and fails to change the locale -if the given combination of +may change the content of the string. +The format of the string is not standardized and varies among +operating systems. +.Pp +On +.Ox , +if +.Fn setlocale +was never called with a +.Pf non- Dv NULL +.Fa locale +argument, the string +.Qq C +is returned. +Otherwise, if the .Fa category -and +was not +.Dv LC_ALL +or if the locale is the same for all categories, a copy of the .Fa locale -makes no sense. -The -.Fn localeconv -function returns a pointer to a static object -which may be altered by later calls to +argument is returned. +Otherwise, the locales for the six categories +.Dv LC_COLLATE , +.Dv LC_CTYPE , +.Dv LC_MESSAGES , +.Dv LC_MONETARY , +.Dv LC_NUMERIC , +.Dv LC_TIME +are concatenated in that order, with slash +.Pq Ql / +characters in between. +.Pp +In case of failure, .Fn setlocale -or -.Fn localeconv . -.\" .Sh FILES XXX -.\" .Bl -tag -width /usr/share/locale/locale/category -compact XXX -.\" .It Pa $PATH_LOCALE/\fIlocale\fP/\fIcategory\fP XXX -.\" .It Pa /usr/share/locale/\fIlocale\fP/\fIcategory\fP XXX -.\" locale file for the locale \fIlocale\fP XXX -.\" and the category \fIcategory\fP. XXX -.\" .El -.Sh SEE ALSO -.Xr mklocale 1 , -.Xr catopen 3 , -.Xr printf 3 , -.Xr scanf 3 , -.Xr strcoll 3 , -.Xr strftime 3 , -.Xr strxfrm 3 +returns +.Dv NULL . +On +.Ox , +that can only happen if the +.Fa category +is invalid, if a character encoding other than UTF-8 is requested, +if the requested +.Fa locale +name is of excessive length, or if memory allocation fails. +.Sh EXAMPLES +Calling +.Pp +.Dl setlocale(LC_CTYPE, \(dqen_US.UTF-8\(dq); +.Pp +at the beginning of a program selects the UTF-8 locale and returns +.Qq en_US.UTF-8 . +Calling +.Pp +.Dl setlocale(LC_ALL, NULL); +.Pp +right afterwards leaves the locale unchanged and returns +.Qq C/en_US.UTF-8/C/C/C/C . .Sh STANDARDS The .Fn setlocale @@ -325,27 +332,3 @@ .Fn localeconv functions first appeared in .Bx 4.4 . -.Sh BUGS -The current implementation supports only the -.Qq C -and -.Qq POSIX -locales for all but the -.Dv LC_CTYPE -locale. -.Pp -In spite of the gnarly currency support in -.Fn localeconv , -the standards don't include any functions -for generalized currency formatting. -.Pp -.Dv LC_COLLATE -does not make sense for many languages. -Use of -.Dv LC_MONETARY -could lead to misleading results until we have a real time currency -conversion function. -.Dv LC_NUMERIC -and -.Dv LC_TIME -are personal choices and should not be wrapped up with the other categories.