On 2015-10-22 20:11:15 +0000, Joseph Myers wrote: > LC_CTYPE should affect the interpretation of multibyte character sequences > as characters, including on output. That's the standard semantics.
That's only for the recommended default behavior. There are many contexts where different charset information is provided. Other than that, LC_CTYPE is assumed to correspond to the charset of the terminal. > That's what all C library functions involving interpretation of multibyte > character sequences do. Straightforward use of POSIX library interfaces > does not support producing output in a character set other than that > specified with LC_CTYPE; e.g. printf expects a format string (possibly > resulting from a message catalog) in the LC_CTYPE character set, and does > not convert the bytes to another character set. [...] Only when a setlocale() with appropriate arguments is done. A C program is free to use other locales than declared by the LC_* environment variables when this makes sense. > Again, LC_CTYPE does *not* affect source file interpretation. [...] > You could write your "c99" program wrapper to add a -finput-charset= > option based on the locale's character set if you so wish (it also needs > to do things such as option reordering and handling -O with separate > argument - the "gcc" driver deliberately processes -D and -U options in > the order they appear on the command line, not following the POSIX rule > that -U options take precedence over -D - so you should not expect the > "gcc" driver to be usable as "c99" without such adaptation for deliberate > differences). > > I think we should clearly update the documentation to reflect reality > regarding source file encoding, and leave it strictly for wrappers such as > "c99" to specify -finput-charset= options rather than leaving open the > possibility that GCC's own default might change in future. The documentation should also say whether LC_CTYPE affects the command-line arguments (e.g. macro values via -D) and in what way it affects the output (e.g. messages and output of "gcc -E"). -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)