Hi, Egmont, The example from Markus' page that you show actually shows "source code" written using ASCII but with a C-style static string in UTF-8. There is no problem with this code!
However, if you try to write some code like this: void Ãcrire(const char *myCString); // Function name has Latin-1 chars *in UTF-8 encoding* void åå(const char *myCString); // Function name has Chinese chars *in UTF-8 encoding* ... instead of: void myWriteFunction(const char *myCString); // Function name *limited to basic ASCII Latin* ... THEN You will get into trouble not only with GCC but probably with other compilers as well. So: 1. Keep your code --all parts of it that are actually parsed by the compiler-- limited only to ASCII. (Most people suggest the code be in English with English comments for world-wide comprehension). 2. Although the strings in your program can be in any encoding you want, UTF-8 certainly makes the most sense. I have real-life production code that contains message strings encoded in UTF-8 that compiles and executes just fine on numerous platforms. I have never had a problem with this code with either GCC or Intel's ICC on Linux, GCC on other Free *Nix platforms like FreeBSD and OpenBSD, or Sun's Forte compiler on Solaris 8. I *never* use special compiler #pragmas, nor resort to wide-character (multibyte) strings. I always just use UTF-8 encoding in simple C-style "char *" strings or, for C++ code, in the standard C++ "String" class. - Ed Trager On Friday 2004.11.12 18:45:08 +0100, Egmont Koblinger wrote: > Hi, > > I was reading Markus's page and found the example: > printf("%ls\n", L"SchÃne GrÃÃe"); > and noticed that gcc always interprets the source code according to Latin-1. > > Then I googled a bit and found this reported to the gcc folks by Markus: > http://sources.redhat.com/ml/libc-alpha/2000-09/msg00337.html > > However, this happened four years ago, and I haven't found more recent > pieces of information on this topic. > > So my questions: > > - Is there a proper solution where I can write my source code in UTF-8? > I have linux with gcc 3.3.4 and it's not necessary for the code to be > portable to older or different systems. > > - Some people were discussing a cpp #pragma charset. Is it already > implemented? If yes, where can I find docs about it? > > - Does recompiling gcc with --enable-c-mbchar solve this issue? Will gcc > then honour my locale settings? Is it a stable, ready-for-production-use > option of gcc? > > - Are there any applications which are known to miscompile with a c-mbchar > gcc if I have a non-Latin1 (e.g. Latin-2 or UTF-8) locale settings? > > > > thanks, > > Egmont > > -- > Linux-UTF8: i18n of Linux on all levels > Archive: http://mail.nl.linux.org/linux-utf8/ > > > -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/