"Martin v. Löwis" wrote: > Serge Orlov wrote: > > To summarize the discussion: either it's a bug in glibc or there > is an >> option to specify modern POSIX locale. POSIX locale consist of >> characters from the portable character set, unicode is certainly >> portable. > > Yes, but U+00E4 is not in the portable character set. The portable > character set is defined here: > > http://www.opengroup.org/onlinepubs/007908799/xbd/charset.html
Thanks for the link. They write (in 1997 or earlier ?): The wide-character value for each member of the Portable Character Set will equal its value when used as the lone character in an integer character constant. Wide-character codes for other characters are locale- and *implementation-dependent* Emphasis is mine. So how many libc implementations with non-unicode wide-character codes do we have in 2005? I'm really interested to know. Serge. -- http://mail.python.org/mailman/listinfo/python-list