> -----Original Message-----
> From: Keld Jørn Simonsen [mailto:[EMAIL PROTECTED]]
...
> > > If wchar_t is Unicode, then the C compiler will have the macro
> > > 
> > >          __STDC_ISO_10646__ An integer constant of the form  yyyymmL
> > 
> > On glibc-2.1.3 wchar_t is Unicode but the macro is not defined.
> > Moreover, if it's not Unicode then I see no good way to sensibly
> > convert between its encoding and Unicode anyway.
> 
> Well, my understanding is that the new glibc does not run Unicode
> but UCS-4. Unicode is inherently 16 bit - I hope they someday
> would step into the 32 bit world, but have seen no signs of it.

Maybe that's because you are not looking... ;-)

Unicode does have a preference for UTF-16.  However,
for a long time UTF-8 is also an encoding form for
Unicode, and UTF-32 (UCS-4 bounded to the first 17
planes) is very much in the making, though not
formally part of the Unicode standard yet.  See
http://www.unicode.org/unicode/reports/tr19/.

UCS-2 used to be an encoding form for Unicode (and then
the only one).  But that has long ago been superceded
by UTF-16 as the preferred (but not only) encoding form.
UCS-4 was never an encoding form for Unicode, but UTF-32
is.  UTF-8 limited to the first 17 planes has been a
Unicode encoding form ever since Unicode also moved from
UCS-2 to UTF-16, i.e. since Unicode 2.0.

                /kent k
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Reply via email to