Tomohiro KUBOTA wrote on 2001-12-27 07:05 UTC:
> >   http://www.cl.cam.ac.uk/~mgk25/ucs/langinfo.c
> >   http://www.cl.cam.ac.uk/~mgk25/ucs/langinfo.h
>
> A good work.  Bruno's libcharset is also available for this purpose.

There were GNU IPR concerns with Bruno's code, whereas my little hacks
are all "public domain -- share and enjoy".

> Debian GNU/Linux "locales" package includes various pairs of locale
> and encoding.  You may want to include them.  Especially, "TIS-620"
> for "th" would be needed.  (If you want, I can send you the file.)

I added all those you mentioned, but also note that we are not
interested in Debian/Linux here at all, because it has already
a fully operational nl_langinfo(CODESET). We only worry about the
legacy systems like *BSD, which don't have it yet.

> Some people may use alias names of locale, such as "german" for de_DE
> and "french" for fr_FR.  Are there any way to manage these cases?

Are you really talking about existing practice on systems without
nl_langinfo()?

> I have ever heard that the default encoding for Japanese locale on
> some proprietary Unix is Shift_JIS, not EUC-JP.

I know about HP-UX, but I think that has nl_langinfo() anyway, so again
no problem here. Please correct me if I'm wrong.

NEWS:

I have just also written a normalization function for the output of
nl_langinfo(CODETSET) on different platforms. I tested it only on
Linux and Solaris so far, so please try it on any other platform you
have access too. Test instructions are in the source code.

  http://www.cl.cam.ac.uk/~mgk25/ucs/norm_charmap.c

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to