Tomohiro KUBOTA wrote on 2001-12-27 07:05 UTC: > > http://www.cl.cam.ac.uk/~mgk25/ucs/langinfo.c > > http://www.cl.cam.ac.uk/~mgk25/ucs/langinfo.h > > A good work. Bruno's libcharset is also available for this purpose.
There were GNU IPR concerns with Bruno's code, whereas my little hacks are all "public domain -- share and enjoy". > Debian GNU/Linux "locales" package includes various pairs of locale > and encoding. You may want to include them. Especially, "TIS-620" > for "th" would be needed. (If you want, I can send you the file.) I added all those you mentioned, but also note that we are not interested in Debian/Linux here at all, because it has already a fully operational nl_langinfo(CODESET). We only worry about the legacy systems like *BSD, which don't have it yet. > Some people may use alias names of locale, such as "german" for de_DE > and "french" for fr_FR. Are there any way to manage these cases? Are you really talking about existing practice on systems without nl_langinfo()? > I have ever heard that the default encoding for Japanese locale on > some proprietary Unix is Shift_JIS, not EUC-JP. I know about HP-UX, but I think that has nl_langinfo() anyway, so again no problem here. Please correct me if I'm wrong. NEWS: I have just also written a normalization function for the output of nl_langinfo(CODETSET) on different platforms. I tested it only on Linux and Solaris so far, so please try it on any other platform you have access too. Test instructions are in the source code. http://www.cl.cam.ac.uk/~mgk25/ucs/norm_charmap.c Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/> -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/