On Wed, 27 Nov 2002, Maiorana, Jason wrote: > >is variant of what. If you really want a particular variant, go look > >in extension B, or in the upcoming extension C... > > These may be difficult to use as font file formats, such as true type, > have a fixed ucs-2 internal encoding in the cmap, so they have > difficulty representing beyond-bmp characters.
No, truetype doesn't. TTFs/OTFs can have various Cmap formats (including 32bit cmap) depending on the coverage of characters. See, for instance, <http://developer.apple.com/fonts/TTRefMan/RM06/Chap6cmap.html> > Does anybody here have a system where beyond-BMP glyphs work well > with? (Input Servers, font display, titlebars, etc) glibc uses UTF-32 as wchar_t and we don't use UCS-2 locale but uses UTF-8 locales. Freetype2, Xft, fontconfig, and Pango were written from the beginning with this in mind. I think most Linux programs don't suffer from what used to be a common misunderstanding(hope it's not any more) that Unicode is a 16bit character set unless they rely on old X11 core fonts(bdf). Fortunately, XFree86 has been moving toward giving up X11 core fonts entirely and using client-side fonts(Xft, fontconfig) which makes font handling much easier. > >Also lurking in the > >wings are "variant selectors", anticipating more variants, but > >that they should not be given separate characters, but use > >"variant selectors" instead. Finally, the Unicode consortium > >has started pondering on "normalisation tailoring", since some > >find the canonical mappings of some Han characters "unhelpful". > > There are no Han variations yet, afaik. I think that for unified > characters which have significantly different orthography, there > could easily by a pair of non-unified codepoints which were more > specific. Thats certainly better that "variant selectors" which > are destined to be poorly supported if ever. You would be surprised to know how many Chinese characters have been dumped into the queue of IRG. A lot of them are clearly variants of characters already encoded. IMHO, IRG should have taken advantage of variation selector earlier to avoid encoding too many variants in Ext B. -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/