On Wed, 27 Nov 2002, Maiorana, Jason wrote:

> >is variant of what.  If you really want a particular variant, go look
> >in extension B, or in the upcoming extension C...  
> 
> These may be difficult to use as font file formats, such as true type,
> have a fixed ucs-2 internal encoding in the cmap, so they have
> difficulty representing beyond-bmp characters.

  No, truetype doesn't. TTFs/OTFs can have various Cmap formats
(including 32bit cmap) depending on the coverage of characters.  See, for
instance, <http://developer.apple.com/fonts/TTRefMan/RM06/Chap6cmap.html>


> Does anybody here have a system where beyond-BMP glyphs work well
> with? (Input Servers, font display, titlebars, etc)

  glibc uses UTF-32 as wchar_t and we don't use UCS-2 locale but
uses UTF-8 locales. Freetype2, Xft, fontconfig, and Pango were written
from the beginning with this in mind.  I think most Linux programs don't
suffer from what used to be a common misunderstanding(hope it's not any
more) that Unicode is a 16bit character set unless they rely on old
X11 core fonts(bdf). Fortunately, XFree86 has been moving toward giving
up X11 core fonts entirely and using client-side fonts(Xft, fontconfig)
which makes font handling much easier.

> >Also lurking in the
> >wings are "variant selectors", anticipating more variants, but
> >that they should not be given separate characters, but use 
> >"variant selectors" instead. Finally, the Unicode consortium
> >has started pondering on "normalisation tailoring", since some
> >find the canonical mappings of some Han characters "unhelpful".
> 
> There are no Han variations yet, afaik. I think that for unified
> characters which have significantly different orthography, there
> could easily by a pair of non-unified codepoints which were more
> specific. Thats certainly better that "variant selectors" which
> are destined to be poorly supported if ever.

   You would be surprised to know how many Chinese characters have been
dumped into the queue of IRG. A lot of them are clearly variants of
characters already encoded. IMHO, IRG should have taken advantage of
variation selector earlier to avoid encoding too many variants in 
Ext B.

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to