Jarkko Hietaniemi wrote on 2001-11-09 22:02 UTC: > I think that "displaying UTF-8 text" is quite a difficult task. Not > only would you need a really large font -- both a number of glyphs (or > an ingenious font switching scheme), and to support the most intricate > CJK glyphs I hear that at least a 20pt font is required.
Hardly anyone needs full Unicode. If all you are interested in are European scripts and symbols for instance, then the 3 kilocharacters of the Unicode subset MES-3 are more than good enough for your needs, and the XFree86 standard xterm fonts 6x13, 8x13, 9x15, 9x18, 10x20 have covered MES-3 for over a year now and are widely used. People who can read CJK glyphs have used larger font sizes so far and will continue to do so in the future. Font size has nothing whatsoever to do with the encoding. It would be silly to decide (as Netscape 4 did :-( ) that every Unicode font has to be large enough to be able to represent every script covered by Unicode. On the contrary: XFree86 ships with 4x6 and 5x8 pixel Unicode fonts, and people do use these with xterm. For the 6x13 standard xterm "fixed" font, we even have a 12x13 doublewidth Japanese supplement that quite a number of Japanese users of 800x600 laptops have found very useful, in spite of its for CJK typographic needs a too small resolution (and Chinese and Korean users regularly send me questions for when 12x13 will be extended to cover their glyph repertoires as well). XFree86 also has a 9x18/18x18 terminal font with good CJK coverage, and I have no doubt that others will provide eventually even larger and nicer terminal fonts. > Moreover, the old way of thinking "one codepoint, one box" isn't going > to work with combining characters (and keeping on piling the combining > characters pushes the capabilities of the font rendering). Don't forget > ligatures, and I do not mean only the Latin ones: think Arabic, or Indic. The xterm shipping with XFree86 has supported a simple form of combining characters (in particularly motivated by Thai/Maths/IPA requirements) for over a year. This stuff is admittedly a bit more experimental, as not all UTF-8 aware command-line tools are also handling combining characters perfectly, but there is at least a growing Thai community enjoying the xterm support for simple overstriking combining characters. There are also at least two terminal editor in wide use now that support combining characters under xterm: vim 6.0 (the commonly used vi clone) and mined. VT100-style UTF-8 terminal emulators will for the foreseeable future not have full and well-established support for Hebrew, Arabic, Syriac, and Indic, because the bidi and ligature substitution requirements clsh significantly with the simple typewriter rendering model of a VT100. Hebrew and Arabic are about doable and there are experimental implementations by e.g. Robert Brady and others, but Indic and Syriac have not even been seriously discussed. > Mind, I would be (plesantly) surprised if there really is a 'terminal' > that can justice to the intricacies of Unicode. At the time the Plan > 9's 9term probably was close, but Unicode has moved on since. On an > xterm, sure, you can have the fonts, but probably not the combining > characters. Yudit, ditto. You obviously haven't used xterm recently in a UTF-8 locale. Look at the attached UTF-8 file with "vim 6.0" or "cat 0.94c" or newer in a UTF-8 locale! For an update: http://www.cl.cam.ac.uk/~mgk25/unicode.html#xterm Markus -- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
STARGΛ̊TE SG-1