>> time I ran into them. UTF8 is probably the right thing for storing the text in >> a file IMHO but there are other choices.
JL> I think we want the core to use UTF8 too. But we have to deal with JL> variable-length character encodings of course. I'm not sure, the variable-length-ness leads to some bad performance hits on searching and string indexing, especially backwards; also, with UTF-8 there is the possibility of accidentally generating malformed bit sequences that don't correspond to any character, and this can be really hard to debug. There is really little reason to use UTF-8 except for staying as ASCII-transparent and as compatible with 8-bit channels as possible. Actually, UCS4 seems like a good choice, especially because of the better compatibility with Omega (that uses non-Unicode 32-bit codepoints for some purposes - among others, paradoxically, to stay as compatible with Unicode as possible on the input side). >> My Xfree86 4.2.1 tree does not contain any fonts that do more than one alphabet Oh, quite a lot of fonts do, nowadays, such as MS's core fonts that are in use on a substantial number of Linux systems. Actually, I'm quite sure that there are some iso10646 fonts in his tree, too. Anyway, XFree86 is shifting towards bitmapped multilanguage TrueType fonts anyway with one of the next versions. >> different fonts or a font_set. Methinks that UCS-4 internal format will require >> the the painter to have a UCS-4 to font and glyph mapping function. That's something either offered by most frontends anyway or trivially simple to implement. Cheers - Philipp Reichmuth mailto:[EMAIL PROTECTED] -- Stay the patient course / Of little worth is your ire / The network is down
