Re: Non-ascii string processing? - count display units

Markus Scherer Tue, 07 Oct 2003 17:17:04 -0700

You might want to look at East Asian Width http://unicode.org/reports/tr11/ for an approximation of the green-screen width of a string.

To be absolutely precise, you need feedback from your green-screen layout engine and its font, of course, like you do for a graphical display.

markus

Edward H. Trager wrote:

What you really need for such a thing is a function which computes the
"width" of a string in terms of display units, rather than its length in
term of characters.
Yes, I agree. I also need such a function. Do you, Marco, or anyone else, know which function(s) provide this service? (In my case, something Open Source or GPLed would be ideal, but ICU would be too heavy). My application started out life in a sheltered ASCII-only childhood, and now needs to move to the bigger UTF-8 world out there. Fortunately, it is quite capable of succeeding in that world, but I haven't even started working on the on-screen table formatting issue yet for exactly this reason.

Actually I believe that if I have to write something myself, making it work for the Latin-with-combining-diacritics and CJK cases would not be too hard. After that however, it seems that one would have to work on a script-by-script basis to get it to really work properly. If it was only a case of Arabic, that would be one thing, but when one looks at the Indic and Indic-derived scripts ... well, there are a lot of Indic and Indic-derived scripts! Not that it is hard, but it would certainly take time, and I haven't done an ounce of research yet to find out whether somebody has done it already or not ...

Re: Non-ascii string processing? - count display units

Reply via email to