> From: Tim Ruehsen <[email protected]>
> Date: Fri, 17 Feb 2017 09:48:23 +0100
> Cc: "Andries E. Brouwer" <[email protected]>, YX Hao 
> <[email protected]>
> Calculating the number of displayed columns from the number of bytes of a 
> string is non-trivial. It is trivial only for charsets/locales where each 
> byte 
> (or codepoint) will take exactly one column on the display.
> 
> With unicode you have to *at least* compose the string first (NFC I guess), 
> and 
> then count the codepoints. But I am not sure about exceptions.
> 
> @Andries Do you know an algorithm how to calculate the columns from a given 
> string + encoding ?

I'm not Andries, but AFAIK there's a file in the Unicode Character
Database (UCD) called EastAsianWidth.txt which provides the width
information.

There's also this (which is a derivative of the UCD data):

  https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

Reply via email to