On Friday, February 17, 2017 12:05:21 PM CET Eli Zaretskii wrote:
> > From: Tim Ruehsen <[email protected]>
> > Date: Fri, 17 Feb 2017 09:48:23 +0100
> > Cc: "Andries E. Brouwer" <[email protected]>, YX Hao
> > <[email protected]> Calculating the number of displayed columns from
> > the number of bytes of a string is non-trivial. It is trivial only for
> > charsets/locales where each byte (or codepoint) will take exactly one
> > column on the display.
> > 
> > With unicode you have to *at least* compose the string first (NFC I
> > guess), and then count the codepoints. But I am not sure about
> > exceptions.
> > 
> > @Andries Do you know an algorithm how to calculate the columns from a
> > given
> > string + encoding ?
> 
> I'm not Andries, but AFAIK there's a file in the Unicode Character
> Database (UCD) called EastAsianWidth.txt which provides the width
> information.
> 
> There's also this (which is a derivative of the UCD data):
> 
>   https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

Hi Eli,

thanks for pointing out.

I read bit in the gnulib source code... and wcwidth() should do it correctly 
(either gnulib or libc version).
This page made it clearer for me as well: http://stackoverflow.com/questions/
3634627/how-to-know-the-preferred-display-width-in-columns-of-unicode-
characters

Regards, Tim

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to