On Friday, February 17, 2017 12:05:21 PM CET Eli Zaretskii wrote: > > From: Tim Ruehsen <[email protected]> > > Date: Fri, 17 Feb 2017 09:48:23 +0100 > > Cc: "Andries E. Brouwer" <[email protected]>, YX Hao > > <[email protected]> Calculating the number of displayed columns from > > the number of bytes of a string is non-trivial. It is trivial only for > > charsets/locales where each byte (or codepoint) will take exactly one > > column on the display. > > > > With unicode you have to *at least* compose the string first (NFC I > > guess), and then count the codepoints. But I am not sure about > > exceptions. > > > > @Andries Do you know an algorithm how to calculate the columns from a > > given > > string + encoding ? > > I'm not Andries, but AFAIK there's a file in the Unicode Character > Database (UCD) called EastAsianWidth.txt which provides the width > information. > > There's also this (which is a derivative of the UCD data): > > https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
Hi Eli, thanks for pointing out. I read bit in the gnulib source code... and wcwidth() should do it correctly (either gnulib or libc version). This page made it clearer for me as well: http://stackoverflow.com/questions/ 3634627/how-to-know-the-preferred-display-width-in-columns-of-unicode- characters Regards, Tim
signature.asc
Description: This is a digitally signed message part.
