> From: Tim Ruehsen <[email protected]> > Date: Fri, 17 Feb 2017 09:48:23 +0100 > Cc: "Andries E. Brouwer" <[email protected]>, YX Hao > <[email protected]> > Calculating the number of displayed columns from the number of bytes of a > string is non-trivial. It is trivial only for charsets/locales where each > byte > (or codepoint) will take exactly one column on the display. > > With unicode you have to *at least* compose the string first (NFC I guess), > and > then count the codepoints. But I am not sure about exceptions. > > @Andries Do you know an algorithm how to calculate the columns from a given > string + encoding ?
I'm not Andries, but AFAIK there's a file in the Unicode Character Database (UCD) called EastAsianWidth.txt which provides the width information. There's also this (which is a derivative of the UCD data): https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c
