> On Feb 19, 2018, at 2:54 AM, Ralf Junker <ralfjun...@gmx.de> wrote: > > 'です' are 2 codepoints according to > > http://www.fontspace.com/unicode/analyzer/?q=%E3%81%A7%E3%81%99 > <http://www.fontspace.com/unicode/analyzer/?q=%E3%81%A7%E3%81%99> > > The requested overall width is 4, so I would expect expect two added spaces > and a total length of 4.
If this is being done for the purpose of visual alignment in a monospaced font, it's not going to work. Both of those Kanji(?) characters are displayed as double-width (in macOS's Terminal at least), so their visual width is 4 spaces, meaning there should be zero spaces of padding. You really _cannot_ equate Unicode code-points with visual width of displayed text, even in a monospaced layout. Not only do terminals render some characters as double-width, but there are all kinds of other exceptions like zero-width joiners, diacritical marks, ligatures, and joined forms. As a very common example of the latter, many emojis — e.g. all the faces with multiple skin tones — are actually composed of multiple (up to five or six) Unicode code-points. TL;DR: If you use character (code-point) counts to visually lay out text, you're likely to get bad results with anything other than plain ASCII, so it's only marginally better than just counting bytes. —Jens _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users