On Tue, May 31, 2016 at 07:40:13PM +0000, Wyatt via Digitalmars-d wrote: > On Tuesday, 31 May 2016 at 19:20:19 UTC, Timon Gehr wrote: > > > > The 'length' of a character is not one in all contexts. > > The following text takes six columns in my terminal: > > > > 日本語 > > 123456 > > That's a property of your font and font rendering engine, not Unicode. > (Also, it's probably not quite six columns; most fonts I've tested, > 漢字 are rendered as something like 1.5 characters wide, assuming your > terminal doesn't overlap them.) [...]
I believe he was talking about a console terminal that uses 2 columns to render the so-called "double width" characters. The CJK block does contain "double-width" versions of selected blocks (e.g., the ASCII block), to be used with said characters. Of course, using string length to measure string width is a risky venture fraught with pitfalls, because your terminal may not actually render them the way you think it should. Nevertheless, it does serve to highlight why a construct like s.walkLength is essentially buggy, because there is not enough information to determine which length it should return -- length of the buffer in bytes, or the number of code points, or the number of graphemes, or the width of the string. No matter which choice you make, it only works for a subset of cases and is wrong for the other cases. This is a prime illustration of why forcing autodecoding on every string in D is a wrong design. T -- Не дорог подарок, дорога любовь.