Re: Today's programming challenge - How's your Range-Fu ?

Panke via Digitalmars-d Mon, 20 Apr 2015 10:51:14 -0700

This can lead to subtle bugs, cf. length of random and e_one.You have to convert everything to dstring to get the "expected"result. However, this is not always desirable.

There are three things that you need to be aware of when handlingunicode: code units, code points and graphems.

In general the length of one guarantees anything about the lengthof the other, except for utf32, which is a 1:1 mapping betweencode units and code points.

In this thread, we were discussing the relationship between codepoints and graphemes. You're examples however apply to therelationship between code units and code points.

To measure the columns needed to print a string, you'll need thenumber of graphemes. (d|)?string.length gives you the number ofcode units.

If you normalize a string (in the sequence ofcharacters/codepoints sense, not object.string) to NFC, it willdecompose every precomposed character in the string (like é,single codeunit), establish a defined order between the compositecharacters and then recompose a selected few graphemes (like é).This way é always ends up as a single code unit in NFC. There aredozens of other combinations where you'll still have n:1 mappingbetween code points and graphemes left after normalization.

Example given already in this thread: putting an arrow over anlatin letter is typical in math and always more than onecodepoint.

Re: Today's programming challenge - How's your Range-Fu ?

Reply via email to