On Fri, 12 Nov 2010 01:00:18 +0100 Daniel Gibson <metalcae...@gmail.com> wrote:
> > http://www.digitalmars.com/d/2.0/phobos/std_utf.html > > If I'm not mistaken, those functions don't handle these "graphemes", i.e. > something that appears like one character on the screen, but consists of > multiple code *points*. Like spir's "â" that, in UTF-8, is encoded with the > following bytes: 0x61 (=='a'), 0xCC, 0x82. (Or \u0061\u0302 in UTF-32). You are right, Daniel. As far as I understand it superficially (haven't used it yet), the current utf library deals with the lower-level issues of encoding code point into code units, and bytes. > Also, a function returning the physical position (i.e. pos in arrray of chars > or > wchars) of logical char #logPos may be useful, e.g. for fixed width printing > stuff: > size_t getPhysPos(char[] str, size_t logPos) See my reply to Walter's next post. Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com