On Tue, Nov 10, 2009 at 11:34 AM, Phil Deets <pjdee...@gmail.com> wrote: > On Tue, 10 Nov 2009 05:18:59 -0500, Lutger <lutger.blijdest...@gmail.com> > wrote: > >> - why is a UTF-string iterator bidirectional and why is that unexpected? > > I think it is wouldn't support random access since accessing the nth code > point (code points are similar to characters) is not a constant time > operation since different code points can be made up of different numbers of > bytes. That isn't necessarily intuitive since UTF-strings are stored > contiguously in memory; so you might expect them to be random-accessible.
I thought the comment was about this: you might expect it to be just a forward iterator, but (surprise!) you can also find the previous codepoint in O(1) time, due to lead units being in values ranges distinct from following units. But I still don't find it particularly unexpected. It's probably only unexpected if you don't know anything about UTF other than the fact that each character is a variable number of bytes. --bb