Am 13.10.2013 15:50, schrieb Dmitry Olshansky:
13-Oct-2013 17:25, nickles пишет:
Ok, if my understandig is wrong, how do YOU measure the length of a
string?
Do you always use count(), or is there an alternative?


It's all there:
http://www.unicode.org/glossary/
http://www.unicode.org/versions/Unicode6.3.0/

I measure string length in code units (as defined in the above
standard). This bears no easy relation to the number of visible
characters but I don't mind it.

Measuring number of visible characters isn't trivial but can be done by
counting number of graphemes. For simple alphabets counting code points
will do the trick as well (what count does).


But you have to take care to normalize the string WRT diacritics if the estimate is supposed to work. OS X for example (if I remember right) always uses explicit combining characters, while Windows uses precomposed characters if possible.

Reply via email to