On 16/10/14 20:46, Uranuz via Digitalmars-d-learn wrote:
I have some string *str* of unicode characters. The question is how to check if
I have valid unicode code point starting at code unit *index*?
[...]

You cannot do that without decoding. Cheking whether utf-x is valid and decoding are the very same process. IIRC, D has a validation func which is more or less just an alias for the decoding func ;-). Moreover, you also need to distinguish "word-character" code points from others (punctuation, spacing, etc) which requires unicode code points (Unicode the consortium provide tables for such tasks).

Thus, I would recommand you to just abandon the illusion of working at the level of code units for such tasks, and simply operate on strings of code points. (Why do you think D has them builtin?)

denis

Reply via email to