On 22 juin, 16:07, Saul Spatz <saul.sp...@gmail.com> wrote: > Thanks very much. This is the elegant kind of solution I was looking for. I > had hoped there was a way to do it without even addressing the matter of > surrogates, but apparently not. The reason I don't like this is that it > depends on knowing that python internally stores strings in UTF-16. I > expected that there would be some built-in iterator that would return the > code points. (Actually, this all started when I realized that s[k] wouldn't > necessarily give me the kth character of the string s.)
A character is not a code point. Beside this, a very few knows (correct English?) a character may have more than one code point. jmf -- http://mail.python.org/mailman/listinfo/python-list