John Machin <sjmachin <at> lexicon.net> writes: > Andrew Fong <FongAndrew <at> gmail.com> writes:
> Are > > there any built-in ways to do something like this already? Or do I > > just have to iterate over the unicode string? > > Converting each character to utf8 and checking the > total number of bytes so far? > Ooooh, sloooowwwwww! > Somewhat faster: u8len = 0 for u in unicode_string: if u <= u'\u007f': u8len += 1 elif u <= u'\u07ff': u8len += 2 elif u <= u'\uffff': u8len += 3 else: u8len += 4 Cheers, John -- http://mail.python.org/mailman/listinfo/python-list