> I don't believe this specific variant has been discussed. Now that you clarify it: no, it hasn't been discussed. I find that not surprising - this proposal is so strange and unnatural that probably nobody dared to suggest it.
> s[5] does not exist. You would get an IndexError indicating that it > refers to the second half of a surrogate. > [...] > > len(s[k]) would be 2 if it involved a surrogate, yes. One character, > two code units. Please consider trade-offs. Study advantages and disadvantages. Compare them. Can you then seriously suggest that indexing should have 'holes'? That it will be an IndexError if you access with an index between 0 and len(s)??????? If you absolutely think support for non-BMP characters is necessary in every program, suggesting that Python use UCS-4 by default on all systems has a higher chance of finding acceptance (in comparison). Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list