On Apr 20, 7:34 pm, Rhamphoryncus <[EMAIL PROTECTED]> wrote: > On Apr 20, 6:21 pm, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > > > I don't believe this specific variant has been discussed. > > Now that you clarify it: no, it hasn't been discussed. I find that > > not surprising - this proposal is so strange and unnatural that > > probably nobody dared to suggest it. > > Difficult problems sometimes need unexpected solutions. > > Although Guido seems to be relenting slightly on the O(1) indexing > requirement, so maybe we'll end up with an O(log n) solution (where n > is the number of surrogates, not the length of the string).
The last thing I heard with regards to string indexing was that Guido was very adamant about O(1) indexing. On the other hand, if one is willing to have O(log n) access time (where n is the number of surrogate pairs), it can be done in O(n/logn) space (again where n is the number of surrogate pairs). An early version of the structure can be found here: http://mail.python.org/pipermail/python-3000/2006-September/003937.html I can't seem to find my later post with an updated version (I do have the source somewhere). > If you pick an index at random you will get IndexError. If you > calculate the index using some combination of len, find, index, rfind, > rindex you will be unaffected by my change. You can even assume the > length of a character so long as you know it fits in 16 bits (ie any > '\uxxxx' escape). > > I'm unaware of any practical use cases that would be harmed by my > change, so that leaves only philosophical issues. Considering the > difficulty of the problem it seems like an okay trade-off to me. It is not ok for s[i] to fail for any index within the range of the length of the string. - Josiah -- http://mail.python.org/mailman/listinfo/python-list