On Sun, Aug 19, 2012 at 6:11 PM, Paul Rubin <no.email@nospam.invalid> wrote: > Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> writes: >> result = text[end:] > > if end not near the end of the original string, then this is O(N) > even with fixed-width representation, because of the char copying. > > if it is near the end, by knowing where the string data area > ends, I think it should be possible to scan backwards from > the end, recognizing what bytes can be the beginning of code points and > counting off the appropriate number. This is O(1) if "near the end" > means "within a constant".
Only if you know exactly where the end is (which requires storing and maintaining a character length - this may already be happening, I don't know). But that approach means you need to have code for both ways (forward search or reverse), and of course it relies on your encoding being reverse-scannable in this way (as UTF-8 is, but not all). And of course, taking the *entire* rest of the string isn't the only thing you do. What if you want to take the next six characters after that index? That would be constant time with a fixed-width storage format. ChrisA -- http://mail.python.org/mailman/listinfo/python-list