Paul Rubin wrote: > "Tuvas" <[EMAIL PROTECTED]> writes: > >>I've actually done the tests on this one, it's actually faster to use >>the += than a list, odd as it may sound. > > > Frederik explained the reason; there's an optimization in Python 2.4 > that I'd forgotten about, for that specific case. It's not in earlier > versions. It's a bit fragile in 2.4: > > a = '' > for x in something: > a += g(x) > > is fast, but if a is aliased, Python can't do the optimization, so > > a = '' > b = a > for x in something: > a += g(x) > > is slow.
Is this really true? After the first time through the loop, 'a' won't be aliased any more since strings are immutable. After that the loops should be equivalent. I tried this out to see if I could see a timing difference, in case I was missing something, with Python 2.4.1, the following two snippets timed essentially the same for N up to 2**20 (I didn't try any higher): def concat1(): a = '' for x in ' '*N: a += x return a def concat2(): a = '' b = a for x in ' '*N: a += x return a Regards, -tim > Figuring out which case to use relies on CPython's reference > counting storage allocator (the interpreter keeps track of how many > pointers there are to any given object) and so the optimization may > not be feasible at all in other implementations which use different > storage allocation strategies (e.g. Lisp-style garbage collection). > > All in all I think it's best to use a completely different approach > (something like StringBuffer) but my effort to start a movement here > on clpy a couple months ago didn't get anywhere. -- http://mail.python.org/mailman/listinfo/python-list