I added a _PyUnicodeWriter internal API to optimize str%args and str.format(args). It uses a buffer which is overallocated, so it's basically like CPython str += str optimization. I still don't know how efficient it is on Windows, since realloc() is slow on Windows (at least on old Windows versions).
We should add an official and public API to concatenate strings. I know that PyPy has already its own API. Example: writer = UnicodeWriter() for item in data: writer += item # i guess that it's faster than writer.append(item) return str(writer) # or writer.getvalue() ? I don't care of the exact implementation of UnicodeWriter, it just have to be as fast or faster than ''.join(data). I don't remember if _PyUnicodeWriter is faster than StringIO or slower. I created an issue for that: http://bugs.python.org/issue15612 Victor 2013/2/12 Maciej Fijalkowski <fij...@gmail.com>: > Hi > > We recently encountered a performance issue in stdlib for pypy. It > turned out that someone commited a performance "fix" that uses += for > strings instead of "".join() that was there before. > > Now this hurts pypy (we can mitigate it to some degree though) and > possible Jython and IronPython too. > > How people feel about generally not having += on long strings in > stdlib (since the refcount = 1 thing is a hack)? > > What about other performance improvements in stdlib that are > problematic for pypy or others? > > Personally I would like cleaner code in stdlib vs speeding up CPython. > Typically that also helps pypy so I'm not unbiased. > > Cheers, > fijal > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com