Eli Bendersky <eli...@gmail.com> added the comment: I wonder if this is a fair comparison, Serhiy. Strings are unicode underneath, so they have a large overhead per string (more data to copy around). Increasing the length of the strings changes the game because due to PEP 393, the overhead for ASCII-only Unicode strings is constant:
>>> import sys >>> sys.getsizeof('a') 50 >>> sys.getsizeof(b'a') 34 >>> sys.getsizeof('a' * 1000) 1049 >>> sys.getsizeof(b'a' * 1000) 1033 >>> When re-running your tests with larger chunks, the results are quite interesting: $ ./python -m timeit -s "import io; d=[b'a'*100,b'bb'*50,b'ccc'*50]*1000" "b=io.BytesIO(); w=b.write" "for x in d: w(x)" "b.getvalue()" 1000 loops, best of 3: 509 usec per loop $ ./python -m timeit -s "import io; d=['a'*100,'bb'*50,'ccc'*50]*1000" "s=io.StringIO(); w=s.write" "for x in d: w(x)" "s.getvalue()" 1000 loops, best of 3: 282 usec per loop So, it seems to me that BytesIO could use some optimization! ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15381> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com