[issue15381] Optimize BytesIO to so less reallocations when written, similarly to StringIO

Eli Bendersky Wed, 18 Jul 2012 02:32:50 -0700

Eli Bendersky <[email protected]> added the comment:

I wonder if this is a fair comparison, Serhiy. Strings are unicode underneath, 
so they have a large overhead per string (more data to copy around). Increasing 
the length of the strings changes the game because due to PEP 393, the overhead 
for ASCII-only Unicode strings is constant:


>>> import sys
>>> sys.getsizeof('a')
50
>>> sys.getsizeof(b'a')
34
>>> sys.getsizeof('a' * 1000)
1049
>>> sys.getsizeof(b'a' * 1000)
1033
>>> 

When re-running your tests with larger chunks, the results are quite 
interesting:

$ ./python -m timeit -s "import io; d=[b'a'*100,b'bb'*50,b'ccc'*50]*1000"  
"b=io.BytesIO(); w=b.write"  "for x in d: w(x)"  "b.getvalue()"
1000 loops, best of 3: 509 usec per loop
$ ./python -m timeit -s "import io; d=['a'*100,'bb'*50,'ccc'*50]*1000"  
"s=io.StringIO(); w=s.write"  "for x in d: w(x)"  "s.getvalue()"
1000 loops, best of 3: 282 usec per loop

So, it seems to me that BytesIO could use some optimization!

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue15381>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15381] Optimize BytesIO to so less reallocations when written, similarly to StringIO

Reply via email to