[issue2857] add codec for java modified utf-8

STINNER Victor Wed, 11 May 2011 12:56:36 -0700

STINNER Victor <victor.stin...@haypocalc.com> added the comment:

Benchmark:
a) ./python -m timeit "(b'\xc3\xa9' * 10000).decode('utf-8')"
b)./python -m timeit "(''.join( map(chr, range(0, 128)) )*1000).encode('utf-8')"
c) ./python -m timeit "f=open('Misc/ACKS', encoding='utf-8'); acks=f.read(); 
f.close()" "acks.encode('utf-8')"
d) ./python -m timeit "f=open('Misc/ACKS', 'rb'); acks=f.read(); f.close()" 
"acks.decode('utf-8')"


Original -> patched (smallest value of 3 runs):
a) 85.8 usec -> 83.4 usec (-2.8%)
b) 548 usec -> 688 usec  (+25.5%)
c) 132 usec -> 144 usec (+9%)
d) 65.9 usec -> 67.3 usec (+2.1%)

Oh, decode 2 bytes sequences are faster with my patch. Strange :-)

But 25% slower to encode a pure ASCII text is not a good news.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2857>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2857] add codec for java modified utf-8

Reply via email to