New submission from Serhiy Storchaka <storch...@gmail.com>:

Charmap decoders are not as important as UTF decoders, but are still widely 
used. In Python 3.3 with PEP 393 they slowed down 4x. The proposed patch 
restores the performance.

Optimized only the most common case, when the decoder is specified by the UCS2 
table with length >= 256. Map-based decoders translated to table-based. UCS1 
tables widened to UCS2 by adding 257th fake characters.

Benchmark results:

                             3.2           3.3(vanilla)  3.3(patched)

cp1251    'A'*10000          111 (+10%)    31 (+294%)    122
cp1251    '\xa0'*10000       111 (+8%)     29 (+314%)    120
cp1251    '\u0402'*10000     111 (+6%)     25 (+372%)    118

----------
components: Interpreter Core, Unicode
files: decode_charmap.patch
keywords: patch
messages: 161301
nosy: ezio.melotti, haypo, lemburg, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Faster charmap decoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25664/decode_charmap.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14874>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to