Ezio Melotti <ezio.melo...@gmail.com> added the comment: The attached patch is a proof of concept to see if Steffen proposal might be viable.
I wrote another normalize_encoding function that implements the algorithm described in msg129259, adjusted the shortcuts and did some timings. (Note: the function is not tested extensively and might break. It might also be optimized further.) These are the results: # $ command # result with my patch # result without wolf@hp:~/dev/py/py3k$ ./python -m timeit "b'x'.decode('latin1')" 1000000 loops, best of 3: 0.626 usec per loop 100000 loops, best of 3: 2.03 usec per loop wolf@hp:~/dev/py/py3k$ ./python -m timeit "b'x'.decode('latin-1')" 1000000 loops, best of 3: 0.614 usec per loop 1000000 loops, best of 3: 0.616 usec per loop wolf@hp:~/dev/py/py3k$ ./python -m timeit "b'x'.decode('iso-8859-1')" 1000000 loops, best of 3: 0.993 usec per loop 1000000 loops, best of 3: 0.649 usec per loop wolf@hp:~/dev/py/py3k$ ./python -m timeit "b'x'.decode('iso8859_1')" 1000000 loops, best of 3: 1.01 usec per loop 100000 loops, best of 3: 2.08 usec per loop wolf@hp:~/dev/py/py3k$ ./python -m timeit "b'x'.decode('iso_8859_1')" 1000000 loops, best of 3: 0.734 usec per loop 1000000 loops, best of 3: 0.694 usec per loop wolf@hp:~/dev/py/py3k$ ./python -m timeit "b'x'.decode('utf8')" 1000000 loops, best of 3: 0.728 usec per loop 100000 loops, best of 3: 6.37 usec per loop ---------- Added file: http://bugs.python.org/file20878/issue11303.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11303> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com