Le 28/08/2011 23:06, "Martin v. Löwis" a écrit :
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
- the iobench results are between 2% acceleration (seek operations),
16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
difference is probably in the UTF-8 decoder; I have already
restored the "runs of ASCII" optimization and am out of ideas for
further speedups. Again, having to scan the UTF-8 string twice
is probably one cause of slowdown.
I don't think it's the UTF-8 decoder because I see an even larger
slowdown with simpler encodings (e.g. "-E latin1" or "-E utf-16le").
Those haven't been ported to the new API, yet. Consider, for example,
d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test;
with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this
is a 25% speedup for PEP 393.
If I understand correctly, the performance now highly depend on the used
characters? A pure ASCII string is faster than a string with characters
in the ISO-8859-1 charset? Is it also true for BMP characters vs non-BMP
characters?
Do these benchmark tools use only ASCII characters, or also some
ISO-8859-1 characters? Or, better, different Unicode ranges in different
tests?
Victor
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com