Le 28/08/2011 23:06, "Martin v. Löwis" a écrit :
Am 28.08.2011 22:01, schrieb Antoine Pitrou:

- the iobench results are between 2% acceleration (seek operations),
   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
   difference is probably in the UTF-8 decoder; I have already
   restored the "runs of ASCII" optimization and am out of ideas for
   further speedups. Again, having to scan the UTF-8 string twice
   is probably one cause of slowdown.

I don't think it's the UTF-8 decoder because I see an even larger
slowdown with simpler encodings (e.g. "-E latin1" or "-E utf-16le").

Those haven't been ported to the new API, yet. Consider, for example,
d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test;
with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this
is a 25% speedup for PEP 393.

If I understand correctly, the performance now highly depend on the used characters? A pure ASCII string is faster than a string with characters in the ISO-8859-1 charset? Is it also true for BMP characters vs non-BMP characters?

Do these benchmark tools use only ASCII characters, or also some ISO-8859-1 characters? Or, better, different Unicode ranges in different tests?

Python-Dev mailing list

Reply via email to