Le mardi 23 août 2011 à 13:51 +0200, "Martin v. Löwis" a écrit :
> > This optimization was done when trying to improve the speed of text I/O.
> 
> So what speedup did it achieve, for the kind of data you talked about?

Since I don't have the number anymore, I've just saved the contents of
https://linuxfr.org/news/le-noyau-linux-est-disponible-en-version%C2%A030
as a "linuxfr.html" file and then did:

$ ./python -m timeit "with open('linuxfr.html', encoding='utf8') as f: f.read()"
1000 loops, best of 3: 859 usec per loop

After disabling the fast path, I ran the micro-benchmark again:

$ ./python -m timeit "with open('linuxfr.html', encoding='utf8') as f: f.read()"
1000 loops, best of 3: 1.09 msec per loop

so that's a 20% speedup.

> > Do you have three copies of the UTF-8 decoder already, or do you a use a
> > stringlib-like approach?
> 
> It's a single implementation - see for yourself.

So why would you need three separate implementation of the unrolled
loop? You already have a macro named WRITE_FLEXIBLE_OR_WSTR.

Even without taking into account the unrolled loop, I wonder how much
slower UTF-8 decoding becomes with that approach, by the way. Instead of
testing the "kind" variable at each loop iteration, using a
stringlib-like approach may be a better deal IMO.

Of course we would first need to have various benchmark numbers once the
current PEP 393 implementation is complete.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to