Re: [Python-Dev] Python 3.0.1 (io-in-c)

Antoine Pitrou Wed, 28 Jan 2009 08:23:41 -0800

Paul Moore <p.f.moore <at> gmail.com> writes:
> >
> > As I pointed out, utf-8, utf-16 and latin1 decoders have already been
optimized
> > in py3k. For *pure ASCII* input, utf-8 decoding is blazingly fast (1GB/s
here).
> > The dataset for iobench isn't pure ASCII though, and that's why it's not
as fast.
> 
> Ah, thanks. Although you said your data was 95% ASCII, and you're
> getting decode speeds of 250MB/s. That's 75% slowdown for 5% of the
> data! Surely that's not right???


If you look at how utf-8 decoding is implemented (in unicodeobject.c), it's
quite obvious why it is so :-) There is a (very) fast path for chunks of pure
ASCII data, and (fast but not blazingly fast) fallback for non ASCII data.

Please don't think of it as a slowdown... It's still much faster than 2.x, which
manages 130MB/s on the same data.

Regards

Antoine.

_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 3.0.1 (io-in-c)

Reply via email to