2009/1/28 Antoine Pitrou <solip...@pitrou.net>:
> Paul Moore <p.f.moore <at> gmail.com> writes:
>>
>> It would be helpful to limit this cost as much as possible - maybe
>> that's simply ensuring that the default encoding for open is (in the
>> majority of cases) a highly-optimised one whose costs *don't* dominate
>> in the way you describe
>
> As I pointed out, utf-8, utf-16 and latin1 decoders have already been 
> optimized
> in py3k. For *pure ASCII* input, utf-8 decoding is blazingly fast (1GB/s 
> here).
> The dataset for iobench isn't pure ASCII though, and that's why it's not as 
> fast.

Ah, thanks. Although you said your data was 95% ASCII, and you're
getting decode speeds of 250MB/s. That's 75% slowdown for 5% of the
data! Surely that's not right???

> People are invited to test their own workloads with the io-c branch and report
> performance figures (and possible bugs). There are so many possibilities that
> the benchmark figures given by a generic tool can only be indicative.

At the moment, I don't have the time to download and build the branch,
and in any case as I only have Visual Studio Express, I don't get the
PGO optimisations, making any tests I do highly suspect.

Paul.

PS Can anyone comment on why Python defaults to utf-8 on Windows? That
seems like a highly suspect default...
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to