[issue1943] improved allocation of PyUnicode objects

Collin Winter Wed, 03 Jun 2009 10:35:23 -0700

Collin Winter <coll...@gmail.com> added the comment:

On Wed, Jun 3, 2009 at 2:36 AM, Marc-Andre Lemburg
<rep...@bugs.python.org> wrote:
> Marc-Andre Lemburg <m...@egenix.com> added the comment:
>> All this is assuming the speed-up is important enough to bother.  Has
>> anyone run a comparison benchmark using the Unladen Swallow benchmarks?
>>
>>  I trust those much more than micro-benchmarks (including, I assume,
>> stringbench.py).  I do expect that reducing the number of allocations
>> for short-to-medium-size strings from 2 to 1 would be a significant
>> speed-up, but I can't guess how much.
>
> While the Unladen Swallow aims at providing high-level benchmarks,
> it's current state doesn't really implement that promise (yet).
>
> If you look at the list of benchmarks they use, most appear to be
> dealing with pickling. That doesn't strike me as particularly useful
> for testing real life Python usage.


I would take issue with your characterization of those benchmarks.
There are several benchmarks for cPickle, true, both macro and micro
benchmarks, but I would not describe their number as "most of [our]
benchmarks". For example, "slowpickle" and "slowunpickle" both use the
pure-Python pickle.py, and are testing how close we can get that
implementation to the tuned cPickle version.

Regardless, I don't know that any of our benchmarks really stress
unicode performance. We so far haven't cared about improving the
performance of unicode objects. 2to3 uses unicode internally, so that
might be a good benchmark to run.

> If a high level benchmark is indeed what's wanted, then they should
> setup pre-configured Django and Zope instances and run those through
> a series of real-life usage scenarios to cover the web application
> use space.

We have a benchmark for Django and Spitfire templates, both of which
are heavily used in the web application use space. We focused on
template languages because in talking to Google's web app teams, they
found their primary CPU bottlenecks to be templating systems, not ORMs
or other components.

> For scientific use cases, it would be good to have similar
> setups using BioPython, NumPy and matplotlib. And so on. Much like
> the high level benchmarks you have in the Windows world.

We have NumPy in our correctness test suite, but no benchmarks based
on it. Looking at all the packages you just named, they make heavy use
of C/C++ extensions (with BioPython and matplotpib both depending on
NumPy) or large C libraries (matplotlib can depend on Cairo, I see).
We've been focusing on pure-Python performance, so I'm skeptical that
benchmarks with such large C components would be a useful guide for
our work. I'm happy to talk about this further outside of this thread,
though.

Collin

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1943>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue1943] improved allocation of PyUnicode objects

Reply via email to