Hello Ian,
On 25/07/11 11:00, Ian Ozsvald wrote:
Dear all, I've published v0.2 of my High Performance Python tutorial
write-up from the session I ran at EuroPython:
http://ianozsvald.com/2011/07/25/high-performance-python-tutorial-v0-2-from-europython-2011/
today I and Armin investigated a bit more about the performances of the
mandelbrot algorithm that you wrote for your tutorial. What we found is very
interesting :-).
We compared three versions of the code:
- a (slightly modified) pure python one on PyPy
- the Cython one using calculate_z.pyx_2_bettermath
- the shedskin one, using shedskin2.py
The PyPy version looks like this:
def calculate_z_serial_purepython(q, maxiter, z):
"""Pure python with complex datatype, iterating over list of q and z"""
output = [0] * len(q)
for i in range(len(q)):
zi = z[i]
qi = q[i]
for iteration in range(maxiter):
zi = zi * zi + qi
if (zi.real*zi.real + zi.imag*zi.imag) > 4.0:
output[i] = iteration
break
return output
i.e., it is exactly the same as pure_python_2.py, but we avoid to use abs(zi),
so it is comparable with the cython and shedskin version.
First, we ran the programs to calculate passing "1000 1000" as arguments, and
these are the results:
PyPy: 1.95 secs
Cython: 0.58 secs
Shedskin: 0.42 secs
so, PyPy is ~4.5x slower than Shedskin.
However, we realized that using the default values for x1,x2,y1,y2, the
innermost loop runs very few iterations most of the time, and this is one case
in which PyPy suffer most, because it needs to go through a bridge to continue
the execution, and at the moment bridges are slower than loops.
So, we changed the values of x1,x2,y1,y2 to compute a different region, in
which the innermost loop runs more frequently. We used these values:
x1, x2, y1, y2 = 0.37865401-0.02, 0.37865401+0.02, 0.669227668-0.02,
0.669227668+0.02
and since all programs are faster to compute the image, we used "3000 3000" as
arguments from the command line. These are the results:
PyPy: 0.89
Cython: 1.76
Shedskin: 0.26
So, in this case, PyPy is ~2x faster than Cython and ~3.5x slower than Shedskin.
In the meantime, Armin wrote a C version of it:
http://paste.pocoo.org/raw/504216/
which tooks 0.946 seconds to complete. This is in line with the PyPy's result,
but we are still investigating why the shedskin's version is so much faster.
ciao,
Anto
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev