Hello Keith,
While I also echo Johann's points about the arbitrariness and non-utility of
benchmarking I'll briefly comment just on just a few tests to help out with
getting things into idiomatic python/numpy:
Tests 1 and 2 are fairly pointless (empty for loop and empty procedure) that
won't actually influence the running time of well-written non-pathological code.
Test 3:
#Test 3 - Add 200000 scalar ints
nrep = 2000000 * scale_factor
for i in range(nrep):
a = i + 1
well, python looping is slow... one doesn't do such loops in idiomatic code if
the underlying intent can be re-cast into array operations in numpy. But here
the test is on such a simple operation that it's not clear how to recast in a
way that would remain reasonable. Ideally you'd test something like:
i = numpy.arange(200000)
for j in range(scale_factor):
a = i + 1
but that sort of changes what the test is testing.
Finally, test 21:
#Test 21 - Smooth 512 by 512 byte array, 5x5 boxcar
for i in range(nrep):
b = scipy.ndimage.filters.median_filter(a, size=(5, 5))
timer.log('Smooth 512 by 512 byte array, 5x5 boxcar, %d times' % nrep)
A median filter is definitely NOT a boxcar filter! You want "uniform_filter":
In [4]: a = numpy.empty((1000,1000))
In [5]: timeit scipy.ndimage.filters.median_filter(a, size=(5, 5))
10 loops, best of 3: 93.2 ms per loop
In [6]: timeit scipy.ndimage.filters.uniform_filter(a, size=(5, 5))
10 loops, best of 3: 27.7 ms per loop
Zach
On Sep 26, 2011, at 10:19 AM, Keith Hughitt wrote:
> Hi all,
>
> Myself and several colleagues have recently started work on a Python library
> for solar physics, in order to provide an alternative to the current mainstay
> for solar physics, which is written in IDL.
>
> One of the first steps we have taken is to create a Python port of a popular
> benchmark for IDL (time_test3) which measures performance for a variety of
> (primarily matrix) operations. In our initial attempt, however, Python
> performs significantly poorer than IDL for several of the tests. I have
> attached a graph which shows the results for one machine: the x-axis is the
> test # being compared, and the y-axis is the time it took to complete the
> test, in milliseconds. While it is possible that this is simply due to
> limitations in Python/Numpy, I suspect that this is due at least in part to
> our lack in familiarity with NumPy and SciPy.
>
> So my question is, does anyone see any places where we are doing things very
> inefficiently in Python?
>
> In order to try and ensure a fair comparison between IDL and Python there are
> some things (e.g. the style of timing and output) which we have deliberately
> chosen to do a certain way. In other cases, however, it is likely that we
> just didn't know a better method.
>
> Any feedback or suggestions people have would be greatly appreciated.
> Unfortunately, due to the proprietary nature of IDL, we cannot share the
> original version of time_test3, but hopefully the comments in time_test3.py
> will be clear enough.
>
> Thanks!
> Keith
> <sunpy_time_test3_idl_python_2011-09-26.png>_______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion