Hello Keith,

While I also echo Johann's points about the arbitrariness and non-utility of 
benchmarking I'll briefly comment just on just a few tests to help out with 
getting things into idiomatic python/numpy:

Tests 1 and 2 are fairly pointless (empty for loop and empty procedure) that 
won't actually influence the running time of well-written non-pathological code.

Test 3: 
    #Test 3 - Add 200000 scalar ints
    nrep = 2000000 * scale_factor
    for i in range(nrep):
        a = i + 1

well, python looping is slow... one doesn't do such loops in idiomatic code if 
the underlying intent can be re-cast into array operations in numpy. But here 
the test is on such a simple operation that it's not clear how to recast in a 
way that would remain reasonable. Ideally you'd test something like:
i = numpy.arange(200000)
for j in range(scale_factor):
  a = i + 1

but that sort of changes what the test is testing.


Finally, test 21:
    #Test 21 - Smooth 512 by 512 byte array, 5x5 boxcar
    for i in range(nrep):
        b = scipy.ndimage.filters.median_filter(a, size=(5, 5))
    timer.log('Smooth 512 by 512 byte array, 5x5 boxcar, %d times' % nrep)

A median filter is definitely NOT a boxcar filter! You want "uniform_filter":

In [4]: a = numpy.empty((1000,1000))

In [5]: timeit scipy.ndimage.filters.median_filter(a, size=(5, 5))
10 loops, best of 3: 93.2 ms per loop

In [6]: timeit scipy.ndimage.filters.uniform_filter(a, size=(5, 5))
10 loops, best of 3: 27.7 ms per loop

Zach


On Sep 26, 2011, at 10:19 AM, Keith Hughitt wrote:

> Hi all,
> 
> Myself and several colleagues have recently started work on a Python library 
> for solar physics, in order to provide an alternative to the current mainstay 
> for solar physics, which is written in IDL.
> 
> One of the first steps we have taken is to create a Python port of a popular 
> benchmark for IDL (time_test3) which measures performance for a variety of 
> (primarily matrix) operations. In our initial attempt, however, Python 
> performs significantly poorer than IDL for several of the tests. I have 
> attached a graph which shows the results for one machine: the x-axis is the 
> test # being compared, and the y-axis is the time it took to complete the 
> test, in milliseconds. While it is possible that this is simply due to 
> limitations in Python/Numpy, I suspect that this is due at least in part to 
> our lack in familiarity with NumPy and SciPy.
> 
> So my question is, does anyone see any places where we are doing things very 
> inefficiently in Python?
> 
> In order to try and ensure a fair comparison between IDL and Python there are 
> some things (e.g. the style of timing and output) which we have deliberately 
> chosen to do a certain way. In other cases, however, it is likely that we 
> just didn't know a better method.
> 
> Any feedback or suggestions people have would be greatly appreciated. 
> Unfortunately, due to the proprietary nature of IDL, we cannot share the 
> original version of time_test3, but hopefully the comments in time_test3.py 
> will be clear enough.
> 
> Thanks!
> Keith
> <sunpy_time_test3_idl_python_2011-09-26.png>_______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to