Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

David Warde-Farley Thu, 09 Jul 2009 01:49:18 -0700

On 8-Jul-09, at 6:16 PM, Pauli Virtanen wrote:

> Just to tickle some interest, a "pathological" case before  
> optimization:
>
>    In [1]: import numpy as np
>    In [2]: x = np.zeros((80000, 256))
>    In [3]: %timeit x.sum(axis=0)
>    10 loops, best of 3: 850 ms per loop
>
> After optimization:
>
>    In [1]: import numpy as np
>    In [2]: x = np.zeros((80000, 256))
>    In [3]: %timeit x.sum(axis=0)
>    10 loops, best of 3: 78.5 ms per loop


Not knowing a terrible lot about cache optimization, I have nothing to  
contribute but encouragement. :) Pauli, this is fantastic work!

Just curious about regressions: have you tested on any non-x86  
hardware? Being a frequent user of an older ppc machine I worry about  
such things (and plan to give your benchmark a try tomorrow on both  
ppc and ppc64 OS X).

Cheers,
David
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

Reply via email to