Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

Pauli Virtanen Mon, 13 Jul 2009 01:01:18 -0700

Wed, 08 Jul 2009 22:16:22 +0000, Pauli Virtanen kirjoitti:
[clip]
> On an older CPU (slower, smaller cache), the situation is slightly
> different:
> 
>     http://www.iki.fi/pav/tmp/athlon.png
>     http://www.iki.fi/pav/tmp/athlon.txt
> 
> On average, it's still an improvement in many cases.  However, now there
> are more regressions. The significant ones (factor of 1/2) are N-D
> arrays where the reduction runs over an axis with a small number of
> elements.


Part of this seemed (thanks, Valgrind!) to be because of L2 cache misses, 
which came from forgetting to evaluate also the first reduction iteration 
in blocks. Fixed -- the regressions are now less severe (most are ~0.8), 
although for this machine there are still some...

-- 
Pauli Virtanen

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

Reply via email to