Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

Pauli Virtanen Thu, 09 Jul 2009 02:08:22 -0700

Thu, 09 Jul 2009 09:54:26 +0200, Matthieu Brucher kirjoitti:
> 2009/7/9 Pauli Virtanen <pav...@iki.fi>:
[clip]
>> I'm still kind of hoping that it's possible to make some minimal
>> assumptions about CPU caches in general, and have a rule that decides a
>> code path that is good enough, if not optimal.
> 
> Unfortunately, this is not possible. We've been playing with blocking
> loops for a long time in finite difference schemes, and it is always
> compiler dependent (that is, the optimal size of the block is bandwidth
> dependent and even operation dependent).


I'm not completely sure about this: the data access pattern in a reduce 
operation is in principle relatively simple, and the main focus would be 
in improving worst cases rather than being completely optimal. This could 
perhaps be achieved with a generic rule that tries to maximize data 
locality.

Of course, I may be wrong here...

-- 
Pauli Virtanen

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Optimizing reduction loops (sum(), prod(), et al.)

Reply via email to