Hi Pauli, On Thu, Feb 10, 2011 at 8:31 PM, Pauli Virtanen <p...@iki.fi> wrote:
> Thu, 10 Feb 2011 12:16:12 -0600, Robert Kern wrote: > [clip] > > One thing that might be worthwhile is to make > > implementations of sum() and cumsum() that avoid the ufunc machinery and > > do their iterations more quickly, at least for some common combinations > > of dtype and contiguity. > > I wonder what is the balance between the iterator overhead and the time > taken in the reduction inner loop. This should be straightforward to > benchmark. > > Apparently, some overhead decreased with the new iterators, since current > Numpy master outperforms 1.5.1 by a factor of 2 for this benchmark: > > In [8]: %timeit M.sum(1) # Numpy 1.5.1 > 10 loops, best of 3: 85 ms per loop > > In [8]: %timeit M.sum(1) # Numpy master > 10 loops, best of 3: 49.5 ms per loop > > I don't think this is explainable by the new memory layout optimizations, > since M is C-contiguous. > > Perhaps there would be room for more optimization, even within the ufunc > framework? I hope so. Please suggest if there's anything that I can do to further advance this. (My C skills are allready bit rusty, but at any higher level I'll try my best to contribute). Thanks, eat > > -- > Pauli Virtanen > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion