On Fri, Nov 19, 2010 at 8:19 PM, Charles R Harris <[email protected] > wrote:
> > > On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman <[email protected]>wrote: > >> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman <[email protected]> >> wrote: >> > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen <[email protected]> wrote: >> >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: >> >> [clip] >> >>> My guess is that having separate underlying functions for each dtype, >> >>> ndim, and axis would be a nightmare for a large project like Numpy. >> But >> >>> manageable for a focused project like nanny. >> >> >> >> Might be easier to migrate the nan* functions to using Ufuncs. >> >> >> >> Unless I'm missing something, >> >> >> >> np.nanmax -> np.fmax.reduce >> >> np.nanmin -> np.fmin.reduce >> >> >> >> For `nansum`, we'd need to add an ufunc `nanadd`, and for >> >> `nanargmax/min`, we'd need `argfmin/fmax'. >> > >> > How about that! I wasn't aware of fmax/fmin. Yes, I'd like a nanadd, >> please. >> > >> >>> arr = np.random.rand(1000, 1000) >> >>> arr[arr > 0.5] = np.nan >> >>> np.nanmax(arr) >> > 0.49999625409581072 >> >>> np.fmax.reduce(arr, axis=None) >> > <snip> >> > TypeError: an integer is required >> >>> np.fmax.reduce(np.fmax.reduce(arr, axis=0), axis=0) >> > 0.49999625409581072 >> > >> >>> timeit np.fmax.reduce(np.fmax.reduce(arr, axis=0), axis=0) >> > 100 loops, best of 3: 12.7 ms per loop >> >>> timeit np.nanmax(arr) >> > 10 loops, best of 3: 39.6 ms per loop >> > >> >>> timeit np.nanmax(arr, axis=0) >> > 10 loops, best of 3: 46.5 ms per loop >> >>> timeit np.fmax.reduce(arr, axis=0) >> > 100 loops, best of 3: 12.7 ms per loop >> >> Cython is faster than np.fmax.reduce. >> >> I wrote a cython version of np.nanmax, called nanmax below. (It only >> handles the 2d, float64, axis=None case, but since the array is large >> I don't think that explains the time difference). >> >> Note that fmax.reduce is slower than np.nanmax when there are no NaNs: >> >> >> arr = np.random.rand(1000, 1000) >> >> timeit np.nanmax(arr) >> 100 loops, best of 3: 5.82 ms per loop >> >> timeit np.fmax.reduce(np.fmax.reduce(arr)) >> 100 loops, best of 3: 9.14 ms per loop >> >> timeit nanmax(arr) >> 1000 loops, best of 3: 1.17 ms per loop >> >> >> arr[arr > 0.5] = np.nan >> >> >> timeit np.nanmax(arr) >> 10 loops, best of 3: 45.5 ms per loop >> >> timeit np.fmax.reduce(np.fmax.reduce(arr)) >> 100 loops, best of 3: 12.7 ms per loop >> >> timeit nanmax(arr) >> 1000 loops, best of 3: 1.17 ms per loop >> > > There seem to be some odd hardware/compiler dependencies. I get quite a > different pattern of times: > > In [1]: arr = np.random.rand(1000, 1000) > > In [2]: timeit np.nanmax(arr) > 100 loops, best of 3: 10.4 ms per loop > > In [3]: timeit np.fmax.reduce(arr.flat) > 100 loops, best of 3: 2.09 ms per loop > > In [4]: arr[arr > 0.5] = np.nan > > In [5]: timeit np.nanmax(arr) > 100 loops, best of 3: 12.9 ms per loop > > In [6]: timeit np.fmax.reduce(arr.flat) > 100 loops, best of 3: 7.09 ms per loop > > > I've tweaked fmax with the reduce loop option but the nanmax times don't > look like yours at all. I'm also a bit surprised that > you don't see any difference in times when the array contains a lot of > nans. I'm running on AMD Phenom, gcc 4.4.5. > > However, I noticed that the build wants to be -O1 by default. I have my own CFLAGS that make it -O2, but It looks like ubuntu's python might be built with -O1. Hmm. That could certainly cause some odd timings. Chuck
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
