On Mo, 2015-03-16 at 15:53 +0000, Dave Hirschfeld wrote: > I have a number of large arrays for which I want to compute the mean and > standard deviation over a particular axis - e.g. I want to compute the > statistics for axis=1 as if the other axes were combined so that in the > example below I get two values back > > In [1]: a = randn(30, 2, 10000) > > For the mean this can be done easily like: > > In [2]: a.mean(0).mean(-1) > Out[2]: array([ 0.0007, -0.0009]) >
If you have numpy 1.7+ (which I guess by now is basically always the case), you can do a.mean((0, 1)). Though it isn't actually faster in this example, probably because it has to use buffered iterators and things, but I would guess the performance should be much more stable depending on memory order, etc. then any other method. - Sebastian > > ...but this won't work for the std. Using some transformations we can > come up with something which will work for either: > > In [3]: a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) > Out[3]: array([ 0.0007, -0.0009]) > > In [4]: a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) > Out[4]: array([ 0.0007, -0.0009]) > > > If we look at the performance of these equivalent methods: > > In [5]: %timeit a.transpose(2,0,1).reshape(-1, 2).mean(axis=0) > 100 loops, best of 3: 14.5 ms per loop > > In [6]: %timeit a.transpose(1,0,2).reshape(2, -1).mean(axis=-1) > 100 loops, best of 3: 5.05 ms per loop > > > we can see that the latter version is a clear winner. Investigating > further, both methods appear to copy the data so the performance is > likely down to better cache utilisation. > > In [7]: np.may_share_memory(a, a.transpose(2,0,1).reshape(-1, 2)) > Out[7]: False > > In [8]: np.may_share_memory(a, a.transpose(1,0,2).reshape(2, -1)) > Out[8]: False > > > Both methods are however significantly slower than the initial attempt: > > In [9]: %timeit a.mean(0).mean(-1) > 1000 loops, best of 3: 1.2 ms per loop > > Perhaps because it allocates a smaller temporary? > > For those who like a challenge: is there a faster way to achieve what > I'm after? > > > Cheers, > Dave > > > > > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion