Hi

ndarray.std(axis=1) seems to have memory issues on large 2D-arrays. I
first thought I had a performance issue but discovered that std() used
lots of memory and therefore caused lots of swapping.

I want to get an array where element i is the stadard deviation of row
i in the 2D array. Using valgrind on the std() function...

$ valgrind --tool=massif python -c "from numpy import *;
a=reshape(arange(100000*100),(100000,100)).std(axis=1)"

... showed me a peak of 200Mb memory while iterating line by line...

$ valgrind --tool=massif python -c "from numpy import *;
a=array([x.std() for x in reshape(arange(100000*100),(100000,100))])"

... got a peak of 40Mb memory.

This seems unnecessary since we know before calculations what the
output shape will be and should therefore be able to preallocate
memory.


My original problem was to get an moving average and a moving standard
deviation (120k rows and N=1000). For average I guess convolve should
perform good, but is there anything smart for std()? For now I use ...

>>> moving_std=array([a[i:i+n].std() for i in range(len(a)-n)])

which seems to perform quite well.

BR,

//Torgil

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Reply via email to