A Sunday 23 March 2008, Francesc Altet escrigué: > A Sunday 23 March 2008, Anne Archibald escrigué: > > On 23/03/2008, Damian Eads <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > > > I am working on a memory-intensive experiment with very large > > > arrays so I must be careful when allocating memory. Numpy already > > > supports a number of in-place operations (+=, *=) making the task > > > much more manageable. However, it is not obvious to me out I set > > > values based on a very simple condition. > > > > > > The expression > > > > > > y[y<0]=-1 > > > > > > generates a binary index mask y>=0 of the same size as the array > > > y, which is problematic when y is quite large. > > > > > > I was wondering if there was anything like a set_where(A, cmp, > > > B, setval, [optional elseval]) function where cmp would be a > > > comparison operator expressed as a string. > > > > > > The code below illustrates what I want to do. Admittedly, it > > > needs to be cleaned up but it's a proof of concept. Does numpy > > > provide any functions that support the functionality of the code > > > below? > > > > That's a good question, but I'm pretty sure it doesn't, apart from > > numpy.clip(). The way I'd try to solve that problem would be with > > the dreaded for loop. Don't iterate over single elements, but if > > you have a gargantuan array, working in chunks of ten thousand (or > > whatever) won't have too much overhead: > > > > block = 100000 > > for n in arange(0,len(y),block): > > yc = y[n:n+block] > > yc[yc<0] = -1 > > > > It's a bit of a pain, but working with arrays that nearly fill RAM > > *is* a pain, as I'm sure you are all too aware by now. > > > > You might look into numexpr, this is the sort of thing it does > > (though I've never used it and can't say whether it can do this). > > Well, Numexpr is designed to minimize the number of temporaries, and > can do what Damian wants without requiring to put the mask in a > temporary. However, the output will require new space. The usage > should be something like: > > In [11]: y = numpy.random.normal(0, 10, 10) > > In [12]: numexpr.evaluate('where(y<0, -1, y)') > Out[12]: > array([ 7.11784295, -1. , 10.92876842, -1. , > 0.76092629, -1. , 14.07021792, -1. , > 5.67173405, 31.28631822])
Ops. I realised that, for this particular case, Numexpr memory usage is similar to its NumPy counterpart: y[:] = numpy.where(y<0, -1, y) So, I think the best option for you should be working with chunks, as Anne suggested. Cheers, -- >0,0< Francesc Altet http://www.carabos.com/ V V Cárabos Coop. V. Enjoy Data "-" _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion