[Numpy-discussion] Overlapping ranges

2009-03-16 Thread Peter Saffrey
I'm trying to file a set of data points, defined by genome coordinates, into bins, also based on genome coordinates. Each data point is (chromosome, start, end, point) and each bin is (chromosome, start, end). I have about 140 million points to file into around 100,000 bins. Both are (roughly)

[Numpy-discussion] Medians that ignore values

2008-09-18 Thread Peter Saffrey
I have data from biological experiments that is represented as a list of about 5000 triples. I would like to convert this to a list of the median of each triple. I did some profiling and found that numpy was much about 12 times faster for this application than using regular Python lists and a l

Re: [Numpy-discussion] Medians that ignore values

2008-09-18 Thread Peter Saffrey
physics.ucf.edu> writes: > Currently the only way you can handle NaNs is by using masked arrays. > Create a mask by doing isfinite(a), then call the masked array > median(). There's an example here: > > http://sd-2116.dedibox.fr/pydocweb/doc/numpy.ma/ > I had looked at masked arrays, bu

Re: [Numpy-discussion] Medians that ignore values

2008-09-18 Thread Peter Saffrey
Pierre GM gmail.com> writes: > Mmh, typo? > Yes, apologies. I was aiming for thorough, but ended up just careless. It's been a long day. > Ohoh. What version of numpy are you using ? The version in the Ubuntu package repository. It says 1:1.0.4-6ubuntu3. > if you don't give an axis > param

Re: [Numpy-discussion] Medians that ignore values

2008-09-19 Thread Peter Saffrey
David Cournapeau ar.media.kyoto-u.ac.jp> writes: > You can use nanmean (from scipy.stats): > I rejoiced when I saw this answer, because it looks like a function I can just drop in and it works. Unfortunately, nanmedian seems to be quite a bit slower than just using lists (ignoring nan values fr

Re: [Numpy-discussion] Medians that ignore values

2008-09-19 Thread Peter Saffrey
David Cournapeau ar.media.kyoto-u.ac.jp> writes: > It may be that nanmedian is slow. But I would sincerly be surprised if > it were slower than python list, except for some pathological cases, or > maybe a bug in nanmedian. What do your data look like ? (size, number of > nan, etc...) > I've po

Re: [Numpy-discussion] Medians that ignore values

2008-09-19 Thread Peter Saffrey
Pierre GM gmail.com> writes: > I think there were some changes on the C side of numpy between 1.0 and 1.1, > you may have to recompile scipy and matplotlib from sources. What versions > are you using for those 2 packages ? > $ dpkg -l | grep scipy ii python-scipy

Re: [Numpy-discussion] Medians that ignore values

2008-09-19 Thread Peter Saffrey
Alan G Isaac american.edu> writes: > Recently I needed to fill a 2d array with values > from computations that could "go wrong". > I created an array of NaN and then replaced > the elements where the computation produced > a useful value. I then applied ``nanmax``, > to get the maximum of the us

Re: [Numpy-discussion] Medians that ignore values

2008-09-22 Thread Peter Saffrey
David Cournapeau ar.media.kyoto-u.ac.jp> writes: > Still, it is indeed really slow for your case; when I fixed nanmean and > co, I did not know much about numpy, I just wanted them to give the > right answer :) I think this can be made faster, specially for your case > (where the axis along which

Re: [Numpy-discussion] Medians that ignore values

2008-09-22 Thread Peter Saffrey
David Cournapeau ar.media.kyoto-u.ac.jp> writes: > Unfortunately, we can't, because we would loose generality: we need to > compute median on any axis, not only the last one. The proper solution > would be to have a sort/max/min/etc... which knows about nan in numpy, > which is what Chuck and I a

[Numpy-discussion] Standard functions (z-score) on nan (again)

2008-09-25 Thread Peter Saffrey
I've bodged my way through my median problems (see previous postings). Now I need to take a z-score of an array that might contain nans. At the moment, if the array, which is 7000 elements, contains 1 nan or more, all the results come out as nan. My other problem is that my array is indexed from