Re: [Numpy-discussion] New functions.

Charles R Harris Tue, 31 May 2011 20:41:56 -0700

On Tue, May 31, 2011 at 8:50 PM, Bruce Southey <bsout...@gmail.com> wrote:


> On Tue, May 31, 2011 at 9:26 PM, Charles R Harris
> <charlesr.har...@gmail.com> wrote:
> >
> >
> > On Tue, May 31, 2011 at 8:00 PM, Skipper Seabold <jsseab...@gmail.com>
> > wrote:
> >>
> >> On Tue, May 31, 2011 at 9:53 PM, Warren Weckesser
> >> <warren.weckes...@enthought.com> wrote:
> >> >
> >> >
> >> > On Tue, May 31, 2011 at 8:36 PM, Skipper Seabold <jsseab...@gmail.com
> >
> >> > wrote:
> >> >> I don't know if it's one pass off the top of my head, but I've used
> >> >> percentile for interpercentile ranges.
> >> >>
> >> >> [docs]
> >> >> [1]: X = np.random.random(1000)
> >> >>
> >> >> [docs]
> >> >> [2]: np.percentile(X,[0,100])
> >> >> [2]: [0.00016535235312509222, 0.99961513543316571]
> >> >>
> >> >> [docs]
> >> >> [3]: X.min(),X.max()
> >> >> [3]: (0.00016535235312509222, 0.99961513543316571)
> >> >>
> >> >
> >> >
> >> > percentile() isn't one pass; using percentile like that is much
> slower:
> >> >
> >> > In [25]: %timeit np.percentile(X,[0,100])
> >> > 10000 loops, best of 3: 103 us per loop
> >> >
> >> > In [26]: %timeit X.min(),X.max()
> >> > 100000 loops, best of 3: 11.8 us per loop
> >> >
> >>
> >> Probably should've checked that before opening my mouth. Never
> >> actually used it for a minmax, but it is faster than two calls to
> >> scipy.stats.scoreatpercentile. Guess I'm +1 to fast order statistics.
> >>
> >
> > So far the biggest interest seems to be in order statistics of various
> > sorts, so to speak.
> >
> > Order Statistics
> >
> > minmax
> > median
> > k'th element
> > largest/smallest k elements
> >
> > Other Statistics
> >
> > mean/std
> >
> > Nan functions
> >
> > nanadd
> >
> > Chuck
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
> How about including all or some of Keith's Bottleneck package?
> He has tried to include some of the discussed functions and tried to
> make them very fast.
>

I don't think they are sufficiently general as they are limited to 2
dimensions. However, I think the moving filters should go into scipy, either
in ndimage or maybe signals. Some of the others we can still speed of
significantly, for instance nanmedian, by using the new functionality in
numpy, i.e., numpy sort has worked with nans for a while now. It looks like
call overhead dominates the nanmax times for small arrays and this might
improve if the ufunc machinery is cleaned up a bit more, I don't know how
far Mark got with that.

One bit of infrastructure that could be generally helpful is low-level
support for masked arrays, but that is a larger topic.

Chuck

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] New functions.

Reply via email to