Re: [Numpy-discussion] slicing and aliasing

2012-05-31 Thread Keith Goodman
On Thu, May 31, 2012 at 7:30 AM, Neal Becker wrote: > That is, will: > > u[a:b] = u[c:d] > > always work (assuming the ranges of a:b, d:d are equal, or course) It works most of the time. This thread shows you how to find an example where it does not work: http://mail.scipy.org/pipermail/numpy-d

Re: [Numpy-discussion] timing results (was: record arrays initialization)

2012-05-03 Thread Keith Goodman
On Thu, May 3, 2012 at 3:12 PM, Moroney, Catherine M (388D) wrote: > Here is the python code: > > def single(element, targets): > >    if (isinstance(element, tuple)): >        xelement = element[0] >    elif (isinstance(element, numpy.ndarray)): >        xelement = element >    else: >        re

Re: [Numpy-discussion] timing results (was: record arrays initialization)

2012-05-03 Thread Keith Goodman
On Thu, May 3, 2012 at 12:46 PM, Paul Anton Letnes wrote: > > Could you show us the code? It's hard to tell otherwise. As Keith Goodman > pointed out, if he gets 7.5x with cython, it could be that the Fortran code > could be improved as well. Fortran has a reputation

Re: [Numpy-discussion] timing results (was: record arrays initialization)

2012-05-03 Thread Keith Goodman
On Thu, May 3, 2012 at 10:38 AM, Moroney, Catherine M (388D) wrote: > Actually Fortran with correct array ordering - 13 seconds!  What horrible > python/numpy > mistake am I making to cause such a slowdown? For the type of problem you are working on, I'd flip it around and ask what you are doin

Re: [Numpy-discussion] record arrays initialization

2012-05-03 Thread Keith Goodman
On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs wrote: > The cKDTree implementation is more than 4 times faster than the brute-force > approach: > > T = scipy.spatial.cKDTree(targets) > > In [11]: %timeit foo1(element, targets)   # Brute force > 1000 loops, best of 3: 385 us per loop > > In [12]: %

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Keith Goodman
On Mon, Mar 5, 2012 at 12:12 PM, Keith Goodman wrote: > On Mon, Mar 5, 2012 at 12:06 PM, Neal Becker wrote: >> But doesn't this one fail on empty array? > > Yes. I'm optimizing for fun, not for corner cases. This should work > for size zero and NaNs: &

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Keith Goodman
On Mon, Mar 5, 2012 at 12:06 PM, Neal Becker wrote: > But doesn't this one fail on empty array? Yes. I'm optimizing for fun, not for corner cases. This should work for size zero and NaNs: @cython.boundscheck(False) @cython.wraparound(False) def allequal(np.ndarray[np.float64_t, ndim=1] a): c

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Keith Goodman
On Mon, Mar 5, 2012 at 11:52 AM, Benjamin Root wrote: > Another issue to watch out for is if the array is empty.  Technically > speaking, that should be True, but some of the solutions offered so far > would fail in this case. Good point. For fun, here's the speed of a simple cython allclose: I

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Keith Goodman
On Mon, Mar 5, 2012 at 11:36 AM, wrote: > How about numpy.ptp, to follow this line? I would expect it's single > pass, but wouldn't short circuit compared to cython of Keith I[1] a = np.ones(10) I[2] timeit (a == a[0]).all() 1000 loops, best of 3: 203 us per loop I[3] timeit a.min() == a.max

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Keith Goodman
On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker wrote: > Keith Goodman wrote: > >> On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker wrote: >>> What is a simple, efficient way to determine if all elements in an array (in >>> my case, 1D) are equal?  How about close? >>

Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Keith Goodman
On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker wrote: > What is a simple, efficient way to determine if all elements in an array (in > my > case, 1D) are equal?  How about close? For the exactly equal case, how about: I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1# All equal I[3]

Re: [Numpy-discussion] speed of array creation from tuples

2012-02-27 Thread Keith Goodman
On Mon, Feb 27, 2012 at 2:55 PM, Skipper Seabold wrote: > I am surprised by this (though maybe I shouldn't be?) It's always faster to > use list comprehension to unpack lists of tuples than np.array/asarray? > > [~/] > [1]: X = [tuple(np.random.randint(10,size=2)) for _ in > range(100)] > > [~/] >

Re: [Numpy-discussion] Creating a bool array with Cython

2012-02-26 Thread Keith Goodman
On Sat, Feb 25, 2012 at 7:04 PM, Dag Sverre Seljebotn wrote: > On 02/25/2012 03:26 PM, Keith Goodman wrote: >> Is this a reasonable (and fast) way to create a bool array in cython? >> >>      def makebool(): >>          cdef: >>              int n = 2 >&

[Numpy-discussion] Creating a bool array with Cython

2012-02-25 Thread Keith Goodman
Is this a reasonable (and fast) way to create a bool array in cython? def makebool(): cdef: int n = 2 np.npy_intp *dims = [n] np.ndarray[np.uint8_t, ndim=1] a a = PyArray_EMPTY(1, dims, NPY_UINT8, 0) a[0] = 1 a[1] = 0

Re: [Numpy-discussion] swaxes(0, 1) 10% faster than transpose on 2D matrix?

2012-01-19 Thread Keith Goodman
On Thu, Jan 19, 2012 at 1:37 AM, Mark Bakker wrote: > I noticed that swapaxes(0,1) is consistently (on my system) 10% faster than > transpose on a 2D matrix. Transpose is faster for me. And a.T is faster than a.transpose() perhaps because a.transpose() checks that the inputs make sense? My guess

Re: [Numpy-discussion] alterNEP - was: missing data discussion round 2

2011-06-30 Thread Keith Goodman
On Thu, Jun 30, 2011 at 10:51 AM, Nathaniel Smith wrote: > On Thu, Jun 30, 2011 at 6:31 AM, Matthew Brett > wrote: >> In the interest of making the discussion as concrete as possible, here >> is my draft of an alternative proposal for NAs and masking, based on >> Nathaniel's comments.  Writing i

Re: [Numpy-discussion] missing data discussion round 2

2011-06-27 Thread Keith Goodman
On Mon, Jun 27, 2011 at 8:55 AM, Mark Wiebe wrote: > First I'd like to thank everyone for all the feedback you're providing, > clearly this is an important topic to many people, and the discussion has > helped clarify the ideas for me. I've renamed and updated the NEP, then > placed it into the ma

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Keith Goodman
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern wrote: > The alternative proposal would be to add a few new dtypes that are > NA-aware. E.g. an nafloat64 would reserve a particular NaN value > (there are lots of different NaN bit patterns, we'd just reserve one) > that would represent NA. An naint32

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Keith Goodman
On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe wrote: > On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman wrote: >> >> On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe wrote: >> > Enthought has asked me to look into the "missing data" problem and how >> > NumPy

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-23 Thread Keith Goodman
On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe wrote: > Enthought has asked me to look into the "missing data" problem and how NumPy > could treat it better. I've considered the different ideas of adding dtype > variants with a special signal value and masked arrays, and concluded that > adding masks

Re: [Numpy-discussion] argmax for top N elements

2011-06-22 Thread Keith Goodman
On Wed, Jun 22, 2011 at 12:08 PM, RadimRehurek wrote: >> Date: Wed, 22 Jun 2011 11:30:47 -0400 >> From: Alex Flint >> Subject: [Numpy-discussion] argmax for top N elements >> >> Is it possible to use argmax or something similar to find the locations of >> the largest N elements in a matrix? > > I

Re: [Numpy-discussion] fast SSD

2011-06-21 Thread Keith Goodman
On Tue, Jun 21, 2011 at 5:09 PM, Alex Flint wrote: > Is there a fast way to compute an array of sum-of-squared-differences > between a (small)  K x K array and all K x K sub-arrays of a larger array? > (i.e. each element x,y in the output array is the SSD between the small > array and the sub-arra

Re: [Numpy-discussion] poor performance of sum with sub-machine-word integer types

2011-06-21 Thread Keith Goodman
On Tue, Jun 21, 2011 at 9:46 AM, Zachary Pincus wrote: > Hello all, > > As a result of the "fast greyscale conversion" thread, I noticed an anomaly > with numpy.ndararray.sum(): summing along certain axes is much slower with > sum() than versus doing it explicitly, but only with integer dtypes a

Re: [Numpy-discussion] numpy type mismatch

2011-06-10 Thread Keith Goodman
On Fri, Jun 10, 2011 at 6:35 PM, Charles R Harris wrote: > On Fri, Jun 10, 2011 at 5:19 PM, Olivier Delalleau wrote: >> But isn't it a bug if numpy.dtype('i') != numpy.dtype('l') on a 32 bit >> computer where both are int32? >> > > Maybe yes, maybe no ;) They have different descriptors, so from

Re: [Numpy-discussion] Returning the same dtype in Cython as np.argmax

2011-06-10 Thread Keith Goodman
On Wed, Jun 8, 2011 at 9:49 PM, Travis Oliphant wrote: > On Jun 7, 2011, at 3:17 PM, Keith Goodman wrote: >> What is the rule to determine the dtype returned by numpy functions >> that return indices such as np.argmax? > > The return type of indices will be np.intp. Thanks,

[Numpy-discussion] Returning the same dtype in Cython as np.argmax

2011-06-07 Thread Keith Goodman
What is the rule to determine the dtype returned by numpy functions that return indices such as np.argmax? I assumed that np.argmax() returned the same dtype as np.int_. That works on linux32/64 and win32 but on win-amd64 np.int_ is int32 and np.argmax() returns int64. Someone suggested using int

[Numpy-discussion] [job] Python Job at Hedge Fund

2011-06-07 Thread Keith Goodman
We are looking for help to predict tomorrow's stock returns. The challenge is model selection in the presence of noisy data. The tools are ubuntu, python, cython, c, numpy, scipy, la, bottleneck, git. A quantitative background and experience or interest in model selection, machine learning, and s

Re: [Numpy-discussion] k maximal elements

2011-06-06 Thread Keith Goodman
On Mon, Jun 6, 2011 at 6:57 AM, gary ruben wrote: > I learn a lot by watching the numpy and scipy lists (today Olivier > taught me about heapq :), but he may not have noticed that Python 2.4 > added an nsmallest method) > > import heapq > q = list(x) > heapq.heapify(q) > k_smallest = heapq.nsmalle

Re: [Numpy-discussion] k maximal elements

2011-06-06 Thread Keith Goodman
On Mon, Jun 6, 2011 at 6:44 AM, Keith Goodman wrote: > On Sun, Jun 5, 2011 at 11:15 PM, Alex Ter-Sarkissov > wrote: >> I have a vector of positive integers length n. Is there a simple (i.e. >> without sorting/ranking) of 'pulling out' k larrgest (or smallest) values.

Re: [Numpy-discussion] k maximal elements

2011-06-06 Thread Keith Goodman
On Sun, Jun 5, 2011 at 11:15 PM, Alex Ter-Sarkissov wrote: > I have a vector of positive integers length n. Is there a simple (i.e. > without sorting/ranking) of 'pulling out' k larrgest (or smallest) values. > Something like > > sum(x[sum(x,1)>(max(sum(x,1)+min(sum(x,1/2,]) > > but smarter Y

Re: [Numpy-discussion] New functions.

2011-06-01 Thread Keith Goodman
On Tue, May 31, 2011 at 8:41 PM, Charles R Harris wrote: > On Tue, May 31, 2011 at 8:50 PM, Bruce Southey wrote: >> How about including all or some of Keith's Bottleneck package? >> He has tried to include some of the discussed functions and tried to >> make them very fast. > > I don't think the

Re: [Numpy-discussion] random seed replicate 2d randn with 1d loop

2011-05-23 Thread Keith Goodman
On Mon, May 23, 2011 at 12:34 PM, wrote: > Obviously I was working by columns, using a transpose worked, but > rewriting to axis=1 instead of axis=0 which should be more efficient > since I had almost all calculations by columns, I needed > params = map(lambda x: np.expand_dims(x, 1), params) >

Re: [Numpy-discussion] random seed replicate 2d randn with 1d loop

2011-05-23 Thread Keith Goodman
On Mon, May 23, 2011 at 11:42 AM, Keith Goodman wrote: > On Mon, May 23, 2011 at 11:33 AM,   wrote: >> I have a function in two versions, one vectorized, one with loop >> >> the vectorized function  gets all randn variables in one big array >> rvs = distr.rvs(

Re: [Numpy-discussion] random seed replicate 2d randn with 1d loop

2011-05-23 Thread Keith Goodman
On Mon, May 23, 2011 at 11:33 AM, wrote: > I have a function in two versions, one vectorized, one with loop > > the vectorized function  gets all randn variables in one big array > rvs = distr.rvs(args, **{'size':(nobs, nrep)}) > > the looping version has: >    for irep in xrange(nrep): >        

Re: [Numpy-discussion] Mapping of dtype to C types

2011-05-09 Thread Keith Goodman
On Mon, May 9, 2011 at 1:46 AM, Pauli Virtanen wrote: > Sun, 08 May 2011 14:45:45 -0700, Keith Goodman wrote: >> I'm writing a function that accepts four possible dtypes: int32, int64, >> float32, float64. The function will call a C extension (wrapped in >> Cython).

[Numpy-discussion] Mapping of dtype to C types

2011-05-08 Thread Keith Goodman
I'm writing a function that accepts four possible dtypes: int32, int64, float32, float64. The function will call a C extension (wrapped in Cython). What are the equivalent C types? int, long, float, double, respectively? Will that work on all systems? ___

Re: [Numpy-discussion] ANN: Numpy 1.6.0 release candidate 2

2011-05-03 Thread Keith Goodman
On Tue, May 3, 2011 at 11:18 AM, Ralf Gommers wrote: > I am pleased to announce the availability of the second release > candidate of NumPy 1.6.0. nanmin and nanmax are much faster in Numpy 1.6. Plus they now return an object that has dtype, etc attributes when the input is all NaN. Broke Bottle

Re: [Numpy-discussion] ANN: Numpy 1.6.0 release candidate 2

2011-05-03 Thread Keith Goodman
On Tue, May 3, 2011 at 11:18 AM, Ralf Gommers wrote: > I am pleased to announce the availability of the second release > candidate of NumPy 1.6.0. I get one failure when I run from ipython but not python. In ipython I import a few packages at startup. One of those packages must be changing the n

Re: [Numpy-discussion] Array as Variable using "from cdms2 import MV2 as MV"

2011-04-25 Thread Keith Goodman
On Mon, Apr 25, 2011 at 12:17 AM, wrote: > On Mon, Apr 25, 2011 at 2:50 AM, dileep kunjaai > wrote: >> Dear sir, >> >> I am have 2  mxn numpy array say "obs" & "fcst". I have to >> calculate sum of squre of (obs[i, j]-fcst[i, j]) using from cdms2 >> import MV2 as MV   in CDAT witho

Re: [Numpy-discussion] Extending numpy statistics functions (like mean)

2011-04-11 Thread Keith Goodman
On Mon, Apr 11, 2011 at 2:36 PM, Sergio Pascual wrote: > Hi list. > > For mi application, I would like to implement some new statistics > functions over numpy arrays, such as truncated mean. Ideally this new > function should have the same arguments > than numpy.mean: axis, dtype and out. Is there

Re: [Numpy-discussion] argmin and argmax without nan

2011-03-24 Thread Keith Goodman
On Thu, Mar 24, 2011 at 6:19 AM, Ralf Gommers wrote: > 2011/3/24 Dmitrey : >> hi, >> is there any way to get argmin and argmax of an array w/o nans? >> Currently I have > from numpy import * > argmax([10,nan,100]) >> 1 > argmin([10,nan,100]) >> 1 >> But it's not the values I would like

Re: [Numpy-discussion] moving window product

2011-03-21 Thread Keith Goodman
On Mon, Mar 21, 2011 at 11:27 AM, Brent Pedersen wrote: > my current use-case is to do this 24 times on arrays of about 200K elements. > file IO is the major bottleneck. Would using h5py or pytables help? I get about 3 ms for a write-read cycle for a 200K array. That's much faster than the convo

Re: [Numpy-discussion] moving window product

2011-03-21 Thread Keith Goodman
On Mon, Mar 21, 2011 at 10:34 AM, Brent Pedersen wrote: > On Mon, Mar 21, 2011 at 11:19 AM, Keith Goodman wrote: >> On Mon, Mar 21, 2011 at 10:10 AM, Brent Pedersen wrote: >>> hi, is there a way to take the product along a 1-d array in a moving >>> window? -- simila

Re: [Numpy-discussion] moving window product

2011-03-21 Thread Keith Goodman
On Mon, Mar 21, 2011 at 10:10 AM, Brent Pedersen wrote: > hi, is there a way to take the product along a 1-d array in a moving > window? -- similar to convolve, with product in place of sum? > currently, i'm column_stacking the array with offsets of itself into > window_size columns and then takin

Re: [Numpy-discussion] assert_almost_equal bug?

2011-03-12 Thread Keith Goodman
On Sat, Mar 12, 2011 at 4:16 AM, Ralf Gommers wrote: > On Sat, Mar 12, 2011 at 12:10 AM, Keith Goodman wrote: >> assert_almost_equal() and assert_array_almost_equal() raise a >> ValueError instead of an AssertionError when the array contains >> np.inf: > > That&#x

[Numpy-discussion] assert_almost_equal bug?

2011-03-11 Thread Keith Goodman
assert_almost_equal() and assert_array_almost_equal() raise a ValueError instead of an AssertionError when the array contains np.inf: >> a = np.array([[1., 2.], [3., 4.]]) >> b = a.copy() >> np.testing.assert_almost_equal(a, b) >> b[0,0] = np.inf >> np.testing.assert_almost_equal(a, b) ValueError

Re: [Numpy-discussion] How to get the prices of Moving Averages Crosses?

2011-03-01 Thread Keith Goodman
On Tue, Mar 1, 2011 at 8:07 AM, Andre Lopes wrote: > Hi, > > I'm new to Numpy. I'm doing some tests with some Stock Market Quotes > > My struggle right now is "how to get the values of the moving averages > crosses", I send an image in attach to illustrate what I'm trying to > get. > > I'm using t

[Numpy-discussion] When memory access is a bottleneck

2011-02-25 Thread Keith Goodman
A topic that often comes up on the list is that arr.sum(axis=-1) is faster than arr.sum(axis=0). For C ordered arrays, moving along the last axis moves the smallest amount in memory. And moving small amounts in memory keeps the data in cache longer. Can I use that fact to speed up calculations alon

Re: [Numpy-discussion] NaN value processing in weave.inline code

2011-01-14 Thread Keith Goodman
On Fri, Jan 14, 2011 at 12:03 PM, Joon Ro wrote: > Hi, > I was wondering if it is possible to process (in if statement - check if the > given value is NaN) numpy NaN value inside the weave.inline c code. > > testcode = ''' > if (test(0)) { >       return_val = test(0); > } > ''' > > err = weave.in

Re: [Numpy-discussion] Output dtype

2011-01-12 Thread Keith Goodman
On Wed, Jan 12, 2011 at 8:20 AM, Bruce Southey wrote: > On 12/13/2010 04:53 PM, Keith Goodman wrote: >> On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey  wrote: >> >>> Unless something has changed since the docstring was written, this is >>> probably an inherited &

Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-11 Thread Keith Goodman
On Tue, Jan 4, 2011 at 8:14 AM, Keith Goodman wrote: > On Tue, Jan 4, 2011 at 8:06 AM, Sebastian Haase wrote: >> On Mon, Jan 3, 2011 at 5:32 PM, Erik Rigtorp wrote: >>> On Mon, Jan 3, 2011 at 11:26, Eric Firing wrote: >>>> Instead of calculating statistics indepen

Re: [Numpy-discussion] aa.astype(int) truncates and doesn't round

2011-01-06 Thread Keith Goodman
On Thu, Jan 6, 2011 at 2:14 AM, wrote: > just something I bumped into and wasn't aware of > aa > array([ 1.,  1.,  1.,  1.,  1.]) aa.astype(int) > array([0, 1, 0, 0, 0]) aa - 1 > array([ -2.22044605e-16,   2.22044605e-16,  -2.22044605e-16, >        -3.33066907e-16,  -3.33066907e-16

Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-04 Thread Keith Goodman
On Tue, Jan 4, 2011 at 8:06 AM, Sebastian Haase wrote: > On Mon, Jan 3, 2011 at 5:32 PM, Erik Rigtorp wrote: >> On Mon, Jan 3, 2011 at 11:26, Eric Firing wrote: >>> Instead of calculating statistics independently each time the window is >>> advanced one data point, the statistics are updated.  I

Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Keith Goodman
On Mon, Jan 3, 2011 at 7:41 AM, Erik Rigtorp wrote: > On Mon, Jan 3, 2011 at 10:36, Keith Goodman wrote: >> On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp wrote: >> >>> It's only a view of the array, no copying is done. Though some >>> operations like np.

Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Keith Goodman
On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp wrote: > It's only a view of the array, no copying is done. Though some > operations like np.std()  will copy the array, but that's more of a > bug. In general It's hard to imagine any speedup gains by copying a > 10GB array. I don't think that np.std

Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Keith Goodman
On Fri, Dec 31, 2010 at 8:29 PM, Erik Rigtorp wrote: > Implementing moving average, moving std and other functions working > over rolling windows using python for loops are slow. This is a > effective stride trick I learned from Keith Goodman's > Bottleneck code but generalized into arrays of >

Re: [Numpy-discussion] Faster NaN functions

2010-12-31 Thread Keith Goodman
On Fri, Dec 31, 2010 at 8:21 AM, Lev Givon wrote: > Received from Erik Rigtorp on Fri, Dec 31, 2010 at 08:52:53AM EST: >> Hi, >> >> I just send a pull request for some faster NaN functions, >> https://github.com/rigtorp/numpy. >> >> I implemented the following generalized ufuncs: nansum(), nancums

Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-29 Thread Keith Goodman
On Wed, Dec 29, 2010 at 11:54 AM, Pauli Virtanen wrote: > Keith Goodman wrote: >> np.float64 is fast, just hoping someone had a C-API inline version of >> np.float64() that is faster. > > You're looking for PyArrayScalar_New and _ASSIGN. > See > https://github.c

Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-29 Thread Keith Goodman
On Wed, Dec 29, 2010 at 11:43 AM, Matthew Brett wrote: > Hi, > >>> That might be because I'm not understanding you very well, but I was >>> thinking that: >>> >>> cdef dtype descr = PyArray_DescrFromType(NPY_FLOAT64) >>> >>> would give you the float64 dtype that I thought you wanted?  I'm >>> shoo

Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-29 Thread Keith Goodman
On Wed, Dec 29, 2010 at 10:13 AM, Matthew Brett wrote: >>> Forgive me if I haven't understood your question, but can you use >>> PyArray_DescrFromType with e.g  NPY_FLOAT64 ? >> >> I'm pretty hopeless here. I don't know how to put all that together in >> a function. > > That might be because I'm n

Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-29 Thread Keith Goodman
On Wed, Dec 29, 2010 at 9:48 AM, Matthew Brett wrote: > Hi, > > On Wed, Dec 29, 2010 at 5:37 PM, Robert Bradshaw > wrote: >> On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman wrote: >>> On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw >>> wrote: >&g

Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-29 Thread Keith Goodman
On Wed, Dec 29, 2010 at 9:37 AM, Robert Bradshaw wrote: > On Wed, Dec 29, 2010 at 9:05 AM, Keith Goodman wrote: >> On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw >> wrote: >>> On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier >>> wrote: >>>> Wo

Re: [Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-29 Thread Keith Goodman
On Tue, Dec 28, 2010 at 11:22 PM, Robert Bradshaw wrote: > On Tue, Dec 28, 2010 at 8:10 PM, John Salvatier > wrote: >> Wouldn't that be a cast? You do casts in Cython with (expression) >> and that should be the equivalent of float64 I think. > > Or even (expression) if you've cimported numpy > (t

[Numpy-discussion] NumPy C-API equivalent of np.float64()

2010-12-28 Thread Keith Goodman
I'm looking for the C-API equivalent of the np.float64 function, something that I could use inline in a Cython function. I don't know how to write the function. Anyone have one sitting around? I'd like to use it, if it is faster than np.float64 (np.int32, np.float32, ...) in the Bottleneck package

Re: [Numpy-discussion] How construct custom slice

2010-12-27 Thread Keith Goodman
On Mon, Dec 27, 2010 at 10:36 AM, Mario Moura wrote: > Hi Folks > > a = np.zeros((4,3,5,55,5),dtype='|S8') > myLen = 4 # here I use myLen = len(something) > li = [3,2,4] # li from a list.append(something) > sl = slice(0,myLen) > tmpIndex = tuple(li) + sl + 4  # <== Here my problem > a[tmpIndex] >

Re: [Numpy-discussion] Output dtype

2010-12-13 Thread Keith Goodman
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey wrote: > Unless something has changed since the docstring was written, this is > probably an inherited 'bug' from np.mean() as the author expected that > the docstring of mean was correct. For my 'old' 2.0 dev version: > >  >>> np.mean( np.array([[0

Re: [Numpy-discussion] Output dtype

2010-12-13 Thread Keith Goodman
On Mon, Dec 13, 2010 at 12:20 PM, Bruce Southey wrote: > On 12/13/2010 11:59 AM, Keith Goodman wrote: >> > From the np.median doc string: "If the input contains integers, or >> floats of smaller precision than 64, then the output data-type is >> float64." &g

[Numpy-discussion] Output dtype

2010-12-13 Thread Keith Goodman
>From the np.median doc string: "If the input contains integers, or floats of smaller precision than 64, then the output data-type is float64." >> arr = np.array([[0,1,2,3,4,5]], dtype='float32') >> np.median(arr, axis=0).dtype dtype('float32') >> np.median(arr, axis=1).dtype dtype('float32'

Re: [Numpy-discussion] np.var() and ddof

2010-12-10 Thread Keith Goodman
On Fri, Dec 10, 2010 at 2:26 PM, wrote: > On Fri, Dec 10, 2010 at 4:42 PM, Keith Goodman wrote: >> Why does ddof=2 and ddof=3 give the same result? >> >>>> np.var([1, 2, 3], ddof=0) >>   0.3 >>>> np.var([1, 2, 3], ddof=1) >

[Numpy-discussion] np.var() and ddof

2010-12-10 Thread Keith Goodman
Why does ddof=2 and ddof=3 give the same result? >> np.var([1, 2, 3], ddof=0) 0.3 >> np.var([1, 2, 3], ddof=1) 1.0 >> np.var([1, 2, 3], ddof=2) 2.0 >> np.var([1, 2, 3], ddof=3) 2.0 >> np.var([1, 2, 3], ddof=4) -2.0 I expected NaN for ddof=3.

Re: [Numpy-discussion] A Cython apply_along_axis function

2010-12-10 Thread Keith Goodman
On Wed, Dec 1, 2010 at 6:07 PM, Keith Goodman wrote: > On Wed, Dec 1, 2010 at 5:53 PM, David wrote: > >> On 12/02/2010 04:47 AM, Keith Goodman wrote: >>> It's hard to write Cython code that can handle all dtypes and >>> arbitrary number of dimensions. The forme

Re: [Numpy-discussion] A Cython apply_along_axis function

2010-12-01 Thread Keith Goodman
On Wed, Dec 1, 2010 at 5:53 PM, David wrote: > On 12/02/2010 04:47 AM, Keith Goodman wrote: >> It's hard to write Cython code that can handle all dtypes and >> arbitrary number of dimensions. The former is typically dealt with >> using templates, but what do people do

[Numpy-discussion] A Cython apply_along_axis function

2010-12-01 Thread Keith Goodman
It's hard to write Cython code that can handle all dtypes and arbitrary number of dimensions. The former is typically dealt with using templates, but what do people do about the latter? I'm trying to take baby steps towards writing an apply_along_axis function that takes as input a cython function

Re: [Numpy-discussion] Warning: invalid value encountered in subtract

2010-11-30 Thread Keith Goodman
On Tue, Nov 30, 2010 at 2:25 PM, Robert Kern wrote: > On Tue, Nov 30, 2010 at 16:22, Keith Goodman wrote: >> On Tue, Nov 30, 2010 at 1:41 PM, Skipper Seabold wrote: >>> On Tue, Nov 30, 2010 at 1:34 PM, Keith Goodman wrote: >>>> After upgrading from numpy 1.4.

Re: [Numpy-discussion] Warning: invalid value encountered in subtract

2010-11-30 Thread Keith Goodman
On Tue, Nov 30, 2010 at 1:41 PM, Skipper Seabold wrote: > On Tue, Nov 30, 2010 at 1:34 PM, Keith Goodman wrote: >> After upgrading from numpy 1.4.1 to 1.5.1 I get warnings like >> "Warning: invalid value encountered in subtract" when I run unit tests >> (or timei

Re: [Numpy-discussion] A faster median (Wirth's method)

2010-11-30 Thread Keith Goodman
On Tue, Nov 30, 2010 at 11:58 AM, Matthew Brett wrote: > Hi, > > On Tue, Nov 30, 2010 at 11:35 AM, Keith Goodman wrote: >> On Tue, Nov 30, 2010 at 11:25 AM, John Salvatier >> wrote: >>> I am very interested in this result. I have wanted to know how to do an >&

Re: [Numpy-discussion] A faster median (Wirth's method)

2010-11-30 Thread Keith Goodman
On Tue, Nov 30, 2010 at 11:25 AM, John Salvatier wrote: > I am very interested in this result. I have wanted to know how to do an My first thought was to write the reducing function like this cdef np.float64_t namean(np.ndarray[np.float64_t, ndim=1] a): but cython doesn't allow np.ndarray in a

Re: [Numpy-discussion] A faster median (Wirth's method)

2010-11-30 Thread Keith Goodman
On Tue, Sep 1, 2009 at 2:37 PM, Sturla Molden wrote: > Dag Sverre Seljebotn skrev: >> >> Nitpick: This will fail on large arrays. I guess numpy.npy_intp is the >> right type to use in this case? >> > By the way, here is a more polished version, does it look ok? > > http://projects.scipy.org/numpy/

[Numpy-discussion] Warning: invalid value encountered in subtract

2010-11-30 Thread Keith Goodman
After upgrading from numpy 1.4.1 to 1.5.1 I get warnings like "Warning: invalid value encountered in subtract" when I run unit tests (or timeit) using "python -c 'blah'" but not from an interactive session. How can I tell the warnings to go away? ___ NumP

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-22 Thread Keith Goodman
On Mon, Nov 22, 2010 at 11:00 AM, wrote: > I don't think that works for complex numbers. > (statsmodels has now a preference that calculations work also for > complex numbers) I'm only supporting int32, int64, float64 for now. Getting the other ints and floats should be easy. I don't have plans

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-22 Thread Keith Goodman
On Mon, Nov 22, 2010 at 10:51 AM, wrote: > On Mon, Nov 22, 2010 at 1:39 PM, Keith Goodman wrote: >> On Mon, Nov 22, 2010 at 10:32 AM,   wrote: >>> On Mon, Nov 22, 2010 at 1:26 PM, Keith Goodman wrote: >>>> On Mon, Nov 22, 2010 at 9:03 AM, Keith Goodman wrote:

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-22 Thread Keith Goodman
On Mon, Nov 22, 2010 at 10:32 AM, wrote: > On Mon, Nov 22, 2010 at 1:26 PM, Keith Goodman wrote: >> On Mon, Nov 22, 2010 at 9:03 AM, Keith Goodman wrote: >> >>> @cython.boundscheck(False) >>> @cython.wraparound(False) >>> def nanstd_twopass(np

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-22 Thread Keith Goodman
On Mon, Nov 22, 2010 at 9:03 AM, Keith Goodman wrote: > @cython.boundscheck(False) > @cython.wraparound(False) > def nanstd_twopass(np.ndarray[np.float64_t, ndim=1] a, int ddof): >    "nanstd of 1d numpy array with dtype=np.float64 along axis=0." >    cdef Py_ssize_t i &

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-22 Thread Keith Goodman
On Mon, Nov 22, 2010 at 9:13 AM, wrote: > Two pass would provide precision that we would expect in numpy, but I > don't know if anyone ever tested the NIST problems for basic > statistics. Here are the results for their most difficult dataset. But I guess running one test doesn't mean anything.

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-22 Thread Keith Goodman
On Sun, Nov 21, 2010 at 5:56 PM, Robert Kern wrote: > On Sun, Nov 21, 2010 at 19:49, Keith Goodman wrote: > >> But this sample gives a difference: >> >>>> a = np.random.rand(100) >>>> a.var() >>   0.080232196646619805 >>>> var(a) >&

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 5:56 PM, Robert Kern wrote: > On Sun, Nov 21, 2010 at 19:49, Keith Goodman wrote: > >> But this sample gives a difference: >> >>>> a = np.random.rand(100) >>>> a.var() >>   0.080232196646619805 >>>> var(a) >&

Re: [Numpy-discussion] Does np.std() make two passes through the data?

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 4:18 PM, wrote: > On Sun, Nov 21, 2010 at 6:43 PM, Keith Goodman wrote: >> Does np.std() make two passes through the data? >> >> Numpy: >> >>>> arr = np.random.rand(10) >>>> arr.std() >>   0.3008736260967052 >

[Numpy-discussion] Does np.std() make two passes through the data?

2010-11-21 Thread Keith Goodman
Does np.std() make two passes through the data? Numpy: >> arr = np.random.rand(10) >> arr.std() 0.3008736260967052 Looks like an algorithm that makes one pass through the data (one for loop) wouldn't match arr.std(): >> np.sqrt((arr*arr).mean() - arr.mean()**2) 0.30087362609670526 But a

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 3:16 PM, Wes McKinney wrote: > What would you say to a single package that contains: > > - NaN-aware NumPy and SciPy functions (nanmean, nanmin, etc.) I'd say yes. > - moving window functions (moving_{count, sum, mean, var, std, etc.}) Yes. BTW, we both do arr=arr.asty

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 12:30 PM, wrote: > On Sun, Nov 21, 2010 at 2:48 PM, Keith Goodman wrote: >> On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: >>> On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: >>>> On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-21 Thread Keith Goodman
On Sun, Nov 21, 2010 at 10:25 AM, Wes McKinney wrote: > On Sat, Nov 20, 2010 at 7:24 PM, Keith Goodman wrote: >> On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: >> >>> Keith (and others), >>> >>> What would you think about creating a library o

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Keith Goodman
On Sat, Nov 20, 2010 at 3:54 PM, Wes McKinney wrote: > Keith (and others), > > What would you think about creating a library of mostly Cython-based > "domain specific functions"? So stuff like rolling statistical > moments, nan* functions like you have here, and all that-- NumPy-array > only func

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-20 Thread Keith Goodman
On Fri, Nov 19, 2010 at 7:42 PM, Keith Goodman wrote: > I should make a benchmark suite. >> ny.benchit(verbose=False) Nanny performance benchmark Nanny 0.0.1dev Numpy 1.4.1 Speed is numpy time divided by nanny time NaN means all NaNs Speed Test

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 8:33 PM, wrote: -np.inf>-np.inf > False > > If the only value is -np.inf, you will return nan, I guess. > np.nanmax([-np.inf, np.nan]) > -inf That's a great corner case. Thanks, Josef. This looks like it would fix it: change if ai > amax: amax = ai to i

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 8:05 PM, Charles R Harris wrote: > This doesn't look right: > > @cython.boundscheck(False) > @cython.wraparound(False) > def nanmax_2d_float64_axisNone(np.ndarray[np.float64_t, ndim=2] a): > "nanmax of 2d numpy array with dtype=np.float64 along axis=None." > cdef P

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 7:51 PM, wrote: > > does this give you the correct answer? > 1>np.nan > False > > What's the starting value for amax? -inf? Because "1 > np.nan" is False, the current running max does not get updated, which is what we want. >> import nanny as ny >> np.nanmax([1, np.

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 7:19 PM, Charles R Harris wrote: > > > On Fri, Nov 19, 2010 at 1:50 PM, Keith Goodman wrote: >> >> On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman >> wrote: >> > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: >> >>

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 12:29 PM, Keith Goodman wrote: > On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: >> Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: >> [clip] >>> My guess is that having separate underlying functions for each dtype, >>> nd

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 12:19 PM, Pauli Virtanen wrote: > Fri, 19 Nov 2010 11:19:57 -0800, Keith Goodman wrote: > [clip] >> My guess is that having separate underlying functions for each dtype, >> ndim, and axis would be a nightmare for a large project like Numpy. But >>

Re: [Numpy-discussion] [ANN] Nanny, faster NaN functions

2010-11-19 Thread Keith Goodman
On Fri, Nov 19, 2010 at 12:10 PM, wrote: > What's the speed advantage of nanny compared to np.nansum that you > have if the arrays are larger, say (1000,10) or (1,100) axis=0 ? Good point. In the small examples I showed so far maybe the speed up was all in overhead. Fortunately, that's not

  1   2   3   4   5   6   7   >