[Numpy-discussion] bug with mmap'ed datetime64 arrays

2014-02-17 Thread Charles G. Waldman
test case: #!/usr/bin/env python import numpy as np a=np.array(['2014', '2015', '2016'], dtype='datetime64') x=np.datetime64('2015') print a>x np.save('test.npy', a) b = np.load('test.npy', mmap_mode='c') print b>x result: >>> [False False True] Traceback (most recent call last): File "", li

Re: [Numpy-discussion] Proposal to make power return float, and other such things.

2014-02-17 Thread alex
On Mon, Feb 17, 2014 at 8:13 PM, Charles R Harris wrote: > This is apropos issue #899, where it is suggested that power promote > integers to float. That sounds reasonable to me, but such a change in > behavior makes it a bit iffy. > > Thoughts? After this change, what would be the recommended wa

Re: [Numpy-discussion] Proposal to make power return float, and other such things.

2014-02-17 Thread Alan G Isaac
On 2/17/2014 8:13 PM, Charles R Harris wrote: > This is apropos issue #899 , where > it is suggested that power promote integers to float. Even when base and exponent are both positive integers? Alan Isaac __

[Numpy-discussion] Proposal to make power return float, and other such things.

2014-02-17 Thread Charles R Harris
This is apropos issue #899 , where it is suggested that power promote integers to float. That sounds reasonable to me, but such a change in behavior makes it a bit iffy. Thoughts? Chuck ___ NumPy-Discussion mai

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread David Cournapeau
On Mon, Feb 17, 2014 at 7:31 PM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > hi, > I noticed that during some simplistic benchmarks (e.g. > https://github.com/numpy/numpy/issues/4310) a lot of time is spent in > the kernel zeroing pages. > This is because under linux glibc will always

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Stefan Seefeld
On 02/17/2014 06:56 PM, Nathaniel Smith wrote: > On Mon, Feb 17, 2014 at 3:55 PM, Stefan Seefeld wrote: >> On 02/17/2014 03:42 PM, Nathaniel Smith wrote: >>> Another optimization we should consider that might help a lot in the >>> same situations where this would help: for code called from the >>>

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Julian Taylor
On 17.02.2014 22:27, Sturla Molden wrote: > Nathaniel Smith wrote: >> Also, I'd be pretty wary of caching large chunks of unused memory. People >> already have a lot of trouble understanding their program's memory usage, >> and getting rid of 'greedy free' will make this even worse. > > A cache w

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Nathaniel Smith
On Mon, Feb 17, 2014 at 3:55 PM, Stefan Seefeld wrote: > On 02/17/2014 03:42 PM, Nathaniel Smith wrote: >> Another optimization we should consider that might help a lot in the >> same situations where this would help: for code called from the >> cpython eval loop, it's afaict possible to determine

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 4:57 PM, wrote: > On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte wrote: >> Hey guys, >> >> I wrote myself a little helper function `mdot` which chains np.dot for >> multiple arrays. So I can write >> >> mdot(A, B, C, D, E) >> >> instead of these >> >> A.dot(B).dot(C

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread Eelco Hoogendoorn
considering np.dot takes only its binary positional args and a single defaulted kwarg, passing in a variable number of positional args as a list makes sense. Then just call the builtin reduce on the list, and there you go. I also generally approve of such semantics for binary associative operation

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread Jaime Fernández del Río
Perhaps you could reuse np.dot, by giving its second argument a default None value, and passing a tuple as first argument, i.e. np.dot((a, b, c)) would compute a.dot(b).dot(c), possibly not in that order. As is suggested in the matlab thread linked by Josef, if you do implement an optimal ordering

Re: [Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 4:39 PM, Stefan Otte wrote: > Hey guys, > > I wrote myself a little helper function `mdot` which chains np.dot for > multiple arrays. So I can write > > mdot(A, B, C, D, E) > > instead of these > > A.dot(B).dot(C).dot(D).dot(E) > np.dot(np.dot(np.dot(np.dot(A, B

[Numpy-discussion] Proposal: Chaining np.dot with mdot helper function

2014-02-17 Thread Stefan Otte
Hey guys, I wrote myself a little helper function `mdot` which chains np.dot for multiple arrays. So I can write mdot(A, B, C, D, E) instead of these A.dot(B).dot(C).dot(D).dot(E) np.dot(np.dot(np.dot(np.dot(A, B), C), D), E) I know you can use `numpy.matrix` to get nicer formulas.

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Sturla Molden
Nathaniel Smith wrote: > Also, I'd be pretty wary of caching large chunks of unused memory. People > already have a lot of trouble understanding their program's memory usage, > and getting rid of 'greedy free' will make this even worse. A cache would only be needed when there is a lot of computin

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Stefan Seefeld
On 02/17/2014 03:42 PM, Nathaniel Smith wrote: > Another optimization we should consider that might help a lot in the > same situations where this would help: for code called from the > cpython eval loop, it's afaict possible to determine which inputs are > temporaries by checking their refcnt. In

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Nathaniel Smith
On 17 Feb 2014 15:17, "Sturla Molden" wrote: > > Julian Taylor wrote: > > > When an array is created it tries to get its memory from the cache and > > when its deallocated it returns it to the cache. > > Good idea, however there is already a C function that does this. It uses a > heap to keep the

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Julian Taylor
On 17.02.2014 21:16, Sturla Molden wrote: > Julian Taylor wrote: > >> When an array is created it tries to get its memory from the cache and >> when its deallocated it returns it to the cache. > > Good idea, however there is already a C function that does this. It uses a > heap to keep the cache

Re: [Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Sturla Molden
Julian Taylor wrote: > When an array is created it tries to get its memory from the cache and > when its deallocated it returns it to the cache. Good idea, however there is already a C function that does this. It uses a heap to keep the cached memory blocks sorted according to size. You know it

[Numpy-discussion] allocated memory cache for numpy

2014-02-17 Thread Julian Taylor
hi, I noticed that during some simplistic benchmarks (e.g. https://github.com/numpy/numpy/issues/4310) a lot of time is spent in the kernel zeroing pages. This is because under linux glibc will always allocate large memory blocks with mmap. As these pages can come from other processes the kernel mu

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Sturla Molden wrote: > wrote: > maybe -1 >> >> statsmodels is using np.linalg.pinv which uses svd >> I never ran heard of any crash (*), and the only time I compared with >> scipy I didn't like the slowdown. > > If you did care about speed in least-sqares fitting you would not call QR > or SVD

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread Charles R Harris
On Mon, Feb 17, 2014 at 11:32 AM, Julian Taylor < jtaylor.deb...@googlemail.com> wrote: > On 17.02.2014 15:18, Francesc Alted wrote: > > On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: > >> On Sun, Feb 16, 2014 at 6:12 PM, Daπid wrote: > >>> On 16 February 2014 23:43, wrote: > What's the f

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
wrote: maybe -1 > > statsmodels is using np.linalg.pinv which uses svd > I never ran heard of any crash (*), and the only time I compared with > scipy I didn't like the slowdown. If you did care about speed in least-sqares fitting you would not call QR or SVD directly, but use the builting LAPA

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread Julian Taylor
On 17.02.2014 15:18, Francesc Alted wrote: > On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: >> On Sun, Feb 16, 2014 at 6:12 PM, Daπid wrote: >>> On 16 February 2014 23:43, wrote: What's the fastest argsort for a 1d array with around 28 Million elements, roughly uniformly distributed,

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Sturla Molden wrote: > Dave Hirschfeld wrote: > >> Even if lapack_lite always performed the isfinite check and threw a python >> error if False, it would be much better than either hanging or segfaulting >> and >> people who care about the isfinite cost probably would be linking to a fast >>

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Dave Hirschfeld wrote: > Even if lapack_lite always performed the isfinite check and threw a python > error if False, it would be much better than either hanging or segfaulting > and > people who care about the isfinite cost probably would be linking to a fast > lapack anyway. +1 (if I have

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Dave Hirschfeld
Sturla Molden gmail.com> writes: > > gmail.com> wrote: > > > I use official numpy release for development, Windows, 32bit python, > > i.e. MingW 3.5 and whatever old ATLAS the release includes. > > > > a constant 13% cpu usage is 1/8 th of my 8 virtual cores. > > Based on this and Alex' mess

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
wrote: > I use official numpy release for development, Windows, 32bit python, > i.e. MingW 3.5 and whatever old ATLAS the release includes. > > a constant 13% cpu usage is 1/8 th of my 8 virtual cores. Based on this and Alex' message it seems the offender is the f2c generated lapack_lite librar

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 10:03 AM, alex wrote: > On Mon, Feb 17, 2014 at 4:49 AM, Dave Hirschfeld wrote: >> alex ncsu.edu> writes: >> >>> >>> Hello list, >>> >>> Here's another idea resurrection from numpy github comments that I've >>> been advised could be posted here for re-discussion. >>> >>>

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread alex
On Mon, Feb 17, 2014 at 4:49 AM, Dave Hirschfeld wrote: > alex ncsu.edu> writes: > >> >> Hello list, >> >> Here's another idea resurrection from numpy github comments that I've >> been advised could be posted here for re-discussion. >> >> The proposal would be to make np.linalg.svd more like scip

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread josef . pktd
On Mon, Feb 17, 2014 at 9:18 AM, Francesc Alted wrote: > On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: >> On Sun, Feb 16, 2014 at 6:12 PM, Daπid wrote: >>> On 16 February 2014 23:43, wrote: What's the fastest argsort for a 1d array with around 28 Million elements, roughly uniformly

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Jason Grout wrote: > For what my vote is worth, -1. I thought this was pretty much the > designed difference between the scipy and numpy linalg routines. Scipy > does the checking, and numpy provides the raw speed. Maybe this is > better resolved as a note in the documentation for numpy ab

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Sturla Molden
Dave Hirschfeld wrote: > It certainly shouldn't crash or hang though and for me at least it doesn't - > it returns NaN which immediately suggests to me that I've got bad input > (maybe just because I've seen it before). It might be dependent on the BLAS or LAPACK version. Since you are on Anac

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Jason Grout
On 2/15/14 3:37 PM, alex wrote: > The proposal would be to make np.linalg.svd more like scipy.linalg.svd > with respect to input checking. The argument against the change is > raw speed; if you know that you will never feed non-finite input to > svd, then np.linalg.svd is a bit faster than scipy.l

Re: [Numpy-discussion] svd error checking vs. speed

2014-02-17 Thread Dave Hirschfeld
alex ncsu.edu> writes: > > Hello list, > > Here's another idea resurrection from numpy github comments that I've > been advised could be posted here for re-discussion. > > The proposal would be to make np.linalg.svd more like scipy.linalg.svd > with respect to input checking. The argument aga

Re: [Numpy-discussion] argsort speed

2014-02-17 Thread Francesc Alted
On 2/17/14, 1:08 AM, josef.p...@gmail.com wrote: > On Sun, Feb 16, 2014 at 6:12 PM, Daπid wrote: >> On 16 February 2014 23:43, wrote: >>> What's the fastest argsort for a 1d array with around 28 Million >>> elements, roughly uniformly distributed, random order? >> >> On numpy latest version: >> >