[Numpy-discussion] einsum slow vs (tensor)dot
Hi, I was just looking at the einsum function. To me, it's a really elegant and clear way of doing array operations, which is the core of what numpy is about. It removes the need to remember a range of functions, some of which I find tricky (e.g. tile). Unfortunately the present implementation seems ~ 4-6x slower than dot or tensordot for decent size arrays. I suspect it is because the implementation does not use blas/lapack calls. cheers, George Nurser. E.g. (in ipython on Mac OS X 10.6, python 2.7.3, numpy 1.6.2 from macports) a = np.arange(60.).reshape(1500,400) b = np.arange(24.).reshape(400,600) c = np.arange(600) d = np.arange(400) %timeit np.einsum('ij,jk', a, b) 10 loops, best of 3: 156 ms per loop %timeit np.dot(a,b) 10 loops, best of 3: 27.4 ms per loop %timeit np.einsum('i,ij,j',d,b,c) 1000 loops, best of 3: 709 us per loop %timeit np.dot(d,np.dot(b,c)) 1 loops, best of 3: 121 us per loop or abig = np.arange(4800.).reshape(6,8,100) bbig = np.arange(1920.).reshape(8,6,40) %timeit np.einsum('ijk,jil-kl', abig, bbig) 1000 loops, best of 3: 425 us per loop %timeit np.tensordot(abig,bbig, axes=([1,0],[0,1])) 1 loops, best of 3: 105 us per loop ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.linalg.lstsq with several columns all 0 = huge x ?
Folks, np.linalg.lstsq of a random-uniform A 50 x 32 with 3 columns all 0 returns x[:3] 0 as expected, but 4 columns all 0 = huge x: lstsq (50, 32) with 4 columns all 0: [ -3.7e+09 -3.6e+13 -1.9e+13 -2.9e+12 7.3e-01 ... This may be a roundoff problem, or even a Mac Altivec lapack bug, not worth looking into. linalg.svd is ok though, odd. Summary: if you run linalg.lstsq on big arrays, either check max |x| or regularize, do lstsq( vstack( A, weight * eye(dim) ), hstack( b, zeros(dim) )) cheers -- denis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.linalg.lstsq with several columns all 0 = huge x ?
On Wed, Oct 24, 2012 at 1:33 PM, denis denis-bz...@t-online.de wrote: Folks, np.linalg.lstsq of a random-uniform A 50 x 32 with 3 columns all 0 returns x[:3] 0 as expected, but 4 columns all 0 = huge x: lstsq (50, 32) with 4 columns all 0: [ -3.7e+09 -3.6e+13 -1.9e+13 -2.9e+12 7.3e-01 ... This may be a roundoff problem, or even a Mac Altivec lapack bug, not worth looking into. linalg.svd is ok though, odd. Summary: if you run linalg.lstsq on big arrays, either check max |x| or regularize, do lstsq( vstack( A, weight * eye(dim) ), hstack( b, zeros(dim) )) lstsq has rcond argument to do (I think) essentially the same. might need to be increased in your example. Josef cheers -- denis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] how to pipe into numpy arrays?
As numpy.fromfile seems to require full file object functionalities like seek, I can not use it with the sys.stdin pipe. So how could I stream a binary pipe directly into numpy? I can imagine storing the data in a string and use StringIO but the files are 3.6 GB large, just the binary, and that will most likely be much more as a string object. Reading binary files on disk is NOT the problem, I would like to avoid the temporary file if possible. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to pipe into numpy arrays?
On Wed, Oct 24, 2012 at 3:00 PM, Michael Aye kmichael@gmail.com wrote: As numpy.fromfile seems to require full file object functionalities like seek, I can not use it with the sys.stdin pipe. So how could I stream a binary pipe directly into numpy? I can imagine storing the data in a string and use StringIO but the files are 3.6 GB large, just the binary, and that will most likely be much more as a string object. Reading binary files on disk is NOT the problem, I would like to avoid the temporary file if possible. I haven't tried this myself, but there is a numpy.frombuffer() function as well. Maybe that could be used here? Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Is there a way to reset an accumulate function?
On Wed, Oct 24, 2012 at 4:47 AM, Robert Kern robert.k...@gmail.com wrote: How about this? def nancumsum(x): nans = np.isnan(x) x = np.array(x) x[nans] = 0 reset_idx = np.zeros(len(x), dtype=int) reset_idx[nans] = np.arange(len(x))[nans] reset_idx = np.maximum.accumulate(reset_idx) cumsum = np.cumsum(x) cumsum = cumsum - cumsum[reset_idx] return cumsum Thank you for putting in the time to look at this. It doesn't work for the first group of numbers if x[0] is non-zero. Could perhaps concatenate a np.nan at the beginning to force a reset and adjust the returned array to not include the dummy value... def nancumsum(x): x = np.concatenate(([np.nan], x)) nans = np.isnan(x) x = np.array(x) x[nans] = 0 reset_idx = np.zeros(len(x), dtype=int) reset_idx[nans] = np.arange(len(x))[nans] reset_idx = np.maximum.accumulate(reset_idx) cumsum = np.cumsum(x) cumsum = cumsum - cumsum[reset_idx] return cumsum[1:] a array([ 4., 1., 2., 0., 18., 5., 6., 0., 8., 9.], dtype=float32) If no np.nan, then 'nancumsum' and 'np.cumsum' should be the same... np.cumsum(a) array([ 4., 5., 7., 7., 25., 30., 36., 36., 44., 53.], dtype=float32) nancumsum(a) array([ 4., 5., 7., 7., 25., 30., 36., 36., 44., 53.]) a[3] = np.nan np.cumsum(a) array([ 4., 5., 7., nan, nan, nan, nan, nan, nan, nan], dtype=float32) nancumsum(a) array([ 4., 5., 7., 0., 18., 23., 29., 29., 37., 46.]) Excellent! Kindest regards, Tim ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] error of install numpy on linux redhat.
Hi, All, I am trying to install numpy from http://www.scipy.org/Download . by git clone git://github.com/numpy/numpy.git numpy But, when I ran python setup.py install I got: SystemError: Cannot compile 'Python.h'. Perhaps you need to install python-dev|python-devel Where to get python-dev ? I tried: $ easy_install python-develSearching for python-develReading http://pypi.python.org/simple/python-devel/Couldn't find index page for 'python-devel' (maybe misspelled?)Scanning index of all packages (this may take a while)Reading http://pypi.python.org/simple/No local packages or download links found for python-develerror: Could not find suitable distribution for Requirement.parse('python-devel') and $ easy_install python-devSearching for python-devReading http://pypi.python.org/simple/python-dev/Couldn't find index page for 'python-dev' (maybe misspelled?)Scanning index of all packages (this may take a while)Reading http://pypi.python.org/simple/No local packages or download links found for python-deverror: Could not find suitable distribution for Requirement.parse('python-dev') ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion