Re: [Numpy-discussion] Numpy correlate
Hi, Le 19/03/2013 08:12, Sudheer Joseph a écrit : *Thank you Pierre,* It appears the numpy.correlate uses the frequency domain method for getting the ccf. I would like to know how serious or exactly what is the issue with normalization?. I have computed cross correlation using the function and interpreting the results based on it. It will be helpful if you could tell me if there is a significant bug in the function with best regards, Sudheer np.correlate works in the time domain. I started a discussion about a month ago about the way it's implemented http://mail.scipy.org/pipermail/numpy-discussion/2013-February/065562.html Unfortunately I didn't find time to dig deeper in the matter which needs working in the C code of numpy which I'm not familiar with. Concerning the normalization of mpl.xcorr, I think that what is computed is just fine. It's just the way this normalization is described in the docstring which I think is weird. https://github.com/matplotlib/matplotlib/issues/1835 best, Pierre signature.asc Description: OpenPGP digital signature ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Dot/inner products with broadcasting?
I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general: In [68]: import numpy as np In [69]: import numpy.core.gufuncs_linalg as gula In [70]: K = np.random.randn(1000,1000) In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop In [73]: %timeit np.einsum('ik,kj-ij', K, K) 1 loops, best of 3: 1.24 s per loop Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum.. -Jaakko On 03/15/2013 04:22 PM, Oscar Villellas wrote: In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d from numpy.core.umath_tests import inner1d It should do the trick :) On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen jaakko.lutti...@aalto.fi wrote: Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J On 03/13/2013 04:21 PM, Jaakko Luttinen wrote: Hi! How can I compute dot product (or similar multiplysum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array. For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1) But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # - (2, 3, 3) instead of (2, 3) Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2) But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # - (2, 3, 2, 5) instead of (2, 3, 5) I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products. I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot). Any help? Best regards, Jaakko ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order= 2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order= 5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order= 2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order= 5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy array to C API
Greetings I'm extending our existing C/C++ software with Python/Numpy in order to do extra number crunching. It already works like a charm calling python with the C API Python.h. But what is the proper way of passing double arrays returned from Python/Numpy routines back to C? I came across PyArray but I can see in the compiler warnings, it is deprecated and I don't wanna start from scratch on legacy facilities. Going forward, what is the intended way of doing this with neat code on both sides and with a minimum of mem copy gymnastics overhead? thanks in advance Søren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
Hi, Could also be that they are linked to different libs such as atlas and standart Blas. What is the output of numpy.show_config() in the two different python versions. Jens On Wed, Mar 20, 2013 at 2:14 PM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? Fred On Wed, Mar 20, 2013 at 10:14 AM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 10:14 AM, Daπid wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? I know nothing about what goes on behind the scenes. I am using the win32 binary package. Colin W. When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 10:29 AM, Jens Nielsen wrote: Hi, Could also be that they are linked to different libs such as atlas and standart Blas. What is the output of numpy.show_config() in the two different python versions. Jens Thanks for this pointer. The result for Py2.7: numpy.show_config() atlas_threads_info: NOT AVAILABLE blas_opt_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = c atlas_blas_threads_info: NOT AVAILABLE lapack_opt_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = f77 atlas_info: libraries = ['lapack', 'f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE atlas_blas_info: libraries = ['f77blas', 'cblas', 'atlas'] library_dirs = ['C:\\local\\lib\\yop\\sse3'] define_macros = [('NO_ATLAS_INFO', -1)] language = c mkl_info: NOT AVAILABLE The result for 3.2: import numpy numpy.show_config() lapack_info: NOT AVAILABLE lapack_opt_info: NOT AVAILABLE blas_info: NOT AVAILABLE atlas_threads_info: NOT AVAILABLE blas_src_info: NOT AVAILABLE atlas_blas_info: NOT AVAILABLE lapack_src_info: NOT AVAILABLE atlas_blas_threads_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE blas_opt_info: NOT AVAILABLE atlas_info: NOT AVAILABLE lapack_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE I hope that this helps. Colin W. On Wed, Mar 20, 2013 at 2:14 PM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. Colin W. Fred On Wed, Mar 20, 2013 at 10:14 AM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
The python3 version is compiled without any optimised library and is falling back on a slow version. Where did you get this installation from? Jens On Wed, Mar 20, 2013 at 3:01 PM, Colin J. Williams cjwilliam...@gmail.comwrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. Colin W. Fred On Wed, Mar 20, 2013 at 10:14 AM, Daπid davidmen...@gmail.com wrote: Without much detailed knowledge of the topic, I would expect both versions to give very similar timing, as it is essentially a call to ATLAS function, not much is done in Python. Given this, maybe the difference is in ATLAS itself. How have you installed it? When you compile ATLAS, it will do some machine-specific optimisation, but if you have installed a binary chances are that your version is optimised for a machine quite different from yours. So, two different installations could have been compiled in different machines and so one is more suited for your machine. If you want to be sure, I would try to compile ATLAS (this may be difficult) or check the same on a very different machine (like an AMD processor, different architecture...). Just for reference, on Linux Python 2.7 64 bits can deal with these matrices easily. %timeit mat=np.random.random((6143,6143)); matinv= np.linalg.inv(mat); res = np.dot(mat, matinv); diff= res-np.eye(6143); print np.sum(np.abs(diff)) 2.41799631031e-05 1.13955868701e-05 3.64338191541e-05 1.13484781021e-05 1 loops, best of 3: 156 s per loop Intel i5, 4 GB of RAM and SSD. ATLAS installed from Fedora repository (I don't run heavy stuff on this computer). On 20 March 2013 14:46, Colin J. Williams c...@ncf.ca wrote: I have a small program which builds random matrices for increasing matrix orders, inverts the matrix and checks the precision of the product. At some point, one would expect operations to fail, when the memory capacity is exceeded. In both Python 2.7 and 3.2 matrices of order 3,071 area handled, but not 6,143. Using wall-clock times, with win32, Python 3.2 is slower than Python 2.7. The profiler indicates a problem in the solver. Done on a Pentium, with 2.7 GHz processor, 2 GB of RAM and 221 GB of free disk space. Both Python 3.2.3 and Python 2.7.3 use numpy 1.6.2. The results are show below. Colin W. _ 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.097 Time elapsed (seconds)= 0.004143 order=5 measure ofimprecision= 2.207 Time elapsed (seconds)= 0.001514 order= 11 measure ofimprecision= 2.372 Time elapsed (seconds)= 0.001455 order= 23 measure ofimprecision= 3.318 Time elapsed (seconds)= 0.001608 order= 47 measure ofimprecision= 4.257 Time elapsed (seconds)= 0.002339 order= 95 measure ofimprecision= 4.986 Time elapsed (seconds)= 0.005747 order= 191 measure ofimprecision= 5.788 Time elapsed (seconds)= 0.029974 order= 383 measure ofimprecision= 6.765 Time elapsed (seconds)= 0.145339 order= 767 measure ofimprecision= 7.909 Time elapsed (seconds)= 0.841142 order= 1535 measure ofimprecision= 8.532 Time elapsed (seconds)= 5.793630 order= 3071 measure ofimprecision= 9.774 Time elapsed (seconds)= 39.559540 order= 6143 Process terminated by a MemoryError Above: 2.7.3 Below: Python 3.2.3 bbb_bbb 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)] order=2 measure ofimprecision= 0.000 Time elapsed (seconds)= 0.113930 order=5 measure ofimprecision= 1.807 Time elapsed (seconds)= 0.001373 order= 11 measure ofimprecision= 2.395 Time elapsed (seconds)= 0.001468 order= 23 measure ofimprecision= 3.073 Time elapsed (seconds)= 0.001609 order= 47 measure ofimprecision= 5.642 Time elapsed (seconds)= 0.002687 order= 95 measure ofimprecision= 5.745 Time elapsed (seconds)= 0.013510 order= 191 measure ofimprecision= 5.866 Time elapsed (seconds)= 0.061560 order= 383 measure ofimprecision= 7.129 Time elapsed (seconds)= 0.418490 order= 767 measure ofimprecision= 8.240 Time elapsed (seconds)= 3.815713 order= 1535 measure ofimprecision= 8.735 Time elapsed (seconds)= 27.877270 order= 3071 measure ofimprecision= 9.996 Time elapsed (seconds)=212.545610 order= 6143 Process terminated by a MemoryError ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org
Re: [Numpy-discussion] Dot/inner products with broadcasting?
Well, thanks to seberg, I finally noticed that there is a dot product function in this new module numpy.core.gufuncs_linalg, it was just named differently (matrix_multiply instead of dot). However, I may have found a bug in it: import numpy.core.gufuncs_linalg as gula A = np.arange(2*2).reshape((2,2)) B = np.arange(2*1).reshape((2,1)) gula.matrix_multiply(A, B) ValueError: On entry to DGEMM parameter number 10 had an illegal value -Jaakko On 03/20/2013 03:33 PM, Jaakko Luttinen wrote: I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general: In [68]: import numpy as np In [69]: import numpy.core.gufuncs_linalg as gula In [70]: K = np.random.randn(1000,1000) In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop In [73]: %timeit np.einsum('ik,kj-ij', K, K) 1 loops, best of 3: 1.24 s per loop Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum.. -Jaakko On 03/15/2013 04:22 PM, Oscar Villellas wrote: In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d from numpy.core.umath_tests import inner1d It should do the trick :) On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen jaakko.lutti...@aalto.fi wrote: Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J On 03/13/2013 04:21 PM, Jaakko Luttinen wrote: Hi! How can I compute dot product (or similar multiplysum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array. For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1) But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # - (2, 3, 3) instead of (2, 3) Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2) But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # - (2, 3, 2, 5) instead of (2, 3, 5) I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products. I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot). Any help? Best regards, Jaakko ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. That explain why you have memory problem but not other people with 64 bits version. So if you want to work with bigger input, change to a python 64 bits. Fred ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 11:06 AM, Jens Nielsen wrote: The python3 version is compiled without any optimised library and is falling back on a slow version. Where did you get this installation from? Jens From the SciPy site. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Execution time difference between 2.7 and 3.2 using numpy
On 20/03/2013 11:12 AM, Frédéric Bastien wrote: On Wed, Mar 20, 2013 at 11:01 AM, Colin J. Williams cjwilliam...@gmail.com wrote: On 20/03/2013 10:30 AM, Frédéric Bastien wrote: Hi, win32 do not mean it is a 32 bits windows. sys.platform always return win32 on 32bits and 64 bits windows even for python 64 bits. But that is a good question, is your python 32 or 64 bits? 32 bits. That explain why you have memory problem but not other people with 64 bits version. So if you want to work with bigger input, change to a python 64 bits. Fred But my machine is only 32 bit. Colin W. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Add ability to disable the autogeneration of the function signature in a ufunc docstring.
On Fri, Mar 15, 2013 at 4:39 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Mar 15, 2013 at 6:47 PM, Warren Weckesser warren.weckes...@gmail.com wrote: Hi all, In a recent scipy pull request (https://github.com/scipy/scipy/pull/459), I ran into the problem of ufuncs automatically generating a signature in the docstring using arguments such as 'x' or 'x1, x2'. scipy.special has a lot of ufuncs, and for most of them, there are much more descriptive or conventional argument names than 'x'. For now, we will include a nicer signature in the added docstring, and grudgingly put up with the one generated by the ufunc. In the long term, it would be nice to be able to disable the automatic generation of the signature. I submitted a pull request to numpy to allow that: https://github.com/numpy/numpy/pull/3149 Comments on the pull request would be appreciated. The functionality seems obviously useful, but adding a magic public attribute to all ufuncs seems like a somewhat clumsy way to expose it? Esp. since ufuncs are always created through the C API, including docstring specification, but this can only be set at the Python level? Maybe it's the best option but it seems worth taking a few minutes to consider alternatives. Agreed; exposing the flag as part of the public Python ufunc API is unnecessary, since this is something that would rarely, if ever, be changed during the life of the ufunc. Brainstorming: - If the first line of the docstring starts with funcname( and ends with ), then that's a signature and we skip adding one (I think sphinx does something like this?) Kinda magic and implicit, but highly backwards compatible. - Declare that henceforth, the signature generation will be disabled by default, and go through and add a special marker like __SIGNATURE__ to all the existing ufunc docstrings, which gets replaced (if present) by the automagically generated signature. - Give ufunc arguments actual names in general, that work for things like kwargs, and then use those in the automagically generated signature. This is the most work, but it would mean that people don't have to remember to update their non-magic signatures whenever numpy adds a new feature like out= or where=, and would make the docstrings actually accurate, which right now they aren't: I'm leaning towards this option. I don't know if there would still be a need to disable the automatic generation of the docstring if it was good enough. In [7]: np.add.__doc__.split(\n)[0] Out[7]: 'add(x1, x2[, out])' In [8]: np.add(x1=1, x2=2) ValueError: invalid number of arguments - Allow some special syntax to describe the argument names in the docstring: __ARGNAMES__: a b\n - add(a, b[, out]) - Something else... -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Dot/inner products with broadcasting?
Reproduced it. I will take a look at it. That error comes direct from BLAS and shouldn't be happening. I will also look why inner1d is not performing well. Note: inner1d is implemented with calls to BLAS (dot). I will get back to you later :) On Wed, Mar 20, 2013 at 4:10 PM, Jaakko Luttinen jaakko.lutti...@aalto.fi wrote: Well, thanks to seberg, I finally noticed that there is a dot product function in this new module numpy.core.gufuncs_linalg, it was just named differently (matrix_multiply instead of dot). However, I may have found a bug in it: import numpy.core.gufuncs_linalg as gula A = np.arange(2*2).reshape((2,2)) B = np.arange(2*1).reshape((2,1)) gula.matrix_multiply(A, B) ValueError: On entry to DGEMM parameter number 10 had an illegal value -Jaakko On 03/20/2013 03:33 PM, Jaakko Luttinen wrote: I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general: In [68]: import numpy as np In [69]: import numpy.core.gufuncs_linalg as gula In [70]: K = np.random.randn(1000,1000) In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop In [73]: %timeit np.einsum('ik,kj-ij', K, K) 1 loops, best of 3: 1.24 s per loop Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum.. -Jaakko On 03/15/2013 04:22 PM, Oscar Villellas wrote: In fact, there is already an inner1d implemented in numpy.core.umath_tests.inner1d from numpy.core.umath_tests import inner1d It should do the trick :) On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen jaakko.lutti...@aalto.fi wrote: Answering to myself, this pull request seems to implement an inner product with broadcasting (inner1d) and many other useful functions: https://github.com/numpy/numpy/pull/2954/ -J On 03/13/2013 04:21 PM, Jaakko Luttinen wrote: Hi! How can I compute dot product (or similar multiplysum operations) efficiently so that broadcasting is utilized? For multi-dimensional arrays, NumPy's inner and dot functions do not match the leading axes and use broadcasting, but instead the result has first the leading axes of the first input array and then the leading axes of the second input array. For instance, I would like to compute the following inner-product: np.sum(A*B, axis=-1) But numpy.inner gives: A = np.random.randn(2,3,4) B = np.random.randn(3,4) np.inner(A,B).shape # - (2, 3, 3) instead of (2, 3) Similarly for dot product, I would like to compute for instance: np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2) But numpy.dot gives: In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) In [13]: np.dot(A,B).shape # - (2, 3, 2, 5) instead of (2, 3, 5) I could use einsum for these operations, but I'm not sure whether that's as efficient as using some BLAS-supported(?) dot products. I couldn't find any function which could perform this kind of operations. NumPy's functions seem to either flatten the input arrays (vdot, outer) or just use the axes of the input arrays separately (dot, inner, tensordot). Any help? Best regards, Jaakko ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Add ability to disable the autogeneration of the function signature in a ufunc docstring.
On 20 Mar 2013 17:11, Warren Weckesser warren.weckes...@gmail.com wrote: On Fri, Mar 15, 2013 at 4:39 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Mar 15, 2013 at 6:47 PM, Warren Weckesser warren.weckes...@gmail.com wrote: Hi all, In a recent scipy pull request (https://github.com/scipy/scipy/pull/459), I ran into the problem of ufuncs automatically generating a signature in the docstring using arguments such as 'x' or 'x1, x2'. scipy.special has a lot of ufuncs, and for most of them, there are much more descriptive or conventional argument names than 'x'. For now, we will include a nicer signature in the added docstring, and grudgingly put up with the one generated by the ufunc. In the long term, it would be nice to be able to disable the automatic generation of the signature. I submitted a pull request to numpy to allow that: https://github.com/numpy/numpy/pull/3149 Comments on the pull request would be appreciated. The functionality seems obviously useful, but adding a magic public attribute to all ufuncs seems like a somewhat clumsy way to expose it? Esp. since ufuncs are always created through the C API, including docstring specification, but this can only be set at the Python level? Maybe it's the best option but it seems worth taking a few minutes to consider alternatives. Agreed; exposing the flag as part of the public Python ufunc API is unnecessary, since this is something that would rarely, if ever, be changed during the life of the ufunc. Brainstorming: - If the first line of the docstring starts with funcname( and ends with ), then that's a signature and we skip adding one (I think sphinx does something like this?) Kinda magic and implicit, but highly backwards compatible. - Declare that henceforth, the signature generation will be disabled by default, and go through and add a special marker like __SIGNATURE__ to all the existing ufunc docstrings, which gets replaced (if present) by the automagically generated signature. - Give ufunc arguments actual names in general, that work for things like kwargs, and then use those in the automagically generated signature. This is the most work, but it would mean that people don't have to remember to update their non-magic signatures whenever numpy adds a new feature like out= or where=, and would make the docstrings actually accurate, which right now they aren't: I'm leaning towards this option. I don't know if there would still be a need to disable the automatic generation of the docstring if it was good enough. Certainly it would be nice for ufunc argument handling to better match python argument handling! Just needs someone willing to do the work... *cough* ;-) -n In [7]: np.add.__doc__.split(\n)[0] Out[7]: 'add(x1, x2[, out])' In [8]: np.add(x1=1, x2=2) ValueError: invalid number of arguments - Allow some special syntax to describe the argument names in the docstring: __ARGNAMES__: a b\n - add(a, b[, out]) - Something else... -n ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] how to efficiently select multiple slices from an array?
Hey, On Wed, 2013-03-20 at 16:31 +0100, Andreas Hilboll wrote: Cross-posting a question I asked on SO (http://stackoverflow.com/q/15527666/152439): Given an array d = np.random.randn(100) and an index array i = np.random.random_integers(low=3, high=d.size - 5, size=20) how can I efficiently create a 2d array r with r.shape = (20, 8) such that for all j=0..19, r[j] = d[i[j]-3:i[j]+5] In my case, the arrays are quite large (~20 instead of 100 and 20), so something quick would be useful. You can use stride tricks, its simple to do by hand, but since I got it, maybe just use this: https://gist.github.com/seberg/3866040 d = np.random.randn(100) windowed_d = rolling_window(d, 8) i = np.random_integers(len(windowed_d)) r = d[i,:] Or use stride_tricks by hand, with: windowed_d = np.lib.stride_tricks.as_strided(d, (d.shape[0]-7, 8), (d.strides[0],)*2) Since the fancy indexing will create a copy, while windowed_d views the same data as the original array, of course that is not the case for the end result. Regards, Sebastian Cheers, Andreas. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion cool, thanks! ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy array to C API
On Wed, Mar 20, 2013 at 9:03 AM, Robert Kern robert.k...@gmail.com wrote: I highly recommend using an existing tool to write this interface, to take care of the reference counting, etc for you. Cython is particularly nice. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion