Re: [Numpy-discussion] New numpy functions: filled, filled_like
On Mon, Jan 14, 2013 at 2:57 PM, Benjamin Root ben.r...@ou.edu wrote: I am also +1 on the idea of having a filled() and filled_like() function (I learned a long time ago to just do a = np.empty() and a.fill() rather than the multiplication trick I learned from Matlab). However, the collision with the masked array API is a non-starter for me. np.const() and np.const_like() probably make the most sense, but I would prefer a verb over a noun. To get an array of 1's, you call np.ones(shape), to get an array of 0's you call np.zeros(shape) so to get an array of val's why not call np.vals(shape, val)? Cheers Robins ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Embedded NumPy LAPACK errors
On Sat, Jan 5, 2013 at 1:03 PM, Robin robi...@gmail.com wrote: If not, is there a reasonable way to build numpy.linalg such that it interfaces with MKL correctly ? I managed to get this to work in the end. Since Matlab uses MKL with ILP64 interface it is not possible to get Numpy to use that without modifications to all the lapack calls. However, I was able to keep the two different versions of lapack seperate. The first step is to build numpy to link statically against MKL. I wasn't sure how to get distutils to do this so I copied all the mkl static .a libaries to a temporary directory and pointed numpy to that to force the issue (so dynamic linking wasn't an option). Even with that it still uses the Lapack from the Matlab dynamic global symbols. The trick was adding the linker flag -Bsymbolic which means lapack_lite calls to lapack use the statically linked local copies. With these changes everything appears to work. There are two test failures (below) which do not appear when running the same Numpy build outside of Matlab but they don't seem so severe. So: [robini@robini2-pc numpy]$ cat site.cfg [mkl] search_static_first = true library_dirs = /tmp/intel64 include_dirs = /opt/intel/mkl/include #mkl_libs = mkl_sequential, mkl_intel_lp64, mkl_core, mkl_lapack95_lp64, mkl_blas95_lp64 mkl_libs = mkl_lapack95, mkl_blas95, mkl_intel_lp64, mkl_sequential, mkl_core, svml, imf, irc lapack_libs = [robini@robini2-pc numpy]$ ls /tmp/intel64/ libimf.a libmkl_gnu_thread.a libirc.a libmkl_intel_ilp64.a libmkl_blacs_ilp64.a libmkl_intel_lp64.a libmkl_blacs_intelmpi_ilp64.a libmkl_intel_sp2dp.a libmkl_blacs_intelmpi_lp64.a libmkl_intel_thread.a libmkl_blacs_lp64.alibmkl_lapack95_ilp64.a libmkl_blacs_openmpi_ilp64.a libmkl_lapack95_lp64.a libmkl_blacs_openmpi_lp64.alibmkl_pgi_thread.a libmkl_blacs_sgimpt_ilp64.alibmkl_scalapack_ilp64.a libmkl_blacs_sgimpt_lp64.a libmkl_scalapack_lp64.a libmkl_blas95_ilp64.a libmkl_sequential.a libmkl_blas95_lp64.a libmkl_solver_ilp64.a libmkl_cdft_core.a libmkl_solver_ilp64_sequential.a libmkl_core.a libmkl_solver_lp64.a libmkl_gf_ilp64.a libmkl_solver_lp64_sequential.a libmkl_gf_lp64.a libsvml.a in numpy/distutils/intelccompiler.py: class IntelEM64TCCompiler(UnixCCompiler): A modified Intel x86_64 compiler compatible with a 64bit gcc built Python. compiler_type = 'intelem' cc_exe = 'icc -m64 -fPIC' cc_args = -fPIC def __init__ (self, verbose=0, dry_run=0, force=0): UnixCCompiler.__init__ (self, verbose,dry_run, force) self.cc_exe = 'icc -m64 -fPIC -O3 -fomit-frame-pointer' compiler = self.cc_exe self.set_executables(compiler=compiler, compiler_so=compiler, compiler_cxx=compiler, linker_exe=compiler, linker_so=compiler + ' -shared -static-intel -Bsymbolic') Test failures (test_special_values also fails outside Matlab, but the other 2 only occur when embedded): == FAIL: test_umath.test_nextafterl -- Traceback (most recent call last): File /opt/epd-7.3/lib/python2.7/site-packages/nose/case.py, line 197, in runTest self.test(*self.arg) File /home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py, line 215, in knownfailer return f(*args, **kwargs) File /home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py, line 1123, in test_nextafterl return _test_nextafter(np.longdouble) File /home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py, line 1108, in _test_nextafter assert np.nextafter(one, two) - one == eps AssertionError == FAIL: test_umath.test_spacingl -- Traceback (most recent call last): File /opt/epd-7.3/lib/python2.7/site-packages/nose/case.py, line 197, in runTest self.test(*self.arg) File /home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py, line 215, in knownfailer return f(*args, **kwargs) File /home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py, line 1149, in test_spacingl return _test_spacing(np.longdouble) File /home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py, line 1132, in _test_spacing assert np.spacing(one) == eps AssertionError == FAIL: test_special_values (test_umath_complex.TestClog) -- Traceback (most recent call last): File /home/robini
Re: [Numpy-discussion] Embedded NumPy LAPACK errors
Coincidently I have been having the same problem this week. Unrelated to the problem, I would suggest looking at pymex which 'wraps' python inside Matlab very nicely, although it has the same problem with duplicate lapack symbols. https://github.com/kw/pymex I have the same problem with Enthough EPD which is built against MKL - but I think the problem is that Intel provide two different interfaces - ILP64 with 64 bit integer indices and LP64 with 32 bit integers. Matlab link against the ILP64 version, whereas Enthought use the LP64 version - so there are still incompatible. Cheers Robin On Sat, Jan 5, 2013 at 10:42 AM, Paul Anton Letnes paul.anton.let...@gmail.com wrote: On 4. jan. 2013, at 21:42, m...@eml.cc wrote: Hiall, I am trying to embed numerical code in a mexFunction, as called by MATLAB, written as a Cython function. NumPy core functions and BLAS work fine, but calls to LAPACK function such as SVD seem to be made against to MATLAB's linked MKL, and this generates MKL errors. When I try this with Octave, it works fine, presumably because it is compiled against the same LAPACK as the NumPy I am embedding. Assuming I haven't made big mistakes up to here, I have the following questions: Is there a way to request numpy.linalg to use a particular LAPACK library, e.g. /usr/lib/liblapack.so ? If not, is there a reasonable way to build numpy.linalg such that it interfaces with MKL correctly ? It's possible, but it's much easier to install one of the pre-built python distributions. Enthought, WinPython and others include precompiled python/numpy/scipy/etc with MKL. If that works for you, I'd recommend that route, as it involves less work. Good luck, Paul ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Tue, Jan 24, 2012 at 6:24 PM, David Warde-Farley warde...@iro.umontreal.ca wrote: On Tue, Jan 24, 2012 at 06:00:05AM +0100, Sturla Molden wrote: Den 23.01.2012 22:08, skrev Christoph Gohlke: Maybe this explains the win-amd64 behavior: There are a couple of places in mtrand where array indices and sizes are C long instead of npy_intp, for example in the randint function: https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L863 Both i and length could overflow here. It should overflow on allocation of more than 2 GB. There is also a lot of C longs in the internal state (line 55-105), as well as the other functions. Producing 2 GB of random ints twice fails: Sturla, since you seem to have access to Win64 machines, do you suppose you could try this code: a = numpy.ones((1, 972)) b = numpy.zeros((4993210,), dtype=int) c = a[b] and verify that there's a whole lot of 0s in the matrix, specifically, c[574519:].sum() 356.0 c[574520:].sum() 0.0 is the case on Linux 64-bit; is it the case on Windows 64? Yes - I get exactly the same numbers in 64 bit windows with 1.6.1. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] advanced indexing bug with huge arrays?
On Mon, Jan 23, 2012 at 7:55 PM, David Warde-Farley warde...@iro.umontreal.ca wrote: I've reproduced this (rather serious) bug myself and confirmed that it exists in master, and as far back as 1.4.1. I'd really appreciate if someone could reproduce and confirm on another machine, as so far all my testing has been on our single high-memory machine. I see the same behaviour on a Winodows machine with numpy 1.6.1. But I don't think it is an indexing problem - rather something with the random number creation. a itself is already zeros for high indexes. In [8]: b[100:110] Out[8]: array([3429029, 1251819, 4292918, 2249483, 757620, 3977130, 3455449, 2005054, 2565207, 3114930]) In [9]: a[b[100:110]] Out[9]: array([[0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], ..., [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0], [0, 0, 0, ..., 0, 0, 0]], dtype=uint8) In [41]: a[581350:,0].sum() Out[41]: 0 Cheers Robin Thanks, David On Mon, Jan 23, 2012 at 05:23:28AM -0500, David Warde-Farley wrote: A colleague has run into this weird behaviour with NumPy 1.6.1, EPD 7.1-2, on Linux (Fedora Core 14) 64-bit: a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8') b = numpy.random.randint(500,size=(4993210,)) c = a[b] It seems c is not getting filled in full, namely: In [14]: c[100:].sum() Out[14]: 0 I haven't been able to reproduce this quite yet, I'll try to find a machine with sufficient memory tomorrow. But does anyone have any insight in the mean time? It smells like some kind of integer overflow bug. Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] upsample or scale an array
Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM. I also just found this StackOverflow post with another solution. a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays. In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop Or for a 43200x4800 array: In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb. Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner! -Robin On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote: On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: I need to take an array - derived from raster GIS data - and upsample or scale it. That is, I need to repeat each value in each dimension so that, for example, a 2x2 array becomes a 4x4 array as follows: [[1, 2], [3, 4]] becomes [[1,1,2,2], [1,1,2,2], [3,3,4,4] [3,3,4,4]] It seems like some combination of np.resize or np.repeat and reshape + rollaxis would do the trick, but I'm at a loss. Many thanks! -Robin Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron: In [14]: a = arange(12).reshape(3,4) In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]]) Warren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] upsample or scale an array
That does repeat the elements, but doesn't get them into the desired order. In [4]: print a [[1 2] [3 4]] In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]]) In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]]) It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner: array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]) It would take some more reshaping/axis rolling to get there, but it seems doable. Anyone know what combination of manipulations would work with the result of np.tile? -Robin On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote: You can also use numpy.tile -=- Olivier 2011/12/3 Robin Kraft Thanks Warren, this is great, and even handles giant arrays just fine if you've got enough RAM. I also just found this StackOverflow post with another solution. a.repeat(2, axis=0).repeat(2, axis=1). http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array np.kron lets you do more, but for my simple use case the repeat() method is faster and more ram efficient with large arrays. In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8') In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 182 ms per loop In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8')) 1 loops, best of 3: 513 ms per loop Or for a 43200x4800 array: In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8') In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 1 loops, best of 3: 6.92 s per loop In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8')) 1 loops, best of 3: 27.8 s per loop In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 1.7gb. Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, and should have come to the Numpy list a lot sooner! -Robin On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote: On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote: I need to take an array - derived from raster GIS data - and upsample or scale it. That is, I need to repeat each value in each dimension so that, for example, a 2x2 array becomes a 4x4 array as follows: [[1, 2], [3, 4]] becomes [[1,1,2,2], [1,1,2,2], [3,3,4,4] [3,3,4,4]] It seems like some combination of np.resize or np.repeat and reshape + rollaxis would do the trick, but I'm at a loss. Many thanks! -Robin Just a day or so ago, Josef Perktold showed one way of accomplishing this using numpy.kron: In [14]: a = arange(12).reshape(3,4) In [15]: a Out[15]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [16]: kron(a, ones((2,2))) Out[16]: array([[ 0., 0., 1., 1., 2., 2., 3., 3.], [ 0., 0., 1., 1., 2., 2., 3., 3.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 4., 4., 5., 5., 6., 6., 7., 7.], [ 8., 8., 9., 9., 10., 10., 11., 11.], [ 8., 8., 9., 9., 10., 10., 11., 11.]]) Warren ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] upsample or scale an array
Ha! I knew it had to be possible! Thanks Derek. So for and N = 2 (now on my laptop): In [70]: M = 1200 In [69]: N = 2 In [71]: a = np.random.randint(0, 255, (M**2)).reshape(M,-1) In [76]: timeit np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) 10 loops, best of 3: 99.1 ms per loop In [78]: timeit a.repeat(2, axis=0).repeat(2, axis=1) 10 loops, best of 3: 85.6 ms per loop In [79]: timeit np.kron(a, np.ones((2,2), 'uint8')) 1 loops, best of 3: 521 ms per loop It turns out np.kron and repeat are pretty straightforward for multi-dimensional data too - scaling or stretching a stacked array representing pixel data over time, for example. Nothing changes for np.kron - it handles the additional dimensionality by itself. With repeat you just tell it to operate on the last two dimensions. So to sum up: 1) np.kron is cool for the simplicity of the code and simple scaling to N dimensions. It's also handy if you want to scale the array elements themselves too. 2) repeat() along the last N axes is a bit more intuitive (i.e. less magical) to me and has a better performance profile. 3) Derek's reshape/rolling solution is almost as fast but it gives me a headache trying to visualize what it's actually doing. I don't want to think about adding another dimension ... Thanks for the help folks. Here's scaling of a hypothetical time series (i.e. 3 axes), where each sub-array represents a month. In [26]: print a [[[1 2] [3 4]] [[1 2] [3 4]] [[1 2] [3 4]]] In [27]: np.kron(a, np.ones((2,2), dtype='uint8')) Out[27]: array([[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]]) In [64]: a.repeat(2, axis=1).repeat(2, axis=2) Out[64]: array([[[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]], [[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]]) On Dec. 3, 2011, at 12:50PM, Derek Homeier wrote: On 03.12.2011, at 6:22PM, Robin Kraft wrote: That does repeat the elements, but doesn't get them into the desired order. In [4]: print a [[1 2] [3 4]] In [7]: np.tile(a, 4) Out[7]: array([[1, 2, 1, 2, 1, 2, 1, 2], [3, 4, 3, 4, 3, 4, 3, 4]]) In [8]: np.tile(a, 4).reshape(4,4) Out[8]: array([[1, 2, 1, 2], [1, 2, 1, 2], [3, 4, 3, 4], [3, 4, 3, 4]]) It's close, but I want to repeat the elements along the two axes, effectively stretching it by the lower right corner: array([[1, 1, 2, 2], [1, 1, 2, 2], [3, 3, 4, 4], [3, 3, 4, 4]]) It would take some more reshaping/axis rolling to get there, but it seems doable. Anyone know what combination of manipulations would work with the result of np.tile? Rolling was the keyword: np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4)) [[1 1 2 2] [1 1 2 2] [3 3 4 4] [3 3 4 4]] I leave the generalisation and timing up to you, but it seems for a = np.arange(M**2).reshape(M,-1) np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) should do the trick. Cheers, Derek ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] upsample or scale an array
I need to take an array - derived from raster GIS data - and upsample or scale it. That is, I need to repeat each value in each dimension so that, for example, a 2x2 array becomes a 4x4 array as follows: [[1, 2], [3, 4]] becomes [[1,1,2,2], [1,1,2,2], [3,3,4,4] [3,3,4,4]] It seems like some combination of np.resize or np.repeat and reshape + rollaxis would do the trick, but I'm at a loss. Many thanks! -Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython
On Mon, Oct 3, 2011 at 9:42 PM, Yang Zhang yanghates...@gmail.com wrote: It turns out that there's a long-standing problem in numpy that prevents it from being used in embedded CPython environments: Just wanted to make the point for reference that in general Numpy does work fine in (non-threaded) embedded CPython situations, see for example pymex [1] which embeds Python + Numpy in a Matlab mex file and works really well. This seems to a be a problem specific to Jepp. Just wanted to mention it in case it puts someone off trying something unnecessarily in the future. Cheers Robin [1] https://github.com/kw/pymex http://stackoverflow.com/questions/7592565/when-embedding-cpython-in-java-why-does-this-hang/7630992#7630992 http://mail.scipy.org/pipermail/numpy-discussion/2009-July/044046.html Is there any fix or workaround for this? Thanks. -- Yang Zhang http://yz.mit.edu/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py and openmp on mac os x with gfortran
I'm not at my Mac to check the exact paths but see if pointing one of the environment variables LD_LIBRARY_PATH or DYLD_LIBRARY_PATH to a directory where the gfortran openmp libraries can be found - this will depend on where you got gfortran from and the version, but you should be able to find it by following the symlink that is the gfortran command and looking for an appropriate lib/ directory near the target of that. Cheers Robin On Wed, Jul 20, 2011 at 9:02 PM, Brandt Belson bbel...@princeton.edu wrote: Hello, I'm struggling to create openmp subroutines. I've simplified the problem down to the subroutine below. -- play.f90 -- subroutine step(soln,n) implicit none integer n,i real*8 soln(n) !f2py intent(in) n !f2py intent(out) soln !f2py depend(n) soln !$OMP PARALLEL DO do i=1,n soln(i) = .1 end do !$OMP END PARALLEL DO end subroutine step I compile this with the command: f2py -c -m play play.f90 --fcompiler=gfortran --f90flags=-fopenmp This completes successfully. When I import the module, I get the following error message. $ python -c 'import play' Traceback (most recent call last): File string, line 1, in module ImportError: dlopen(./play.so, 2): Symbol not found: _GOMP_parallel_end Referenced from: /home/bbelson/Desktop/SOR/play.so Expected in: flat namespace in /home/bbelson/Desktop/SOR/play.so It seems to me that the linking step is broken, however I did not see any other options in the f2py documentation to change the linking step. Did I miss something? Thanks, Brandt ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication
I didn't have time yesterday but the attached illustrates what I mean about putting the shared data in a module (it should work with the previous myutil). I don't get a big speed up but at least it is faster using multiple subprocesses: Not threaded: 0.450406074524 Using 8 processes: 0.282383 If I add a 1s sleep in the inner product function shows a much more significant improvement (ie if it were a longer computation): Not threaded: 50.6744170189 Using 8 processes: 8.152393 Still not quite linear but certainly an improvement. Cheers Robin On Thu, Jun 16, 2011 at 9:19 PM, Robin robi...@gmail.com wrote: The fact that you are still passing the myutil.arrayList argument to pool.map means the data is still being pickled. All arguments to pool.map are pickled to be passed to the subprocess. You need to change the function so that it accesses the data directly, then just pass indices to pool.map. Cheers Robin On Thu, Jun 16, 2011 at 9:05 PM, Brandt Belson bbel...@princeton.edu wrote: Hi all, Thanks for the replies. As mentioned, I'm parallelizing so that I can take many inner products simultaneously (which I agree is embarrassingly parallel). The library I'm writing asks the user to supply a function that takes two objects and returns their inner product. After all the discussion though it seems this is too simplistic of an approach. Instead, I plan to write this part of the library as if the inner product function supplied by the user uses all available cores (with numpy and/or numexpr built with MKL or LAPACK). As far as using fortran or C and openMP, this probably isn't worth the time it would take, both for me and the user. I've tried increasing the array sizes and found the same trends, so the slowdown isn't only because the arrays are too small to see the benefit of multiprocessing. I wrote the code to be easy for anyone to experiment with, so feel free to play around with what is included in the profiling, the sizes of arrays, functions used, etc. I also tried using handythread.foreach with arraySize = (3000,1000), and found the following: No shared memory, numpy array multiplication took 1.57585811615 seconds Shared memory, numpy array multiplication took 1.25499510765 seconds This is definitely an improvement from multiprocessing, but without knowing any better, I was hoping to see a roughly 8x speedup on my 8-core workstation. Based on what Chris sent, it seems there is some large overhead caused by multiprocessing pickling numpy arrays. To test what Robin mentioned If you are on Linux or Mac then fork works nicely so you have read only shared memory you just have to put it in a module before the fork (so before pool = Pool() ) and then all the subprocesses can access it without any pickling required. ie myutil.data = listofdata p = multiprocessing.Pool(8) def mymapfunc(i): return mydatafunc(myutil.data[i]) p.map(mymapfunc, range(len(myutil.data))) I tried creating the arrayList in the myutil module and using multiprocessing to find the inner products of myutil.arrayList, however this was still slower than not using multiprocessing, so I believe there is still some large overhead. Here are the results: No shared memory, numpy array multiplication took 1.55906510353 seconds Shared memory, numpy array multiplication took 9.8242638 seconds Shared memory, myutil.arrayList numpy array multiplication took 8.77094507217 seconds I'm attaching this code. I'm going to work around this numpy/multiprocessing behavior with numpy/numexpr built with MKL or LAPACK. It would be good to know exactly what's causing this though. It would be nice if there was a way to get the ideal speedup via multiprocessing, regardless of the internal workings of the single-threaded inner product function, as this was the behavior I expected. I imagine other people might come across similar situations, but again I'm going to try to get around this by letting MKL or LAPACK make use of all available cores. Thanks again, Brandt ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion import numpy as np import myutil import multiprocessing import itertools import time as T arraySize = (3000,1000) numArrays = 50 arrayList = [np.random.random(s) for s in itertools.repeat(arraySize,numArrays)] IPs = np.zeros(numArrays) # pool # save data to module _before_ fork myutil.arrayList = arrayList def shared_numpy_inner_product(i): # do numpy inner product on shared module variable #T.sleep(1) return myutil.numpy_inner_product(myutil.arrayList[i]) myutil.shared_numpy_inner_product = shared_numpy_inner_product # 1 thread startTime = T.time() for i in range(len(arrayList)): IPs[i] = myutil.shared_numpy_inner_product(i) endTime = T.time() - startTime print Not threaded: ,endTime # now fork the subprocesses processes=multiprocessing.cpu_count
Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication
On Thu, Jun 16, 2011 at 6:44 PM, Christopher Barker chris.bar...@noaa.gov wrote: 2. There is also the question of when the process pool is spawned. Though I haven't checked, I suspect it happens prior to calling pool.map. But if it does not, this is a factor as well, particularly on Windows (less so on Linux and Apple). It didn't work well on my Mac, so ti's either not an issue, or not Windows-specific, anyway. I am pretty sure that the process pool is spawned when you create the pool object instance. 3. arrayList is serialised by pickling, which has a significan overhead. It's not shared memory either, as the OP's code implies, but the main thing is the slowness of cPickle. I'll bet this is a big issue, and one I'm curious about how to address, I have another problem where I need to multi-process, and I'd love to know a way to pass data to the other process and back *without* going through pickle. maybe memmapped files? If you are on Linux or Mac then fork works nicely so you have read only shared memory you just have to put it in a module before the fork (so before pool = Pool() ) and then all the subprocesses can access it without any pickling required. ie myutil.data = listofdata p = multiprocessing.Pool(8) def mymapfunc(i): return mydatafunc(myutil.data[i]) p.map(mymapfunc, range(len(myutil.data))) Actually that won't work because mymapfunc needs to be in a module so it can be pickled but hopefully you get the idea. Cheers Robin IPs = N.array(innerProductList) 4. numpy.array is a very slow function. The benchmark should preferably not include this overhead. I re-ran, moving that out of the timing loop, and, indeed, it helped a lot, but it still takes longer with the multi-processing. I suspect that the overhead of pickling, etc. is overwhelming the operation itself. That and the load balancing issue that I don't understand! To test this, I did a little experiment -- creating a fake operation, one that simply returns an element from the input array -- so it should take next to no time, and we can time the overhead of the pickling, etc: $ python shared_mem.py Using 2 processes No shared memory, numpy array multiplication took 0.124427080154 seconds Shared memory, numpy array multiplication took 0.586215019226 seconds No shared memory, fake array multiplication took 0.000391006469727 seconds Shared memory, fake array multiplication took 0.54935503006 seconds No shared memory, my array multiplication took 23.5055780411 seconds Shared memory, my array multiplication took 13.0932741165 seconds Bingo! The overhead of the multi-processing takes about .54 seconds, which explains the slowdown for the numpy method not so mysterious after all. Bruce Southey wrote: But if everything is *single-threaded* and thread-safe, then you just create a function and use Anne's very useful handythread.py (http://www.scipy.org/Cookbook/Multithreading). This may be worth a try -- though the GIL could well get in the way. By the way, if the arrays are sufficiently small, there is a lot of overhead involved such that there is more time in communication than computation. yup -- clearly the case here. I wonder if it's just array size though -- won't cPickle time scale with array size? So it may not be size pe-se, but rather how much computation you need for a given size array. -Chris [I've enclosed the OP's slightly altered code] -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication
The fact that you are still passing the myutil.arrayList argument to pool.map means the data is still being pickled. All arguments to pool.map are pickled to be passed to the subprocess. You need to change the function so that it accesses the data directly, then just pass indices to pool.map. Cheers Robin On Thu, Jun 16, 2011 at 9:05 PM, Brandt Belson bbel...@princeton.edu wrote: Hi all, Thanks for the replies. As mentioned, I'm parallelizing so that I can take many inner products simultaneously (which I agree is embarrassingly parallel). The library I'm writing asks the user to supply a function that takes two objects and returns their inner product. After all the discussion though it seems this is too simplistic of an approach. Instead, I plan to write this part of the library as if the inner product function supplied by the user uses all available cores (with numpy and/or numexpr built with MKL or LAPACK). As far as using fortran or C and openMP, this probably isn't worth the time it would take, both for me and the user. I've tried increasing the array sizes and found the same trends, so the slowdown isn't only because the arrays are too small to see the benefit of multiprocessing. I wrote the code to be easy for anyone to experiment with, so feel free to play around with what is included in the profiling, the sizes of arrays, functions used, etc. I also tried using handythread.foreach with arraySize = (3000,1000), and found the following: No shared memory, numpy array multiplication took 1.57585811615 seconds Shared memory, numpy array multiplication took 1.25499510765 seconds This is definitely an improvement from multiprocessing, but without knowing any better, I was hoping to see a roughly 8x speedup on my 8-core workstation. Based on what Chris sent, it seems there is some large overhead caused by multiprocessing pickling numpy arrays. To test what Robin mentioned If you are on Linux or Mac then fork works nicely so you have read only shared memory you just have to put it in a module before the fork (so before pool = Pool() ) and then all the subprocesses can access it without any pickling required. ie myutil.data = listofdata p = multiprocessing.Pool(8) def mymapfunc(i): return mydatafunc(myutil.data[i]) p.map(mymapfunc, range(len(myutil.data))) I tried creating the arrayList in the myutil module and using multiprocessing to find the inner products of myutil.arrayList, however this was still slower than not using multiprocessing, so I believe there is still some large overhead. Here are the results: No shared memory, numpy array multiplication took 1.55906510353 seconds Shared memory, numpy array multiplication took 9.8242638 seconds Shared memory, myutil.arrayList numpy array multiplication took 8.77094507217 seconds I'm attaching this code. I'm going to work around this numpy/multiprocessing behavior with numpy/numexpr built with MKL or LAPACK. It would be good to know exactly what's causing this though. It would be nice if there was a way to get the ideal speedup via multiprocessing, regardless of the internal workings of the single-threaded inner product function, as this was the behavior I expected. I imagine other people might come across similar situations, but again I'm going to try to get around this by letting MKL or LAPACK make use of all available cores. Thanks again, Brandt ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to tell if I succeeded to build numpy with amd, umfpack and lapack
I think numpy doesn't use umfpack. scipy.sparse used to, but now the umfpack stuff has been moved out to a scikit. So you probably won't see anything about those libraries, but if you install scikits.umfpack and it works then you must be linked correctly. Cheers Robin On Fri, Feb 18, 2011 at 11:32 AM, Samuel John sc...@samueljohn.de wrote: Ping. How to tell, if numpy successfully build against libamd.a and libumfpack.a? How do I know if they were successfully linked (statically)? Is it possible from within numpy, like show_config() ? I think show_config() has no information about these in it :-( Anybody? Thanks, Samuel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Where did the github numpy repository go?
Git is having some kind of major outage: http://status.github.com/ The site and git access is unavailable due to a database failure. We're researching the issue. On Nov 14, 2010, at 3:29 PM, numpy-discussion-requ...@scipy.org wrote: Message: 5 Date: Sun, 14 Nov 2010 13:29:03 -0700 From: Charles R Harris charlesr.har...@gmail.com Subject: [Numpy-discussion] Where did the github numpy repository go? To: numpy-discussion numpy-discussion@scipy.org Message-ID: aanlkti=cpz752y88+sfsoh3wtm9poqphokyxc55dp...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 I keep getting page does not exist. Chuck -- next part -- An HTML attachment was scrubbed... URL: http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101114/3381d8a4/attachment.html -- ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion End of NumPy-Discussion Digest, Vol 50, Issue 32 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] mingw-w64 tutorial ?
On Sun, Aug 22, 2010 at 8:41 AM, Sebastian Haase seb.ha...@gmail.com wrote: Do you know if that contains a C++ compiler ? The first page before it starts the actual download has Visual C++ Compilers grayed out ... !? -Sebastian Ok, apparently I had to install the dot NET Framework 4 from http://msdn.microsoft.com/en-us/netframework/aa569263.aspx first, before then the C++ could be installed. But now setup.py still complains: error: Unable to find vcvarsall.bat and I think it is looking for C:\Program Files (x86)\Microsoft Visual Studio 9.0 while that file got installed in C:\Program Files (x86)\Microsoft Visual Studio 10.0 I don't know how to get the log.debug messages from the setup.py script activated... ? My limited understanding is that there are two different versions of microsoft stuff at the moment: VS 2008 which goes with SDK 3.5 and VS 2010 which goes with SDK 4.0. I think Python is built with 2008 so that might be easier to try. I'm not sure though. While the SDK contains some compilers I think the easiest way to get the compilers working is to first install the VS express edition, which is 32 bit, and then install the SDK which provides 64 bit support. Here are some instructions I lifted from the Matlab website (I haven't tried building Python extensions yet but I was able to embed Python with this setup). To install Visual Studio 2008 Express Edition with all required components: 1. Install Microsoft Visual Studio 2008 Express Edition. The main Visual Studio 2008 Express installer is available from: http://www.microsoft.com/express/Downloads/#Visual_Studio_2008_Express_Downloads This package can be installed using the default options. 2. Install the Microsoft Windows SDK. The Microsoft Windows SDK is available by searching Microsoft's download site, or by going directly to: http://msdn.microsoft.com/en-us/windowsserver/bb980924.aspx or http://www.microsoft.com/downloads/details.aspx?FamilyId=F26B1AA4-741A-433A-9BE5-FA919850BDBFdisplaylang=en Download the Windows Server 2008 .NET 3.5 SDK. Do not install beta or 'Release Candidate' (RC) versions. Also do NOT install Microsoft Windows SDK for Windows 7 and .NET Framework 4 (version 7.1); if you want to use a 7.x version choose the Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1. 2.1. While installing the SDK, you must select x64 Compilers and Tools. For example, in the SDK installer above: On screen Installation Options Select Developer Tools-Visual C++ Compilers. This item has the Feature Description Install the Visual C++ 9.0 Compilers. These compilers allow you to target x86, x64, IA64 processor architectures. 3. To verify that you have all installed components, check that the Microsoft SDK contains the amd64 version of the C/C++ compiler cl.exe. This is usually installed into C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin\amd64\cl.exe Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-Dev] Good-bye, sort of (John Hunter)
On Wed, Aug 18, 2010 at 8:20 AM, Sturla Molden stu...@molden.no wrote: Den 18. aug. 2010 kl. 08.19 skrev Martin Raspaud martin.rasp...@smhi.se: Once upon a time, when my boss wanted me to use matlab, I found myself implementing a python interpreter in matlab... There are just two sane solutions for Matlab: Either embed CPyton in a MEX file, or use Matlab's JVM to run Jython ;) http://vader.cse.lehigh.edu/~perkins/pymex.html Just thought I'd mention another one since this came up: http://github.com/kw/pymex This one works very nicely - it proxies any Python objects so you can use, should you want to, the Matlab IDE as a python interpreter, supports numpy arrays etc. Also cross-platform - I even got it to work with 64 bit matlab/python on windows (in a fork on github). Sturla ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py performance question from a rookie
On Mon, Aug 16, 2010 at 10:12 AM, Vasileios Gkinis v.gki...@nbi.ku.dkwrote: Hi all, This is a question on f2py. I am using a Crank Nicholson scheme to model a diffusion process and in the quest of some extra speed I started playing with f2py. However I do not seem to be able to get any significant boost in the performance of the code. Try adding -DF2PY_REPORT_ON_ARRAY_COPY to the f2py command line. This will cause f2py to report any array copies. If any of the types/ordering of the arrays don't match f2py will silently make a copy - this can really affect performance. Cheers Robin In the following simple example I am building a tridiagonal array with plain python for loops and by calling a simple fortran subroutine that does the same thing. Here goes the python script: import numpy as np import time import learnf90_05 import sys run = sys.argv[1] try: dim = np.float(sys.argv[2]) except: dim = 500 elem = np.ones(dim) ### Python routine if run == Python: t_python_i = time.time() array_python = np.zeros((dim, dim)) for row in np.arange(1,dim-1): array_python[row, row-1] = elem[row] array_python[row, row] = elem[row] array_python[row,row+1] = elem[row] t_python_f = time.time() python_time = t_python_f - t_python_i print(Python time: %0.5e %(python_time)) ###fortran routine elif run == Fortran: t_fortran_i = time.time() fortran_array = learnf90_05.test(j = dim, element = elem) t_fortran_f = time.time() fortran_time = t_fortran_f - t_fortran_i print(Fortran time: %0.5e %(fortran_time)) And the fortran subroutine called test is here: subroutine test(j, element, a) integer, intent(in) :: j integer :: row, col real(kind = 8), intent(in) :: element(j) real(kind = 8), intent(out) :: a(j,j) do row=2,j-1 a(row,row-1) = element(row) a(row, row) = element(row) a(row, row+1) = element(row) enddo end The subroutine is compiled with Gfortran 4.2 on Osx 10.5.8. Running the python script with array sizes 300x300 and 3000x3000 gives for the python plain for loops: dhcp103:learnf vasilis$ python fill_array.py Python 3000 Python time: 1.56063e-01 dhcp103:learnf vasilis$ python fill_array.py Python 300 Python time: 4.82297e-03 and for the fortran subroutine: dhcp103:learnf vasilis$ python fill_array.py Fortran 3000 Fortran time: 1.16298e-01 dhcp103:learnf vasilis$ python fill_array.py Fortran 300 Fortran time: 8.83102e-04 It looks like the gain in performance is rather low compared to tests i have seen elsewhere. Am I missing something here..? Cheers...Vasilis -- Vasileios Gkinis PhD student Center for Ice and Climate http://www.iceandclimate.nbi.ku.dk Niels Bohr Institute http://www.nbi.ku.dk/english/ Juliane Maries Vej 30 2100 Copenhagen Denmark email: v.gki...@nbi.ku.dk v.gki...@nbi.ku.dk? skype: vasgkin Tel: +45 353 25913 -- Vasileios Gkinis PhD student Center for Ice and Climate http://www.iceandclimate.nbi.ku.dk Niels Bohr Institute http://www.nbi.ku.dk/english/ Juliane Maries Vej 30 2100 Copenhagen Denmark email: v.gki...@nbi.ku.dk v.gki...@nbi.ku.dk? skype: vasgkin Tel: +45 353 25913 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] summarizing blocks of an array using a moving window
Vincent, Pauli, From: Vincent Schut sc...@sarvision.nl - an other option would be some smart reshaping, which finally gives you a [y//2, x//2, 2, 2] array, which you could then reduce to calculate stats (mean, std, etc) on the last two axes. I *think* you'd have to first reshape both x and y axes, and then reposition the axes. An example: a = gdal_array.BandReadAsArray(blabla) y,x = a.shape #y and x need be divideable by 2! b = a.reshape(y/2, 2, x/2, x).transpose(0,2,1,3).reshape(y/2, x/2, 4) bMean = b.mean(axis=-1) bMax = ..etc. You and Pauli agree on this - seems like a good option. - a third option would be to create an index array, which has a unique value per 2x2 square, and then use histogram2d. This would, if you use its 'weight' functionality, at least enable you to get efficient counts and sums/means. Other stats might be hard, though. H I don't get this, but I'll experiment. I've seen some really useful things done with histogram2d so it seems worthwhile to figure out. Message: 6 From: Pauli Virtanen p...@iki.fi Let's say the image looks like this: np.random.randint(0,2, 16).reshape(4,4) array([[0, 0, 0, 1], [0, 0, 1, 1], [1, 1, 0, 0], [0, 0, 0, 0]]) I want to use a square, non-overlapping moving window for resampling, so that I get a count of all the 1's in each 2x2 window. 0, 0, 0, 1 0, 0, 1, 1 0 3 = 2 0 1, 1, 0, 0 0, 0, 0, 0 In another situation with similar data I'll need the average, or the maximum value, etc.. Non-overlapping windows can be done by reshaping: x = np.array([[0, 0, 0, 1, 1, 1], [0, 0, 1, 1, 0, 0], [1, 1, 0, 0, 1, 1], [0, 0, 0, 0, 1, 1], [1, 0, 1, 0, 1, 1], [0, 0, 1, 0, 0, 0]]) y = x.reshape(3,2,3,2) y2 = y.sum(axis=3).sum(axis=1) This is perfect - and so fast! Thanks! Now I just have to understand why it works ... Can anyone recommend a tutorial on working with (slicing, reshaping, etc.) multi-dimensional arrays? Stefan's slides are beautiful but my brain starts to hurt if I try to follow them line by line. -Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] summarizing blocks of an array using a moving window
Hello all, The short version: For a given NxN array, is there an efficient way to use a moving window to collect a summary statistic on a chunk of the array, and insert it into another array? The long version: I am trying to resample an image loaded with GDAL into an NxN array. Note that this is for statistical purposes, so image quality doesn't matter. For the curious, the image is derived from satellite imagery and displays a map of hotspots of tropical deforestation at 500m resolution. I need to assign a count of these deforestation 'hits' to each pixel in an existing array of 1000m pixels. Let's say the image looks like this: np.random.randint(0,2, 16).reshape(4,4) array([[0, 0, 0, 1], [0, 0, 1, 1], [1, 1, 0, 0], [0, 0, 0, 0]]) I want to use a square, non-overlapping moving window for resampling, so that I get a count of all the 1's in each 2x2 window. 0, 0, 0, 1 0, 0, 1, 1 0 3 = 2 0 1, 1, 0, 0 0, 0, 0, 0 In another situation with similar data I'll need the average, or the maximum value, etc.. My brute-force method is to loop through the rows and columns to get little chunks of data to process. But must be a more elegant way to do this - it's probably obvious too ... (inelegant way further down). Another way to do it would be to use np.tile to create an array called arr filed with blocks of [[0,1],[2,3]]. I could then use something like this to get the stats I want: d0 = img[np.where(arr==0)] d1 = img[np.where(arr==1)] d2 = img[np.where(arr==2)] d3 = img[np.where(arr==3)] img_out = d0 + d1 + d2 + d3 This would be quite snappy if I could create arr efficiently. Unfortunately np.tile does something akin to np.hstack to create this array, so it isn't square and can't be reshaped appropriately (np.tile(np.arange(2**2).reshape(2, 2), 4)): array([[0, 1, 0, 1, 0, 1, 0, 1], [2, 3, 2, 3, 2, 3, 2, 3]]) Inefficient sample code below. Advice greatly appreciated! -Robin import numpy as np from math import sqrt from time import time def rs(img_size=16, window_size=2): w = window_size # making sure the numbers work if img_size % sqrt(img_size) 0: print Square root of image size must be an integer. print Sqrt =, sqrt(img_size) return elif (img_size / sqrt(img_size)) % w 0: print Square root of image size must be evenly divisible by, w print Sqrt =, sqrt(img_size) print sqrt(img_size), /, w, =, sqrt(img_size)/w return else: rows = int(sqrt(img_size)) cols = int(sqrt(img_size)) # create fake data: ones and zeros img = np.random.randint(0, 2, img_size).reshape(rows, cols) # create empty array for resampled data img_out = np.zeros((rows/w, cols/w), dtype=np.int) t = time() # retreive blocks of pixels in img # insert summary into img_out for r in xrange(0, rows, w): # skip rows based on window size for c in xrange(0, cols, w): # skip columns based on window size # calculate the sum of the values in moving window, #insert value into corresponding pixel in img_out img_out[r/w, c/w] = np.int(np.sum(img[r:r+w, c:c+w])) t1= time() elapsed = t1-t print img shape:, img.shape print img, \n print img_out shape:, img_out.shape print img_out print %.7f seconds % elapsed rs(imgage_size=16, window=2) rs(81, 3) rs(100) #rs(4800*4800) # very slow ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy for Python 3?
On Mon, Jul 19, 2010 at 1:53 PM, Matthieu Brucher matthieu.bruc...@gmail.com wrote: I'm afraid that if you don't know if you have a compiler, you don't have one. This also means you will not be able to compile Numpy, as the official compiler is no longer available. Is this the VS 2008 Express Edition? I saw something posted a while ago about how it was no longer available, but I think it was a mistake as it still seems to be easily available from: http://www.microsoft.com/express/downloads/#2008-All Even 2005 is still available so I imagine there is some time before we have to worry about 2008. http://www.microsoft.com/downloads/details.aspx?familyid=7B0B0339-613A-46E6-AB4D-080D4D4A8C4Edisplaylang=en Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy on windows 64 bit
On Tue, Jul 6, 2010 at 6:57 PM, Christoph Gohlke cgoh...@uci.edu wrote: I proposed a fix at http://projects.scipy.org/numpy/ticket/1535. Does it work for you? Thanks very much... that looks great. Since it works with long's it fixes my problems (I think it will also fix a couple of the failing scipy tests) Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] numpy on windows 64 bit
Hi, I am having some problems with win64 with all my tests failing. I installed amd64 Python from Python.org and numpy and scipy from http://www.lfd.uci.edu/~gohlke/pythonlibs/ I noticed that on windows sys.maxint is the 32bit value (2147483647 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy on windows 64 bit
On Mon, Jul 5, 2010 at 12:09 PM, David Cournapeau courn...@gmail.com wrote: Short of saying what those failures are, we can't help you, Thanks for reply... Somehow my message got truncated - I had written more detail about the errors! I noticed that on windows sys.maxint is the 32bit value (2147483647 This is not surprising: sys.maxint gives you the max value of a long, which is 32 bits even on 64 bits on windows. I just got to figuring this out... But it makes some problems. The main one I'm having is that I assume because of this problem array shapes are longs instead of ints (ie x.shape[0] is a long). This breaks np.random.permutation(x.shape[1]) which I use all over the place (I opened a ticket for this, #1535). Something I asked in the previous mail that got lost is what is the best cross platform way of doing this? np.random.permutation(int(x.shape[1]))? Actually that and the problems with scipy.sparse (spsolve doesn't work) cover all of the errors I'm seeing... (I detailed those in a seperate mail to the scipy list). Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] OT: request help building pymex win64
On Fri, Jul 2, 2010 at 6:10 PM, David Cournapeau courn...@gmail.com wrote: Also, I would double check the issue is not something else altogether, Hi, I got it working eventually - it was something else altogether! I had made some mistakes in the changes I had made to get it to compile with visual studio that were causing the segfaults. Makes me think of the old phrase - problem is between keyboard and chair. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] OT: request help building pymex win64
Hi, Sorry for the offtopic post but I wondered if any Windows experts who are familiar with topics like linking python on windows and visual studio runtimes etc. might be able to help. I'm on a bit of a mission to get pymex built for 64 bit windows. Pymex ( http://github.com/kw/pymex ) is a matlab package that embeds the Python interpreter in a mex file and provides a very elegant interface for manipulating python objects from matlab, as well as converting between data times when necessary. It builds easily on mac, linux and win32 with mingw, but I really need it also for 64 bit windows. (It works very well with numpy as well so not completely OT). I have looked at trying to get a 64bit mingw working to build mex files, but that seemed quite difficult, so instead I am trying to build with VS 2008 Express Edition + Windows 7 SDK (for 64 bit support). As far as I can tell this is installed OK as I can build the example mex64 files OK. I have made some modifications to pymex to get it to build under vs 2008 ( http://github.com/robince/pymex/tree/win64 ). And I can get it to build and link (I believe using the implicit dll method of linking against C:\Python26\libs\python26.lib of the amd64 python.org python) without errors, but when I run it seems to segfaults whenever a pointer is passed between the mex side and python26.dll. I asked this stackoverflow question which has some more details (build log) http://stackoverflow.com/questions/3167134/trying-to-embed-python-into-matlab-mex-win64 Anyway I'm completely in the dark but wondered if some of the experts on here would be able to spot something (perhaps to do with incompatible C runtimes - I am not sure what runtime Python is built with but I thought it was VS 2008). Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] OT: request help building pymex win64
On Fri, Jul 2, 2010 at 5:47 PM, David Cournapeau courn...@gmail.com wrote: The problem may be that matlab is built with one runtime, and Python with another Unless your matlab is very recent, it is actually quite likely to be compiled with VS 2005, which means you should use python 2.5 instead (or built python2.6 with VS 2005, but I am not sure it is even possible without herculean efforts). Thanks for your help! I thought of that, but then VS 2008 is an officially supported compiler for the version of matlab I am using (2009a). http://www.mathworks.com/support/compilers/release2009a/win64.html So I thought on the matlab/mex side 2008 should be fine, and I thought since Python is built with 2008 that should also be OK. But obviously something isn't! Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Brooken Toolchain (Mac 10.6)
To build against the python.org 2.5 you need to use the older gcc: export CC=/usr/bin/gcc-4.0 export CXX=/usr/bin/g++-4.0 should do it. By default snow leopard uses 4.2 now, which doesn't support the -Wno-long-double option used when building python. Cheers Robin On Mon, Apr 19, 2010 at 3:55 PM, Patrick Marsh patrickmars...@gmail.com wrote: Greetings, Per my previous email, I'm trying to setup the release process for Numpy on my MacBook Pro. When trying to build Numpy 1.4.1r3 with Python 2.5.4 I get a broken toolchain error (below). I do not get this error when trying to build Numpy with Python 2.6.5 - and there is nothing fundamentally different (that I know of) between my Python 2.5.4 and Python 2.6.5 environments. To address a previous suggestion I received offline, I do have sufficient permissions (I've even tried using sudo, just to make sure) and I do have setuptools installed. Any help, would be appreciated. Patrick compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/2.5/include/python2.5 -c' gcc: _configtest.c cc1: error: unrecognized command line option -Wno-long-double cc1: error: unrecognized command line option -Wno-long-double lipo: can't figure out the architecture type of: /var/tmp//ccjTUva4.out cc1: error: unrecognized command line option -Wno-long-double cc1: error: unrecognized command line option -Wno-long-double lipo: can't figure out the architecture type of: /var/tmp//ccjTUva4.out failure. removing: _configtest.c _configtest.o Traceback (most recent call last): File setup.py, line 187, in module setup_package() File setup.py, line 180, in setup_package configuration=configuration ) File /Users/pmarsh/git/numpy.release/numpy/distutils/core.py, line 186, in setup return old_setup(**new_attr) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/core.py, line 151, in setup dist.run_commands() File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py, line 974, in run_commands self.run_command(cmd) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py, line 994, in run_command cmd_obj.run() File /Users/pmarsh/git/numpy.release/numpy/distutils/command/build.py, line 37, in run old_build.run(self) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/command/build.py, line 112, in run self.run_command(cmd_name) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/cmd.py, line 333, in run_command self.distribution.run_command(command) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py, line 994, in run_command cmd_obj.run() File /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line 152, in run self.build_sources() File /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line 163, in build_sources self.build_library_sources(*libname_info) File /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line 298, in build_library_sources sources = self.generate_sources(sources, (lib_name, build_info)) File /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line 385, in generate_sources source = func(extension, build_dir) File numpy/core/setup.py, line 657, in get_mathlib_info raise RuntimeError(Broken toolchain: cannot link a simple C program) RuntimeError: Broken toolchain: cannot link a simple C program -- Patrick Marsh Ph.D. Student / NSSL Liaison to the HWT School of Meteorology / University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies National Severe Storms Laboratory http://www.patricktmarsh.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy.dot segmentation fault
You can build numpy against Accelerate through macports by specifying the +no_atlas variant. Last time I tried I ran into this issue: http://trac.macports.org/ticket/22201 but it looks like it should be fixed now. Cheers Robin On Mon, Jan 18, 2010 at 8:15 PM, Mark Lescroart lescr...@usc.edu wrote: Hi David (et al), Thanks for the reply. The version of ATLAS I was using was v3.8.3_1, installed via MacPorts, compiled and built on my machine with a gcc4.3 compiler. I uninstalled numpy 1.4 and ATLAS, re-installed (i.e., re- compiled) the same version (3.8.3_1), re-installed numpy (for python2.6, version 1.4), and got the same bug. I don't know if this means that there's something fundamentally wrong with the version of ATLAS on MacPorts (probably less likely) or something wrong with the way my system is configured (probably more likely). If anyone can give me any more insight into how to test my installation of ATLAS, I would be much obliged (I read through a fair bit of the ATLAS installation notes on the ATLAS sourceforge page, and could not figure out how to run the whole test suite ). If possible, I would like to solve this problem within Macports (and thus not with the Accelerate framework). I am using numpy mostly through the pyMVPA package for fMRI multi-voxel analysis, and the pyMVPA package depends on a number of other libraries, and the mess of dependencies is most easily managed within the framework of Macports. Cheers, Mark On Jan 17, 2010, at 8:38 PM, David Cournapeau wrote: Mark Lescroart wrote: Hello, I've encountered a segfault in numpy when trying to compute a dot product for two arrays - see code below. The problem only seems to occur when the arrays reach a certain size. Your atlas is most likely broken. You will have to double-check how you built it, and maybe run the whole test suite (as indicated in the ATLAS installation notes). Note that you can use the Accelerate framework on mac os x, this is much easier to get numpy working on mac, cheers, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2
On Fri, Jan 8, 2010 at 2:29 AM, David Warde-Farley d...@cs.toronto.edu wrote: On 5-Jan-10, at 7:02 PM, Christopher Barker wrote: Pretty sure the python.org binaries are 32-bit only. I still think it's sensible to prefer the waiting the rest of this sentence.. ;-) I had meant to say 'sensible to prefer the Python.org version' though in reality I'm a little miffed that Python.org isn't providing Ron's 4- way binaries, since he went to the trouble of adding support for building them. Grumble grumble. My understanding was that 2.6/3.1 will never be buildable as an arch selectable universal binary interpreter (like the apple system python) due to this issue: http://bugs.python.org/issue6834 I think this is only being fixed in 2.7/3.2 so perhaps from then Python will distribute selectable universal builds. (Just mention it in case folks aren't aware of that issue). Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ma feature request (log2)
Hi, Could we have a ma aware numpy.ma.log2 please, similar to np.ma.log and np.ma.log10? I think it should be as simple as the patch below but perhaps I've missed something: Thanks, Robin --- core.py.orig2009-12-13 15:14:14.0 + +++ core.py 2009-12-13 15:14:53.0 + @@ -66,7 +66,7 @@ 'identity', 'ids', 'indices', 'inner', 'innerproduct', 'isMA', 'isMaskedArray', 'is_mask', 'is_masked', 'isarray', 'left_shift', 'less', 'less_equal', 'load', 'loads', 'log', 'log10', - 'logical_and', 'logical_not', 'logical_or', 'logical_xor', + 'log2', 'logical_and', 'logical_not', 'logical_or', 'logical_xor', 'make_mask', 'make_mask_descr', 'make_mask_none', 'mask_or', 'masked', 'masked_array', 'masked_equal', 'masked_greater', 'masked_greater_equal', 'masked_inside', 'masked_invalid', @@ -1124,6 +1124,8 @@ _DomainGreater(0.0)) log10 = _MaskedUnaryOperation(umath.log10, 1.0, _DomainGreater(0.0)) +log2 = _MaskedUnaryOperation(umath.log2, 1.0, + _DomainGreater(0.0)) tan = _MaskedUnaryOperation(umath.tan, 0.0, _DomainTan(1e-35)) arcsin = _MaskedUnaryOperation(umath.arcsin, 0.0, ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] non-standard standard deviation
On Mon, Nov 30, 2009 at 12:30 AM, Colin J. Williams c...@ncf.ca wrote: On 29-Nov-09 17:13 PM, Dr. Phillip M. Feldman wrote: All of the statistical packages that I am currently using and have used in the past (Matlab, Minitab, R, S-plus) calculate standard deviation using the sqrt(1/(n-1)) normalization, which gives a result that is unbiased when sampling from a normally-distributed population. NumPy uses the sqrt(1/n) normalization. I'm currently using the following code to calculate standard deviations, but would much prefer if this could be fixed in NumPy itself: def mystd(x=numpy.array([]), axis=None): This function calculates the standard deviation of the input using the definition of standard deviation that gives an unbiased result for samples from a normally-distributed population. xd= x - x.mean(axis=axis) return sqrt( (xd*xd).sum(axis=axis) / (numpy.size(x,axis=axis)-1.0) ) Anne Archibald has suggested a work-around. Perhaps ddof could be set, by default to 1 as other values are rarely required. Where the distribution of a variate is not known a priori, then I believe that it can be shown that the n-1 divisor provides the best estimate of the variance. There have been previous discussions on this (but I can't find them now) and I believe the current default was chosen deliberately. I think it is the view of the numpy developers that the n divisor has more desireable properties in most cases than the traditional n-1 - see this paper by Travis Oliphant for details: http://hdl.handle.net/1877/438 Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] bus error in embedded numpy
Hi, I'm trying to embed Python in a MATLAB mex file. I've been coming under some pressure to make my Python code available to my MATLAB colleagues so I am trying to come up with a relatively general way of calling numerical python code from Matlab. I am making some progress - but get a reliable crash from numpy. This only occurs the second time I am loading it. So I Py_Initialize, import numpy, Py_Finalize (all works fine), but then if I clear the mex file (clear funcname in matlab - which calls Py_Finalize through a mexAtExit handler) and try to run the function again (which will reinitialize interpreter and import numpy again) I get the followin stack trace from multisarray.so. Wondered if anyone could through any light. I have already run into this bug http://bugs.python.org/issue6869 which prevents me using ctypes... I wondered if this was related. For now its not such a big problem - I will just avoid unloading the mex function (with clear function). But I thought it might be indicative of a memory leak or some other problem since I think in theory it should work (It does if numpy isn't imported). Cheers Robin Bus error detected at Fri Nov 13 17:11:57 2009 Configuration: MATLAB Version: 7.8.0.347 (R2009a) MATLAB License: 161051 Operating System: Darwin 10.0.0 Darwin Kernel Version 10.0.0: Fri Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386 Window System:The X.Org Foundation (10402000), display /tmp/launch-2p1ZWg/:0 Current Visual: 0x24 (class 4, depth 24) Processor ID: x86 Family 6 Model 15 Stepping 10, GenuineIntel Virtual Machine: Java 1.6.0_15-b03-219 with Apple Inc. Java HotSpot(TM) Client VM mixed mode Default Encoding: ISO-8859-1 Fault Count: 1 Register State: eax = 0001 ebx = 307b12ab ecx = edx = 305ef148 esi = 305ef140 edi = 32a79d60 ebp = b097c6c8 esp = b097c670 eip = 307b14be flg = 00010202 Stack Trace: [0] multiarray.so:PyArray_FromScalar~(0x305ef140, 0, 2, 0x32e23287) + 542 bytes [1] multiarray.so:gentype_nonzero_number~(0x305ef140, 0x2eaa6db0, 0, 0x32e73cfe) + 36 bytes [2] Python:PyObject_IsTrue~(0x305ef140, 0x2ea95de0, 2, 0) + 63 bytes [3] Python:PyEval_EvalFrameEx~(0x33c95160, 0, 0x332941e0, 0) + 12598 bytes [4] Python:PyEval_EvalFrameEx~(0x32a70f80, 0, 0x332941e0, 0) + 24217 bytes [5] Python:PyEval_EvalCodeEx~(0x33f05068, 0x332941e0, 0, 0x32a96794) + 1819 bytes [6] Python:PyEval_EvalFrameEx~(0x32a96640, 0, 0x332941e0, 0) + 16561 bytes [7] Python:PyEval_EvalCodeEx~(0x33f051d0, 0x332941e0, 0, 0x32a6fcc0) + 1819 bytes [8] Python:PyEval_EvalFrameEx~(0x32a6fb60, 0, 0x33bc99c0, 0) + 16561 bytes [9] Python:PyEval_EvalCodeEx~(0x334abda0, 0x33bc99c0, 0, 0x32ad3b24) + 1819 bytes [10] Python:PyEval_EvalFrameEx~(0x32ad39d0, 0, 0x33bc94b0, 0) + 16561 bytes [11] Python:PyEval_EvalCodeEx~(0x332919b0, 0x33bc94b0, 0, 0x33ce5294) + 1819 bytes [12] Python:PyEval_EvalFrameEx~(0x33ce5150, 0, 0x33bc94b0, 0) + 16561 bytes [13] Python:PyEval_EvalCodeEx~(0x332918d8, 0x33bc94b0, 0, 0x34d5156c) + 1819 bytes [14] Python:PyEval_EvalFrameEx~(0x34d51430, 0, 0x33bc9300, 0x33bc9300) + 16561 bytes [15] Python:PyEval_EvalCodeEx~(0x33291968, 0x33bc9300, 0x33bc9300, 0) + 1819 bytes [16] Python:PyEval_EvalCode~(0x33291968, 0x33bc9300, 0x33bc9300, 0x9325378f) + 87 bytes [17] Python:PyImport_ExecCodeModuleEx~(0xb097e4cb numpy.core._internal, 0x33291968, 0xb097dbbf /Library/Frameworks/Python.frame.., 0x332a2000) + 193 bytes [18] Python:load_source_module~(1, 0, 0xb097e418 X™C†, 0xb097e41c) + 726 bytes [19] Python:import_submodule~(0xb097e4d6 _internal, 0x33d1ce1c _internal, 9, 0x32e2cfc9) + 293 bytes [20] Python:load_next~(0xb097e4cb numpy.core._internal, 0xb097e8cc, 0xb097e578, 0x32e8b5e6) + 195 bytes [21] Python:import_module_level~(0x32ee4aa0 TD, 0x, 0xee9e70b3, 0x32e6df4d «E) + 142 bytes [22] Python:PyImport_ImportModuleLevel~(0x33d1ce1c _internal, 0x33b25150, 0x33b25150, 0x32ee4aa0 TD) + 45 bytes [23] Python:builtin___import__~(0, 0x348854e0, 0, 0x32e73cfe) + 156 bytes [24] Python:PyObject_Call~(0x33da9a08, 0x348854e0, 0, 0x32e233cd) + 45 bytes [25] Python:PyEval_CallObjectWithKeywords~(0x33da9a08, 0x348854e0, 0, 0x33b25150) + 112 bytes [26] Python:PyEval_EvalFrameEx~(0x33cd2da0, 0, 0x33b25150, 0x33b25150) + 8138 bytes [27] Python:PyEval_EvalCodeEx~(0x2068, 0x33b25150, 0x33b25150, 0) + 1819 bytes [28] Python:PyEval_EvalCode~(0x2068, 0x33b25150, 0x33b25150, 0x9325378f) + 87 bytes [29] Python:PyImport_ExecCodeModuleEx~(0xb097fadb numpy.core, 0x2068, 0xb097ed7f /Library/Frameworks/Python.frame.., 0x4adc73cc) + 193 bytes [30] Python:load_source_module~(1, 0, 0xb097f5cc, 0) + 726 bytes [31] Python:load_package~(5, 0, 0xb097fa28, 0xb097fa2c) + 427 bytes [32] Python:import_submodule~(0xb097fae1 core
Re: [Numpy-discussion] bus error in embedded numpy
I forgot to add it doesn't happen if Py_Finalize isn't called - if I take that out and just let the OS/matlab unload the mex function then I can run it many times without the error, but it does leak a bit of memory each time. Cheers Robin On Fri, Nov 13, 2009 at 5:23 PM, Robin robi...@gmail.com wrote: Hi, I'm trying to embed Python in a MATLAB mex file. I've been coming under some pressure to make my Python code available to my MATLAB colleagues so I am trying to come up with a relatively general way of calling numerical python code from Matlab. I am making some progress - but get a reliable crash from numpy. This only occurs the second time I am loading it. So I Py_Initialize, import numpy, Py_Finalize (all works fine), but then if I clear the mex file (clear funcname in matlab - which calls Py_Finalize through a mexAtExit handler) and try to run the function again (which will reinitialize interpreter and import numpy again) I get the followin stack trace from multisarray.so. Wondered if anyone could through any light. I have already run into this bug http://bugs.python.org/issue6869 which prevents me using ctypes... I wondered if this was related. For now its not such a big problem - I will just avoid unloading the mex function (with clear function). But I thought it might be indicative of a memory leak or some other problem since I think in theory it should work (It does if numpy isn't imported). Cheers Robin Bus error detected at Fri Nov 13 17:11:57 2009 Configuration: MATLAB Version: 7.8.0.347 (R2009a) MATLAB License: 161051 Operating System: Darwin 10.0.0 Darwin Kernel Version 10.0.0: Fri Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386 Window System: The X.Org Foundation (10402000), display /tmp/launch-2p1ZWg/:0 Current Visual: 0x24 (class 4, depth 24) Processor ID: x86 Family 6 Model 15 Stepping 10, GenuineIntel Virtual Machine: Java 1.6.0_15-b03-219 with Apple Inc. Java HotSpot(TM) Client VM mixed mode Default Encoding: ISO-8859-1 Fault Count: 1 Register State: eax = 0001 ebx = 307b12ab ecx = edx = 305ef148 esi = 305ef140 edi = 32a79d60 ebp = b097c6c8 esp = b097c670 eip = 307b14be flg = 00010202 Stack Trace: [0] multiarray.so:PyArray_FromScalar~(0x305ef140, 0, 2, 0x32e23287) + 542 bytes [1] multiarray.so:gentype_nonzero_number~(0x305ef140, 0x2eaa6db0, 0, 0x32e73cfe) + 36 bytes [2] Python:PyObject_IsTrue~(0x305ef140, 0x2ea95de0, 2, 0) + 63 bytes [3] Python:PyEval_EvalFrameEx~(0x33c95160, 0, 0x332941e0, 0) + 12598 bytes [4] Python:PyEval_EvalFrameEx~(0x32a70f80, 0, 0x332941e0, 0) + 24217 bytes [5] Python:PyEval_EvalCodeEx~(0x33f05068, 0x332941e0, 0, 0x32a96794) + 1819 bytes [6] Python:PyEval_EvalFrameEx~(0x32a96640, 0, 0x332941e0, 0) + 16561 bytes [7] Python:PyEval_EvalCodeEx~(0x33f051d0, 0x332941e0, 0, 0x32a6fcc0) + 1819 bytes [8] Python:PyEval_EvalFrameEx~(0x32a6fb60, 0, 0x33bc99c0, 0) + 16561 bytes [9] Python:PyEval_EvalCodeEx~(0x334abda0, 0x33bc99c0, 0, 0x32ad3b24) + 1819 bytes [10] Python:PyEval_EvalFrameEx~(0x32ad39d0, 0, 0x33bc94b0, 0) + 16561 bytes [11] Python:PyEval_EvalCodeEx~(0x332919b0, 0x33bc94b0, 0, 0x33ce5294) + 1819 bytes [12] Python:PyEval_EvalFrameEx~(0x33ce5150, 0, 0x33bc94b0, 0) + 16561 bytes [13] Python:PyEval_EvalCodeEx~(0x332918d8, 0x33bc94b0, 0, 0x34d5156c) + 1819 bytes [14] Python:PyEval_EvalFrameEx~(0x34d51430, 0, 0x33bc9300, 0x33bc9300) + 16561 bytes [15] Python:PyEval_EvalCodeEx~(0x33291968, 0x33bc9300, 0x33bc9300, 0) + 1819 bytes [16] Python:PyEval_EvalCode~(0x33291968, 0x33bc9300, 0x33bc9300, 0x9325378f) + 87 bytes [17] Python:PyImport_ExecCodeModuleEx~(0xb097e4cb numpy.core._internal, 0x33291968, 0xb097dbbf /Library/Frameworks/Python.frame.., 0x332a2000) + 193 bytes [18] Python:load_source_module~(1, 0, 0xb097e418 X™C†, 0xb097e41c) + 726 bytes [19] Python:import_submodule~(0xb097e4d6 _internal, 0x33d1ce1c _internal, 9, 0x32e2cfc9) + 293 bytes [20] Python:load_next~(0xb097e4cb numpy.core._internal, 0xb097e8cc, 0xb097e578, 0x32e8b5e6) + 195 bytes [21] Python:import_module_level~(0x32ee4aa0 TD, 0x, 0xee9e70b3, 0x32e6df4d «E) + 142 bytes [22] Python:PyImport_ImportModuleLevel~(0x33d1ce1c _internal, 0x33b25150, 0x33b25150, 0x32ee4aa0 TD) + 45 bytes [23] Python:builtin___import__~(0, 0x348854e0, 0, 0x32e73cfe) + 156 bytes [24] Python:PyObject_Call~(0x33da9a08, 0x348854e0, 0, 0x32e233cd) + 45 bytes [25] Python:PyEval_CallObjectWithKeywords~(0x33da9a08, 0x348854e0, 0, 0x33b25150) + 112 bytes [26] Python:PyEval_EvalFrameEx~(0x33cd2da0, 0, 0x33b25150, 0x33b25150) + 8138 bytes [27] Python:PyEval_EvalCodeEx~(0x2068, 0x33b25150, 0x33b25150, 0) + 1819 bytes [28] Python:PyEval_EvalCode~(0x2068, 0x33b25150, 0x33b25150
Re: [Numpy-discussion] bus error in embedded numpy
I'm trying to embed Python in a MATLAB mex file. I've been coming under some pressure to make my Python code available to my MATLAB colleagues so I am trying to come up with a relatively general way of calling numerical python code from Matlab. I get a similar but different error trying to do the same with scipy - (importing scipy after a Py_Finalize and matlab function clear) - this time in umath.so: Bus error detected at Fri Nov 13 17:53:43 2009 Configuration: MATLAB Version: 7.8.0.347 (R2009a) MATLAB License: 161051 Operating System: Darwin 10.0.0 Darwin Kernel Version 10.0.0: Fri Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386 Window System:The X.Org Foundation (10402000), display /tmp/launch-2p1ZWg/:0 Current Visual: 0x24 (class 4, depth 24) Processor ID: x86 Family 6 Model 15 Stepping 10, GenuineIntel Virtual Machine: Java 1.6.0_15-b03-219 with Apple Inc. Java HotSpot(TM) Client VM mixed mode Default Encoding: ISO-8859-1 Fault Count: 1 Register State: eax = 33028046 ebx = 327464cb ecx = 3382b110 edx = 330180b0 esi = 304d6ac0 edi = 336e6660 ebp = b08fa8f8 esp = b08fa8cc eip = 23810006 flg = 00010a92 Stack Trace: [0] 0x23810006(0x336e6660, 0x3352e415 remainder, 0x32824d74, 0) [1] umath.so:initumath~(0xb08faebb numpy.core.umath, 0xb08faec6 umath, 0xb08faa07 /Library/Frameworks/Python.frame.., 0xa043aab0) + 2152 bytes [2] Python:_PyImport_LoadDynamicModule~(0xb08faebb numpy.core.umath, 0xb08faa07 /Library/Frameworks/Python.frame.., 0xa043aab0, 0x331ae7d3) + 153 bytes [3] Python:load_module~(3, 0, 0xb08fae08 ∞™C†, 0xb08fae0c) + 203 bytes [4] Python:import_submodule~(0xb08faec6 umath, 0x326eecd4 umath, 5, 0x33148fc9) + 293 bytes [5] Python:load_next~(0xb08faebb numpy.core.umath, 0xb08fb2bc, 0xb08faf68, 0x331a75e6) + 195 bytes [6] Python:import_module_level~(0x33200aa0, 0x, 0xee9e70b3, 0x33189f4d «E) + 142 bytes [7] Python:PyImport_ImportModuleLevel~(0x326eecd4 umath, 0x336340c0, 0x336340c0, 0x33200aa0) + 45 bytes [8] Python:builtin___import__~(0, 0x33845ea0, 0, 0x3318fcfe) + 156 bytes [9] Python:PyObject_Call~(0x326ba7b0, 0x33845ea0, 0, 0x3313f3cd) + 45 bytes [10] Python:PyEval_CallObjectWithKeywords~(0x326ba7b0, 0x33845ea0, 0, 0x336340c0) + 112 bytes [11] Python:PyEval_EvalFrameEx~(0x304d5fd0, 0, 0x336340c0, 0x336340c0) + 8138 bytes [12] Python:PyEval_EvalCodeEx~(0x30386e30, 0x336340c0, 0x336340c0, 0) + 1819 bytes [13] Python:PyEval_EvalCode~(0x30386e30, 0x336340c0, 0x336340c0, 0x9325378f) + 87 bytes [14] Python:PyImport_ExecCodeModuleEx~(0xb08fc4cb numpy.core, 0x30386e30, 0xb08fb76f /Library/Frameworks/Python.frame.., 0x4adc73cc) + 193 bytes [15] Python:load_source_module~(1, 0, 0xb08fbfbc X™C†0D80†370, 0) + 726 bytes [16] Python:load_package~(5, 0, 0xb08fc418, 0xb08fc41c) + 427 bytes [17] Python:import_submodule~(0xb08fc4d1 core, 0x3384512a core.numeric, 4, 0x33148fc9) + 293 bytes [18] Python:load_next~(0xb08fc4cb numpy.core, 0xb08fc8cc, 9, 0x331a75e6) + 195 bytes [19] Python:import_module_level~(0x33200aa0, 0x, 0x746e6920, 0x33189f4d «E) + 213 bytes [20] Python:PyImport_ImportModuleLevel~(0x33845124 numpy.core.numeric, 0x33634ae0, 0x33634ae0, 0x33200aa0) + 45 bytes [21] Python:builtin___import__~(0, 0x33845f30, 0, 0x3318fcfe) + 156 bytes [22] Python:PyObject_Call~(0x326ba7b0, 0x33845f30, 0, 0x3313f3cd) + 45 bytes [23] Python:PyEval_CallObjectWithKeywords~(0x326ba7b0, 0x33845f30, 0, 0x33634ae0) + 112 bytes [24] Python:PyEval_EvalFrameEx~(0x304caf70, 0, 0x33634ae0, 0x33634ae0) + 8138 bytes [25] Python:PyEval_EvalCodeEx~(0x30386e78, 0x33634ae0, 0x33634ae0, 0) + 1819 bytes [26] Python:PyEval_EvalCode~(0x30386e78, 0x33634ae0, 0x33634ae0, 0x9325378f) + 87 bytes [27] Python:PyImport_ExecCodeModuleEx~(0xb08fd68b numpy.lib.type_check, 0x30386e78, 0xb08fcd7f /Library/Frameworks/Python.frame.., 6771) + 193 bytes [28] Python:load_source_module~(1, 0, 0xb08fd5d8, 0xb08fd5dc) + 726 bytes [29] Python:import_submodule~(0xb08fd695 type_check, 0x3038a494 type_check, 10, 0x33148fc9) + 293 bytes [30] Python:load_next~(0xb08fd68b numpy.lib.type_check, 0xb08fda8c, 0xb08fd738, 0x331a75e6) + 195 bytes [31] Python:import_module_level~(0x32703e30, 0x, 0xee9e70b3, 0x33189f4d «E) + 142 bytes [32] Python:PyImport_ImportModuleLevel~(0x3038a494 type_check, 0x33634150, 0x33634150, 0x32703e30) + 45 bytes [33] Python:builtin___import__~(0, 0x33845540, 0, 0x3318fcfe) + 156 bytes [34] Python:PyObject_Call~(0x326ba7b0, 0x33845540, 0, 0x3313f3cd) + 45 bytes [35] Python:PyEval_CallObjectWithKeywords~(0x326ba7b0, 0x33845540, 0, 0x33634150) + 112 bytes [36] Python:PyEval_EvalFrameEx~(0x304b4fd0, 0, 0x33634150, 0x33634150) + 8138 bytes [37] Python:PyEval_EvalCodeEx~(0x32727bf0, 0x33634150, 0x33634150, 0) + 1819
Re: [Numpy-discussion] bus error in embedded numpy
On Fri, Nov 13, 2009 at 6:13 PM, Robert Kern robert.k...@gmail.com wrote: On Fri, Nov 13, 2009 at 12:05, Travis Oliphant oliph...@enthought.com wrote: I wonder if this is related to the fact that you can't unload a dynamically linked module (like NumPy has). So, when you call Py_Finalize you are not really finalizing your usage of Python extension modules. I'm not sure though. Right. We do some global things when numpy is imported. Since there is no unload step for extension modules, we can't undo them. The second time the interpreter starts up, it doesn't know that numpy has already been loaded and that numpy shouldn't try to do those global things again. Thanks for the quick responses... I'm sure you're right - in fact it looks like there was a similar issue here: http://www.mathworks.co.uk/matlabcentral/newsreader/view_thread/262089 I had assumed when matlab unloads the mex function it would also unload python - but it looks like other dynamic libs pulled in from the mex function (in this case python and in turn numpy) aren't unloaded... I wonder if it would be possible to link python statically to my mex function, so it really is unloaded when the mex function is... but I'm getting a bit out of my depth with linker options, and I guess numpy is always loaded dynamically anyway and will stick around. Easy enough to work around it anyway - but just wanted to check what was going on. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bus error in embedded numpy
On Fri, Nov 13, 2009 at 6:50 PM, Pauli Virtanen pav...@iki.fi wrote: Fri, 13 Nov 2009 17:23:19 +, Robin wrote: I'm trying to embed Python in a MATLAB mex file. I've been coming under some pressure to make my Python code available to my MATLAB colleagues so I am trying to come up with a relatively general way of calling numerical python code from Matlab. Out of curiosity, how are you handling the Matlab RTLD_GLOBAL issue. Last time [1] I did something similar, I had to hack around it in an ugly way. .. [1] http://www.iki.fi/pav/software/pythoncall/index.html Actually I was completely unaware of it (the RTLD_GLOBAL thing). I had googled for a while trying to find a similar project (I had assumed someone must have done it) but somehow didn't find your pythoncall project. It's great - the interface is very close to what I had in mind, but it's much more comprehensive then anything I thought of (I was hoping to handle just contiguous numpy arrays). How does the RTLD_GLOBAL thing manifest itself? So far I have only a very basic start which basically consists of: cmd = mxArrayToString(prhs[0]); PyRun_SimpleString(cmd); but I haven't noticed anything not working - I can run numpy testsuite, and do simple commands as expected (initiliase arrays in the interpreter, run numpy functions on them). Perhaps recent versions of Matlab behave differently (I am using R2009a on a mac). So far the only remotely tricky thing I did was redirect sys.stdout and sys.stderr to a wrapper that uses mexPrintf so output goes to the matlab console. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] bus error in embedded numpy
On Fri, Nov 13, 2009 at 7:48 PM, Sturla Molden stu...@molden.no wrote: Robin skrev: I had assumed when matlab unloads the mex function it would also unload python - but it looks like other dynamic libs pulled in from the mex function (in this case python and in turn numpy) aren't unloaded... Matlab MEX functions are DLLs, Python interpreter is a DLL, NumPy .pyd files are DLLs... If you cannot unload Python or NumPy, you cannot unload MEX functions either. The same OS constraints apply. Ah, I hadn't realised it was an OS constraint - I thought it was possible to unload dlls - and that was why matlab provides the clear function. mex automatically clears a function when you rebuild it - I thought that was how you can rebuild and reload mex functions without restarting matlab. I thought it was just a quirk of matlab that there was no way to unload other libraries pulled in through the mex function. I must admit to being a bit out of my depth with this stuff though so I stand corrected. If you are using Windows, I would recommend that you expose your NumPy code as a COM object using pywin32, and make a Matlab wrapper (ordinary .m file) that calls the COM object. If you are not using Windows, you could create an XMLRPC server in Python and call that using Matlab's built-in Java VM. Or you can spawn a separate Python process from Matlab, and communicate using pipes or a tcp/ip socket (Matlab's Java or MEX). There are many solutions that are easier than embedding Python in Matlab. I really want a cross platform solution so that rules out COM. I thought about using web services or something which I guess would be the easiest way to do communication through a tcp socket (least work since I could use existing web services libraries on both sides). But actually I have found it pretty easy to embed Python so far... with about 5 lines of code I'm able run simple strings and with a small cython module I get stdout to the console. I haven't tried passing data back and forth yet but from Pauli's pythoncall it doesn't look like that is too difficult. I was sort of hoping that eventually I could create a numpy array from a matlab data pointer - at least for read only input - so that I could use python code to work on large amounts of data without so much memory overhead (returning smaller results by copying). Do you think I'm likely to run into more problems? I got the feeling from asking questions on IRC that embedding Python is kind of discouraged but I thought in this case it seemed the neatest way. Be careful when you are using file handles. You have to be sure that Matlab, Python and NumPy are all linked against the same CRT. I was aware of this - I thought I would be OK on the mac - at least Python and Numpy and my mex function are built with apple gcc although I'm not sure about Matlab. I guess Windows will be more difficult... But in any case I don't plan to pass any file handles around. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py-users list not working
On Wed, Nov 4, 2009 at 2:38 PM, George Nurser gnur...@googlemail.com wrote: Fortran can accept preprocessor directives, but f2py cannot. You first need to preprocess a .F (or .F90) file to create a .f (or .f90) file which you then pass to f2py The way I preprocess the .F file is to have statements like int int*INTSIZE :: i,j,k So preprocess file.F e.g. in gfortran with gfortran -E -DINTSIZE=8 file.F -o outdir/file.f The outdir is necessary in a case-insensitive file system (like default mac OSX) to prevent the .f files overwriting the .F file. Alternatively, it may be possible to use some other suffix than .f, but I've not tried that. Then f2py file.f That's great thanks very much! It's more or less exactly what I was hoping for. I wonder if it's possible to get distutils to do the preprocess step from a setup.py script? Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] strange performance on mac 2.5/2.6 32/64 bit
Hi, I'm not sure if this is of much interest but it's been really puzzling me so I thought I'd ask. In an earlier post I described how I was surprised a simple f2py wrapped fortran bincount was 4x faster than np.bincount - but that differential only seemed to be on my mac; on moving to linux they both took more or less the same time. I'm trying to work out if it is worth moving some of my bottlenecks to fortran (most of which are np builtins). So far it looks like it is - but only on my mac and only 32bit (see below). Well the only explanation I thought was that the gcc-4.0 used to build numpy on a mac didn't perform so well, so after upgrading to snow leopard I've been trying to look at this again. I was hoping I could get the equivalent performance on my mac, like on linux, which would result in the np c stuff being a couple of times faster. So far, with Python 2.6.3 in 64 bit - numpy seems to be significantly slower and my fortran code _much_ slower - even from the same compiler. Can anyone help me understand what is going on? I have only been able to build 32 bit numpy against 2.5.4 with apple gcc-4.0 and 64 bit numpy against 2.6.3 universal with gcc-4.2. I haven't been able to get a numpy I can import on 2.6.3 in 32 bit mode ( http://projects.scipy.org/numpy/ticket/1221 ). Here are the results for python.org 32 bit 2.5.4, numpy compiled with apple gcc 4.0, f2py using att gfortran 4.2: In [2]: timeit x = np.random.random_integers(0,1023,1).astype(int) 1 loops, best of 3: 2.86 s per loop In [3]: x = np.random.random_integers(0,1023,1).astype(int) In [4]: timeit np.bincount(x) 1 loops, best of 3: 435 ms per loop In [6]: timeit gf42.bincount(x,1024) 10 loops, best of 3: 129 ms per loop In [7]: np.__version__ Out[7]: '1.4.0.dev7618' And for self-built (apple gcc 4.2) 64 bit 2.6.3, numpy compiled with apple gcc 4.2, f2py using the same att gfortran 4.2: In [3]: timeit x = np.random.random_integers(0,1023,1).astype(int) 1 loops, best of 3: 3.91 s per loop # 37% slower than 32bit In [4]: x = np.random.random_integers(0,1023,1).astype(int) In [5]: timeit np.bincount(x) 1 loops, best of 3: 582 ms per loop # 34 % slower than 32 bit In [8]: timeit gf42_64.bincount(x,1024) 1 loops, best of 3: 803 ms per loop # 522% slower than 32 bit So why is there this big difference in performance? I'd really like to know why the fortran compiled with the same compiler is so much slower in 64 bit mode. As far as I can tell the flags used are the same. Also why is numpy slower. I was surprised the I was able to import the 64 bit universal module built with f2py from 2.6 inside 32 bit 3.5 and there it was quick again - so it seems the x64_64 code generated by the fortran compiler is much slower (rather than any wrappers or such). I tried using some more recent gfortrans from macports - but could only use them to build modules against the 64 bit python/numpy since I couldn't find a way to get f2py to force 32 bit output. But the performance was more or less the same (always several times slower the 32 bit att gfortran). Any advice appreciated. Cheers Robin subroutine bincount (x,c,n,m) implicit none integer, intent(in) :: n,m integer, dimension(0:n-1), intent(in) :: x integer, dimension(0:m-1), intent(out) :: c integer :: i c = 0 do i = 0, n-1 c(x(i)) = c(x(i)) + 1 end do end ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] strange performance on mac 2.5/2.6 32/64 bit
After some more pootling about I figured out a lot of the performance loss comes from using 32 bit integers by default when compiles 64 bit. I asked this question on stackoverflow: http://stackoverflow.com/questions/1668899/fortran-32-bit-64-bit-performance-portability is there any way to use fortran with f2py from python in a way that doesn't require the code to be changed depending on platform? Or should I just pack it all in and use weave? Robin On Tue, Nov 3, 2009 at 4:29 PM, Robin robi...@gmail.com wrote: Hi, I'm not sure if this is of much interest but it's been really puzzling me so I thought I'd ask. In an earlier post I described how I was surprised a simple f2py wrapped fortran bincount was 4x faster than np.bincount - but that differential only seemed to be on my mac; on moving to linux they both took more or less the same time. I'm trying to work out if it is worth moving some of my bottlenecks to fortran (most of which are np builtins). So far it looks like it is - but only on my mac and only 32bit (see below). Well the only explanation I thought was that the gcc-4.0 used to build numpy on a mac didn't perform so well, so after upgrading to snow leopard I've been trying to look at this again. I was hoping I could get the equivalent performance on my mac, like on linux, which would result in the np c stuff being a couple of times faster. So far, with Python 2.6.3 in 64 bit - numpy seems to be significantly slower and my fortran code _much_ slower - even from the same compiler. Can anyone help me understand what is going on? I have only been able to build 32 bit numpy against 2.5.4 with apple gcc-4.0 and 64 bit numpy against 2.6.3 universal with gcc-4.2. I haven't been able to get a numpy I can import on 2.6.3 in 32 bit mode ( http://projects.scipy.org/numpy/ticket/1221 ). Here are the results for python.org 32 bit 2.5.4, numpy compiled with apple gcc 4.0, f2py using att gfortran 4.2: In [2]: timeit x = np.random.random_integers(0,1023,1).astype(int) 1 loops, best of 3: 2.86 s per loop In [3]: x = np.random.random_integers(0,1023,1).astype(int) In [4]: timeit np.bincount(x) 1 loops, best of 3: 435 ms per loop In [6]: timeit gf42.bincount(x,1024) 10 loops, best of 3: 129 ms per loop In [7]: np.__version__ Out[7]: '1.4.0.dev7618' And for self-built (apple gcc 4.2) 64 bit 2.6.3, numpy compiled with apple gcc 4.2, f2py using the same att gfortran 4.2: In [3]: timeit x = np.random.random_integers(0,1023,1).astype(int) 1 loops, best of 3: 3.91 s per loop # 37% slower than 32bit In [4]: x = np.random.random_integers(0,1023,1).astype(int) In [5]: timeit np.bincount(x) 1 loops, best of 3: 582 ms per loop # 34 % slower than 32 bit In [8]: timeit gf42_64.bincount(x,1024) 1 loops, best of 3: 803 ms per loop # 522% slower than 32 bit So why is there this big difference in performance? I'd really like to know why the fortran compiled with the same compiler is so much slower in 64 bit mode. As far as I can tell the flags used are the same. Also why is numpy slower. I was surprised the I was able to import the 64 bit universal module built with f2py from 2.6 inside 32 bit 3.5 and there it was quick again - so it seems the x64_64 code generated by the fortran compiler is much slower (rather than any wrappers or such). I tried using some more recent gfortrans from macports - but could only use them to build modules against the 64 bit python/numpy since I couldn't find a way to get f2py to force 32 bit output. But the performance was more or less the same (always several times slower the 32 bit att gfortran). Any advice appreciated. Cheers Robin subroutine bincount (x,c,n,m) implicit none integer, intent(in) :: n,m integer, dimension(0:n-1), intent(in) :: x integer, dimension(0:m-1), intent(out) :: c integer :: i c = 0 do i = 0, n-1 c(x(i)) = c(x(i)) + 1 end do end ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] strange performance on mac 2.5/2.6 32/64 bit
On Tue, Nov 3, 2009 at 6:14 PM, Robin robi...@gmail.com wrote: After some more pootling about I figured out a lot of the performance loss comes from using 32 bit integers by default when compiles 64 bit. I asked this question on stackoverflow: http://stackoverflow.com/questions/1668899/fortran-32-bit-64-bit-performance-portability is there any way to use fortran with f2py from python in a way that doesn't require the code to be changed depending on platform? Including the -DF2PY_REPORT_ON_ARRAY_COPY option showed that the big performance hit was from f2py copying the arrays to cast from 64 bit to 32 bit. Is there a recommended way to easily write fortran extensions that work on both 32bit and 64bit machines (something like using -fdefault-int-8 and f2py not casting on a 64 bit platform, and not using the option and not casting on a 32 bit platform)? Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] f2py error with -DF2PY_REPORT_ATEXIT
Hi, When I try to build a fortran module with f2py from '1.4.0.dev7618' with gfortran 4.2.3 from att.com and apple gcc 4.2 on snow leopard I get the following error when I try to use -DF2PY_REPORT_ATEXIT: In [1]: import gfint --- ImportError Traceback (most recent call last) /Users/robince/fortran/ipython console in module() ImportError: dlopen(./gfint.so, 2): Symbol not found: _on_exit Referenced from: /Users/robince/fortran/gfint.so Expected in: dynamic lookup Is numpy trac the place for f2py tickets? Or is there an obvious fix. Inspecting gfint.so shows the same symbols for both architectures, and _on_exit is listed there with a U which I guess means undefined. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py error with -DF2PY_REPORT_ATEXIT
Just noticed this is only supported on linux - sorry for the noise (not having a very good day today!) Robin On Tue, Nov 3, 2009 at 6:47 PM, Robin robi...@gmail.com wrote: Hi, When I try to build a fortran module with f2py from '1.4.0.dev7618' with gfortran 4.2.3 from att.com and apple gcc 4.2 on snow leopard I get the following error when I try to use -DF2PY_REPORT_ATEXIT: In [1]: import gfint --- ImportError Traceback (most recent call last) /Users/robince/fortran/ipython console in module() ImportError: dlopen(./gfint.so, 2): Symbol not found: _on_exit Referenced from: /Users/robince/fortran/gfint.so Expected in: dynamic lookup Is numpy trac the place for f2py tickets? Or is there an obvious fix. Inspecting gfint.so shows the same symbols for both architectures, and _on_exit is listed there with a U which I guess means undefined. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] f2py-users list not working
I just tried to send the message below to f2py-users - f2py-us...@cens.ioc.ee, but delivery failed. Not sure where else to report this so hopefully here is ok. Cheers Robin -- Forwarded message -- From: Mail Delivery Subsystem mailer-dae...@googlemail.com Date: Tue, Nov 3, 2009 at 9:40 PM Subject: Delivery Status Notification (Failure) To: robi...@gmail.com Delivery to the following recipient failed permanently: f2py-us...@cens.ioc.ee Technical details of permanent failure: Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other server returned was: 550 550 Unrouteable address (state 14). - Original message - Subject: writing module for 32/64 bit From: Robin robi...@gmail.com To: f2py-us...@cens.ioc.ee Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi, I would like to write some subroutines in fortran involving integers and distribute them in a small package. Users might be on 32 bit or 64 bit. Is there an easy or recommended way to approach this? Ideally I would like to work with the native integer type - but if I use 'integer' in fortran it is always 32 bit and f2py converts input aways when numpy is 64 bit. If I use integer*8 in the code then its fine for 64 bit, but on 32 bit platforms, both f2py has to convert and its not the native integer size. What I (think) I'd like to do is be able to use the native platform integer type, like numpy does, and then not worry about. I found there are options like -fdefault-int-8 for gfortran, but when I add that stuff breaks (bus error) since I guess f2py doesn't know about it and is converting and passing 32 bits anyway. Is there any way around this, or do I need to maintain 2 different versions with different fortran code for 32bit/64bit? Or is it possible to acheive something like this with preprossor #ifdefs? (not sure how this works with fortran, or if f2py would be aware of it). Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] f2py-users list not working
On Tue, Nov 3, 2009 at 10:05 PM, Charles R Harris charlesr.har...@gmail.com wrote: I believe f2py development has moved elsewhere, the new mailing list seems to be at http://groups.google.com/group/f2py-dev?pli=1 . It doesn't look to be very active. Perhaps the summer additions to cython adding some fortran support might be useful to you. I don't know much about that but perhaps Dag will offer some advice. Thanks - from the archives of f2py-users http://cens.ioc.ee/pipermail/f2py-users/ it looks like there was activity as recently as October (15th) which is why I assumed it was still supposed to be working. Certainly more traffic more recently than the google group. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] recommended way to run numpy on snow leopard
On Fri, Oct 23, 2009 at 9:09 AM, David Warde-Farley d...@cs.toronto.eduwrote: The Python.org sources for 2.6.x has a script in the Mac/ subdirectory (I think, or in the build tools) for building a 4-way universal binary (i386, x86_64, ppc and ppc64). You can rather easily build it (just run the script) and it will produce executables of the form python (or python2.6) suffixed with -32 or -64 to run in one mode or the other. So, python-32 (or python2.6-32) will get you 32 bit Python, which will work with wxPython using wxMac, or python-64, which will not (but will do everything in 64-bit mode). I've successfully gotten svn numpy to build 4-way using such a 4-way Python. After having some trouble I decided to try this way to build universal 32/64 bit intel framework build and just use that as my main python for my work. (Had some problems with macports and virtualenv, I want to leave the system one alone and theres no 64 bit python.org build). Just in case any one else tries this - there is a problem where it's impossible to select the 32 bit architecture: http://bugs.python.org/issue6834 It might be possible to work around or use the alternative pythonw.c in the ticket - but it won't be fixed in a release until 2.7. Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] recommended way to run numpy on snow leopard
Hi, I was wondering what the recommended way to run numpy/scipy on mac os x 10.6 is. I understood previously it was recommended to use python.org python and keep everything seperate from the system python, which worked well. But now I would like to have a 64 bit python and numpy, and there isn't one available from python.org. Also I think the python.org ones are built against the 10.4 SDK which I understand requires using gcc 4.0 - I was keen to try 4.2 to see if some of the differential performance I've seen between c and fortran goes away. So I guess the choices are system python with a virtualenv or a macports built python (or a hand built python). I'm thinking of using macports at the moment but I'm not sure how to handle preventing macports numpy from installing so I can use svn numpy. I'm not sure how the virtualenv will work with packaged installers - (ie how could I tell the wx installer to install into the virtualenv). I was wondering what others do? Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] recommended way to run numpy on snow leopard
On Wed, Oct 21, 2009 at 10:28 AM, David Cournapeau da...@ar.media.kyoto-u.ac.jp wrote: Robin wrote: Hi, I was wondering what the recommended way to run numpy/scipy on mac os x 10.6 is. I understood previously it was recommended to use python.org python and keep everything seperate from the system python, which worked well. You can simply use the --user option to the install command: instead of installing in /System, it will install numpy (or any other package) in $HOME/.local, and you don't need to update PYTHONPATH, as python knows about this location. Thanks - that looks ideal. I take it $HOME/.local is searched first so numpy will be used fromt here in preference to the system numpy. My only worry is with installer packages - I'm thinking mainly of wxpython. Is there a way I can get that package to install in $HOME/.local. (The installer only seems to let you choose a drive). Also - if I build for example vim against the system python, will I be able to see packages in $HOME/.local from the python interpreter inside vim? Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] recommended way to run numpy on snow leopard
Thanks... On Wed, Oct 21, 2009 at 11:41 AM, David Cournapeau da...@ar.media.kyoto-u.ac.jp wrote: Robin wrote: Thanks - that looks ideal. I take it $HOME/.local is searched first so numpy will be used fromt here in preference to the system numpy. Yes, unless framework-enabled python does something 'fishy' (I think framework vs convention python have different rules w.r.t. sys.path). As always, in doubt, you should check with numpy.__file__ which one is loaded. It seems it does... the built in numpy which is in '/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python', comes before $HOME/.local in sys.path so I think system numpy will always be picked up over my own installed version. robin-mbp:~ robince$ /usr/bin/python2.6 -c import sys; print sys.path ['', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python26.zip', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-darwin', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-mac', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-mac/lib-scriptpackages', '/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-tk', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-old', '/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload', '/Users/robince/.local/lib/python2.6/site-packages', '/Library/Python/2.6/site-packages', '/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/PyObjC', '/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/wx-2.8-mac-unicode'] So I guess virtualenv or macports? Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?
Hi, I have been looking at moving some of my bottleneck functions to fortran with f2py. To get started I tried some simple things, and was surprised they performend so much better than the number builtins - which I assumed would be c and would be quite fast. On my Macbook pro laptop (Intel core 2 duo) I got the following results. Numpy is built with xcode gcc 4.0.1 and gfortran is 4.2.3 - fortran code for shuffle and bincount below: In [1]: x = np.random.random_integers(0,1023,100).astype(int) In [2]: import ftest In [3]: timeit np.bincount(x) 100 loops, best of 3: 3.97 ms per loop In [4]: timeit ftest.bincount(x,1024) 1000 loops, best of 3: 1.15 ms per loop In [5]: timeit np.random.shuffle(x) 1 loops, best of 3: 605 ms per loop In [6]: timeit ftest.shuffle(x) 10 loops, best of 3: 139 ms per loop So fortran was about 4 times faster for these loops - similarly faster than cython as well. So I was really happy as these are two of my biggest bottlenecks, but when I moved a linux workstation I got different results. Here with gcc/gfortran 4.3.3 : In [3]: x = np.random.random_integers(0,1023,100).astype(int) In [4]: timeit np.bincount(x) 100 loops, best of 3: 8.18 ms per loop In [5]: timeit ftest.bincount(x,1024) 100 loops, best of 3: 8.25 ms per loop In [6]: In [7]: timeit np.random.shuffle(x) 1 loops, best of 3: 379 ms per loop In [8]: timeit ftest.shuffle(x) 10 loops, best of 3: 172 ms per loop So shuffle is a bit faster, but bincount is now the same as fortran. The only thing I can think is that it is due to much better performance of the more recent c compiler. I think this would also explain why f2py extension was performing so much better than cython on the mac. So my question is - is there a way to build numpy with a more recent compiler on leopard? (I guess I could upgrade to snow leopard now) - Could I make the numpy install use gcc-4.2 from xcode or would it break stuff? Could I use gcc 4.3.3 from macports? It would be great to get a 4x speed up on all numpy c loops! (already just these two functions I use a lot would make a big difference). Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?
Forgot to include the fortran code used: jm-g26b101:fortran robince$ cat test.f95 subroutine bincount (x,c,n,m) implicit none integer, intent(in) :: n,m integer, dimension(0:n-1), intent(in) :: x integer, dimension(0:m-1), intent(out) :: c integer :: i c = 0 do i = 0, n-1 c(x(i)) = c(x(i)) + 1 end do end subroutine shuffle (x,s,n) implicit none integer, intent(in) :: n integer, dimension(n), intent(in) :: x integer, dimension(n), intent(out) :: s integer :: i,randpos,temp real :: r ! copy input s = x call init_random_seed() ! knuth shuffle from http://rosettacode.org/wiki/Knuth_shuffle#Fortran do i = n, 2, -1 call random_number(r) randpos = int(r * i) + 1 temp = s(randpos) s(randpos) = s(i) s(i) = temp end do end subroutine init_random_seed() ! init_random_seed from gfortran documentation integer :: i, n, clock integer, dimension(:), allocatable :: seed call random_seed(size = n) allocate(seed(n)) call system_clock(count=clock) seed = clock + 37 * (/ (i - 1, i = 1, n) /) call random_seed(put = seed) deallocate(seed) end subroutine ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] extension questions: f2py and cython
Hi, Sent this last week but checking the archives it appears not to have got through. Hopefully this will work... I am looking at moving some of my code to fortran to use with f2py. To get started I used this simple example: SUBROUTINE bincount (x,c,n,m) IMPLICIT NONE INTEGER, INTENT(IN) :: n,m INTEGER, DIMENSION(n), INTENT(IN) :: x INTEGER, DIMENSION(0:m-1), INTENT(OUT) :: c INTEGER :: i DO i = 1, n c(x(i)) = c(x(i)) + 1 END DO END It performs well: In [1]: x = np.random.random_integers(0,1023,100).astype(int) In [4]: timeit test.bincount(x,1024) 1000 loops, best of 3: 1.16 ms per loop In [5]: timeit np.bincount(x) 100 loops, best of 3: 4 ms per loop I'm guessing most of the benefit comes from less checking + not having to find the maximum value (which I pass as parameter m). But I have some questions. It seems to work as is, but I don't set c to zeros anywhere. Can I assume arrays created by f2py are zero? Is this the recommended way to use f2py with arrays? (I initially tried using assumed arrays with DIMENSION(:) but it couldn't get it to work). Also I'm quite new to fortran - what would be the advantages, if any, of using a FORALL instead of DO in a situation like this? I guess with 1D arrays it doesn't make a copy since ordering is not a problem, but if it was 2D arrays am I right in thinking that if I passed in a C order array it would automatically make a copy to F order. What about the return - will I get a number array in F order, or will it automatically be copied to C order? (I guess I will see but I haven't got onto trying that yet). What if I wanted to keep all the array creation in numpy - ie call it as the fortran subroutine bincount(x,c) and have c modified in place? Should I be using !f2py comments? I wasn't clear if these are needed - it seems to work as is but could they give any improvement? For comparison I tried the same thing in cython - after a couple of iterations with not typing things properly I ended up with: import numpy as np cimport numpy as np cimport cython @cython.boundscheck(False) def bincount(np.ndarray[np.int_t, ndim=1] x not None,int m): cdef int n = x.shape[0] cdef unsigned int i cdef np.ndarray[np.int_t, ndim=1] c = np.zeros(m,dtype=np.int) for i from 0 = i n: c[unsigned intx[i]] += 1 return c which now performs a bit better than np.bincount, but still significantly slower than the fortran. Is this to be expected or am I missing something in the cython? In [14]: timeit ctest.bincount(x,1024) 100 loops, best of 3: 3.31 ms per loop Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] extension questions: f2py and cython
Hi, I have another question about distributing a Python extension which uses f2py wrapped code. Ideally I'd like to keep pure Python/Numpy alternatives and just use fortran version if available - but I think that should be OK. I'm more worried about distributing binaries on Windows - I think on Mac/Linux it would be ok to have a fortran compiler required and build it - but on Windows I guess one should really distribute binaries. What is the recommended (free) fortran 95 compiler for use with f2py on windows (gfortan with cygwin?) Is it possible to get f2py to build a static library on windows so I can just distribute that? Or will I need to include library files from the compiler? How many different binary versions does one need to support common recent windows setups? I guess I need a different binary for each major python version and 32/64 bits (ie 2.5 32bit, 2.6 32bit, 2.5 64bit, 2.6 64bit). Is this right, or would different binaries be required for XP, Vista, 7 etc. ? Can anyone point me to a smallish Python package that includes fortran code in this way that I could look to for inspiration? Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] extension questions: f2py and cython
Hi, Thanks On Thu, Oct 15, 2009 at 4:10 PM, Andrew Hawryluk hawr...@novachem.com wrote: But I have some questions. It seems to work as is, but I don't set c to zeros anywhere. Can I assume arrays created by f2py are zero? As I understand it, uninitialized variables in Fortran are compiler/system-dependent. Some compilers initialize values to zero, many leave the previous contents of the memory in place. It is safest to never use the value of an uninitialized variable. But in this case I understood it was initialised or created by the f2py wrapper first and then passed to the fortran subroutine - so I wondered how f2py creates it (I think I traced it to array_from_pyobj() but I couldn't really understand what it was doing or whether it would always be zeros). I guess as you say though it is always safer to initialize explicitly Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Not enough storage for memmap on 32 bit WinXP for accumulated file size above approx. 1 GB
On Mon, Jul 27, 2009 at 11:37 AM, Kim Hansenslaun...@gmail.com wrote: The machine is new and shiny with loads of processing power and many TB of HDD storage. I am however bound to 32 bits Win XP OS as there are some other costum made third-party and very expensive applications running on that machine (which generate the large files I analyze), which can only run on 32 bits, oh well You could think about using some kind of virtualisation - this is exactly the sort of situation where I find it really useful. You can run a 64 bit host OS, then have 32 bit XP as a 'guest' in VMware or Virtualbox or some other virtualisation software. With recent CPU's there is very little performance penalty (running 32bit on a 64bit host) and it can be very convenient (it is easy to map network drives between guest and host which perform very well since the 'network' is virtual) Cheers Robin ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farleyd...@cs.toronto.edu wrote: On 8-Jun-09, at 8:33 AM, Jason Rennie wrote: Note that EM can be very slow to converge: That's absolutely true, but EM for PCA can be a life saver in cases where diagonalizing (or even computing) the full covariance matrix is not a realistic option. Diagonalization can be a lot of wasted effort if all you care about are a few leading eigenvectors. EM also lets you deal with missing values in a principled way, which I don't think you can do with standard SVD. EM certainly isn't a magic bullet but there are circumstances where it's appropriate. I'm a big fan of the ECG paper too. :) Hi, I've been following this with interest... although I'm not really familiar with the area. At the risk of drifting further off topic I wondered if anyone could recommend an accessible review of these kind of dimensionality reduction techniques... I am familiar with PCA and know of diffusion maps and ICA and others, but I'd never heard of EM and I don't really have any idea how they relate to each other and which might be better for one job or the other... so some sort of primer would be really handy. Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] NameError: name 'numeric' is not defined
Hi, I just updated to latest numpy svn: In [10]: numpy.__version__ Out[10]: '1.4.0.dev7039' It seemed to build fine, but I am getting a lot of errors testing it: -- Ran 178 tests in 0.655s FAILED (errors=138) Out[8]: nose.result.TextTestResult run=178 errors=138 failures=0 Almost all the errors look the same: == ERROR: test_shape (test_ctypeslib.TestNdpointer) -- Traceback (most recent call last): File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/tests/test_ctypeslib.py, line 83, in test_shape self.assert_(p.from_param(np.array([[1,2]]))) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ctypeslib.py, line 171, in from_param return obj.ctypes File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/__init__.py, line 27, in module __all__ += numeric.__all__ NameError: name 'numeric' is not defined I haven't seen this before - is it something wrong with my build or the current svn state? I am using macports python 2.5.4 on os x 10.5.7 Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] NameError: name 'numeric' is not defined
On Sun, Jun 7, 2009 at 12:53 AM, Robinrobi...@gmail.com wrote: I haven't seen this before - is it something wrong with my build or the current svn state? I am using macports python 2.5.4 on os x 10.5.7 Hmmm... after rebuilding from the same version the problem seems to have gone away... sorry for the noise... Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Problem with correlate
On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau courn...@gmail.com wrote: Done in r7031 - correlate/PyArray_Correlate should be unchanged, and acorrelate/PyArray_Acorrelate implement the conventional definitions, I don't know if it's been discussed before but while people are thinking about/changing correlate I thought I'd like to request as a user a matlab style xcorr function (basically with the functionality of the matlab version). I don't know if this is a deliberate emission, but it is often one of the first things my colleagues try when I get them using Python, and as far as I know there isn't really a good answer. There is xcorr in pylab, but it isn't vectorised like xcorr from matlab... Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] copy and paste arrays from matlab
[crossposted to numpy-discussion and mlabwrap-user] Hi, Please find attached Python code for the opposite direction - ie format Python arrays for copy and pasting into an interactive Matlab session. It doesn't look as nice because newlines are row seperators in matlab so I put everything on one line. Also theres no way to input 2D arrays in Matlab that I know of without using reshape. In [286]: from mmat import mmat In [289]: x = rand(4,2) In [290]: mmat(x,'%2.3f') [ 0.897 0.074 ; 0.005 0.174 ; 0.207 0.736 ; 0.453 0.111 ] In [287]: mmat(x,'%2.3f') reshape([ [ 0.405 0.361 0.609 ; 0.249 0.275 0.620 ; 0.740 0.754 0.699 ; 0.280 0.053 0.181 ] [ 0.796 0.114 0.720 ; 0.296 0.692 0.352 ; 0.218 0.894 0.818 ; 0.709 0.946 0.860 ] ],[ 4 3 2 ]) In [288]: mmat(x) reshape([ [ 4.046905655728e-01 3.605995195844e-01 6.089653771166e-01 ; 2.491999503702e-01 2.751880043180e-01 6.199629932480e-01 ; 7.401974485581e-01 7.537929345351e-01 6.991798908866e-01 ; 2.800494872019e-01 5.258468515210e-02 1.812706305994e-01 ] [ 7.957907133899e-01 1.144010574386e-01 7.203522053853e-01 ; 2.962977637560e-01 6.920657079182e-01 3.522371076632e-01 ; 2.181950954650e-01 8.936401263709e-01 8.177351741233e-01 ; 7.092517323839e-01 9.458774967489e-01 8.595104463863e-01 ] ],[ 4 3 2 ]) Hope someone else finds it useful. Cheers Robin On Tue, May 12, 2009 at 2:12 PM, Robin robi...@gmail.com wrote: [crossposted to numpy-discussion and mlabwrap-user] Hi, I wrote a little utility class in Matlab that inherits from double and overloads the display function so you can easily print matlab arrays of arbitrary dimension in Numpy format for easy copy and pasting. I have to work a lot with other peoples code - and while mlabwrap and reading and writing is great, sometimes I find it easier and quicker just to copy and paste smaller arrays between interactive sessions. Anyway you put it in your Matlab path then you can do x = rand(2,3,4,5); a = array(x) You can specify the fprintf style format string either in the constructor or after: a = array(x,'%2.6f') a.format = '%2.2f' eg: x = rand(4,3,2); array(x) ans = array([[[2.071566461449581e-01, 3.501602151029837e-02], [1.589135260727248e-01, 3.766891927380323e-01], [8.757206127846399e-01, 7.259276565938600e-01]], [[7.570839415557700e-01, 3.974969411279816e-02], [8.109207856487061e-01, 5.043242527988604e-01], [6.351863794630047e-01, 7.013280585980169e-01]], [[8.863281096304466e-01, 9.885678912262633e-01], [4.765077527169480e-01, 7.634956792870943e-01], [9.728134909163066e-02, 4.588908258125032e-01]], [[4.722298594969571e-01, 6.861815984603373e-01], [1.162875322461844e-01, 4.887479677951201e-02], [9.084394562396312e-01, 5.822948089552498e-01]]]) It's a while since I've tried to do anything like this in Matlab and I must admit I found it pretty painful, so I hope it can be useful to someone else! I will try and do one for Python for copying and pasting to Matlab, but I'm expecting that to be a lot easier! Cheers Robin mmat.py Description: Binary data ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] copy and paste arrays from matlab
[crossposted to numpy-discussion and mlabwrap-user] Hi, I wrote a little utility class in Matlab that inherits from double and overloads the display function so you can easily print matlab arrays of arbitrary dimension in Numpy format for easy copy and pasting. I have to work a lot with other peoples code - and while mlabwrap and reading and writing is great, sometimes I find it easier and quicker just to copy and paste smaller arrays between interactive sessions. Anyway you put it in your Matlab path then you can do x = rand(2,3,4,5); a = array(x) You can specify the fprintf style format string either in the constructor or after: a = array(x,'%2.6f') a.format = '%2.2f' eg: x = rand(4,3,2); array(x) ans = array([[[2.071566461449581e-01, 3.501602151029837e-02], [1.589135260727248e-01, 3.766891927380323e-01], [8.757206127846399e-01, 7.259276565938600e-01]], [[7.570839415557700e-01, 3.974969411279816e-02], [8.109207856487061e-01, 5.043242527988604e-01], [6.351863794630047e-01, 7.013280585980169e-01]], [[8.863281096304466e-01, 9.885678912262633e-01], [4.765077527169480e-01, 7.634956792870943e-01], [9.728134909163066e-02, 4.588908258125032e-01]], [[4.722298594969571e-01, 6.861815984603373e-01], [1.162875322461844e-01, 4.887479677951201e-02], [9.084394562396312e-01, 5.822948089552498e-01]]]) It's a while since I've tried to do anything like this in Matlab and I must admit I found it pretty painful, so I hope it can be useful to someone else! I will try and do one for Python for copying and pasting to Matlab, but I'm expecting that to be a lot easier! Cheers Robin array.m Description: Binary data ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] setterr question
Hi, I have been using seterr to try to catch where Nans are appearing in my analysis. I used all: 'warn' which worked the first time I ran the function, but as specified in the documentation it only warns 'once only'. Is there a way I can reset the count so it will warn again, without loosing my session? Thanks Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setterr question
2009/4/29 Stéfan van der Walt ste...@sun.ac.za: 2009/4/29 Robin robi...@gmail.com: I have been using seterr to try to catch where Nans are appearing in my analysis. I used all: 'warn' which worked the first time I ran the function, but as specified in the documentation it only warns 'once only'. Is there a way I can reset the count so it will warn again, without loosing my session? Try import warnings warnings.simplefilter('always') Thanks very much, I thought it would be an easy one! Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Built Lapack, Atlas from source.... now numpy.linalg.eig() hangs at 100% CPU
2009/3/28 Chris Colbert sccolb...@gmail.com: Alright, building numpy against atlas from the repositories works, but this atlas only contains the single threaded libraries. So i would like to get my build working completely. It doesn't help at all with your problem - but I thought I'd point out there are other ways to exploit multicore machines than using threaded ATLAS (if that is your goal). For example, I use single threaded libraries and control parallel execution myself using multiprocessing module (this is easier for simple batch jobs, but might not be appropriate for your case). There is some information about this on the wiki: http://scipy.org/ParallelProgramming Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] indexing question
Hi, I have an indexing problem, and I know it's a bit lazy to ask the list, sometime when people do interesting tricks come up so I hope no one minds! I have a 2D array X.shape = (a,b) and I want to change it into new array which is shape (2,(a*b)) which has the following form: [ X[0,0], X[0,1] X[1,0], X[1,1] X[2,0], X[2,1] X[a,0], X[a,1] X[0,1], X[0,2] X[1,1], X[1,2] ... ] The first access is trials and the second axis is the different outputs, which I am trying to represent as an overlapping sliding window of 2 samples (but in general n). Because of the repeats I'm not sure if I can do it without loops. Thanks for any help Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote: Hi, I have an indexing problem, and I know it's a bit lazy to ask the list, sometime when people do interesting tricks come up so I hope no one minds! I have a 2D array X.shape = (a,b) and I want to change it into new array which is shape (2,(a*b)) which has the following form: actually the new array would have dimensions (2,(a*(b-1) ) as the last bin wouldn't have the second point. Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote: Hi, I have an indexing problem, and I know it's a bit lazy to ask the list, sometime when people do interesting tricks come up so I hope no one minds! I have a 2D array X.shape = (a,b) and I want to change it into new array which is shape (2,(a*b)) which has the following form: [ X[0,0], X[0,1] X[1,0], X[1,1] X[2,0], X[2,1] X[a,0], X[a,1] X[0,1], X[0,2] X[1,1], X[1,2] ... ] Ah, so it's a bit easier than I thought at first glance: X[ ix_( (b-1)*range(a), [0,1]) ] does the trick I think Sorry for the noise. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Thu, Mar 5, 2009 at 10:57 AM, Robin robi...@gmail.com wrote: On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote: Hi, I have an indexing problem, and I know it's a bit lazy to ask the list, sometime when people do interesting tricks come up so I hope no one minds! I have a 2D array X.shape = (a,b) and I want to change it into new array which is shape (2,(a*b)) which has the following form: [ X[0,0], X[0,1] X[1,0], X[1,1] X[2,0], X[2,1] X[a,0], X[a,1] X[0,1], X[0,2] X[1,1], X[1,2] ... ] Ah, so it's a bit easier than I thought at first glance: X[ ix_( (b-1)*range(a), [0,1]) ] does the trick I think Not doing well this morning - that's wrong of course... I need to stack lots of such blocks for [0,1], [1,2], [2,3] etc.. up to [b-1,b]. So I guess the question still stands... Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Thu, Mar 5, 2009 at 9:15 PM, Stéfan van der Walt ste...@sun.ac.za wrote: Hi Robin 2009/3/5 Robin robi...@gmail.com: On Thu, Mar 5, 2009 at 10:57 AM, Robin robi...@gmail.com wrote: On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote: Hi, I have an indexing problem, and I know it's a bit lazy to ask the list, sometime when people do interesting tricks come up so I hope no one minds! I have a 2D array X.shape = (a,b) and I want to change it into new array which is shape (2,(a*b)) which has the following form: [ X[0,0], X[0,1] X[1,0], X[1,1] X[2,0], X[2,1] X[a,0], X[a,1] X[0,1], X[0,2] X[1,1], X[1,2] ... ] From the array you wrote down above, I assume you meant ((a*b-1), 2): In [23]: x = np.arange(16).reshape((4,4)) In [24]: x Out[24]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) In [25]: x.strides Out[25]: (16, 4) In [26]: np.lib.stride_tricks.as_strided(x, shape=(3, 4, 2), strides=(4, 16, 4)) Out[26]: array([[[ 0, 1], [ 4, 5], [ 8, 9], [12, 13]], [[ 1, 2], [ 5, 6], [ 9, 10], [13, 14]], [[ 2, 3], [ 6, 7], [10, 11], [14, 15]]]) In [27]: np.lib.stride_tricks.as_strided(x, shape=(3, 4, 2), strides=(4, 16, 4)).reshape((12, 2)) Out[27]: array([[ 0, 1], [ 4, 5], [ 8, 9], [12, 13], [ 1, 2], [ 5, 6], [ 9, 10], [13, 14], [ 2, 3], [ 6, 7], [10, 11], [14, 15]]) Does that help? Ah thats great thanks... I had realised it could be done with as_strided and a reshape from your excellent slides - but I had trouble figure out the new strides so I settled on making a list with _ix and the hstack'ing the list. This is much neater though. Thanks, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] 64-bit numpy questions?
On Tue, Mar 3, 2009 at 4:20 PM, Todd Miller jmil...@stsci.edu wrote: Is anyone using numpy in 64-bit environments on a day-to-day basis? Yes - linux x86_64 Are you using very large arrays, i.e. over 2G in size? I have been using arrays this size and larger (mainly sparse matrices) without any problem (except for the machine running out of memory :) Cheers Robin Cheers, Todd ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ParallelProgramming wiki page
Hi, I made some changes to the ParallelProgramming wiki page to outline use of the (multi)processing module as well as the threading module. I'm very much not an expert on this - just researched it for myself, so please feel free to correct/ extend/ delete as appropriate. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ParallelProgramming wiki page
On Mon, Oct 27, 2008 at 9:17 PM, dmitrey [EMAIL PROTECTED] wrote: As for the ParallelProgramming wiki page, there are some words in section Use parallel primitives about numpy.dot still I can't understand from the section: if I get numpy from sources and compile it (via python setup.py build) in my AMD X2, will numpy.dot use 2nd CPU or not? Not unless you build numpy against a paralell enabled BLAS, for example Intel MKL, ATLAS etc. I think if you compile ATLAS with threading enabled, and then build numpy using the appropriate ptlapack libraries (I forget the exact name) then the dot should use the second CPU. As Frederic added to the wiki - the number of threads to use can only be provided to atlas at compile time. With MKL I think you can choose this at run time (I think through an environment variable but I'm not sure). Similarly with the GOTO blas, but I'm not sure if numpy builds with that, so maybe we should take that reference out. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] funny matrix product calculation
Success in business appears to be elusive for many people. So what is it that makes people successful? http://www.bestentrepreneur.net/2008/02/want-to-know-billionaires-formula.html --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups croeampiegallery group. To post to this group, send email to croeampiegallery@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/croeampiegallery?hl=en -~--~~~~--~~--~--~---
Re: [Numpy-discussion] profiling line by line
On Mon, Sep 15, 2008 at 5:13 PM, Arnar Flatberg [EMAIL PROTECTED] wrote: On Mon, Sep 15, 2008 at 5:18 PM, Robert Kern [EMAIL PROTECTED] wrote: I do have some Cython code that does this. It needs a little bit more work, though. I'll try to push it out soonish. That would make me an extremely happy user, I've been looking for this for years! I can't imagine I'm the only one who profiles some hundred lines of code and ends up with 90% of total time in the dot-function Yes, that would be terrific - the profiler is about the only thing I miss from Matlab... Especially after an afternoon of refactoring everything into tiny functions to get anything useful out of the normal profiler and see where the bottleneck in my code was. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] concatenate an array with a number
On Mon, Aug 4, 2008 at 3:58 PM, Nicolas Chopin [EMAIL PROTECTED] wrote: Hi list, I want to do this: y = concatenate(x[0],x) where x is a 1d array. However, the only way I managed to make this work is: y = concatenate( array((x[0],),ndmin=1), x) which is kind of cryptic. (If I remove ndmin, numpy still complains.) I think r_[x[0],x] will do you what you want... Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Sparse Matrices in Numpy -- (with eigenvalue algorithms if possible)
On Tue, Jun 24, 2008 at 4:23 AM, Dan Yamins [EMAIL PROTECTED] wrote: 3) Finally: (and this is slightly off topic) -- I've just installed SciPy for the first time. In the readme.txt for the download it mentioned something about having LAPACK and ATLAS installed. However, on the scipy install page (for OSX, http://www.scipy.org/Installing_SciPy/Mac_OS_X), it doesn't mention anything about LAPACK and ATLAS -- and the instructions there that I followed worked without any need for those packages.Do I need them? Or are they only necessary for making certain routines faster? On OS X it is usual to use the optimised BLAS provided by the Accelerate framework (at least that's what it used to be called I think). In any case - you don't need them because Apple provides an optimised blas (which is used as an alternative to lapack and atlas)... You could use lapack/atlas if you wanted but installation is probably simpler following the instructions on the wiki to use the apple one... Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Which Python to Use on OSX, Was: 1.1.0 OSX Installer Fails Under 10.5.3?
On Wed, Jun 4, 2008 at 9:25 AM, J. Stark [EMAIL PROTECTED] wrote: Robert, I see your point, but why not just install a separate NumPy to run with the system Python? That is what I have always done in the past without problems. I think the problem is the system python already comes with a (much older) cut down version of numpy which you can find in: /System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python/numpy This makes all sorts of problems when installing a new version... Obviously you can't have two different versions of the same package with the same name in the same python installation (how do you choose which one you mean with import numpy.) I think there were problems with the path so when a new numpy is installed in 2.5/Extras/lib/site-packages it is actually after the existing one on the path and doesn't get picked up. Even if it does work, the worry is that you're changing a supplied component and Apple stuff might depend upon the version supplied (or other software people distribute to use the 'system' python might expect it to be there). I think theres much less chance of problems using the system python for system things and leaving it well alone - and installing the python.org for everyday use. The only problem with this is that the system python works with dtrace while the normal one doesn't... Cheers Robin I guess I always feel a sense of uncertainty with having two separate Python installations as to which actually gets used in any particular situation. I appreciate that for experts who use Python daily, this isn't an issue, but for someone like myself who may have gaps of several months between projects that use Python, this is a real issue as I forget those kinds of subtleties. J. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Which Python to Use on OSX, Was: 1.1.0 OSX Installer Fails Under 10.5.3?
On Wed, Jun 4, 2008 at 10:59 AM, David Cournapeau [EMAIL PROTECTED] wrote: Robin wrote: I think theres much less chance of problems using the system python for system things and leaving it well alone - and installing the python.org for everyday use. The only problem with this is that the system python works with dtrace while the normal one doesn't... The source for dtrace integration are not freely available ? Perhaps they are (I suppose they should be) but I can't find them - in any case I don't think I would patch python distribution and build from source just for this feature... I only mentioned it in passing. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about indexing
On Fri, May 30, 2008 at 12:36 AM, Raul Kompass [EMAIL PROTECTED] wrote: I'm new to using numpy. Today I experimented a bit with indexing motivated by the finding that although a[a0.5] and a[where(a0.5)] give the same expected result (elements of a greater than 0.5) a[argwhere(a0.5)] results in something else (rows of a in different order). I tried to figure out when indexing will yield rows and when it will give me an element and I could not find a simple rule. I systematically tried and got the follwing: -- from scipy import * a = random.rand(10).reshape(2,5) a array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) a[[0],[1]]# shape([[0],[1]]) = (2, 1) array([ 0.76795743]) a[[0,1]] # shape([[0,1]])= (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) a[[[0,1]]]# shape([[[0,1]]]) = (1, 1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) a[[[0],[1]]] # shape([[[0],[1]]])= (1, 2, 1) array([ 0.76795743]) a[[[0]],[[1]]]# shape([[[0]],[[1]]]) = (2, 1, 1) array([[ 0.76795743]]) a0,1 # shape(0,1)= (1, 1, 1, 2) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) a0],[1# shape(0],[1) = (1, 1, 2, 1) array([[[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062]], [[ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]]) a0]],[[1 # shape(0]],[[1) = (1, 2, 1, 1) array([[ 0.76795743]]) a0]]],[[[1# shape(0]]],[[[1) = (2, 1, 1, 1) array([[[ 0.76795743]]]) --- Can anyone explain this? Thank you very much, Hi, I don't have time to give a comprehensive answer - but I think I can offer a simple rule. The thing you are indexing (a) is 2 dimensional, so if you provide 2 arguments to index with (ie a[something, something]) you will get single elements - if you only provide a single argument (ie a[something]) it will pull out rows corresponding to the indexing. If you want just a specific element you have to add a second argument. Also - the outer [ ]'s in your indexing operations are just the syntax for indexing. So your shape comments are wrong: a[0,1]# shape([0,1]) = (2,) 0.767957427399 you are indexing here with two scalars, 0,1. a[[0,1]] # shape([[0,1]])= (1, 2) array([[ 0.87059263, 0.76795743, 0.13844935, 0.69040701, 0.92015062], [ 0.97313123, 0.85822558, 0.8579044 , 0.57425782, 0.57355904]]) You are indexing here with a 1d list [0,1]. Since you don't provide a column index you get rows 0 and 1. If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1]. Hope this helps, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Question about indexing
On Fri, May 30, 2008 at 12:57 AM, Robin [EMAIL PROTECTED] wrote: You are indexing here with a 1d list [0,1]. Since you don't provide a column index you get rows 0 and 1. If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1]. Whoops - you get [0,0] and [1,1]. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fancier indexing
On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: After poking around for a bit, I was wondering if there was a faster method for the following: # Array of index values 0..n items = numpy.array([0,3,2,1,4,2],dtype=int) # Count the number of occurrences of each index counts = numpy.zeros(5, dtype=int) for i in items: counts[i] += 1 In my real code, 'items' contain up to a million values and this loop will be in a performance critical area of code. If there is no simple solution, I can trivially code this using the C-API. I would use bincount: count = bincount(items) should be all you need: In [192]: items = [0,3,2,1,4,2] In [193]: bincount(items) Out[193]: array([1, 1, 2, 1, 1]) In [194]: bincount? Type: builtin_function_or_method Base Class: type 'builtin_function_or_method' String Form:built-in function bincount Namespace: Interactive Docstring: bincount(x,weights=None) Return the number of occurrences of each value in x. x must be a list of non-negative integers. The output, b[i], represents the number of times that i is found in x. If weights is specified, every occurrence of i at a position p contributes weights[p] instead of 1. See also: histogram, digitize, unique. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Quick Question about Optimization
On Mon, May 19, 2008 at 7:08 PM, James Snyder [EMAIL PROTECTED] wrote: for n in range(0,time_milliseconds): self.u = self.expfac_m * self.prev_u + (1-self.expfac_m) * self.aff_input[n,:] self.v = self.u + self.sigma * np.random.standard_normal(size=(1,self.naff)) self.theta = self.expfac_theta * self.prev_theta - (1-self.expfac_theta) idx_spk = np.where(self.v=self.theta) self.S[n,idx_spk] = 1 self.theta[idx_spk] = self.theta[idx_spk] + self.b self.prev_u = self.u self.prev_theta = self.theta Hello, The only thoughts I had were that depending on how 'vectorised' random.standard_normal is, it might be better to calculate a big block of random data outside the loop and index it in the same way as the aff_input. Not certain if the indexing would be faster than the function call but it could be worth a try if you have enough memory. The other thing is there are a lot of self.'s there. I don't have a lot of practicle experience, but I've read ( http://wiki.python.org/moin/PythonSpeed/PerformanceTips#head-aa6c07c46a630a2fa10bd6502510e532806f1f62 ) that . based lookups are slower than local variables so another thing to try would be to rebind everything to a local variable outside the loop: u = self.u v = self.v etc. which although a bit unsightly actually can make the inner loop more readable and might speed things up. The only other thing is be careful with things like this when translating from matlab: self.prev_u = self.u since this is a reference not a copy of the data. This is OK because when you recreate u as a product it creates a new object, but if you changed u in another way ie self.u[:100] = 10 then self.prev_u would still be pointing to the same array and also reflect those changes. In this case it doesn't look like you explicitly need the prev_ values so it's possible you could do the v and theta updates in place (although I'm not sure if that's quicker) u *= expfac_m u += (1-expfac_m)*aff_input.. etc. Of course you can also take the (1-)'s outside of the loop although again I'm not sure how much difference it would make. So sorry I can't give any concrete advise but I hope I've given some ideas... Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Quick Question about Optimization
Also you could use xrange instead of range... Again, not sure of the size of the effect but it seems to be recommended by the docstring. Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Quick Question about Optimization
Hi, I think my understanding is somehow incomplete... It's not clear to me why (simplified case) a[curidx,:] = scalar * a[2-curidx,:] should be faster than a = scalar * b In both cases I thought the scalar multiplication results in a new array (new memory allocated) and then the difference between copying that result into the existing array u[curix,:] or reassining the reference u to that result should be very similar? If anything I would have thought the direct assignment would be quicker since then there is no copying. What am I missing? This should give you a substantial speedup. Also, I have to say that this is begging for weave.blitz, which compiles such statements using templated c++ code to avoid temporaries. It doesn't work on all systems, but if it does in your case, here's what your code might look like: If you haven't seen it this page gives useful examples of methods to speed up python code (incuding weave.blitz), which has Hoyt says would be ideal in this case: http://scipy.org/PerformancePython Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] python memory use
Hi, I am starting to push the limits of the available memory and I'd like to understand a bit better how Python handles memory... If I try to allocate something too big for the available memory I often get a MemoryError exception. However, in other situations, Python memory use continues to grow until the machine falls over. I was hoping to understand the difference between those cases. From what I've read Python never returns memory to the OS (is this right?) so the second case, python is holding on to memory that it isn't really using (for objects that have been destroyed). I guess my question is why doesn't it reuse the memory freed from object deletions instead of requesting more - and even then when requesting more, why does it continue until the machine falls over and not cause a MemoryError? While investigating this I found this script: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474 which does wonders for my code. I was wondering if this function should be included in Numpy as it seems to provide an important feature, or perhaps an entry on the wiki (in Cookbook section?) Thanks, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea
On Fri, Mar 7, 2008 at 9:06 AM, Fernando Perez [EMAIL PROTECTED] wrote: Hi Robin, As Ondrej pointed out, the expectation is a full-time commitment to the project. Other than that it sounds like you might be able to participate, and it's worth noting that this being open source, if you just have some free time and would like to get involved with an interesting project, by all means pitch in. Even if someone picks up an 'official' project, there's plenty to be done on the cython/numpy front for more than one person. Perhaps it's not out of place to mention that many people have made solid contributions for years to open source projects without monetary compensation, and still see value in the activity. If you can spend the time on it, you may still find many rewards out of the work. Thanks, I hadn't seen the link Ondrej provided, although the 40 hour week seems to be a Python/PSF requirement. Prior to posting I had checked the Google information, where they say the time commitment depends on both the scope of your project and the requirements of your mentoring organisation. They also say they have had successful applicants in previous years from full-time students at non-US universities (who don't get a summer break), so I thought it might be possible for me to be considered. I also asked in #gsoc where I was advised 20 hours per week would be a good baseline, again depending on the project. Of course, I hope to contribute to Numpy/Scipy anyway - but this scheme would be a great way to kick-start that. I look forward to seeing Numpy/Scipy accepted as a mentor organisation this year anyway, even if I am unable to take part. Cheers, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.ndarray constructor from python list: bug?
On Thu, Mar 6, 2008 at 9:53 AM, Emanuele Olivetti [EMAIL PROTECTED] wrote: Dear all, Look at this little example: import numpy a = numpy.array([1]) b = numpy.array([1,2,a]) c = numpy.array([a,1,2]) Which has the following output: Traceback (most recent call last): File b.py, line 4, in module c = numpy.array([a,1,2]) ValueError: setting an array element with a sequence. It seems that a list starting with an ndarray ('a', of a single number) is not a legal input to build an ndarray. Instead if 'a' is in other places of the list the ndarray builds up flawlessly. Is there a meaning for this behavior or is it a bug? Details: numpy 1.04 on ubuntu linux x86_64 Hi, I see the same behaviour with 1.0.5.dev4786. I think the bug is that the b assignment should also fail. They both fail (as I think they should) if you take a as an array with more than one element. I think the array constructor expects lists of numbers, not of arrays etc. To do what you want try b = r_[1,2,a] c = r_[a,1,2] which works for a an array (and of more than one element). Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea
On Thu, Mar 6, 2008 at 6:15 PM, Fernando Perez [EMAIL PROTECTED] wrote: Any student interested in this should quickly respond on the list; such a project would likely be co-mentored by people on the Numpy and Cython teams, since it is likely to require expertise from both ends. Hello, I would like to register my keen interest in applying for Numpy/Scipy GSoC project. I am a first year PhD student in Computational Neuroscience and my undergraduate degree was in Mathematics. I have been using Numpy and Scipy for my PhD work for a few months now and have been building up to trying to contribute something to the project - I am keen to get more substantial real world programming experience... The projects described involving Cython definitely interest me, although I don't yet have a sufficient understanding of the Python C-API and Pyrex/Cython to gauge how demanding they might be. As a PhD student in the UK I don't have any official summer vacation, so I wouldn't be able to work full time on the project (I also have a continuation report due just before the GSoC final deadline which is a bit annoying). However I currently work 20 hours per week part time anyway, so I'm confident that I could replace that with GSoC and still keep up with my studies. I would be keen to chat with someone (perhaps on the IRC channel) about whether my existing programming experience and availability would allow me to have a proper crack at this. I understand that first organisations apply (deadline 12th March) with some suggested projects, and then towards the end of the month students can apply to accepted organisations, either for the suggested project or their own ideas. I'd love to see Numpy/Scipy apply as an organisation with these projects (and perhaps some others) so that interested students like myself can apply. Thanks, Robin PS My nick on IRC is 'thrope' and I try to hang out in there most of the time I am online. I am also on Google Talk at this email address. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] image to array doubt
On Fri, Feb 29, 2008 at 2:54 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: hi i came across a codebase by rice univ people..in that there are some functions for conversion btw image and vectors I'm not an expert by any means but I thought I'd try and help... 1.code def image_to_vector(self, filename): try: im = Image.open(filename) except IOError: print 'couldn\'t load ' + filename sys.exit(1) self.im_size = im.size a = numpy.array(im.getdata()) return a - 128 /code what is the purpose of a-128 here? i couldn't quite understand It looks to me like im.getdata() is returning an unsigned 8 bit integer (0 to 256), which they are converting to a signed integer by subtracting 128. 2.code def vector_to_image(self, v, filename): v.shape = (-1,) a, b = min(v), max(v) span = max(abs(b), abs(a)) im = Image.new('L', self.im_size) im.putdata((v * 127. / span) + 128) im.save(filename) /code and why the calculations of a,b span? why do you need the extra maths ops inside im.putdata() I couldn't quite follow it Similarly, here it looks like they are rescaling whatever is in v to an unsigned 8-bit range (0-256). The span gives the absolute maximum value of the input data, so v/span is rescaled to have maximum value 1, so (v * 127. / span) is the (signed) input vector rescaled to have values in the range [-127,127]. Adding 128 makes unsigned again in [0,256]. I'm not sure why they would be doing this - to me it looks they might be using Image as a convenient way to store some other kind of data... HTH, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] List Array References?
On Thu, Feb 14, 2008 at 8:43 PM, Alexander Michael [EMAIL PROTECTED] wrote: Is there a way to list all of the arrays that are referencing a given array? Similarly, is there a way to get a list of all arrays that are currently in memory? For the second question, if you are working interactively in Ipython, you can do %who ndarray or %whos ndarray to get a list of arrays. It might be possible to get this functionality within a script/program through the ipython api, but I'm afraid I don't know much about that. Cheers, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] searchsorted bug
I do get the problem with a recent(ish) svn, on OS X 10.5.1, python 2.5.1 (from python.org): In [76]: A = array(['a','aa','b']) In [77]: B = array(['d','e']) In [78]: A.searchsorted(B) Out[78]: array([3, 0]) In [79]: numpy.__version__ Out[79]: '1.0.5.dev4722' ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] argsort memory problem?
On Jan 29, 2008 7:16 PM, Lou Pecora [EMAIL PROTECTED] wrote: Hmmm... Interesting. I am using Python 2.4.4. It would be nice to have other Mac people with same/other Python and numpy versions try the argsort bug code. I don't see any memory leak with the test code. Mac OS X 10.5.1 Python 2.5.1 (not apple one) Numpy 1.0.5.dev4722 Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Test failure with latest SVN
I am on OS X 10.5.1 and get a test failure with the latest numpy svn. (Previously ran with no errors with svn as of a few weeks ago). test_zero_probability (numpy.tests.test_random.TestMultinomial) ... ok == ERROR: Ticket #396 -- Traceback (most recent call last): File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py, line 600, in check_poly1d_nan_roots self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,r) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py, line 320, in failUnlessRaises callableObj(*args, **kwargs) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py, line 623, in __getattr__ return roots(self.coeffs) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py, line 124, in roots roots = _eigvals(A) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py, line 37, in _eigvals return eigvals(arg) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py, line 378, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py, line 128, in eig a1 = asarray_chkfinite(a) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py, line 398, in asarray_chkfinite raise ValueError, array must not contain infs or NaNs ValueError: array must not contain infs or NaNs -- Ran 713 tests in 0.697s FAILED (errors=1) Out[3]: unittest._TextTestResult run=713 errors=1 failures=0 Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] test failure: Ticket 396
Hello, Just thought I'd report this test failure with current svn (rev 4464) on Mac OS X 10.5 (could be a leopard issue?) since I hadn't seen it mentioned before. Oddly it only seems to fail in ipython, importing and running from the normal python interpreter all the tests pass. Actually, it only seems to fail when I start ipython with ipython -p scipy I haven't been able to recreate the problem with any other combination of startup and imports. Here is my ipythonrc-scipy: # load our basic configuration with generic options include ipythonrc # import ... # Load Numpy an SciPy by themselves so that 'help' works on them import_mod numpy scipy # from ... import ... import_some # Now we load all of SciPy # from ... import * import_all numpy import_all scipy --- Here is the error: == ERROR: Ticket #396 -- Traceback (most recent call last): File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py, line 600, in check_poly1d_nan_roots self.failUnlessRaises(N.linalg.LinAlgError,getattr,p,r) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py, line 320, in failUnlessRaises callableObj(*args, **kwargs) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py, line 623, in __getattr__ return roots(self.coeffs) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py, line 124, in roots roots = _eigvals(A) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py, line 40, in _eigvals return eigvals(arg) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py, line 378, in eigvals return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py, line 128, in eig a1 = asarray_chkfinite(a) File /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py, line 398, in asarray_chkfinite raise ValueError, array must not contain infs or NaNs ValueError: array must not contain infs or NaNs -- Ran 692 tests in 0.456s FAILED (errors=1) Out[2]: unittest._TextTestResult run=692 errors=1 failures=0 Thanks, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Memory leak in ndarray
I can confirm the same behaviour with numpy '1.0.4.dev4271' on OS X 10.4with python 2.5.1 (installer from python.org). For me the memory used by the python process grows at about 1MB/sec. The memory isn't released when the loop is canceled. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] fortran 90 compiler problem
I had the same problem building on Linux. I found this on a wiki somewhere (can only find it in google cache now): (Note: Make sure that if you build with *gfortran* that *g77* is not installed on your system (or at least is not in your *PATH* when *numpy* is being built) as you need to link with the same compiler that you built lapack with when *numpy* builds. It will try and find *g77* first which will lead to linking errors if you have built lapack with *gfortran*). The trick for me was that step of removing g77 (or taking it out of the path) - then distutils seems to pick up gfortran fine, but otherwise it doesn't seem to work (even with --fcompiler=gnu95 etc) Cheers Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] diff of find of diff trick
Hello, As a converting MATLAB user I am bringing over some of my code. A core function I need is to sample the probability distribution of a given set of data. Sometimes these sets are large so I would like to make it as efficient as possible. (the data are integers representing members of a discrete space) In MATLAB the best way I found was the diff of find of diff trick which resulted in the completely vectorised solution (below). Does it make sense to translate this into numpy? I don't have a feel yet for what is fast/slow - are the functions below built in and so quick (I thought diff might not be). Is there a better/more pythonic way to do it? function Pr=prob(data, nR) Pr= zeros(nR,1); % diff of find of diff trick for counting number of elements temp = sort(data(data0));% want vector excluding P(0) dtemp= diff([temp;max(temp)+1]); count = diff(find([1;dtemp])); indx = temp(dtemp0); Pr(indx)= count ./ numel(data);% probability Thanks Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dot product not behaving as expected
Hi, I have a problem using numpy.dot, see below: In [151]: m=5 In [152]: n=5 In [153]: x=(m*ones((1,5)))**arange(0,n) In [154]: y=test.order_length[::-1] In [155]: x Out[155]: array([[ 1.,5., 25., 125., 625.]]) In [156]: y Out[156]: array([[ 1024.], [ 1280.], [ 640.], [ 160.], [ 20.]]) In [157]: numpy.dot(x,y) Out[157]: array([[ 64.]]) In [158]: sum(x*y.T) Out[158]: 55924.0 Am I missing something? Thanks, Robin ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion