Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Robin
On Mon, Jan 14, 2013 at 2:57 PM, Benjamin Root ben.r...@ou.edu wrote:
 I am also +1 on the idea of having a filled() and filled_like() function (I
 learned a long time ago to just do a = np.empty() and a.fill() rather than
 the multiplication trick I learned from Matlab).  However, the collision
 with the masked array API is a non-starter for me.  np.const() and
 np.const_like() probably make the most sense, but I would prefer a verb over
 a noun.

To get an array of 1's, you call np.ones(shape), to get an array of
0's you call np.zeros(shape) so to get an array of val's why not call
np.vals(shape, val)?

Cheers

Robins
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Embedded NumPy LAPACK errors

2013-01-08 Thread Robin
On Sat, Jan 5, 2013 at 1:03 PM, Robin robi...@gmail.com wrote:
 If not, is there a reasonable way to build numpy.linalg such that
 it interfaces with MKL correctly ?

I managed to get this to work in the end. Since Matlab uses MKL with
ILP64 interface it is not possible to get Numpy to use that without
modifications to all the lapack calls. However, I was able to keep the
two different versions of lapack seperate.

The first step is to build numpy to link statically against MKL. I
wasn't sure how to get distutils to do this so I copied all the mkl
static .a libaries to a temporary directory and pointed numpy to that
to force the issue (so dynamic linking wasn't an option).

Even with that it still uses the Lapack from the Matlab dynamic global
symbols. The trick was adding the linker flag -Bsymbolic which means
lapack_lite calls to lapack use the statically linked local copies.
With these changes everything appears to work. There are two test
failures (below) which do not appear when running the same Numpy build
outside of Matlab but they don't seem so severe.

So:
[robini@robini2-pc numpy]$ cat site.cfg
[mkl]
search_static_first = true
library_dirs = /tmp/intel64
include_dirs = /opt/intel/mkl/include
#mkl_libs = mkl_sequential, mkl_intel_lp64, mkl_core,
mkl_lapack95_lp64, mkl_blas95_lp64
mkl_libs = mkl_lapack95, mkl_blas95, mkl_intel_lp64, mkl_sequential,
mkl_core, svml, imf, irc
lapack_libs =

[robini@robini2-pc numpy]$ ls /tmp/intel64/
libimf.a   libmkl_gnu_thread.a
libirc.a   libmkl_intel_ilp64.a
libmkl_blacs_ilp64.a   libmkl_intel_lp64.a
libmkl_blacs_intelmpi_ilp64.a  libmkl_intel_sp2dp.a
libmkl_blacs_intelmpi_lp64.a   libmkl_intel_thread.a
libmkl_blacs_lp64.alibmkl_lapack95_ilp64.a
libmkl_blacs_openmpi_ilp64.a   libmkl_lapack95_lp64.a
libmkl_blacs_openmpi_lp64.alibmkl_pgi_thread.a
libmkl_blacs_sgimpt_ilp64.alibmkl_scalapack_ilp64.a
libmkl_blacs_sgimpt_lp64.a libmkl_scalapack_lp64.a
libmkl_blas95_ilp64.a  libmkl_sequential.a
libmkl_blas95_lp64.a   libmkl_solver_ilp64.a
libmkl_cdft_core.a libmkl_solver_ilp64_sequential.a
libmkl_core.a  libmkl_solver_lp64.a
libmkl_gf_ilp64.a  libmkl_solver_lp64_sequential.a
libmkl_gf_lp64.a   libsvml.a

in numpy/distutils/intelccompiler.py:
class IntelEM64TCCompiler(UnixCCompiler):
 A modified Intel x86_64 compiler compatible with a 64bit gcc
built Python.

compiler_type = 'intelem'
cc_exe = 'icc -m64 -fPIC'
cc_args = -fPIC
def __init__ (self, verbose=0, dry_run=0, force=0):
UnixCCompiler.__init__ (self, verbose,dry_run, force)
self.cc_exe = 'icc -m64 -fPIC -O3 -fomit-frame-pointer'
compiler = self.cc_exe
self.set_executables(compiler=compiler,
 compiler_so=compiler,
 compiler_cxx=compiler,
 linker_exe=compiler,
 linker_so=compiler + ' -shared
-static-intel -Bsymbolic')

Test failures (test_special_values also fails outside Matlab, but the
other 2 only occur when embedded):
==
FAIL: test_umath.test_nextafterl
--
Traceback (most recent call last):
  File /opt/epd-7.3/lib/python2.7/site-packages/nose/case.py, line
197, in runTest
self.test(*self.arg)
  File 
/home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py,
line 215, in knownfailer
return f(*args, **kwargs)
  File 
/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py,
line 1123, in test_nextafterl
return _test_nextafter(np.longdouble)
  File 
/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py,
line 1108, in _test_nextafter
assert np.nextafter(one, two) - one == eps
AssertionError

==
FAIL: test_umath.test_spacingl
--
Traceback (most recent call last):
  File /opt/epd-7.3/lib/python2.7/site-packages/nose/case.py, line
197, in runTest
self.test(*self.arg)
  File 
/home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py,
line 215, in knownfailer
return f(*args, **kwargs)
  File 
/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py,
line 1149, in test_spacingl
return _test_spacing(np.longdouble)
  File 
/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py,
line 1132, in _test_spacing
assert np.spacing(one) == eps
AssertionError

==
FAIL: test_special_values (test_umath_complex.TestClog)
--
Traceback (most recent call last):
  File 
/home/robini

Re: [Numpy-discussion] Embedded NumPy LAPACK errors

2013-01-05 Thread Robin
Coincidently I have been having the same problem this week. Unrelated
to the problem, I would suggest looking at pymex which 'wraps' python
inside Matlab very nicely, although it has the same problem with
duplicate lapack symbols.

https://github.com/kw/pymex

I have the same problem with Enthough EPD which is built against MKL -
but I think the problem is that Intel provide two different interfaces
- ILP64 with 64 bit integer indices and LP64 with 32 bit integers.
Matlab link against the ILP64 version, whereas Enthought use the LP64
version - so there are still incompatible.

Cheers

Robin

On Sat, Jan 5, 2013 at 10:42 AM, Paul Anton Letnes
paul.anton.let...@gmail.com wrote:

 On 4. jan. 2013, at 21:42, m...@eml.cc wrote:

 Hiall,


 I am trying to embed numerical code in a mexFunction,
 as called by MATLAB, written as a Cython function.

 NumPy core functions and BLAS work fine, but calls to LAPACK
 function such as SVD seem to be made against to MATLAB's linked
 MKL, and this generates MKL errors. When I try this with
 Octave, it works fine, presumably because it is compiled against
 the same LAPACK as the NumPy I am embedding.


 Assuming I haven't made big mistakes up to here, I have the
 following questions:

 Is there a way to request numpy.linalg to use a particular
 LAPACK library, e.g. /usr/lib/liblapack.so ?

 If not, is there a reasonable way to build numpy.linalg such that
 it interfaces with MKL correctly ?

 It's possible, but it's much easier to install one of the pre-built python 
 distributions. Enthought, WinPython and others include precompiled 
 python/numpy/scipy/etc with MKL. If that works for you, I'd recommend that 
 route, as it involves less work.

 Good luck,
 Paul

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-24 Thread Robin
On Tue, Jan 24, 2012 at 6:24 PM, David Warde-Farley
warde...@iro.umontreal.ca wrote:
 On Tue, Jan 24, 2012 at 06:00:05AM +0100, Sturla Molden wrote:
 Den 23.01.2012 22:08, skrev Christoph Gohlke:
 
  Maybe this explains the win-amd64 behavior: There are a couple of places
  in mtrand where array indices and sizes are C long instead of npy_intp,
  for example in the randint function:
 
  https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L863
 
 

 Both i and length could overflow here. It should overflow on allocation
 of more than 2 GB.

 There is also a lot of C longs in the internal state (line 55-105), as
 well as the other functions.

 Producing 2 GB of random ints twice fails:

 Sturla, since you seem to have access to Win64 machines, do you suppose you
 could try this code:

 a = numpy.ones((1, 972))
 b = numpy.zeros((4993210,), dtype=int)
 c = a[b]

 and verify that there's a whole lot of 0s in the matrix, specifically,

 c[574519:].sum()
 356.0
 c[574520:].sum()
 0.0

 is the case on Linux 64-bit; is it the case on Windows 64?

Yes - I get exactly the same numbers in 64 bit windows with 1.6.1.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] advanced indexing bug with huge arrays?

2012-01-23 Thread Robin
On Mon, Jan 23, 2012 at 7:55 PM, David Warde-Farley
warde...@iro.umontreal.ca wrote:
 I've reproduced this (rather serious) bug myself and confirmed that it exists
 in master, and as far back as 1.4.1.

 I'd really appreciate if someone could reproduce and confirm on another
 machine, as so far all my testing has been on our single high-memory machine.

I see the same behaviour on a Winodows machine with numpy 1.6.1. But I
don't think it is an indexing problem - rather something with the
random number creation. a itself is already zeros for high indexes.

In [8]: b[100:110]
Out[8]:
array([3429029, 1251819, 4292918, 2249483,  757620, 3977130, 3455449,
   2005054, 2565207, 3114930])

In [9]: a[b[100:110]]
Out[9]:
array([[0, 0, 0, ..., 0, 0, 0],
   [0, 0, 0, ..., 0, 0, 0],
   [0, 0, 0, ..., 0, 0, 0],
   ...,
   [0, 0, 0, ..., 0, 0, 0],
   [0, 0, 0, ..., 0, 0, 0],
   [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)

In [41]: a[581350:,0].sum()
Out[41]: 0

Cheers

Robin

 Thanks,
 David

 On Mon, Jan 23, 2012 at 05:23:28AM -0500, David Warde-Farley wrote:
 A colleague has run into this weird behaviour with NumPy 1.6.1, EPD 7.1-2, 
 on Linux (Fedora Core 14) 64-bit:

  a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8')
  b = numpy.random.randint(500,size=(4993210,))
  c = a[b]

 It seems c is not getting filled in full, namely:

  In [14]: c[100:].sum()
  Out[14]: 0

 I haven't been able to reproduce this quite yet, I'll try to find a machine 
 with sufficient memory tomorrow. But does anyone have any insight in the 
 mean time? It smells like some kind of integer overflow bug.

 Thanks,

 David
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] upsample or scale an array

2011-12-03 Thread Robin Kraft
Thanks Warren, this is great, and even handles giant arrays just fine if you've 
got enough RAM.

I also just found this StackOverflow post with another solution.

a.repeat(2, axis=0).repeat(2, axis=1). 
http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array

np.kron lets you do more, but for my simple use case the repeat() method is 
faster and more ram efficient with large arrays.

In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')

In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
10 loops, best of 3: 182 ms per loop

In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8'))
1 loops, best of 3: 513 ms per loop


Or for a 43200x4800 array:

In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')

In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
1 loops, best of 3: 6.92 s per loop

In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8'))
1 loops, best of 3: 27.8 s per loop

In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 
1.7gb.

Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, 
and should have come to the Numpy list a lot sooner!

-Robin


On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
 On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
 
  I need to take an array - derived from raster GIS data - and upsample or
  scale it. That is, I need to repeat each value in each dimension so that,
  for example, a 2x2 array becomes a 4x4 array as follows:
 
  [[1, 2],
   [3, 4]]
 
  becomes
 
  [[1,1,2,2],
   [1,1,2,2],
   [3,3,4,4]
   [3,3,4,4]]
 
  It seems like some combination of np.resize or np.repeat and reshape +
  rollaxis would do the trick, but I'm at a loss.
 
  Many thanks!
 
  -Robin
 
 
 
 Just a day or so ago, Josef Perktold showed one way of accomplishing this
 using numpy.kron:
 
 In [14]: a = arange(12).reshape(3,4)
 
 In [15]: a
 Out[15]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])
 
 In [16]: kron(a, ones((2,2)))
 Out[16]:
 array([[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.]])
 
 
 Warren


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Robin Kraft
That does repeat the elements, but doesn't get them into the desired order.

In [4]: print a
[[1 2]
 [3 4]]

In [7]: np.tile(a, 4)
Out[7]: 
array([[1, 2, 1, 2, 1, 2, 1, 2],
   [3, 4, 3, 4, 3, 4, 3, 4]])

In [8]: np.tile(a, 4).reshape(4,4)
Out[8]: 
array([[1, 2, 1, 2],
   [1, 2, 1, 2],
   [3, 4, 3, 4],
   [3, 4, 3, 4]])

It's close, but I want to repeat the elements along the two axes, effectively 
stretching it by the lower right corner:

array([[1, 1, 2, 2],
   [1, 1, 2, 2],
   [3, 3, 4, 4],
   [3, 3, 4, 4]])

It would take some more reshaping/axis rolling to get there, but it seems 
doable.

Anyone know what combination of manipulations would work with the result of 
np.tile?

-Robin



On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote:

 You can also use numpy.tile
 
 -=- Olivier
 
 2011/12/3 Robin Kraft

 Thanks Warren, this is great, and even handles giant arrays just fine if 
 you've got enough RAM.
 
 I also just found this StackOverflow post with another solution.
 
 a.repeat(2, axis=0).repeat(2, axis=1). 
 http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array
 
 np.kron lets you do more, but for my simple use case the repeat() method is 
 faster and more ram efficient with large arrays.
 
 In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')
 
 In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 10 loops, best of 3: 182 ms per loop
 
 In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8'))
 1 loops, best of 3: 513 ms per loop
 
 
 Or for a 43200x4800 array:
 
 In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')
 
 In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 1 loops, best of 3: 6.92 s per loop
 
 In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8'))
 1 loops, best of 3: 27.8 s per loop
 
 In this case repeat() peaked at about 1gb of ram usage while np.kron hit 
 about 1.7gb.
 
 Thanks again Warren. I'd tried way too many variations on reshape and 
 rollaxis, and should have come to the Numpy list a lot sooner!
 
 -Robin
 
 
 On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
 On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
 
  I need to take an array - derived from raster GIS data - and upsample or
  scale it. That is, I need to repeat each value in each dimension so that,
  for example, a 2x2 array becomes a 4x4 array as follows:
 
  [[1, 2],
   [3, 4]]
 
  becomes
 
  [[1,1,2,2],
   [1,1,2,2],
   [3,3,4,4]
   [3,3,4,4]]
 
  It seems like some combination of np.resize or np.repeat and reshape +
  rollaxis would do the trick, but I'm at a loss.
 
  Many thanks!
 
  -Robin
 
 
 
 Just a day or so ago, Josef Perktold showed one way of accomplishing this
 using numpy.kron:
 
 In [14]: a = arange(12).reshape(3,4)
 
 In [15]: a
 Out[15]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])
 
 In [16]: kron(a, ones((2,2)))
 Out[16]:
 array([[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.]])
 
 
 Warren
 
 
 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Robin Kraft
Ha! I knew it had to be possible! Thanks Derek. So for and N = 2 (now on my 
laptop):

In [70]: M = 1200
In [69]: N = 2
In [71]: a = np.random.randint(0, 255, (M**2)).reshape(M,-1)

In [76]: timeit np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 
1).reshape(M*N,-1)
10 loops, best of 3: 99.1 ms per loop

In [78]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
10 loops, best of 3: 85.6 ms per loop

In [79]: timeit np.kron(a, np.ones((2,2), 'uint8'))
1 loops, best of 3: 521 ms per loop

It turns out np.kron and repeat are pretty straightforward for 
multi-dimensional data too - scaling or stretching a stacked array representing 
pixel data over time, for example. Nothing changes for np.kron - it handles the 
additional dimensionality by itself. With repeat you just tell it to operate on 
the last two dimensions.

So to sum up:

1) np.kron is cool for the simplicity of the code and simple scaling to N 
dimensions. It's also handy if you want to scale the array elements themselves 
too.
2) repeat() along the last N axes is a bit more intuitive (i.e. less magical) 
to me and has a better performance profile. 
3) Derek's reshape/rolling solution is almost as fast but it gives me a 
headache trying to visualize what it's actually doing. I don't want to think 
about adding another dimension ...

Thanks for the help folks. Here's scaling of a hypothetical time series (i.e. 3 
axes), where each sub-array represents a month.


In [26]: print a
[[[1 2]
  [3 4]]

 [[1 2]
  [3 4]]

 [[1 2]
  [3 4]]]

In [27]: np.kron(a, np.ones((2,2), dtype='uint8'))
Out[27]: 
array([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]])

In [64]: a.repeat(2, axis=1).repeat(2, axis=2)
Out[64]: 
array([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]])

On Dec. 3, 2011, at 12:50PM, Derek Homeier wrote:

 On 03.12.2011, at 6:22PM, Robin Kraft wrote:
 
  That does repeat the elements, but doesn't get them into the desired order.
  
  In [4]: print a
  [[1 2]
   [3 4]]
  
  In [7]: np.tile(a, 4)
  Out[7]: 
  array([[1, 2, 1, 2, 1, 2, 1, 2],
 [3, 4, 3, 4, 3, 4, 3, 4]])
  
  In [8]: np.tile(a, 4).reshape(4,4)
  Out[8]: 
  array([[1, 2, 1, 2],
 [1, 2, 1, 2],
 [3, 4, 3, 4],
 [3, 4, 3, 4]])
  
  It's close, but I want to repeat the elements along the two axes, 
  effectively stretching it by the lower right corner:
  
  array([[1, 1, 2, 2],
 [1, 1, 2, 2],
 [3, 3, 4, 4],
 [3, 3, 4, 4]])
  
  It would take some more reshaping/axis rolling to get there, but it seems 
  doable.
  
  Anyone know what combination of manipulations would work with the result of 
  np.tile?
  
 Rolling was the keyword:
 
 np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4))
 [[1 1 2 2]
  [1 1 2 2]
  [3 3 4 4]
  [3 3 4 4]]
 
 I leave the generalisation and timing up to you, but it seems for 
 a = np.arange(M**2).reshape(M,-1)
 
 np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) 
 
 should do the trick.
 
 Cheers,
   Derek
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] upsample or scale an array

2011-12-02 Thread Robin Kraft
I need to take an array - derived from raster GIS data - and upsample or scale 
it. That is, I need to repeat each value in each dimension so that, for 
example, a 2x2 array becomes a 4x4 array as follows:

[[1, 2],
 [3, 4]]

becomes

[[1,1,2,2],
 [1,1,2,2],
 [3,3,4,4]
 [3,3,4,4]]

It seems like some combination of np.resize or np.repeat and reshape + rollaxis 
would do the trick, but I'm at a loss.

Many thanks!

-Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Long-standing issue with using numpy in embedded CPython

2011-10-04 Thread Robin
On Mon, Oct 3, 2011 at 9:42 PM, Yang Zhang yanghates...@gmail.com wrote:
 It turns out that there's a long-standing problem in numpy that
 prevents it from being used in embedded CPython environments:

Just wanted to make the point for reference that in general Numpy does
work fine in (non-threaded) embedded CPython situations, see for
example pymex [1] which embeds Python + Numpy in a Matlab mex file and
works really well.

This seems to a be a problem specific to Jepp.

Just wanted to mention it in case it puts someone off trying something
unnecessarily in the future.

Cheers

Robin

[1] https://github.com/kw/pymex


 http://stackoverflow.com/questions/7592565/when-embedding-cpython-in-java-why-does-this-hang/7630992#7630992
 http://mail.scipy.org/pipermail/numpy-discussion/2009-July/044046.html
 Is there any fix or workaround for this?  Thanks.
 --
 Yang Zhang
 http://yz.mit.edu/
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] f2py and openmp on mac os x with gfortran

2011-07-20 Thread Robin
I'm not at my Mac to check the exact paths but see if pointing one of
the environment variables
LD_LIBRARY_PATH
or
DYLD_LIBRARY_PATH
to a directory where the gfortran openmp libraries can be found - this
will depend on where you got gfortran from and the version, but you
should be able to find it by following the symlink that is the
gfortran command and looking for an appropriate lib/ directory near
the target of that.

Cheers

Robin

On Wed, Jul 20, 2011 at 9:02 PM, Brandt Belson bbel...@princeton.edu wrote:
 Hello,
 I'm struggling to create openmp subroutines. I've simplified the problem
 down to the subroutine below.
 -- play.f90 --
 subroutine step(soln,n)
   implicit none
   integer n,i
   real*8 soln(n)

   !f2py intent(in) n
   !f2py intent(out) soln
   !f2py depend(n) soln
 !$OMP PARALLEL DO
   do i=1,n
     soln(i) = .1
   end do
 !$OMP END PARALLEL DO
 end subroutine step

 I compile this with the command:
 f2py -c -m play play.f90 --fcompiler=gfortran --f90flags=-fopenmp
 This completes successfully. When I import the module, I get the following
 error message.
 $ python -c 'import play'
 Traceback (most recent call last):
   File string, line 1, in module
 ImportError: dlopen(./play.so, 2): Symbol not found: _GOMP_parallel_end
   Referenced from: /home/bbelson/Desktop/SOR/play.so
   Expected in: flat namespace
  in /home/bbelson/Desktop/SOR/play.so
 It seems to me that the linking step is broken, however I did not see any
 other options in the f2py documentation to change the linking step. Did I
 miss something?
 Thanks,
 Brandt
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication

2011-06-17 Thread Robin
I didn't have time yesterday but the attached illustrates what I mean
about putting the shared data in a module (it should work with the
previous myutil).

I don't get a big speed up but at least it is faster using multiple
subprocesses:

Not threaded:  0.450406074524
Using 8 processes: 0.282383

If I add a 1s sleep in the inner product function shows a much more
significant improvement (ie if it were a longer computation):

Not threaded:  50.6744170189
Using 8 processes: 8.152393

Still not quite linear but certainly an improvement.

Cheers

Robin

On Thu, Jun 16, 2011 at 9:19 PM, Robin robi...@gmail.com wrote:
 The fact that you are still passing the myutil.arrayList argument to
 pool.map means the data is still being pickled. All arguments to
 pool.map are pickled to be passed to the subprocess. You need to
 change the function so that it accesses the data directly, then just
 pass indices to pool.map.

 Cheers

 Robin

 On Thu, Jun 16, 2011 at 9:05 PM, Brandt Belson bbel...@princeton.edu wrote:
 Hi all,
 Thanks for the replies. As mentioned, I'm parallelizing so that I can take
 many inner products simultaneously (which I agree is embarrassingly
 parallel). The library I'm writing asks the user to supply a function that
 takes two objects and returns their inner product. After all the discussion
 though it seems this is too simplistic of an approach. Instead, I plan to
 write this part of the library as if the inner product function supplied by
 the user uses all available cores (with numpy and/or numexpr built with MKL
 or LAPACK).
 As far as using fortran or C and openMP, this probably isn't worth the time
 it would take, both for me and the user.
 I've tried increasing the array sizes and found the same trends, so the
 slowdown isn't only because the arrays are too small to see the benefit of
 multiprocessing. I wrote the code to be easy for anyone to experiment with,
 so feel free to play around with what is included in the profiling, the
 sizes of arrays, functions used, etc.
 I also tried using handythread.foreach with arraySize = (3000,1000), and
 found the following:
 No shared memory, numpy array multiplication took 1.57585811615 seconds
 Shared memory, numpy array multiplication took 1.25499510765 seconds
 This is definitely an improvement from multiprocessing, but without knowing
 any better, I was hoping to see a roughly 8x speedup on my 8-core
 workstation.
 Based on what Chris sent, it seems there is some large overhead caused by
 multiprocessing pickling numpy arrays. To test what Robin mentioned
 If you are on Linux or Mac then fork works nicely so you have read
 only shared memory you just have to put it in a module before the fork
 (so before pool = Pool() ) and then all the subprocesses can access it
 without any pickling required. ie
 myutil.data = listofdata
 p = multiprocessing.Pool(8)
 def mymapfunc(i):
   return mydatafunc(myutil.data[i])

 p.map(mymapfunc, range(len(myutil.data)))
 I tried creating the arrayList in the myutil module and using
 multiprocessing to find the inner products of myutil.arrayList, however this
 was still slower than not using multiprocessing, so I believe there is still
 some large overhead. Here are the results:
 No shared memory, numpy array multiplication took 1.55906510353 seconds
 Shared memory, numpy array multiplication took 9.8242638 seconds
 Shared memory, myutil.arrayList numpy array multiplication took
 8.77094507217 seconds
 I'm attaching this code.
 I'm going to work around this numpy/multiprocessing behavior with
 numpy/numexpr built with MKL or LAPACK. It would be good to know exactly
 what's causing this though. It would be nice if there was a way to get the
 ideal speedup via multiprocessing, regardless of the internal workings of
 the single-threaded inner product function, as this was the behavior I
 expected. I imagine other people might come across similar situations, but
 again I'm going to try to get around this by letting MKL or LAPACK make use
 of all available cores.
 Thanks again,
 Brandt
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion




import numpy as np
import myutil
import multiprocessing
import itertools
import time as T


arraySize = (3000,1000)
numArrays = 50

arrayList = [np.random.random(s) for s in itertools.repeat(arraySize,numArrays)]
IPs = np.zeros(numArrays)

# pool
# save data to module _before_ fork
myutil.arrayList = arrayList
def shared_numpy_inner_product(i):
# do numpy inner product on shared module variable
#T.sleep(1)
return myutil.numpy_inner_product(myutil.arrayList[i])
myutil.shared_numpy_inner_product = shared_numpy_inner_product

# 1 thread
startTime = T.time()
for i in range(len(arrayList)):
IPs[i] = myutil.shared_numpy_inner_product(i)
endTime = T.time() - startTime
print Not threaded: ,endTime

# now fork the subprocesses
processes=multiprocessing.cpu_count

Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication

2011-06-16 Thread Robin
On Thu, Jun 16, 2011 at 6:44 PM, Christopher Barker
chris.bar...@noaa.gov wrote:
 2. There is also the question of when the process pool is spawned. Though
 I haven't checked, I suspect it happens prior to calling pool.map. But if it
 does not, this is a factor as well, particularly on Windows (less so on
 Linux and Apple).

 It didn't work well on my Mac, so ti's either not an issue, or not
 Windows-specific, anyway.

I am pretty sure that the process pool is spawned when you create the
pool object instance.


 3.  arrayList is serialised by pickling, which has a significan
 overhead.  It's not shared memory either, as the OP's code implies, but the
 main thing is the slowness of cPickle.

 I'll bet this is a big issue, and one I'm curious about how to address, I
 have another problem where I need to multi-process, and I'd love to know a
 way to pass data to the other process and back *without* going through
 pickle. maybe memmapped files?

If you are on Linux or Mac then fork works nicely so you have read
only shared memory you just have to put it in a module before the fork
(so before pool = Pool() ) and then all the subprocesses can access it
without any pickling required. ie
myutil.data = listofdata
p = multiprocessing.Pool(8)
def mymapfunc(i):
return mydatafunc(myutil.data[i])

p.map(mymapfunc, range(len(myutil.data)))

Actually that won't work because mymapfunc needs to be in a module so
it can be pickled but hopefully you get the idea.

Cheers

Robin




 IPs = N.array(innerProductList)

 4.  numpy.array is a very slow function. The benchmark should preferably
 not include this overhead.

 I re-ran, moving that out of the timing loop, and, indeed, it helped a lot,
 but it still takes longer with the multi-processing.

 I suspect that the overhead of pickling, etc. is overwhelming the operation
 itself. That and the load balancing issue that I don't understand!

 To test this, I did a little experiment -- creating a fake operation, one
 that simply returns an element from the input array -- so it should take
 next to no time, and we can time the overhead of the pickling, etc:

 $ python shared_mem.py

 Using 2 processes
 No shared memory, numpy array multiplication took 0.124427080154 seconds
 Shared memory, numpy array multiplication took 0.586215019226 seconds

 No shared memory, fake array multiplication took 0.000391006469727 seconds
 Shared memory, fake array multiplication took 0.54935503006 seconds

 No shared memory, my array multiplication took 23.5055780411 seconds
 Shared memory, my array multiplication took 13.0932741165 seconds

 Bingo!

 The overhead of the multi-processing takes about .54 seconds, which explains
 the slowdown for the numpy method

 not so mysterious after all.

 Bruce Southey wrote:

 But if everything is *single-threaded* and thread-safe, then you just
 create a function and use Anne's very useful handythread.py
 (http://www.scipy.org/Cookbook/Multithreading).

 This may be worth a try -- though the GIL could well get in the way.

 By the way, if the arrays are sufficiently small, there is a lot of
 overhead involved such that there is more time in communication than
 computation.

 yup -- clearly the case here. I wonder if it's just array size though --
 won't cPickle time scale with array size? So it may not be size pe-se, but
 rather how much computation you need for a given size array.

 -Chris

 [I've enclosed the OP's slightly altered code]




 --
 Christopher Barker, Ph.D.
 Oceanographer

 Emergency Response Division
 NOAA/NOS/ORR            (206) 526-6959   voice
 7600 Sand Point Way NE   (206) 526-6329   fax
 Seattle, WA  98115       (206) 526-6317   main reception

 chris.bar...@noaa.gov

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Using multiprocessing (shared memory) with numpy array multiplication

2011-06-16 Thread Robin
The fact that you are still passing the myutil.arrayList argument to
pool.map means the data is still being pickled. All arguments to
pool.map are pickled to be passed to the subprocess. You need to
change the function so that it accesses the data directly, then just
pass indices to pool.map.

Cheers

Robin

On Thu, Jun 16, 2011 at 9:05 PM, Brandt Belson bbel...@princeton.edu wrote:
 Hi all,
 Thanks for the replies. As mentioned, I'm parallelizing so that I can take
 many inner products simultaneously (which I agree is embarrassingly
 parallel). The library I'm writing asks the user to supply a function that
 takes two objects and returns their inner product. After all the discussion
 though it seems this is too simplistic of an approach. Instead, I plan to
 write this part of the library as if the inner product function supplied by
 the user uses all available cores (with numpy and/or numexpr built with MKL
 or LAPACK).
 As far as using fortran or C and openMP, this probably isn't worth the time
 it would take, both for me and the user.
 I've tried increasing the array sizes and found the same trends, so the
 slowdown isn't only because the arrays are too small to see the benefit of
 multiprocessing. I wrote the code to be easy for anyone to experiment with,
 so feel free to play around with what is included in the profiling, the
 sizes of arrays, functions used, etc.
 I also tried using handythread.foreach with arraySize = (3000,1000), and
 found the following:
 No shared memory, numpy array multiplication took 1.57585811615 seconds
 Shared memory, numpy array multiplication took 1.25499510765 seconds
 This is definitely an improvement from multiprocessing, but without knowing
 any better, I was hoping to see a roughly 8x speedup on my 8-core
 workstation.
 Based on what Chris sent, it seems there is some large overhead caused by
 multiprocessing pickling numpy arrays. To test what Robin mentioned
 If you are on Linux or Mac then fork works nicely so you have read
 only shared memory you just have to put it in a module before the fork
 (so before pool = Pool() ) and then all the subprocesses can access it
 without any pickling required. ie
 myutil.data = listofdata
 p = multiprocessing.Pool(8)
 def mymapfunc(i):
   return mydatafunc(myutil.data[i])

 p.map(mymapfunc, range(len(myutil.data)))
 I tried creating the arrayList in the myutil module and using
 multiprocessing to find the inner products of myutil.arrayList, however this
 was still slower than not using multiprocessing, so I believe there is still
 some large overhead. Here are the results:
 No shared memory, numpy array multiplication took 1.55906510353 seconds
 Shared memory, numpy array multiplication took 9.8242638 seconds
 Shared memory, myutil.arrayList numpy array multiplication took
 8.77094507217 seconds
 I'm attaching this code.
 I'm going to work around this numpy/multiprocessing behavior with
 numpy/numexpr built with MKL or LAPACK. It would be good to know exactly
 what's causing this though. It would be nice if there was a way to get the
 ideal speedup via multiprocessing, regardless of the internal workings of
 the single-threaded inner product function, as this was the behavior I
 expected. I imagine other people might come across similar situations, but
 again I'm going to try to get around this by letting MKL or LAPACK make use
 of all available cores.
 Thanks again,
 Brandt
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to tell if I succeeded to build numpy with amd, umfpack and lapack

2011-02-18 Thread Robin
I think numpy doesn't use umfpack. scipy.sparse used to, but now the
umfpack stuff has been moved out to a scikit.
So you probably won't see anything about those libraries, but if you
install scikits.umfpack and it works then you must be linked
correctly.

Cheers

Robin

On Fri, Feb 18, 2011 at 11:32 AM, Samuel John sc...@samueljohn.de wrote:
 Ping.

 How to tell, if numpy successfully build against libamd.a and libumfpack.a?
 How do I know if they were successfully linked (statically)?
 Is it possible from within numpy, like show_config() ?
 I think show_config() has no information about these in it :-(

 Anybody?

 Thanks,
  Samuel
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Where did the github numpy repository go?

2010-11-14 Thread Robin Kraft
Git is having some kind of major outage:

http://status.github.com/

The site and git access is unavailable due to a database failure. We're 
researching the issue.

On Nov 14, 2010, at 3:29 PM, numpy-discussion-requ...@scipy.org wrote:
 
 Message: 5
 Date: Sun, 14 Nov 2010 13:29:03 -0700
 From: Charles R Harris charlesr.har...@gmail.com
 Subject: [Numpy-discussion] Where did the github numpy repository go?
 To: numpy-discussion numpy-discussion@scipy.org
 Message-ID:
   aanlkti=cpz752y88+sfsoh3wtm9poqphokyxc55dp...@mail.gmail.com
 Content-Type: text/plain; charset=iso-8859-1
 
 I keep getting page does not exist.
 
 Chuck
 -- next part --
 An HTML attachment was scrubbed...
 URL: 
 http://mail.scipy.org/pipermail/numpy-discussion/attachments/20101114/3381d8a4/attachment.html
  
 
 --
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 End of NumPy-Discussion Digest, Vol 50, Issue 32
 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] mingw-w64 tutorial ?

2010-08-22 Thread Robin
On Sun, Aug 22, 2010 at 8:41 AM, Sebastian Haase seb.ha...@gmail.com wrote:
 Do you know if that contains a C++ compiler ?  The first page before
 it starts the actual download has Visual C++ Compilers grayed out
 ... !?

 -Sebastian

 Ok, apparently I had to install the dot NET Framework 4 from
 http://msdn.microsoft.com/en-us/netframework/aa569263.aspx
 first, before then the C++ could be installed.
 But now setup.py still complains:
       error: Unable to find vcvarsall.bat
 and I think it is looking for
 C:\Program Files (x86)\Microsoft Visual Studio 9.0
 while that file got installed in
 C:\Program Files (x86)\Microsoft Visual Studio 10.0
 I don't know how to get the log.debug messages from the setup.py
 script activated...
 ?

My limited understanding is that there are two different versions of
microsoft stuff at the moment:
VS 2008 which goes with SDK 3.5 and VS 2010 which goes with SDK 4.0. I
think Python is built with 2008 so that might be easier to try. I'm
not sure though.

While the SDK contains some compilers I think the easiest way to get
the compilers working is to first install the VS express edition,
which is 32 bit, and then install the SDK which provides 64 bit
support. Here are some instructions I lifted from the Matlab website
(I haven't tried building Python extensions yet but I was able to
embed Python with this setup).

To install Visual Studio 2008 Express Edition with all required components:
1. Install Microsoft Visual Studio 2008 Express Edition. The main
Visual Studio 2008 Express installer is available from:
http://www.microsoft.com/express/Downloads/#Visual_Studio_2008_Express_Downloads
This package can be installed using the default options.

2. Install the Microsoft Windows SDK. The Microsoft Windows SDK is
available by searching Microsoft's download site, or by going directly
to:
http://msdn.microsoft.com/en-us/windowsserver/bb980924.aspx
or
http://www.microsoft.com/downloads/details.aspx?FamilyId=F26B1AA4-741A-433A-9BE5-FA919850BDBFdisplaylang=en

Download the Windows Server 2008  .NET 3.5 SDK. Do not install beta
or 'Release Candidate' (RC) versions. Also do NOT install Microsoft
Windows SDK for Windows 7 and .NET Framework 4 (version 7.1); if you
want to use a 7.x version choose the Microsoft Windows SDK for
Windows 7 and .NET Framework 3.5 SP1.

2.1. While installing the SDK, you must select x64 Compilers and
Tools. For example, in the SDK installer above:
On screen Installation Options
Select Developer Tools-Visual C++ Compilers.
This item has the Feature Description Install the Visual C++ 9.0
Compilers. These compilers allow you to target x86, x64, IA64
processor architectures.

3. To verify that you have all installed components, check that the
Microsoft SDK contains the amd64 version of the C/C++ compiler
cl.exe. This is usually installed into


C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin\amd64\cl.exe


Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-Dev] Good-bye, sort of (John Hunter)

2010-08-18 Thread Robin
On Wed, Aug 18, 2010 at 8:20 AM, Sturla Molden stu...@molden.no wrote:
 Den 18. aug. 2010 kl. 08.19 skrev Martin Raspaud
 martin.rasp...@smhi.se:

 Once upon a time, when my boss wanted me to use matlab, I found myself
 implementing a python interpreter in matlab...


 There are just two sane solutions for Matlab: Either embed CPyton in a
 MEX file, or use Matlab's JVM to run Jython ;)

 http://vader.cse.lehigh.edu/~perkins/pymex.html

Just thought I'd mention another one since this came up:

http://github.com/kw/pymex
This one works very nicely - it proxies any Python objects so you can
use, should you want to, the Matlab IDE as a python interpreter,
supports numpy arrays etc. Also cross-platform - I even got it to work
with 64 bit matlab/python on windows (in a fork on github).




 Sturla
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] f2py performance question from a rookie

2010-08-16 Thread Robin
On Mon, Aug 16, 2010 at 10:12 AM, Vasileios Gkinis v.gki...@nbi.ku.dkwrote:

  Hi all,

 This is a question on f2py.
 I am using a Crank Nicholson scheme to model a diffusion process and in the
 quest of some extra speed I started playing with f2py.
 However I do not seem to be able to get any significant boost in the
 performance of the code.


Try adding
 -DF2PY_REPORT_ON_ARRAY_COPY
to the f2py command line.

This will cause f2py to report any array copies. If any of the
types/ordering of the arrays don't match f2py will silently make a copy -
this can really affect performance.

Cheers

Robin



 In the following simple example I am building a tridiagonal array with
 plain python for loops and by calling a simple fortran subroutine that does
 the same thing.

 Here goes the python script:

 import numpy as np
 import time
 import learnf90_05
 import sys

 run = sys.argv[1]

 try:
 dim = np.float(sys.argv[2])
 except:
 dim = 500
 elem = np.ones(dim)

 ### Python routine

 if run == Python:
 t_python_i = time.time()
 array_python = np.zeros((dim, dim))
 for row in np.arange(1,dim-1):
 array_python[row, row-1] = elem[row]
 array_python[row, row] = elem[row]
 array_python[row,row+1] = elem[row]

 t_python_f = time.time()
 python_time = t_python_f - t_python_i
 print(Python time: %0.5e %(python_time))

 ###fortran routine

 elif run == Fortran:
 t_fortran_i = time.time()
 fortran_array = learnf90_05.test(j = dim, element = elem)
 t_fortran_f = time.time()
 fortran_time = t_fortran_f - t_fortran_i
 print(Fortran time: %0.5e %(fortran_time))

 And the fortran subroutine called test is here:

 subroutine test(j, element, a)
 integer, intent(in) ::  j
 integer :: row, col
 real(kind = 8),  intent(in) :: element(j)
 real(kind = 8), intent(out) :: a(j,j)

 do row=2,j-1
 a(row,row-1) = element(row)
 a(row, row) = element(row)
 a(row, row+1) = element(row)
 enddo

 end

 The subroutine is compiled with Gfortran 4.2 on Osx 10.5.8.

 Running the python script with array sizes 300x300 and 3000x3000 gives for
 the python plain for loops:

 dhcp103:learnf vasilis$ python fill_array.py Python 3000
 Python time: 1.56063e-01
 dhcp103:learnf vasilis$ python fill_array.py Python 300
 Python time: 4.82297e-03

 and for the fortran subroutine:

 dhcp103:learnf vasilis$ python fill_array.py Fortran 3000
 Fortran time: 1.16298e-01
 dhcp103:learnf vasilis$ python fill_array.py Fortran 300
 Fortran time: 8.83102e-04

 It looks like the gain in performance is rather low compared to tests i
 have seen elsewhere.

 Am I missing something here..?

 Cheers...Vasilis


  --
  
 Vasileios Gkinis PhD student
  Center for Ice and Climate http://www.iceandclimate.nbi.ku.dk
  Niels Bohr Institute http://www.nbi.ku.dk/english/
 Juliane Maries Vej 30
 2100 Copenhagen
 Denmark
   email: v.gki...@nbi.ku.dk v.gki...@nbi.ku.dk?  skype: vasgkin  Tel: +45
 353 25913  
  --
  
 Vasileios Gkinis PhD student
  Center for Ice and Climate http://www.iceandclimate.nbi.ku.dk
  Niels Bohr Institute http://www.nbi.ku.dk/english/
 Juliane Maries Vej 30
 2100 Copenhagen
 Denmark
   email: v.gki...@nbi.ku.dk v.gki...@nbi.ku.dk?  skype: vasgkin  Tel: +45
 353 25913  

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] summarizing blocks of an array using a moving window

2010-07-22 Thread Robin Kraft
Vincent, Pauli,


 From: Vincent Schut sc...@sarvision.nl

 - an other option would be some smart reshaping, which finally gives you 
 a [y//2, x//2, 2, 2] array, which you could then reduce to calculate 
 stats (mean, std, etc) on the last two axes.  I *think* you'd have to 
 first reshape both x and y axes, and then reposition the axes. An
 example:
 a = gdal_array.BandReadAsArray(blabla)
 y,x = a.shape #y and x need be divideable by 2!
 b = a.reshape(y/2, 2, x/2, x).transpose(0,2,1,3).reshape(y/2, x/2, 4)
 bMean = b.mean(axis=-1)
 bMax = ..etc.


You and Pauli agree on this - seems like a good option.


 - a third option would be to create an index array, which has a unique 
 value per 2x2 square, and then use histogram2d. This would, if you use 
 its 'weight' functionality, at least enable you to get efficient counts 
 and sums/means. Other stats might be hard, though.

H I don't get this, but I'll experiment. I've seen some really useful 
things done with histogram2d so it seems worthwhile to figure out.


 Message: 6
 From: Pauli Virtanen p...@iki.fi

  Let's say the image looks like this: np.random.randint(0,2,
  16).reshape(4,4)
  
  array([[0, 0, 0, 1],
 [0, 0, 1, 1],
 [1, 1, 0, 0],
 [0, 0, 0, 0]])
  
  I want to use a square, non-overlapping moving window for resampling, so
  that I get a count of all the 1's in each 2x2 window.
  
  0, 0,   0, 1
  0, 0,   1, 1 0  3
  =   2  0
  1, 1,   0, 0
  0, 0,   0, 0
  
  In another situation with similar data I'll need the average, or the
  maximum value, etc..
 
 Non-overlapping windows can be done by reshaping:
 
 x = np.array([[0, 0, 0, 1, 1, 1],
   [0, 0, 1, 1, 0, 0],
   [1, 1, 0, 0, 1, 1],
   [0, 0, 0, 0, 1, 1],
   [1, 0, 1, 0, 1, 1],
   [0, 0, 1, 0, 0, 0]])
 
 y = x.reshape(3,2,3,2)
 y2 = y.sum(axis=3).sum(axis=1)


This is perfect - and so fast! Thanks! Now I just have to understand why it 
works ...

Can anyone recommend a tutorial on working with (slicing, reshaping, etc.) 
multi-dimensional arrays? Stefan's slides are beautiful but my brain starts to 
hurt if I try to follow them line by line.

-Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] summarizing blocks of an array using a moving window

2010-07-21 Thread Robin Kraft
Hello all,

The short version: For a given NxN array, is there an efficient way to use a 
moving window to collect a summary statistic on a chunk of the array, and 
insert it into another array?

The long version: I am trying to resample an image loaded with GDAL into an NxN 
array. Note that this is for statistical purposes, so image quality doesn't 
matter. For the curious, the image is derived from satellite imagery and 
displays a map of hotspots of tropical deforestation at 500m resolution. I need 
to assign a count of these deforestation 'hits' to each pixel in an existing 
array of 1000m pixels.


Let's say the image looks like this: np.random.randint(0,2, 16).reshape(4,4)

array([[0, 0, 0, 1],
   [0, 0, 1, 1],
   [1, 1, 0, 0],
   [0, 0, 0, 0]])

I want to use a square, non-overlapping moving window for resampling, so that I 
get a count of all the 1's in each 2x2 window.

0, 0,   0, 1
0, 0,   1, 1 0  3
=   2  0
1, 1,   0, 0
0, 0,   0, 0

In another situation with similar data I'll need the average, or the maximum 
value, etc..

My brute-force method is to loop through the rows and columns to get little 
chunks of data to process. But must be a more elegant way to do this - it's 
probably obvious too ... (inelegant way further down).

Another way to do it would be to use np.tile to create an array called arr 
filed with blocks of [[0,1],[2,3]]. I could then use something like this to get 
the stats I want:

d0 = img[np.where(arr==0)]
d1 = img[np.where(arr==1)]
d2 = img[np.where(arr==2)]
d3 = img[np.where(arr==3)]

img_out = d0 + d1 + d2 + d3

This would be quite snappy if I could create arr efficiently. Unfortunately 
np.tile does something akin to np.hstack to create this array, so it isn't 
square and can't be reshaped appropriately (np.tile(np.arange(2**2).reshape(2, 
2), 4)):

array([[0, 1, 0, 1, 0, 1, 0, 1],
   [2, 3, 2, 3, 2, 3, 2, 3]])

Inefficient sample code below. Advice greatly appreciated!

-Robin


import numpy as np
from math import sqrt
from time import time

def rs(img_size=16, window_size=2):
w = window_size

# making sure the numbers work

if img_size % sqrt(img_size)  0:
print Square root of image size must be an integer.
print Sqrt =, sqrt(img_size)

return

elif (img_size / sqrt(img_size)) % w  0:
print Square root of image size must be evenly divisible by, w
print Sqrt =, sqrt(img_size)
print sqrt(img_size), /, w, =, sqrt(img_size)/w

return

else:

rows = int(sqrt(img_size))
cols = int(sqrt(img_size))

# create fake data: ones and zeros
img = np.random.randint(0, 2, img_size).reshape(rows, cols)

# create empty array for resampled data
img_out = np.zeros((rows/w, cols/w), dtype=np.int)

t = time()

# retreive blocks of pixels in img
# insert summary into img_out

for r in xrange(0, rows, w): # skip rows based on window size
for c in xrange(0, cols, w): # skip columns based on window size

# calculate the sum of the values in moving window,
#insert value into corresponding pixel in img_out

img_out[r/w, c/w] = np.int(np.sum(img[r:r+w, c:c+w]))

t1= time()
elapsed = t1-t
print img shape:, img.shape
print img, \n
print img_out shape:, img_out.shape
print img_out

print %.7f seconds % elapsed

rs(imgage_size=16, window=2)
rs(81, 3)
rs(100)
#rs(4800*4800) # very slow
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy for Python 3?

2010-07-19 Thread Robin
On Mon, Jul 19, 2010 at 1:53 PM, Matthieu Brucher
matthieu.bruc...@gmail.com wrote:
 I'm afraid that if you don't know if you have a compiler, you don't
 have one. This also means you will not be able to compile Numpy, as
 the official compiler is no longer available.

Is this the VS 2008 Express Edition? I saw something posted a while
ago about how it was no longer available, but I think it was a mistake
as it still seems to be easily available from:

http://www.microsoft.com/express/downloads/#2008-All

Even 2005 is still available so I imagine there is some time before we
have to worry about 2008.

http://www.microsoft.com/downloads/details.aspx?familyid=7B0B0339-613A-46E6-AB4D-080D4D4A8C4Edisplaylang=en

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy on windows 64 bit

2010-07-06 Thread Robin
On Tue, Jul 6, 2010 at 6:57 PM, Christoph Gohlke cgoh...@uci.edu wrote:

 I proposed a fix at http://projects.scipy.org/numpy/ticket/1535. Does it
 work for you?

Thanks very much... that looks great. Since it works with long's it
fixes my problems (I think it will also fix a couple of the failing
scipy tests)

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy on windows 64 bit

2010-07-05 Thread Robin
Hi,

I am having some problems with win64 with all my tests failing.

I installed amd64 Python from Python.org and numpy and scipy from
http://www.lfd.uci.edu/~gohlke/pythonlibs/

I noticed that on windows sys.maxint is the 32bit value (2147483647
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy on windows 64 bit

2010-07-05 Thread Robin
On Mon, Jul 5, 2010 at 12:09 PM, David Cournapeau courn...@gmail.com wrote:

 Short of saying what those failures are, we can't help you,

Thanks for reply... Somehow my message got truncated - I had written
more detail about the errors!

 I noticed that on windows sys.maxint is the 32bit value (2147483647

 This is not surprising: sys.maxint gives you the max value of a long,
 which is 32 bits even on 64 bits on windows.

I just got to figuring this out... But it makes some problems. The
main one I'm having is that I assume because of this problem array
shapes are longs instead of ints (ie x.shape[0] is a long).

This breaks np.random.permutation(x.shape[1]) which I use all over the
place (I opened a ticket for this, #1535). Something I asked in the
previous mail that got lost is what is the best cross platform way of
doing this?
np.random.permutation(int(x.shape[1]))?

Actually that and the problems with scipy.sparse (spsolve doesn't
work) cover all of the errors I'm seeing... (I detailed those in a
seperate mail to the scipy list).

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] OT: request help building pymex win64

2010-07-03 Thread Robin
On Fri, Jul 2, 2010 at 6:10 PM, David Cournapeau courn...@gmail.com wrote:
 Also, I would double check the issue is not something else altogether,

Hi,

I got it working eventually - it was something else altogether! I had
made some mistakes in the changes I had made to get it to compile with
visual studio that were causing the segfaults. Makes me think of the
old phrase - problem is between keyboard and chair.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] OT: request help building pymex win64

2010-07-02 Thread Robin
Hi,

Sorry for the offtopic post but I wondered if any Windows experts who
are familiar with topics like linking python on windows and visual
studio runtimes etc. might be able to help.

I'm on a bit of a mission to get pymex built for 64 bit windows. Pymex
( http://github.com/kw/pymex ) is a matlab package that embeds the
Python interpreter in a mex file and provides a very elegant interface
for manipulating python objects from matlab, as well as converting
between data times when necessary. It builds easily on mac, linux and
win32 with mingw, but I really need it also for 64 bit windows. (It
works very well with numpy as well so not completely OT).

I have looked at trying to get a 64bit mingw  working to build mex
files, but that seemed quite difficult, so instead I am trying to
build with VS 2008 Express Edition + Windows 7 SDK (for 64 bit
support). As far as I can tell this is installed OK as I can build the
example mex64 files OK.

I have made some modifications to pymex to get it to build under vs
2008 ( http://github.com/robince/pymex/tree/win64 ).

And I can get it to build and link (I believe using the implicit dll
method of linking against C:\Python26\libs\python26.lib of the amd64
python.org python) without errors, but when I run it seems to
segfaults whenever a pointer is passed between the mex side and
python26.dll.

I asked this stackoverflow question which has some more details (build log)
http://stackoverflow.com/questions/3167134/trying-to-embed-python-into-matlab-mex-win64

Anyway I'm completely in the dark but wondered if some of the experts
on here would be able to spot something (perhaps to do with
incompatible C runtimes - I am not sure what runtime Python is built
with but I thought it was VS 2008).

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] OT: request help building pymex win64

2010-07-02 Thread Robin
On Fri, Jul 2, 2010 at 5:47 PM, David Cournapeau courn...@gmail.com wrote:

 The problem may be that matlab is built with one runtime, and Python
 with another Unless your matlab is very recent, it is actually
 quite likely to be compiled with VS 2005, which means you should use
 python 2.5 instead (or built python2.6 with VS 2005, but I am not sure
 it is even possible without herculean efforts).

Thanks for your help!

I thought of that, but then VS 2008 is an officially supported
compiler for the version of matlab I am using (2009a).
http://www.mathworks.com/support/compilers/release2009a/win64.html

So I thought on the matlab/mex side 2008 should be fine, and I thought
since Python is built with 2008 that should also be OK.  But obviously
something isn't!

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Brooken Toolchain (Mac 10.6)

2010-04-20 Thread Robin
To build against the python.org 2.5 you need to use the older gcc:

export CC=/usr/bin/gcc-4.0
export CXX=/usr/bin/g++-4.0

should do it. By default snow leopard uses 4.2 now, which doesn't
support the -Wno-long-double option used when building python.

Cheers

Robin

On Mon, Apr 19, 2010 at 3:55 PM, Patrick Marsh patrickmars...@gmail.com wrote:
 Greetings,
 Per my previous email, I'm trying to setup the release process for Numpy on
 my MacBook Pro.  When trying to build Numpy 1.4.1r3 with Python 2.5.4 I get
 a broken toolchain error (below).  I do not get this error when trying to
 build Numpy with Python 2.6.5 - and there is nothing fundamentally different
 (that I know of) between my Python 2.5.4 and Python 2.6.5 environments.  To
 address a previous suggestion I received offline, I do have sufficient
 permissions (I've even tried using sudo, just to make sure) and I do have
 setuptools installed.
 Any help, would be appreciated.

 Patrick

 
 compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
 -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath
 -Inumpy/core/include
 -I/Library/Frameworks/Python.framework/Versions/2.5/include/python2.5 -c'
 gcc: _configtest.c
 cc1: error: unrecognized command line option -Wno-long-double
 cc1: error: unrecognized command line option -Wno-long-double
 lipo: can't figure out the architecture type of: /var/tmp//ccjTUva4.out
 cc1: error: unrecognized command line option -Wno-long-double
 cc1: error: unrecognized command line option -Wno-long-double
 lipo: can't figure out the architecture type of: /var/tmp//ccjTUva4.out
 failure.
 removing: _configtest.c _configtest.o
 Traceback (most recent call last):
   File setup.py, line 187, in module
     setup_package()
   File setup.py, line 180, in setup_package
     configuration=configuration )
   File /Users/pmarsh/git/numpy.release/numpy/distutils/core.py, line 186,
 in setup
     return old_setup(**new_attr)
   File
 /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/core.py,
 line 151, in setup
     dist.run_commands()
   File
 /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py,
 line 974, in run_commands
     self.run_command(cmd)
   File
 /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py,
 line 994, in run_command
     cmd_obj.run()
   File /Users/pmarsh/git/numpy.release/numpy/distutils/command/build.py,
 line 37, in run
     old_build.run(self)
   File
 /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/command/build.py,
 line 112, in run
     self.run_command(cmd_name)
   File
 /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/cmd.py,
 line 333, in run_command
     self.distribution.run_command(command)
   File
 /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/distutils/dist.py,
 line 994, in run_command
     cmd_obj.run()
   File
 /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line
 152, in run
     self.build_sources()
   File
 /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line
 163, in build_sources
     self.build_library_sources(*libname_info)
   File
 /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line
 298, in build_library_sources
     sources = self.generate_sources(sources, (lib_name, build_info))
   File
 /Users/pmarsh/git/numpy.release/numpy/distutils/command/build_src.py, line
 385, in generate_sources
     source = func(extension, build_dir)
   File numpy/core/setup.py, line 657, in get_mathlib_info
     raise RuntimeError(Broken toolchain: cannot link a simple C program)
 RuntimeError: Broken toolchain: cannot link a simple C program


 --
 Patrick Marsh
 Ph.D. Student / NSSL Liaison to the HWT
 School of Meteorology / University of Oklahoma
 Cooperative Institute for Mesoscale Meteorological Studies
 National Severe Storms Laboratory
 http://www.patricktmarsh.com


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy.dot segmentation fault

2010-01-18 Thread Robin
You can build numpy against Accelerate through macports by specifying
the +no_atlas variant.

Last time I tried I ran into this issue:
http://trac.macports.org/ticket/22201

but it looks like it should be fixed now.

Cheers

Robin


On Mon, Jan 18, 2010 at 8:15 PM, Mark Lescroart lescr...@usc.edu wrote:
 Hi David (et al),

 Thanks for the reply. The version of ATLAS I was using was v3.8.3_1,
 installed via MacPorts, compiled and built on my machine with a gcc4.3
 compiler. I uninstalled numpy 1.4 and ATLAS, re-installed (i.e., re-
 compiled) the same version (3.8.3_1), re-installed numpy (for
 python2.6, version 1.4), and got the same bug.

 I don't know if this means that there's something fundamentally wrong
 with the version of ATLAS on MacPorts (probably less likely) or
 something wrong with the way my system is configured (probably more
 likely). If anyone can give me any more insight into how to test my
 installation of ATLAS, I would be much obliged (I read through a fair
 bit of the ATLAS installation notes on the ATLAS sourceforge page, and
 could not figure out how to run the whole test suite ).

 If possible, I would like to solve this problem within Macports (and
 thus not with the Accelerate framework). I am using numpy mostly
 through the pyMVPA package for fMRI multi-voxel analysis, and the
 pyMVPA package depends on a number of other libraries, and the mess of
 dependencies is most easily managed within the framework of Macports.

 Cheers,

 Mark

 On Jan 17, 2010, at 8:38 PM, David Cournapeau wrote:

 Mark Lescroart wrote:
 Hello,

 I've encountered a segfault in numpy when trying to compute a dot
 product for two arrays - see code below. The problem only seems to
 occur
 when the arrays reach a certain size.

 Your atlas is most likely broken. You will have to double-check how
 you
 built it, and maybe run the whole test suite (as indicated in the
 ATLAS
 installation notes).

 Note that you can use the Accelerate framework on mac os x, this is
 much
 easier to get numpy working on mac,

 cheers,

 David
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.4.0 installer fails on OSX 10.6.2

2010-01-08 Thread Robin
On Fri, Jan 8, 2010 at 2:29 AM, David Warde-Farley d...@cs.toronto.edu wrote:
 On 5-Jan-10, at 7:02 PM, Christopher Barker wrote:

 Pretty sure the python.org binaries are 32-bit only. I still think
 it's sensible to prefer the

 waiting the rest of this sentence.. ;-)

 I had meant to say 'sensible to prefer the Python.org version' though
 in reality I'm a little miffed that Python.org isn't providing Ron's 4-
 way binaries, since he went to the trouble of adding support for
 building them. Grumble grumble.

My understanding was that 2.6/3.1 will never be buildable as an arch
selectable universal binary interpreter (like the apple system python)
due to this issue:
http://bugs.python.org/issue6834

I think this is only being fixed in 2.7/3.2 so perhaps from then
Python will distribute selectable universal builds.

(Just mention it in case folks aren't aware of that issue).

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ma feature request (log2)

2009-12-13 Thread Robin
Hi,

Could we have a ma aware numpy.ma.log2 please, similar to np.ma.log
and np.ma.log10?

I think it should be as simple as the patch below but perhaps I've
missed something:

Thanks,

Robin

--- core.py.orig2009-12-13 15:14:14.0 +
+++ core.py 2009-12-13 15:14:53.0 +
@@ -66,7 +66,7 @@
'identity', 'ids', 'indices', 'inner', 'innerproduct',
'isMA', 'isMaskedArray', 'is_mask', 'is_masked', 'isarray',
'left_shift', 'less', 'less_equal', 'load', 'loads', 'log', 'log10',
-   'logical_and', 'logical_not', 'logical_or', 'logical_xor',
+   'log2', 'logical_and', 'logical_not', 'logical_or', 'logical_xor',
'make_mask', 'make_mask_descr', 'make_mask_none', 'mask_or',
'masked', 'masked_array', 'masked_equal', 'masked_greater',
'masked_greater_equal', 'masked_inside', 'masked_invalid',
@@ -1124,6 +1124,8 @@
 _DomainGreater(0.0))
 log10 = _MaskedUnaryOperation(umath.log10, 1.0,
   _DomainGreater(0.0))
+log2 = _MaskedUnaryOperation(umath.log2, 1.0,
+  _DomainGreater(0.0))
 tan = _MaskedUnaryOperation(umath.tan, 0.0,
 _DomainTan(1e-35))
 arcsin = _MaskedUnaryOperation(umath.arcsin, 0.0,
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] non-standard standard deviation

2009-11-29 Thread Robin
On Mon, Nov 30, 2009 at 12:30 AM, Colin J. Williams c...@ncf.ca wrote:
 On 29-Nov-09 17:13 PM, Dr. Phillip M. Feldman wrote:
 All of the statistical packages that I am currently using and have used in
 the past (Matlab, Minitab, R, S-plus) calculate standard deviation using the
 sqrt(1/(n-1)) normalization, which gives a result that is unbiased when
 sampling from a normally-distributed population.  NumPy uses the sqrt(1/n)
 normalization.  I'm currently using the following code to calculate standard
 deviations, but would much prefer if this could be fixed in NumPy itself:

 def mystd(x=numpy.array([]), axis=None):
     This function calculates the standard deviation of the input using the
     definition of standard deviation that gives an unbiased result for
 samples
     from a normally-distributed population.

     xd= x - x.mean(axis=axis)
     return sqrt( (xd*xd).sum(axis=axis) / (numpy.size(x,axis=axis)-1.0) )

 Anne Archibald has suggested a work-around.  Perhaps ddof could be set,
 by default to
 1 as other values are rarely required.

 Where the distribution of a variate is not known a priori, then I
 believe that it can be shown
 that the n-1 divisor provides the best estimate of the variance.

There have been previous discussions on this (but I can't find them
now) and I believe the current default was chosen deliberately. I
think it is the view of the numpy developers that the n divisor has
more desireable properties in most cases than the traditional n-1 -
see this paper by Travis Oliphant for details:
http://hdl.handle.net/1877/438

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bus error in embedded numpy

2009-11-13 Thread Robin
Hi,

I'm trying to embed Python in a MATLAB mex file. I've been coming
under some pressure to make my Python code available to my MATLAB
colleagues so I am trying to come up with a relatively general way of
calling numerical python code from Matlab.

I am making some progress - but get a reliable crash from numpy. This
only occurs the second time I am loading it. So I Py_Initialize,
import numpy, Py_Finalize (all works fine), but then if I clear the
mex file (clear funcname in matlab - which calls Py_Finalize through a
mexAtExit handler)  and try to run the function again (which will
reinitialize interpreter and import numpy again) I get the followin
stack trace from multisarray.so. Wondered if anyone could through any
light. I have already run into this bug
http://bugs.python.org/issue6869 which prevents me using ctypes... I
wondered if this was related.

For now its not such a big problem - I will just avoid unloading the
mex function (with clear function). But I thought it might be
indicative of a memory leak or some other problem since I think in
theory it should work (It does if numpy isn't imported).

Cheers

Robin


 Bus error detected at Fri Nov 13 17:11:57 2009


Configuration:
  MATLAB Version:   7.8.0.347 (R2009a)
  MATLAB License:   161051
  Operating System: Darwin 10.0.0 Darwin Kernel Version 10.0.0: Fri
Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386
  Window System:The X.Org Foundation (10402000), display
/tmp/launch-2p1ZWg/:0
  Current Visual:   0x24 (class 4, depth 24)
  Processor ID: x86 Family 6 Model 15 Stepping 10, GenuineIntel
  Virtual Machine:  Java 1.6.0_15-b03-219 with Apple Inc. Java
HotSpot(TM) Client VM mixed mode
  Default Encoding:  ISO-8859-1

Fault Count: 1

Register State:
  eax = 0001  ebx = 307b12ab
  ecx =   edx = 305ef148
  esi = 305ef140  edi = 32a79d60
  ebp = b097c6c8  esp = b097c670
  eip = 307b14be  flg = 00010202

Stack Trace:
  [0] multiarray.so:PyArray_FromScalar~(0x305ef140, 0, 2, 0x32e23287)
+ 542 bytes
  [1] multiarray.so:gentype_nonzero_number~(0x305ef140, 0x2eaa6db0, 0,
0x32e73cfe) + 36 bytes
  [2] Python:PyObject_IsTrue~(0x305ef140, 0x2ea95de0, 2, 0) + 63 bytes
  [3] Python:PyEval_EvalFrameEx~(0x33c95160, 0, 0x332941e0, 0) + 12598 bytes
  [4] Python:PyEval_EvalFrameEx~(0x32a70f80, 0, 0x332941e0, 0) + 24217 bytes
  [5] Python:PyEval_EvalCodeEx~(0x33f05068, 0x332941e0, 0, 0x32a96794)
+ 1819 bytes
  [6] Python:PyEval_EvalFrameEx~(0x32a96640, 0, 0x332941e0, 0) + 16561 bytes
  [7] Python:PyEval_EvalCodeEx~(0x33f051d0, 0x332941e0, 0, 0x32a6fcc0)
+ 1819 bytes
  [8] Python:PyEval_EvalFrameEx~(0x32a6fb60, 0, 0x33bc99c0, 0) + 16561 bytes
  [9] Python:PyEval_EvalCodeEx~(0x334abda0, 0x33bc99c0, 0, 0x32ad3b24)
+ 1819 bytes
  [10] Python:PyEval_EvalFrameEx~(0x32ad39d0, 0, 0x33bc94b0, 0) + 16561 bytes
  [11] Python:PyEval_EvalCodeEx~(0x332919b0, 0x33bc94b0, 0,
0x33ce5294) + 1819 bytes
  [12] Python:PyEval_EvalFrameEx~(0x33ce5150, 0, 0x33bc94b0, 0) + 16561 bytes
  [13] Python:PyEval_EvalCodeEx~(0x332918d8, 0x33bc94b0, 0,
0x34d5156c) + 1819 bytes
  [14] Python:PyEval_EvalFrameEx~(0x34d51430, 0, 0x33bc9300,
0x33bc9300) + 16561 bytes
  [15] Python:PyEval_EvalCodeEx~(0x33291968, 0x33bc9300, 0x33bc9300,
0) + 1819 bytes
  [16] Python:PyEval_EvalCode~(0x33291968, 0x33bc9300, 0x33bc9300,
0x9325378f) + 87 bytes
  [17] Python:PyImport_ExecCodeModuleEx~(0xb097e4cb
numpy.core._internal, 0x33291968, 0xb097dbbf
/Library/Frameworks/Python.frame.., 0x332a2000) + 193 bytes
  [18] Python:load_source_module~(1, 0, 0xb097e418 X™C†, 0xb097e41c)
+ 726 bytes
  [19] Python:import_submodule~(0xb097e4d6 _internal, 0x33d1ce1c
_internal, 9, 0x32e2cfc9) + 293 bytes
  [20] Python:load_next~(0xb097e4cb numpy.core._internal,
0xb097e8cc, 0xb097e578, 0x32e8b5e6) + 195 bytes
  [21] Python:import_module_level~(0x32ee4aa0 TD, 0x,
0xee9e70b3, 0x32e6df4d «E) + 142 bytes
  [22] Python:PyImport_ImportModuleLevel~(0x33d1ce1c _internal,
0x33b25150, 0x33b25150, 0x32ee4aa0 TD) + 45 bytes
  [23] Python:builtin___import__~(0, 0x348854e0, 0, 0x32e73cfe) + 156 bytes
  [24] Python:PyObject_Call~(0x33da9a08, 0x348854e0, 0, 0x32e233cd) + 45 bytes
  [25] Python:PyEval_CallObjectWithKeywords~(0x33da9a08, 0x348854e0,
0, 0x33b25150) + 112 bytes
  [26] Python:PyEval_EvalFrameEx~(0x33cd2da0, 0, 0x33b25150,
0x33b25150) + 8138 bytes
  [27] Python:PyEval_EvalCodeEx~(0x2068, 0x33b25150, 0x33b25150,
0) + 1819 bytes
  [28] Python:PyEval_EvalCode~(0x2068, 0x33b25150, 0x33b25150,
0x9325378f) + 87 bytes
  [29] Python:PyImport_ExecCodeModuleEx~(0xb097fadb numpy.core,
0x2068, 0xb097ed7f /Library/Frameworks/Python.frame..,
0x4adc73cc) + 193 bytes
  [30] Python:load_source_module~(1, 0, 0xb097f5cc, 0) + 726 bytes
  [31] Python:load_package~(5, 0, 0xb097fa28, 0xb097fa2c) + 427 bytes
  [32] Python:import_submodule~(0xb097fae1 core

Re: [Numpy-discussion] bus error in embedded numpy

2009-11-13 Thread Robin
I forgot to add it doesn't happen if Py_Finalize isn't called - if I
take that out and just let the OS/matlab unload the mex function then
I can run it many times without the error, but it does leak a bit of
memory each time.

Cheers

Robin

On Fri, Nov 13, 2009 at 5:23 PM, Robin robi...@gmail.com wrote:
 Hi,

 I'm trying to embed Python in a MATLAB mex file. I've been coming
 under some pressure to make my Python code available to my MATLAB
 colleagues so I am trying to come up with a relatively general way of
 calling numerical python code from Matlab.

 I am making some progress - but get a reliable crash from numpy. This
 only occurs the second time I am loading it. So I Py_Initialize,
 import numpy, Py_Finalize (all works fine), but then if I clear the
 mex file (clear funcname in matlab - which calls Py_Finalize through a
 mexAtExit handler)  and try to run the function again (which will
 reinitialize interpreter and import numpy again) I get the followin
 stack trace from multisarray.so. Wondered if anyone could through any
 light. I have already run into this bug
 http://bugs.python.org/issue6869 which prevents me using ctypes... I
 wondered if this was related.

 For now its not such a big problem - I will just avoid unloading the
 mex function (with clear function). But I thought it might be
 indicative of a memory leak or some other problem since I think in
 theory it should work (It does if numpy isn't imported).

 Cheers

 Robin

 
             Bus error detected at Fri Nov 13 17:11:57 2009
 

 Configuration:
  MATLAB Version:   7.8.0.347 (R2009a)
  MATLAB License:   161051
  Operating System: Darwin 10.0.0 Darwin Kernel Version 10.0.0: Fri
 Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386
  Window System:    The X.Org Foundation (10402000), display
 /tmp/launch-2p1ZWg/:0
  Current Visual:   0x24 (class 4, depth 24)
  Processor ID:     x86 Family 6 Model 15 Stepping 10, GenuineIntel
  Virtual Machine:  Java 1.6.0_15-b03-219 with Apple Inc. Java
 HotSpot(TM) Client VM mixed mode
  Default Encoding:  ISO-8859-1

 Fault Count: 1

 Register State:
  eax = 0001  ebx = 307b12ab
  ecx =   edx = 305ef148
  esi = 305ef140  edi = 32a79d60
  ebp = b097c6c8  esp = b097c670
  eip = 307b14be  flg = 00010202

 Stack Trace:
  [0] multiarray.so:PyArray_FromScalar~(0x305ef140, 0, 2, 0x32e23287)
 + 542 bytes
  [1] multiarray.so:gentype_nonzero_number~(0x305ef140, 0x2eaa6db0, 0,
 0x32e73cfe) + 36 bytes
  [2] Python:PyObject_IsTrue~(0x305ef140, 0x2ea95de0, 2, 0) + 63 bytes
  [3] Python:PyEval_EvalFrameEx~(0x33c95160, 0, 0x332941e0, 0) + 12598 bytes
  [4] Python:PyEval_EvalFrameEx~(0x32a70f80, 0, 0x332941e0, 0) + 24217 bytes
  [5] Python:PyEval_EvalCodeEx~(0x33f05068, 0x332941e0, 0, 0x32a96794)
 + 1819 bytes
  [6] Python:PyEval_EvalFrameEx~(0x32a96640, 0, 0x332941e0, 0) + 16561 bytes
  [7] Python:PyEval_EvalCodeEx~(0x33f051d0, 0x332941e0, 0, 0x32a6fcc0)
 + 1819 bytes
  [8] Python:PyEval_EvalFrameEx~(0x32a6fb60, 0, 0x33bc99c0, 0) + 16561 bytes
  [9] Python:PyEval_EvalCodeEx~(0x334abda0, 0x33bc99c0, 0, 0x32ad3b24)
 + 1819 bytes
  [10] Python:PyEval_EvalFrameEx~(0x32ad39d0, 0, 0x33bc94b0, 0) + 16561 bytes
  [11] Python:PyEval_EvalCodeEx~(0x332919b0, 0x33bc94b0, 0,
 0x33ce5294) + 1819 bytes
  [12] Python:PyEval_EvalFrameEx~(0x33ce5150, 0, 0x33bc94b0, 0) + 16561 bytes
  [13] Python:PyEval_EvalCodeEx~(0x332918d8, 0x33bc94b0, 0,
 0x34d5156c) + 1819 bytes
  [14] Python:PyEval_EvalFrameEx~(0x34d51430, 0, 0x33bc9300,
 0x33bc9300) + 16561 bytes
  [15] Python:PyEval_EvalCodeEx~(0x33291968, 0x33bc9300, 0x33bc9300,
 0) + 1819 bytes
  [16] Python:PyEval_EvalCode~(0x33291968, 0x33bc9300, 0x33bc9300,
 0x9325378f) + 87 bytes
  [17] Python:PyImport_ExecCodeModuleEx~(0xb097e4cb
 numpy.core._internal, 0x33291968, 0xb097dbbf
 /Library/Frameworks/Python.frame.., 0x332a2000) + 193 bytes
  [18] Python:load_source_module~(1, 0, 0xb097e418 X™C†, 0xb097e41c)
 + 726 bytes
  [19] Python:import_submodule~(0xb097e4d6 _internal, 0x33d1ce1c
 _internal, 9, 0x32e2cfc9) + 293 bytes
  [20] Python:load_next~(0xb097e4cb numpy.core._internal,
 0xb097e8cc, 0xb097e578, 0x32e8b5e6) + 195 bytes
  [21] Python:import_module_level~(0x32ee4aa0 TD, 0x,
 0xee9e70b3, 0x32e6df4d «E) + 142 bytes
  [22] Python:PyImport_ImportModuleLevel~(0x33d1ce1c _internal,
 0x33b25150, 0x33b25150, 0x32ee4aa0 TD) + 45 bytes
  [23] Python:builtin___import__~(0, 0x348854e0, 0, 0x32e73cfe) + 156 bytes
  [24] Python:PyObject_Call~(0x33da9a08, 0x348854e0, 0, 0x32e233cd) + 45 bytes
  [25] Python:PyEval_CallObjectWithKeywords~(0x33da9a08, 0x348854e0,
 0, 0x33b25150) + 112 bytes
  [26] Python:PyEval_EvalFrameEx~(0x33cd2da0, 0, 0x33b25150,
 0x33b25150) + 8138 bytes
  [27] Python:PyEval_EvalCodeEx~(0x2068, 0x33b25150, 0x33b25150,
 0) + 1819 bytes
  [28] Python:PyEval_EvalCode~(0x2068, 0x33b25150, 0x33b25150

Re: [Numpy-discussion] bus error in embedded numpy

2009-11-13 Thread Robin
 I'm trying to embed Python in a MATLAB mex file. I've been coming
 under some pressure to make my Python code available to my MATLAB
 colleagues so I am trying to come up with a relatively general way of
 calling numerical python code from Matlab.

I get a similar but different error trying to do the same with scipy -
(importing scipy after a Py_Finalize and matlab function clear) - this
time in umath.so:


 Bus error detected at Fri Nov 13 17:53:43 2009


Configuration:
  MATLAB Version:   7.8.0.347 (R2009a)
  MATLAB License:   161051
  Operating System: Darwin 10.0.0 Darwin Kernel Version 10.0.0: Fri
Jul 31 22:47:34 PDT 2009; root:xnu-1456.1.25~1/RELEASE_I386 i386
  Window System:The X.Org Foundation (10402000), display
/tmp/launch-2p1ZWg/:0
  Current Visual:   0x24 (class 4, depth 24)
  Processor ID: x86 Family 6 Model 15 Stepping 10, GenuineIntel
  Virtual Machine:  Java 1.6.0_15-b03-219 with Apple Inc. Java
HotSpot(TM) Client VM mixed mode
  Default Encoding:  ISO-8859-1

Fault Count: 1

Register State:
  eax = 33028046  ebx = 327464cb
  ecx = 3382b110  edx = 330180b0
  esi = 304d6ac0  edi = 336e6660
  ebp = b08fa8f8  esp = b08fa8cc
  eip = 23810006  flg = 00010a92

Stack Trace:
  [0] 0x23810006(0x336e6660, 0x3352e415 remainder, 0x32824d74, 0)
  [1] umath.so:initumath~(0xb08faebb numpy.core.umath, 0xb08faec6
umath, 0xb08faa07 /Library/Frameworks/Python.frame.., 0xa043aab0)
+ 2152 bytes
  [2] Python:_PyImport_LoadDynamicModule~(0xb08faebb
numpy.core.umath, 0xb08faa07 /Library/Frameworks/Python.frame..,
0xa043aab0, 0x331ae7d3) + 153 bytes
  [3] Python:load_module~(3, 0, 0xb08fae08 ∞™C†, 0xb08fae0c) + 203 bytes
  [4] Python:import_submodule~(0xb08faec6 umath, 0x326eecd4 umath,
5, 0x33148fc9) + 293 bytes
  [5] Python:load_next~(0xb08faebb numpy.core.umath, 0xb08fb2bc,
0xb08faf68, 0x331a75e6) + 195 bytes
  [6] Python:import_module_level~(0x33200aa0, 0x, 0xee9e70b3,
0x33189f4d «E) + 142 bytes
  [7] Python:PyImport_ImportModuleLevel~(0x326eecd4 umath,
0x336340c0, 0x336340c0, 0x33200aa0) + 45 bytes
  [8] Python:builtin___import__~(0, 0x33845ea0, 0, 0x3318fcfe) + 156 bytes
  [9] Python:PyObject_Call~(0x326ba7b0, 0x33845ea0, 0, 0x3313f3cd) + 45 bytes
  [10] Python:PyEval_CallObjectWithKeywords~(0x326ba7b0, 0x33845ea0,
0, 0x336340c0) + 112 bytes
  [11] Python:PyEval_EvalFrameEx~(0x304d5fd0, 0, 0x336340c0,
0x336340c0) + 8138 bytes
  [12] Python:PyEval_EvalCodeEx~(0x30386e30, 0x336340c0, 0x336340c0,
0) + 1819 bytes
  [13] Python:PyEval_EvalCode~(0x30386e30, 0x336340c0, 0x336340c0,
0x9325378f) + 87 bytes
  [14] Python:PyImport_ExecCodeModuleEx~(0xb08fc4cb numpy.core,
0x30386e30, 0xb08fb76f /Library/Frameworks/Python.frame..,
0x4adc73cc) + 193 bytes
  [15] Python:load_source_module~(1, 0, 0xb08fbfbc X™C†0D80†370, 0)
+ 726 bytes
  [16] Python:load_package~(5, 0, 0xb08fc418, 0xb08fc41c) + 427 bytes
  [17] Python:import_submodule~(0xb08fc4d1 core, 0x3384512a
core.numeric, 4, 0x33148fc9) + 293 bytes
  [18] Python:load_next~(0xb08fc4cb numpy.core, 0xb08fc8cc, 9,
0x331a75e6) + 195 bytes
  [19] Python:import_module_level~(0x33200aa0, 0x, 0x746e6920,
0x33189f4d «E) + 213 bytes
  [20] Python:PyImport_ImportModuleLevel~(0x33845124
numpy.core.numeric, 0x33634ae0, 0x33634ae0, 0x33200aa0) + 45 bytes
  [21] Python:builtin___import__~(0, 0x33845f30, 0, 0x3318fcfe) + 156 bytes
  [22] Python:PyObject_Call~(0x326ba7b0, 0x33845f30, 0, 0x3313f3cd) + 45 bytes
  [23] Python:PyEval_CallObjectWithKeywords~(0x326ba7b0, 0x33845f30,
0, 0x33634ae0) + 112 bytes
  [24] Python:PyEval_EvalFrameEx~(0x304caf70, 0, 0x33634ae0,
0x33634ae0) + 8138 bytes
  [25] Python:PyEval_EvalCodeEx~(0x30386e78, 0x33634ae0, 0x33634ae0,
0) + 1819 bytes
  [26] Python:PyEval_EvalCode~(0x30386e78, 0x33634ae0, 0x33634ae0,
0x9325378f) + 87 bytes
  [27] Python:PyImport_ExecCodeModuleEx~(0xb08fd68b
numpy.lib.type_check, 0x30386e78, 0xb08fcd7f
/Library/Frameworks/Python.frame.., 6771) + 193 bytes
  [28] Python:load_source_module~(1, 0, 0xb08fd5d8, 0xb08fd5dc) + 726 bytes
  [29] Python:import_submodule~(0xb08fd695 type_check, 0x3038a494
type_check, 10, 0x33148fc9) + 293 bytes
  [30] Python:load_next~(0xb08fd68b numpy.lib.type_check,
0xb08fda8c, 0xb08fd738, 0x331a75e6) + 195 bytes
  [31] Python:import_module_level~(0x32703e30, 0x, 0xee9e70b3,
0x33189f4d «E) + 142 bytes
  [32] Python:PyImport_ImportModuleLevel~(0x3038a494 type_check,
0x33634150, 0x33634150, 0x32703e30) + 45 bytes
  [33] Python:builtin___import__~(0, 0x33845540, 0, 0x3318fcfe) + 156 bytes
  [34] Python:PyObject_Call~(0x326ba7b0, 0x33845540, 0, 0x3313f3cd) + 45 bytes
  [35] Python:PyEval_CallObjectWithKeywords~(0x326ba7b0, 0x33845540,
0, 0x33634150) + 112 bytes
  [36] Python:PyEval_EvalFrameEx~(0x304b4fd0, 0, 0x33634150,
0x33634150) + 8138 bytes
  [37] Python:PyEval_EvalCodeEx~(0x32727bf0, 0x33634150, 0x33634150,
0) + 1819 

Re: [Numpy-discussion] bus error in embedded numpy

2009-11-13 Thread Robin
On Fri, Nov 13, 2009 at 6:13 PM, Robert Kern robert.k...@gmail.com wrote:
 On Fri, Nov 13, 2009 at 12:05, Travis Oliphant oliph...@enthought.com wrote:
 I wonder if this is related to the fact that you can't unload a
 dynamically linked module (like NumPy has).   So, when you call
 Py_Finalize you are not really finalizing your usage of Python
 extension modules.

 I'm not sure though.

 Right. We do some global things when numpy is imported. Since there is
 no unload step for extension modules, we can't undo them. The second
 time the interpreter starts up, it doesn't know that numpy has already
 been loaded and that numpy shouldn't try to do those global things
 again.

Thanks for the quick responses...

I'm sure you're right - in fact it looks like there was a similar issue here:
http://www.mathworks.co.uk/matlabcentral/newsreader/view_thread/262089

I had assumed when matlab unloads the mex function it would also
unload python - but it looks like other dynamic libs pulled in from
the mex function (in this case python and in turn numpy) aren't
unloaded...

I wonder if it would be possible to link python statically to my mex
function, so it really is unloaded when the mex function is... but I'm
getting a bit out of my depth with linker options, and I guess numpy
is always loaded dynamically anyway and will stick around.

Easy enough to work around it anyway - but just wanted to check what
was going on.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bus error in embedded numpy

2009-11-13 Thread Robin
On Fri, Nov 13, 2009 at 6:50 PM, Pauli Virtanen pav...@iki.fi wrote:
 Fri, 13 Nov 2009 17:23:19 +, Robin wrote:
 I'm trying to embed Python in a MATLAB mex file. I've been coming under
 some pressure to make my Python code available to my MATLAB colleagues
 so I am trying to come up with a relatively general way of calling
 numerical python code from Matlab.

 Out of curiosity, how are you handling the Matlab RTLD_GLOBAL issue. Last
 time [1] I did something similar, I had to hack around it in an ugly way.

 .. [1] http://www.iki.fi/pav/software/pythoncall/index.html

Actually I was completely unaware of it (the RTLD_GLOBAL thing). I had
googled for a while trying to find a similar project (I had assumed
someone must have done it) but somehow didn't find your pythoncall
project. It's great - the interface is very close to what I had in
mind, but it's much more comprehensive then anything I thought of (I
was hoping to handle just contiguous numpy arrays).

How does the RTLD_GLOBAL thing manifest itself? So far I have only a
very basic start which basically consists of:
cmd = mxArrayToString(prhs[0]);
PyRun_SimpleString(cmd);
but I haven't noticed anything not working - I can run numpy
testsuite, and do simple commands as expected (initiliase arrays in
the interpreter, run numpy functions on them). Perhaps recent versions
of Matlab behave differently (I am using R2009a on a mac).

So far the only remotely tricky thing I did was redirect sys.stdout
and sys.stderr to a wrapper that uses mexPrintf so output goes to the
matlab console.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bus error in embedded numpy

2009-11-13 Thread Robin
On Fri, Nov 13, 2009 at 7:48 PM, Sturla Molden stu...@molden.no wrote:
 Robin skrev:
 I had assumed when matlab unloads the mex function it would also
 unload python - but it looks like other dynamic libs pulled in from
 the mex function (in this case python and in turn numpy) aren't
 unloaded...

 Matlab MEX functions are DLLs, Python interpreter is a DLL, NumPy .pyd
 files are DLLs... If you cannot unload Python or NumPy, you cannot
 unload MEX functions either. The same OS constraints apply.

Ah, I hadn't realised it was an OS constraint - I thought it was
possible to unload dlls - and that was why matlab provides the clear
function. mex automatically clears a function when you rebuild it - I
thought that was how you can rebuild and reload mex functions without
restarting matlab. I thought it was just a quirk of matlab that there
was no way to unload other libraries pulled in through the mex
function. I must admit to being a bit out of my depth with this stuff
though so I stand corrected.

 If you are using Windows, I would recommend that you expose your NumPy
 code as a COM object using pywin32, and make a Matlab wrapper (ordinary
 .m file) that calls the COM object. If you are not using Windows, you
 could create an XMLRPC server in Python and call that using Matlab's
 built-in Java VM.

 Or you can spawn a separate Python process from Matlab, and communicate
 using pipes or a tcp/ip socket (Matlab's Java or MEX).

 There are many solutions that are easier than embedding Python in Matlab.

I really want a cross platform solution so that rules out COM. I
thought about using web services or something which I guess would be
the easiest way to do communication through a tcp socket (least work
since I could use existing web services libraries on both sides). But
actually I have found it pretty easy to embed Python so far... with
about 5 lines of code I'm able run simple strings and with a small
cython module I get stdout to the console. I haven't tried passing
data back and forth yet but from Pauli's pythoncall it doesn't look
like that is too difficult. I was sort of hoping that eventually I
could create a numpy array from a matlab data pointer - at least for
read only input - so that I could use python code to work on large
amounts of data without so much memory overhead (returning smaller
results by copying).

Do you think I'm likely to run into more problems? I got the feeling
from asking questions on IRC that embedding Python is kind of
discouraged but I thought in this case it seemed the neatest way.

 Be careful when you are using file handles. You have to be sure that
 Matlab, Python and NumPy are all linked against the same CRT.

I was aware of this - I thought I would be OK on the mac - at least
Python and Numpy and my mex function are built with apple gcc although
I'm not sure about Matlab. I guess Windows will be more difficult...
But in any case I don't plan to pass any file handles around.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] f2py-users list not working

2009-11-04 Thread Robin
On Wed, Nov 4, 2009 at 2:38 PM, George Nurser gnur...@googlemail.com wrote:
 Fortran can accept preprocessor directives, but f2py cannot.
 You first need to preprocess a .F (or .F90) file to create a .f (or
 .f90) file which you then pass to f2py
 The way I preprocess the .F file is to have statements like
 int int*INTSIZE :: i,j,k

 So preprocess file.F e.g. in gfortran with
 gfortran -E -DINTSIZE=8 file.F  -o outdir/file.f

 The outdir is necessary in a case-insensitive file system (like
 default mac OSX) to prevent the .f files overwriting the .F file.
 Alternatively, it may be possible to use some other suffix than .f,
 but I've not tried that.

 Then f2py file.f

That's great thanks very much! It's more or less exactly what I was hoping for.

I wonder if it's possible to get distutils to do the preprocess step
from a setup.py script?

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] strange performance on mac 2.5/2.6 32/64 bit

2009-11-03 Thread Robin
Hi,

I'm not sure if this is of much interest but it's been really puzzling
me so I thought I'd ask.

In an earlier post I described how I was surprised a simple f2py
wrapped fortran bincount was 4x faster than np.bincount - but that
differential only seemed to be on my mac; on moving to linux they both
took more or less the same time. I'm trying to work out if it is worth
moving some of my bottlenecks to fortran (most of which are np
builtins). So far it looks like it is - but only on my mac and only
32bit (see below).
Well the only explanation I thought was that the gcc-4.0 used to build
numpy on a mac didn't perform so well, so after upgrading to snow
leopard I've been trying to look at this again. I was hoping I could
get the equivalent performance on my mac, like on linux, which would
result in the np c stuff being a couple of times faster.

So far, with Python 2.6.3 in 64 bit - numpy seems to be significantly
slower and my fortran code _much_ slower - even from the same
compiler. Can anyone help me understand what is going on?

I have only been able to build 32 bit numpy against 2.5.4 with apple
gcc-4.0 and 64 bit numpy against 2.6.3 universal with gcc-4.2. I
haven't been able to get a numpy I can import on 2.6.3 in 32 bit mode
( http://projects.scipy.org/numpy/ticket/1221 ).

Here are the results for python.org 32 bit 2.5.4, numpy compiled with
apple gcc 4.0, f2py using att gfortran 4.2:
In [2]: timeit x = np.random.random_integers(0,1023,1).astype(int)
1 loops, best of 3: 2.86 s per loop
In [3]: x = np.random.random_integers(0,1023,1).astype(int)
In [4]: timeit np.bincount(x)
1 loops, best of 3: 435 ms per loop
In [6]: timeit gf42.bincount(x,1024)
10 loops, best of 3: 129 ms per loop
In [7]: np.__version__
Out[7]: '1.4.0.dev7618'

And for self-built (apple gcc 4.2) 64 bit 2.6.3, numpy compiled with
apple gcc 4.2, f2py using the same att gfortran 4.2:
In [3]: timeit x = np.random.random_integers(0,1023,1).astype(int)
1 loops, best of 3: 3.91 s per loop  # 37% slower than 32bit
In [4]: x = np.random.random_integers(0,1023,1).astype(int)
In [5]: timeit np.bincount(x)
1 loops, best of 3: 582 ms per loop # 34 % slower than 32 bit
In [8]: timeit gf42_64.bincount(x,1024)
1 loops, best of 3: 803 ms per loop # 522% slower than 32 bit


So why is there this big difference in performance? I'd really like to
know why the fortran compiled with the same compiler is so much slower
in 64 bit mode. As far as I can tell the flags used are the same. Also
why is numpy slower. I was surprised the I was able to import the 64
bit universal module built with f2py from 2.6 inside 32 bit 3.5 and
there it was quick again - so it seems the x64_64 code generated by
the fortran compiler is much slower (rather than any wrappers or
such).

I tried using some more recent gfortrans from macports - but could
only use them to build modules against the 64 bit python/numpy since I
couldn't find a way to get f2py to force 32 bit output. But the
performance was more or less the same (always several times slower the
32 bit att gfortran).

Any advice appreciated.

Cheers

Robin


subroutine bincount (x,c,n,m)
implicit none
integer, intent(in) :: n,m
integer, dimension(0:n-1), intent(in) :: x
integer, dimension(0:m-1), intent(out) :: c
integer :: i

c = 0
do i = 0, n-1
c(x(i)) = c(x(i)) + 1
end do
end
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] strange performance on mac 2.5/2.6 32/64 bit

2009-11-03 Thread Robin
After some more pootling about I figured out a lot of the performance
loss comes from using 32 bit integers by default when compiles 64 bit.
I asked this question on stackoverflow:
http://stackoverflow.com/questions/1668899/fortran-32-bit-64-bit-performance-portability

is there any way to use fortran with f2py from python in a way that
doesn't require the code to be changed depending on platform?

Or should I just pack it all in and use weave?

Robin

On Tue, Nov 3, 2009 at 4:29 PM, Robin robi...@gmail.com wrote:
 Hi,

 I'm not sure if this is of much interest but it's been really puzzling
 me so I thought I'd ask.

 In an earlier post I described how I was surprised a simple f2py
 wrapped fortran bincount was 4x faster than np.bincount - but that
 differential only seemed to be on my mac; on moving to linux they both
 took more or less the same time. I'm trying to work out if it is worth
 moving some of my bottlenecks to fortran (most of which are np
 builtins). So far it looks like it is - but only on my mac and only
 32bit (see below).
 Well the only explanation I thought was that the gcc-4.0 used to build
 numpy on a mac didn't perform so well, so after upgrading to snow
 leopard I've been trying to look at this again. I was hoping I could
 get the equivalent performance on my mac, like on linux, which would
 result in the np c stuff being a couple of times faster.

 So far, with Python 2.6.3 in 64 bit - numpy seems to be significantly
 slower and my fortran code _much_ slower - even from the same
 compiler. Can anyone help me understand what is going on?

 I have only been able to build 32 bit numpy against 2.5.4 with apple
 gcc-4.0 and 64 bit numpy against 2.6.3 universal with gcc-4.2. I
 haven't been able to get a numpy I can import on 2.6.3 in 32 bit mode
 ( http://projects.scipy.org/numpy/ticket/1221 ).

 Here are the results for python.org 32 bit 2.5.4, numpy compiled with
 apple gcc 4.0, f2py using att gfortran 4.2:
 In [2]: timeit x = np.random.random_integers(0,1023,1).astype(int)
 1 loops, best of 3: 2.86 s per loop
 In [3]: x = np.random.random_integers(0,1023,1).astype(int)
 In [4]: timeit np.bincount(x)
 1 loops, best of 3: 435 ms per loop
 In [6]: timeit gf42.bincount(x,1024)
 10 loops, best of 3: 129 ms per loop
 In [7]: np.__version__
 Out[7]: '1.4.0.dev7618'

 And for self-built (apple gcc 4.2) 64 bit 2.6.3, numpy compiled with
 apple gcc 4.2, f2py using the same att gfortran 4.2:
 In [3]: timeit x = np.random.random_integers(0,1023,1).astype(int)
 1 loops, best of 3: 3.91 s per loop  # 37% slower than 32bit
 In [4]: x = np.random.random_integers(0,1023,1).astype(int)
 In [5]: timeit np.bincount(x)
 1 loops, best of 3: 582 ms per loop # 34 % slower than 32 bit
 In [8]: timeit gf42_64.bincount(x,1024)
 1 loops, best of 3: 803 ms per loop # 522% slower than 32 bit


 So why is there this big difference in performance? I'd really like to
 know why the fortran compiled with the same compiler is so much slower
 in 64 bit mode. As far as I can tell the flags used are the same. Also
 why is numpy slower. I was surprised the I was able to import the 64
 bit universal module built with f2py from 2.6 inside 32 bit 3.5 and
 there it was quick again - so it seems the x64_64 code generated by
 the fortran compiler is much slower (rather than any wrappers or
 such).

 I tried using some more recent gfortrans from macports - but could
 only use them to build modules against the 64 bit python/numpy since I
 couldn't find a way to get f2py to force 32 bit output. But the
 performance was more or less the same (always several times slower the
 32 bit att gfortran).

 Any advice appreciated.

 Cheers

 Robin

 
 subroutine bincount (x,c,n,m)
    implicit none
    integer, intent(in) :: n,m
    integer, dimension(0:n-1), intent(in) :: x
    integer, dimension(0:m-1), intent(out) :: c
    integer :: i

    c = 0
    do i = 0, n-1
        c(x(i)) = c(x(i)) + 1
    end do
 end

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] strange performance on mac 2.5/2.6 32/64 bit

2009-11-03 Thread Robin
On Tue, Nov 3, 2009 at 6:14 PM, Robin robi...@gmail.com wrote:
 After some more pootling about I figured out a lot of the performance
 loss comes from using 32 bit integers by default when compiles 64 bit.
 I asked this question on stackoverflow:
 http://stackoverflow.com/questions/1668899/fortran-32-bit-64-bit-performance-portability

 is there any way to use fortran with f2py from python in a way that
 doesn't require the code to be changed depending on platform?

Including the -DF2PY_REPORT_ON_ARRAY_COPY option showed that the big
performance hit was from f2py copying the arrays to cast from 64 bit
to 32 bit.

Is there a recommended way to easily write fortran extensions that
work on both 32bit and 64bit machines (something like using
-fdefault-int-8 and f2py not casting on a 64 bit platform, and not
using the option and not casting on a 32 bit platform)?

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] f2py error with -DF2PY_REPORT_ATEXIT

2009-11-03 Thread Robin
Hi,

When I try to build a fortran module with f2py from '1.4.0.dev7618'
with gfortran 4.2.3 from att.com and apple gcc 4.2 on snow leopard I
get the following error when I try to use -DF2PY_REPORT_ATEXIT:

In [1]: import gfint
---
ImportError   Traceback (most recent call last)

/Users/robince/fortran/ipython console in module()

ImportError: dlopen(./gfint.so, 2): Symbol not found: _on_exit
  Referenced from: /Users/robince/fortran/gfint.so
  Expected in: dynamic lookup

Is numpy trac the place for f2py tickets? Or is there an obvious fix.
Inspecting gfint.so shows the same symbols for both architectures, and
_on_exit is listed there with a U which I guess means undefined.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] f2py error with -DF2PY_REPORT_ATEXIT

2009-11-03 Thread Robin
Just noticed this is only supported on linux - sorry for the noise
(not having a very good day today!)

Robin

On Tue, Nov 3, 2009 at 6:47 PM, Robin robi...@gmail.com wrote:
 Hi,

 When I try to build a fortran module with f2py from '1.4.0.dev7618'
 with gfortran 4.2.3 from att.com and apple gcc 4.2 on snow leopard I
 get the following error when I try to use -DF2PY_REPORT_ATEXIT:

 In [1]: import gfint
 ---
 ImportError                               Traceback (most recent call last)

 /Users/robince/fortran/ipython console in module()

 ImportError: dlopen(./gfint.so, 2): Symbol not found: _on_exit
  Referenced from: /Users/robince/fortran/gfint.so
  Expected in: dynamic lookup

 Is numpy trac the place for f2py tickets? Or is there an obvious fix.
 Inspecting gfint.so shows the same symbols for both architectures, and
 _on_exit is listed there with a U which I guess means undefined.

 Cheers

 Robin

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] f2py-users list not working

2009-11-03 Thread Robin
I just tried to send the message below to f2py-users -
f2py-us...@cens.ioc.ee, but delivery failed.

Not sure where else to report this so hopefully here is ok.

Cheers

Robin


-- Forwarded message --
From: Mail Delivery Subsystem mailer-dae...@googlemail.com
Date: Tue, Nov 3, 2009 at 9:40 PM
Subject: Delivery Status Notification (Failure)
To: robi...@gmail.com


Delivery to the following recipient failed permanently:

    f2py-us...@cens.ioc.ee

Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the
recipient domain. We recommend contacting the other email provider for
further information about the cause of this error. The error that the
other server returned was: 550 550 Unrouteable address (state 14).

- Original message -
Subject: writing module for 32/64 bit
From: Robin robi...@gmail.com
To: f2py-us...@cens.ioc.ee
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi,

I would like to write some subroutines in fortran involving integers
and distribute them in a small package.  Users might be on 32 bit or
64 bit.

Is there an easy or recommended way to approach this? Ideally I would
like to work with the native integer type - but if I use 'integer' in
fortran it is always 32 bit and f2py converts input aways when numpy
is 64 bit. If I use integer*8 in the code then its fine for 64 bit,
but on 32 bit platforms, both f2py has to convert and its not the
native integer size.

What I (think) I'd like to do is be able to use the native platform
integer type, like numpy does, and then not worry about. I found there
are options like -fdefault-int-8 for gfortran, but when I add that
stuff breaks (bus error) since I guess f2py doesn't know about it and
is converting and passing 32 bits anyway.

Is there any way around this, or do I need to maintain 2 different
versions with different fortran code for 32bit/64bit? Or is it
possible to acheive something like this with preprossor #ifdefs? (not
sure how this works with fortran, or if f2py would be aware of it).

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] f2py-users list not working

2009-11-03 Thread Robin
On Tue, Nov 3, 2009 at 10:05 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
 I believe f2py development has moved elsewhere, the new mailing list seems
 to be at http://groups.google.com/group/f2py-dev?pli=1 . It doesn't look to
 be very active. Perhaps the summer additions to cython adding some fortran
 support might be useful to you. I don't know much about that but perhaps Dag
 will offer some advice.

Thanks - from the archives of f2py-users
http://cens.ioc.ee/pipermail/f2py-users/ it looks like there was
activity as recently as October (15th) which is why I assumed it was
still supposed to be working. Certainly more traffic more recently
than the google group.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] recommended way to run numpy on snow leopard

2009-10-29 Thread Robin
On Fri, Oct 23, 2009 at 9:09 AM, David Warde-Farley d...@cs.toronto.eduwrote:

 The Python.org sources for 2.6.x has a script in the Mac/ subdirectory
 (I think, or in the build tools) for building a 4-way universal binary
 (i386, x86_64, ppc and ppc64). You can rather easily build it (just
 run the script) and it will produce executables of the form python (or
 python2.6) suffixed with -32 or -64 to run in one mode or the other.
 So, python-32 (or python2.6-32) will get you 32 bit Python, which will
 work with wxPython using wxMac, or python-64, which will not (but will
 do everything in 64-bit mode). I've successfully gotten svn numpy to
 build 4-way using such a 4-way Python.


After having some trouble I decided to try this way to build universal 32/64
bit intel framework build and just use that as my main python for my work.
(Had some problems with macports and virtualenv, I want to leave the system
one alone and theres no 64 bit python.org build).

Just in case any one else tries this - there is a problem where it's
impossible to select the 32 bit architecture:
http://bugs.python.org/issue6834
It might be possible to work around or use the alternative pythonw.c in the
ticket - but it won't be fixed in a release until 2.7.

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] recommended way to run numpy on snow leopard

2009-10-21 Thread Robin
Hi,

I was wondering what the recommended way to run numpy/scipy on mac os
x 10.6 is. I understood previously it was recommended to use
python.org python and keep everything seperate from the system python,
which worked well. But now I would like to have a 64 bit python and
numpy, and there isn't one available from python.org. Also I think the
python.org ones are built against the 10.4 SDK which I understand
requires using gcc 4.0 - I was keen to try 4.2 to see if some of the
differential performance I've seen between c and fortran goes away.

So I guess the choices are system python with a virtualenv or a
macports built python (or a hand built python).

I'm thinking of using macports at the moment but I'm not sure how to
handle preventing macports numpy from installing so I can use svn
numpy. I'm not sure how the virtualenv will work with packaged
installers - (ie how could I tell the wx installer to install into the
virtualenv).

I was wondering what others do?

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] recommended way to run numpy on snow leopard

2009-10-21 Thread Robin
On Wed, Oct 21, 2009 at 10:28 AM, David Cournapeau
da...@ar.media.kyoto-u.ac.jp wrote:
 Robin wrote:
 Hi,

 I was wondering what the recommended way to run numpy/scipy on mac os
 x 10.6 is. I understood previously it was recommended to use
 python.org python and keep everything seperate from the system python,
 which worked well.

 You can simply use the --user option to the install command: instead of
 installing in /System, it will install numpy (or any other package) in
 $HOME/.local, and you don't need to update PYTHONPATH, as python knows
 about this location.

Thanks - that looks ideal. I take it $HOME/.local is searched first so
numpy will be used fromt here in preference to the system numpy.

My only worry is with installer packages - I'm thinking mainly of
wxpython. Is there a way I can get that package to install in
$HOME/.local. (The installer only seems to let you choose a drive).
Also - if I build for example vim against the system python, will I be
able to see packages in $HOME/.local from the python interpreter
inside vim?

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] recommended way to run numpy on snow leopard

2009-10-21 Thread Robin
Thanks...

On Wed, Oct 21, 2009 at 11:41 AM, David Cournapeau
da...@ar.media.kyoto-u.ac.jp wrote:
 Robin wrote:

 Thanks - that looks ideal. I take it $HOME/.local is searched first so
 numpy will be used fromt here in preference to the system numpy.


 Yes, unless framework-enabled python does something 'fishy' (I think
 framework vs convention python have different rules w.r.t. sys.path). As
 always, in doubt, you should check with numpy.__file__ which one is loaded.

It seems it does...  the built in numpy which is in
'/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python',
comes before $HOME/.local in sys.path so I think system numpy will
always be picked up over my own installed version.

robin-mbp:~ robince$ /usr/bin/python2.6 -c import sys; print sys.path
['', 
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python26.zip',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-darwin',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-mac',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/plat-mac/lib-scriptpackages',
'/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-tk',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-old',
'/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload',
'/Users/robince/.local/lib/python2.6/site-packages',
'/Library/Python/2.6/site-packages',
'/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/PyObjC',
'/System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/wx-2.8-mac-unicode']

So I guess virtualenv or macports?

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?

2009-10-19 Thread Robin
Hi,

I have been looking at moving some of my bottleneck functions to
fortran with f2py. To get started I tried some simple things, and was
surprised they performend so much better than the number builtins -
which I assumed would be c and would be quite fast.

On my Macbook pro laptop (Intel core 2 duo) I got the following
results. Numpy is built with xcode gcc 4.0.1 and gfortran is 4.2.3 -
fortran code for shuffle and bincount below:
In [1]: x = np.random.random_integers(0,1023,100).astype(int)
In [2]: import ftest
In [3]: timeit np.bincount(x)
100 loops, best of 3: 3.97 ms per loop
In [4]: timeit ftest.bincount(x,1024)
1000 loops, best of 3: 1.15 ms per loop
In [5]: timeit np.random.shuffle(x)
1 loops, best of 3: 605 ms per loop
In [6]: timeit ftest.shuffle(x)
10 loops, best of 3: 139 ms per loop

So fortran was about 4 times faster for these loops - similarly faster
than cython as well. So I was really happy as these are two of my
biggest bottlenecks, but when I moved a linux workstation I got
different results. Here with gcc/gfortran 4.3.3 :
In [3]: x = np.random.random_integers(0,1023,100).astype(int)
In [4]: timeit np.bincount(x)
100 loops, best of 3: 8.18 ms per loop
In [5]: timeit ftest.bincount(x,1024)
100 loops, best of 3: 8.25 ms per loop
In [6]:
In [7]: timeit np.random.shuffle(x)
1 loops, best of 3: 379 ms per loop
In [8]: timeit ftest.shuffle(x)
10 loops, best of 3: 172 ms per loop

So shuffle is a bit faster, but bincount is now the same as fortran.
The only thing I can think is that it is due to much better
performance of the more recent c compiler. I think this would also
explain why f2py extension was performing so much better than cython
on the mac.

So my question is -  is there a way to build numpy with a more recent
compiler on leopard? (I guess I could upgrade to snow leopard now) -
Could I make the numpy install use gcc-4.2 from xcode or would it
break stuff? Could I use gcc 4.3.3 from macports? It would be great to
get a 4x speed up on all numpy c loops! (already just these two
functions I use a lot would make a big difference).

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fortran vs numpy on mac/linux - gcc performance?

2009-10-19 Thread Robin
Forgot to include the fortran code used:

jm-g26b101:fortran robince$ cat test.f95
subroutine bincount (x,c,n,m)
implicit none
integer, intent(in) :: n,m
integer, dimension(0:n-1), intent(in) :: x
integer, dimension(0:m-1), intent(out) :: c
integer :: i

c = 0
do i = 0, n-1
c(x(i)) = c(x(i)) + 1
end do
end


subroutine shuffle (x,s,n)
implicit none
integer, intent(in) :: n
integer, dimension(n), intent(in) :: x
integer, dimension(n), intent(out) :: s
integer :: i,randpos,temp
real :: r

! copy input
s = x
call init_random_seed()
! knuth shuffle from http://rosettacode.org/wiki/Knuth_shuffle#Fortran
do i = n, 2, -1
call random_number(r)
randpos = int(r * i) + 1
temp = s(randpos)
s(randpos) = s(i)
s(i) = temp
end do
end


subroutine init_random_seed()
! init_random_seed from gfortran documentation
integer :: i, n, clock
integer, dimension(:), allocatable :: seed

call random_seed(size = n)
allocate(seed(n))

call system_clock(count=clock)

seed = clock + 37 * (/ (i - 1, i = 1, n) /)
call random_seed(put = seed)

deallocate(seed)
end subroutine
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] extension questions: f2py and cython

2009-10-15 Thread Robin
Hi,

Sent this last week but checking the archives it appears not to have
got through. Hopefully this will work...

I am looking at moving some of my code to fortran to use with f2py. To
get started I used this simple example:

SUBROUTINE bincount (x,c,n,m)
 IMPLICIT NONE
 INTEGER, INTENT(IN) :: n,m
 INTEGER, DIMENSION(n), INTENT(IN) :: x
 INTEGER, DIMENSION(0:m-1), INTENT(OUT) :: c
 INTEGER :: i

 DO i = 1, n
   c(x(i)) = c(x(i)) + 1
 END DO
END

It performs well:
In [1]: x = np.random.random_integers(0,1023,100).astype(int)
In [4]: timeit test.bincount(x,1024)
1000 loops, best of 3: 1.16 ms per loop
In [5]: timeit np.bincount(x)
100 loops, best of 3: 4 ms per loop

I'm guessing most of the benefit comes from less checking + not having
to find the maximum value (which I pass as parameter m).

But I have some questions. It seems to work as is, but I don't set c
to zeros anywhere. Can I assume arrays created by f2py are zero? Is
this the recommended way to use f2py with arrays? (I initially tried
using assumed arrays with DIMENSION(:) but it couldn't get it to
work).
Also I'm quite new to fortran - what would be the advantages,
if any, of using a FORALL instead of DO in a situation like this?
I guess with 1D arrays it doesn't make a copy since ordering is not a
problem, but if it was 2D arrays am I right in thinking that if I
passed in a C order array it would automatically make a copy to F
order. What about the return - will I get a number array in F order,
or will it automatically be copied to C order? (I guess I will see but
I haven't got onto trying that yet).
What if I wanted to keep all the array creation in numpy - ie call it
as the fortran subroutine bincount(x,c) and have c modified in place?
Should I be using !f2py comments? I wasn't clear if these are needed -
it seems to work as is but could they give any improvement?

For comparison I tried the same thing in cython - after a couple of
iterations with not typing things properly I ended up with:

import numpy as np
cimport numpy as np
cimport cython

@cython.boundscheck(False)
def bincount(np.ndarray[np.int_t, ndim=1] x not None,int m):
cdef int n = x.shape[0]
cdef unsigned int i
cdef np.ndarray[np.int_t, ndim=1] c = np.zeros(m,dtype=np.int)

for i from 0 = i  n:
c[unsigned intx[i]] += 1

return c

which now performs a bit better than np.bincount, but still
significantly slower than the fortran. Is this to be expected or am I
missing something in the cython?

In [14]: timeit ctest.bincount(x,1024)
100 loops, best of 3: 3.31 ms per loop

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] extension questions: f2py and cython

2009-10-15 Thread Robin
Hi,

I have another question about distributing a Python extension which
uses f2py wrapped code. Ideally I'd like to keep pure Python/Numpy
alternatives and just use fortran version if available - but I think
that should be OK.

I'm more worried about distributing binaries on Windows - I think on
Mac/Linux it would be ok to have a fortran compiler required and build
it - but on Windows I guess one should really distribute binaries.

What is the recommended (free) fortran 95 compiler for use with f2py
on windows (gfortan with cygwin?)
Is it possible to get f2py to build a static library on windows so I
can just distribute that? Or will I need to include library files from
the compiler?
How many different binary versions does one need to support common
recent windows setups? I guess I need a different binary for each
major python version and 32/64 bits (ie 2.5 32bit, 2.6 32bit, 2.5
64bit, 2.6 64bit). Is this right, or would different binaries be
required for XP, Vista, 7 etc. ?

Can anyone point me to a smallish Python package that includes fortran
code in this way that I could look to for inspiration?

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] extension questions: f2py and cython

2009-10-15 Thread Robin
Hi,

Thanks

On Thu, Oct 15, 2009 at 4:10 PM, Andrew Hawryluk hawr...@novachem.com wrote:
 But I have some questions. It seems to work as is, but I don't set c
 to
 zeros anywhere. Can I assume arrays created by f2py are zero?

 As I understand it, uninitialized variables in Fortran are
 compiler/system-dependent. Some compilers initialize values to zero,
 many leave the previous contents of the memory in place. It is safest to
 never use the value of an uninitialized variable.

But in this case I understood it was initialised or created by the
f2py wrapper first and then passed to the fortran subroutine - so I
wondered how f2py creates it (I think I traced it to
array_from_pyobj() but I couldn't really understand what it was doing
or whether it would always be zeros). I guess as you say though it is
always safer to initialize explicitly

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Not enough storage for memmap on 32 bit WinXP for accumulated file size above approx. 1 GB

2009-07-27 Thread Robin
On Mon, Jul 27, 2009 at 11:37 AM, Kim Hansenslaun...@gmail.com wrote:
 The machine is new and shiny with loads of processing power and many
 TB of HDD storage. I am however bound to 32 bits Win XP OS as there
 are some other costum made third-party and very expensive applications
 running on that machine (which generate the large files I analyze),
 which can only run on 32 bits, oh well

You could think about using some kind of virtualisation - this is
exactly the sort of situation where I find it really useful.

You can run a 64 bit host OS, then have 32 bit XP as a 'guest' in
VMware or Virtualbox or some other virtualisation software. With
recent CPU's there is very little performance penalty (running 32bit
on a 64bit host) and it can be very convenient (it is easy to map
network drives between guest and host which perform very well since
the 'network' is virtual)

Cheers

Robin
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] performance matrix multiplication vs. matlab

2009-06-09 Thread Robin
On Mon, Jun 8, 2009 at 7:14 PM, David Warde-Farleyd...@cs.toronto.edu wrote:

 On 8-Jun-09, at 8:33 AM, Jason Rennie wrote:

 Note that EM can be very slow to converge:

 That's absolutely true, but EM for PCA can be a life saver in cases where
 diagonalizing (or even computing) the full covariance matrix is not a
 realistic option. Diagonalization can be a lot of wasted effort if all you
 care about are a few leading eigenvectors. EM also lets you deal with
 missing values in a principled way, which I don't think you can do with
 standard SVD.

 EM certainly isn't a magic bullet but there are circumstances where it's
 appropriate. I'm a big fan of the ECG paper too. :)

Hi,

I've been following this with interest... although I'm not really
familiar with the area. At the risk of drifting further off topic I
wondered if anyone could recommend an accessible review of these kind
of dimensionality reduction techniques... I am familiar with PCA and
know of diffusion maps and ICA and others, but I'd never heard of EM
and I don't really have any idea how they relate to each other and
which might be better for one job or the other... so some sort of
primer would be really handy.

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] NameError: name 'numeric' is not defined

2009-06-06 Thread Robin
Hi,

I just updated to latest numpy svn:
In [10]: numpy.__version__
Out[10]: '1.4.0.dev7039'

It seemed to build fine, but I am getting a lot of errors testing it:
--
Ran 178 tests in 0.655s

FAILED (errors=138)
Out[8]: nose.result.TextTestResult run=178 errors=138 failures=0

Almost all the errors look the same:
==
ERROR: test_shape (test_ctypeslib.TestNdpointer)
--
Traceback (most recent call last):
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/tests/test_ctypeslib.py,
line 83, in test_shape
self.assert_(p.from_param(np.array([[1,2]])))
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/ctypeslib.py,
line 171, in from_param
return obj.ctypes
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/__init__.py,
line 27, in module
__all__ += numeric.__all__
NameError: name 'numeric' is not defined

I haven't seen this before - is it something wrong with my build or
the current svn state? I am using macports python 2.5.4 on os x 10.5.7

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NameError: name 'numeric' is not defined

2009-06-06 Thread Robin
On Sun, Jun 7, 2009 at 12:53 AM, Robinrobi...@gmail.com wrote:
 I haven't seen this before - is it something wrong with my build or
 the current svn state? I am using macports python 2.5.4 on os x 10.5.7

Hmmm... after rebuilding from the same version the problem seems to
have gone away... sorry for the noise...

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Problem with correlate

2009-06-02 Thread Robin
On Tue, Jun 2, 2009 at 11:36 AM, David Cournapeau courn...@gmail.com wrote:

 Done in r7031 - correlate/PyArray_Correlate should be unchanged, and
 acorrelate/PyArray_Acorrelate implement the conventional definitions,

I don't know if it's been discussed before but while people are
thinking about/changing correlate I thought I'd like to request as a
user a matlab style xcorr function (basically with the functionality
of the matlab version).

I don't know if this is a deliberate emission, but it is often one of
the first things my colleagues try when I get them using Python, and
as far as I know there isn't really a good answer. There is xcorr in
pylab, but it isn't vectorised like xcorr from matlab...

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] copy and paste arrays from matlab

2009-05-13 Thread Robin
[crossposted to numpy-discussion and mlabwrap-user]

Hi,

Please find attached Python code for the opposite direction - ie
format Python arrays for copy and pasting into an interactive Matlab
session.

It doesn't look as nice because newlines are row seperators in matlab
so I put everything on one line. Also theres no way to input 2D
arrays in Matlab that I know of without using reshape.

In [286]: from mmat import mmat
In [289]: x = rand(4,2)
In [290]: mmat(x,'%2.3f')
[ 0.897 0.074 ;   0.005 0.174 ;   0.207 0.736 ;   0.453 0.111 ]
In [287]: mmat(x,'%2.3f')
reshape([  [ 0.405 0.361 0.609 ;   0.249 0.275 0.620 ;   0.740 0.754
0.699 ;   0.280 0.053 0.181 ] [ 0.796 0.114 0.720 ;   0.296 0.692
0.352 ;   0.218 0.894 0.818 ;   0.709 0.946 0.860 ] ],[ 4 3 2 ])
In [288]: mmat(x)
reshape([  [ 4.046905655728e-01 3.605995195844e-01 6.089653771166e-01
;   2.491999503702e-01 2.751880043180e-01 6.199629932480e-01 ;
7.401974485581e-01 7.537929345351e-01 6.991798908866e-01 ;
2.800494872019e-01 5.258468515210e-02 1.812706305994e-01 ] [
7.957907133899e-01 1.144010574386e-01 7.203522053853e-01 ;
2.962977637560e-01 6.920657079182e-01 3.522371076632e-01 ;
2.181950954650e-01 8.936401263709e-01 8.177351741233e-01 ;
7.092517323839e-01 9.458774967489e-01 8.595104463863e-01 ] ],[ 4 3 2
])

Hope someone else finds it useful.

Cheers

Robin

On Tue, May 12, 2009 at 2:12 PM, Robin robi...@gmail.com wrote:
 [crossposted to numpy-discussion and mlabwrap-user]

 Hi,

 I wrote a little utility class in Matlab that inherits from double and
 overloads the display function so you can easily print matlab arrays
 of arbitrary dimension in Numpy format for easy copy and pasting.

 I have to work a lot with other peoples code - and while mlabwrap and
 reading and writing is great, sometimes I find it easier and quicker
 just to copy and paste smaller arrays between interactive sessions.

 Anyway you put it in your Matlab path then you can do
 x = rand(2,3,4,5);
 a = array(x)

 You can specify the fprintf style format string either in the
 constructor or after:
 a = array(x,'%2.6f')
 a.format = '%2.2f'

 eg:
 x = rand(4,3,2);
 array(x)
 ans =

 array([[[2.071566461449581e-01, 3.501602151029837e-02],
        [1.589135260727248e-01, 3.766891927380323e-01],
        [8.757206127846399e-01, 7.259276565938600e-01]],

       [[7.570839415557700e-01, 3.974969411279816e-02],
        [8.109207856487061e-01, 5.043242527988604e-01],
        [6.351863794630047e-01, 7.013280585980169e-01]],

       [[8.863281096304466e-01, 9.885678912262633e-01],
        [4.765077527169480e-01, 7.634956792870943e-01],
        [9.728134909163066e-02, 4.588908258125032e-01]],

       [[4.722298594969571e-01, 6.861815984603373e-01],
        [1.162875322461844e-01, 4.887479677951201e-02],
        [9.084394562396312e-01, 5.822948089552498e-01]]])

 It's a while since I've tried to do anything like this in Matlab and I
 must admit I found it pretty painful, so I hope it can be useful to
 someone else!

 I will try and do one for Python for copying and pasting to Matlab,
 but I'm expecting that to be a lot easier!

 Cheers

 Robin



mmat.py
Description: Binary data
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] copy and paste arrays from matlab

2009-05-12 Thread Robin
[crossposted to numpy-discussion and mlabwrap-user]

Hi,

I wrote a little utility class in Matlab that inherits from double and
overloads the display function so you can easily print matlab arrays
of arbitrary dimension in Numpy format for easy copy and pasting.

I have to work a lot with other peoples code - and while mlabwrap and
reading and writing is great, sometimes I find it easier and quicker
just to copy and paste smaller arrays between interactive sessions.

Anyway you put it in your Matlab path then you can do
x = rand(2,3,4,5);
a = array(x)

You can specify the fprintf style format string either in the
constructor or after:
a = array(x,'%2.6f')
a.format = '%2.2f'

eg:
 x = rand(4,3,2);
 array(x)
ans =

array([[[2.071566461449581e-01, 3.501602151029837e-02],
[1.589135260727248e-01, 3.766891927380323e-01],
[8.757206127846399e-01, 7.259276565938600e-01]],

   [[7.570839415557700e-01, 3.974969411279816e-02],
[8.109207856487061e-01, 5.043242527988604e-01],
[6.351863794630047e-01, 7.013280585980169e-01]],

   [[8.863281096304466e-01, 9.885678912262633e-01],
[4.765077527169480e-01, 7.634956792870943e-01],
[9.728134909163066e-02, 4.588908258125032e-01]],

   [[4.722298594969571e-01, 6.861815984603373e-01],
[1.162875322461844e-01, 4.887479677951201e-02],
[9.084394562396312e-01, 5.822948089552498e-01]]])

It's a while since I've tried to do anything like this in Matlab and I
must admit I found it pretty painful, so I hope it can be useful to
someone else!

I will try and do one for Python for copying and pasting to Matlab,
but I'm expecting that to be a lot easier!

Cheers

Robin


array.m
Description: Binary data
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] setterr question

2009-04-29 Thread Robin
Hi,

I have been using seterr to try to catch where Nans are appearing in
my analysis.

I used all: 'warn' which worked the first time I ran the function, but
as specified in the documentation it only warns 'once only'. Is there
a way I can reset the count so it will warn again, without loosing my
session?

Thanks

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] setterr question

2009-04-29 Thread Robin
2009/4/29 Stéfan van der Walt ste...@sun.ac.za:
 2009/4/29 Robin robi...@gmail.com:
 I have been using seterr to try to catch where Nans are appearing in
 my analysis.

 I used all: 'warn' which worked the first time I ran the function, but
 as specified in the documentation it only warns 'once only'. Is there
 a way I can reset the count so it will warn again, without loosing my
 session?

 Try

 import warnings
 warnings.simplefilter('always')

Thanks very much, I thought it would be an easy one!

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Built Lapack, Atlas from source.... now numpy.linalg.eig() hangs at 100% CPU

2009-03-28 Thread Robin
2009/3/28 Chris Colbert sccolb...@gmail.com:
 Alright, building numpy against atlas from the repositories works, but this
 atlas only contains the single threaded libraries. So i would like to get my
 build working completely.

It doesn't help at all with your problem - but I thought I'd point out
there are other ways to exploit multicore machines than using threaded
ATLAS (if that is your goal).

For example, I use single threaded libraries and control parallel
execution myself using multiprocessing module (this is easier for
simple batch jobs, but might not be appropriate for your case).

There is some information about this on the wiki:
http://scipy.org/ParallelProgramming

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] indexing question

2009-03-05 Thread Robin
Hi,

I have an indexing problem, and I know it's a bit lazy to ask the
list, sometime when people do interesting tricks come up so I hope no
one minds!

I have a 2D array X.shape = (a,b)

and I want to change it into new array which is shape (2,(a*b)) which
has the following form:
[  X[0,0], X[0,1]
   X[1,0], X[1,1]
   X[2,0], X[2,1]
   
   X[a,0], X[a,1]
   X[0,1], X[0,2]
   X[1,1], X[1,2]
...
]

The first access is trials and the second axis is the different
outputs, which I am trying to represent as an overlapping sliding
window of 2 samples (but in general n). Because of the repeats I'm not
sure if I can do it without loops.

Thanks for any help

 Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] indexing question

2009-03-05 Thread Robin
On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote:
 Hi,

 I have an indexing problem, and I know it's a bit lazy to ask the
 list, sometime when people do interesting tricks come up so I hope no
 one minds!

 I have a 2D array X.shape = (a,b)

 and I want to change it into new array which is shape (2,(a*b)) which
 has the following form:


actually the new array would have dimensions (2,(a*(b-1) ) as the last
bin wouldn't have the second point.

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] indexing question

2009-03-05 Thread Robin
On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote:
 Hi,

 I have an indexing problem, and I know it's a bit lazy to ask the
 list, sometime when people do interesting tricks come up so I hope no
 one minds!

 I have a 2D array X.shape = (a,b)

 and I want to change it into new array which is shape (2,(a*b)) which
 has the following form:
 [  X[0,0], X[0,1]
   X[1,0], X[1,1]
   X[2,0], X[2,1]
   
   X[a,0], X[a,1]
   X[0,1], X[0,2]
   X[1,1], X[1,2]
 ...
 ]


Ah, so it's a bit easier than I thought at first glance:

X[ ix_( (b-1)*range(a), [0,1]) ]
does the trick I think

Sorry for the noise.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] indexing question

2009-03-05 Thread Robin
On Thu, Mar 5, 2009 at 10:57 AM, Robin robi...@gmail.com wrote:
 On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote:
 Hi,

 I have an indexing problem, and I know it's a bit lazy to ask the
 list, sometime when people do interesting tricks come up so I hope no
 one minds!

 I have a 2D array X.shape = (a,b)

 and I want to change it into new array which is shape (2,(a*b)) which
 has the following form:
 [  X[0,0], X[0,1]
   X[1,0], X[1,1]
   X[2,0], X[2,1]
   
   X[a,0], X[a,1]
   X[0,1], X[0,2]
   X[1,1], X[1,2]
 ...
 ]


 Ah, so it's a bit easier than I thought at first glance:

 X[ ix_( (b-1)*range(a), [0,1]) ]
 does the trick I think

Not doing well this morning - that's wrong of course...  I need to
stack lots of such blocks for [0,1], [1,2], [2,3] etc.. up to [b-1,b].
So I guess the question still stands...

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] indexing question

2009-03-05 Thread Robin
On Thu, Mar 5, 2009 at 9:15 PM, Stéfan van der Walt ste...@sun.ac.za wrote:
 Hi Robin

 2009/3/5 Robin robi...@gmail.com:
 On Thu, Mar 5, 2009 at 10:57 AM, Robin robi...@gmail.com wrote:
 On Thu, Mar 5, 2009 at 10:40 AM, Robin robi...@gmail.com wrote:
 Hi,

 I have an indexing problem, and I know it's a bit lazy to ask the
 list, sometime when people do interesting tricks come up so I hope no
 one minds!

 I have a 2D array X.shape = (a,b)

 and I want to change it into new array which is shape (2,(a*b)) which
 has the following form:
 [  X[0,0], X[0,1]
   X[1,0], X[1,1]
   X[2,0], X[2,1]
   
   X[a,0], X[a,1]
   X[0,1], X[0,2]
   X[1,1], X[1,2]
 ...
 ]


 From the array you wrote down above, I assume you meant ((a*b-1), 2):

 In [23]: x = np.arange(16).reshape((4,4))

 In [24]: x
 Out[24]:
 array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

 In [25]: x.strides
 Out[25]: (16, 4)

 In [26]: np.lib.stride_tricks.as_strided(x, shape=(3, 4, 2), strides=(4, 16, 
 4))
 Out[26]:
 array([[[ 0,  1],
        [ 4,  5],
        [ 8,  9],
        [12, 13]],

       [[ 1,  2],
        [ 5,  6],
        [ 9, 10],
        [13, 14]],

       [[ 2,  3],
        [ 6,  7],
        [10, 11],
        [14, 15]]])

 In [27]: np.lib.stride_tricks.as_strided(x, shape=(3, 4, 2),
 strides=(4, 16, 4)).reshape((12, 2))
 Out[27]:
 array([[ 0,  1],
       [ 4,  5],
       [ 8,  9],
       [12, 13],
       [ 1,  2],
       [ 5,  6],
       [ 9, 10],
       [13, 14],
       [ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15]])

 Does that help?

Ah thats great thanks... I had realised it could be done with
as_strided and a reshape from your excellent slides - but I had
trouble figure out the new strides so I settled on making a list with
_ix and the hstack'ing the list.

This is much neater though.

Thanks,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 64-bit numpy questions?

2009-03-03 Thread Robin
On Tue, Mar 3, 2009 at 4:20 PM, Todd Miller jmil...@stsci.edu wrote:
 Is anyone using numpy in 64-bit
 environments on a day-to-day basis?

Yes - linux x86_64

 Are you using very large arrays,
 i.e.  over 2G in size?

I have been using arrays this size and larger (mainly sparse matrices)
without any problem (except for the machine running out of memory :)

Cheers

Robin


 Cheers,
 Todd


 ___
 Numpy-discussion mailing list
 Numpy-discussion@scipy.org
 http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] ParallelProgramming wiki page

2008-10-27 Thread Robin
Hi,

I made some changes to the ParallelProgramming wiki page to outline
use of the (multi)processing module as well as the threading module.

I'm very much not an expert on this - just researched it for myself,
so please feel free to correct/ extend/ delete as appropriate.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ParallelProgramming wiki page

2008-10-27 Thread Robin
On Mon, Oct 27, 2008 at 9:17 PM, dmitrey [EMAIL PROTECTED] wrote:
 As for the ParallelProgramming wiki page, there are some words in
 section Use parallel primitives about numpy.dot still I can't
 understand from the section: if I get numpy from sources and compile it
 (via python setup.py build) in my AMD X2, will numpy.dot use 2nd CPU
 or not?

Not unless you build numpy against a paralell enabled BLAS, for
example Intel MKL, ATLAS etc.

I think if you compile ATLAS with threading enabled, and then build
numpy using the appropriate ptlapack libraries (I forget the exact
name) then the dot should use the second CPU. As Frederic added to the
wiki - the number of threads to use can only be provided to atlas at
compile time.

With MKL I think you can choose this at run time (I think through an
environment variable but I'm not sure).

Similarly with the GOTO blas, but I'm not sure if numpy builds with
that, so maybe we should take that reference out.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] funny matrix product calculation

2008-09-15 Thread Robin
Success in business appears to be elusive for many people. So what is it
that makes people successful?
http://www.bestentrepreneur.net/2008/02/want-to-know-billionaires-formula.html

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
croeampiegallery group.
To post to this group, send email to croeampiegallery@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/croeampiegallery?hl=en
-~--~~~~--~~--~--~---



Re: [Numpy-discussion] profiling line by line

2008-09-15 Thread Robin
On Mon, Sep 15, 2008 at 5:13 PM, Arnar Flatberg
[EMAIL PROTECTED] wrote:
 On Mon, Sep 15, 2008 at 5:18 PM, Robert Kern [EMAIL PROTECTED] wrote:

 I do have some Cython code that
 does this. It needs a little bit more work, though. I'll try to push
 it out soonish.

 That would make me an extremely happy user, I've been looking for this for
 years!
 I can't imagine I'm the only one who profiles some hundred lines of code and
 ends up with 90% of total time in the dot-function

Yes, that would be terrific - the profiler is about the only thing I
miss from Matlab... Especially after an afternoon of refactoring
everything into tiny functions to get anything useful out of the
normal profiler and see where the bottleneck in my code was.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] concatenate an array with a number

2008-08-04 Thread Robin
On Mon, Aug 4, 2008 at 3:58 PM, Nicolas Chopin
[EMAIL PROTECTED] wrote:
 Hi list,
 I want to do this:

 y = concatenate(x[0],x)

 where x is a 1d array.
 However, the only way I managed to make this work is:

 y = concatenate( array((x[0],),ndmin=1), x)

 which is kind of cryptic. (If I remove ndmin, numpy still complains.)

I think r_[x[0],x] will do you what you want...

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Sparse Matrices in Numpy -- (with eigenvalue algorithms if possible)

2008-06-23 Thread Robin
On Tue, Jun 24, 2008 at 4:23 AM, Dan Yamins [EMAIL PROTECTED] wrote:
 3) Finally: (and this is slightly off topic) -- I've just installed SciPy
 for the first time.  In the readme.txt for the download it mentioned
 something about having LAPACK and ATLAS installed.  However, on the scipy
 install page (for OSX, http://www.scipy.org/Installing_SciPy/Mac_OS_X), it
 doesn't mention anything about LAPACK and ATLAS -- and the instructions
 there that I followed worked without any need for those packages.Do I
 need them?  Or are they only necessary for making certain routines faster?

On OS X it is usual to use the optimised BLAS provided by the
Accelerate framework (at least that's what it used to be called I
think). In any case - you don't need them because Apple provides an
optimised blas (which is used as an alternative to lapack and
atlas)... You could use lapack/atlas if you wanted but installation is
probably simpler following the instructions on the wiki to use the
apple one...

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Which Python to Use on OSX, Was: 1.1.0 OSX Installer Fails Under 10.5.3?

2008-06-04 Thread Robin
On Wed, Jun 4, 2008 at 9:25 AM, J. Stark [EMAIL PROTECTED] wrote:
 Robert,

 I see your point, but why not just install a separate NumPy to run
 with the system Python? That is what I have always done in the past
 without problems.

I think the problem is the system python already comes with a (much
older) cut down version of numpy which you can find in:
/System/Library/Frameworks/Python.framework/Versions/2.5/Extras/lib/python/numpy
This makes all sorts of problems when installing a new version...
Obviously you can't have two different versions of the same package
with the same name in the same python installation (how do you choose
which one you mean with import numpy.)
I think there were problems with the path so when a new numpy is
installed in 2.5/Extras/lib/site-packages it is actually after the
existing one on the path and doesn't get picked up. Even if it does
work, the worry is that you're changing a supplied component and Apple
stuff might depend upon the version supplied (or other software people
distribute to use the 'system' python might expect it to be there).

I think theres much less chance of problems using the system python
for system things and leaving it well alone - and installing the
python.org for everyday use. The only problem with this is that the
system python works with dtrace while the normal one doesn't...

Cheers

Robin

 I guess I always feel a sense of uncertainty with having two separate
 Python installations as to which actually gets used in any particular
 situation. I appreciate that for experts who use Python daily, this
 isn't an issue, but for someone like myself who may have gaps of
 several months between projects that use Python, this is a real issue
 as I forget those kinds of subtleties.

 J.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Which Python to Use on OSX, Was: 1.1.0 OSX Installer Fails Under 10.5.3?

2008-06-04 Thread Robin
On Wed, Jun 4, 2008 at 10:59 AM, David Cournapeau
[EMAIL PROTECTED] wrote:
 Robin wrote:
 I think theres much less chance of problems using the system python
 for system things and leaving it well alone - and installing the
 python.org for everyday use. The only problem with this is that the
 system python works with dtrace while the normal one doesn't...


 The source for dtrace integration are not freely available ?

Perhaps they are (I suppose they should be) but I can't find them - in
any case I don't think I would patch python distribution and build
from source just for this feature... I only mentioned it in passing.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question about indexing

2008-05-29 Thread Robin
On Fri, May 30, 2008 at 12:36 AM, Raul Kompass [EMAIL PROTECTED] wrote:
 I'm new to using numpy. Today I experimented a bit with indexing
 motivated by the finding that although
 a[a0.5]  and a[where(a0.5)] give the same expected result (elements of
 a greater than 0.5)
 a[argwhere(a0.5)] results in something else (rows of a in different order).

 I tried to figure out when indexing will yield rows and when it will
 give me an element and I could not find a simple rule.

 I systematically tried and got the follwing:
 --
   from scipy import *
   a = random.rand(10).reshape(2,5)
   a
 array([[ 0.87059263,  0.76795743,  0.13844935,  0.69040701,  0.92015062],
   [ 0.97313123,  0.85822558,  0.8579044 ,  0.57425782,  0.57355904]])



   a[[0],[1]]# shape([[0],[1]])  = (2, 1)
 array([ 0.76795743])

   a[[0,1]]  # shape([[0,1]])= (1, 2)
 array([[ 0.87059263,  0.76795743,  0.13844935,  0.69040701,  0.92015062],
   [ 0.97313123,  0.85822558,  0.8579044 ,  0.57425782,  0.57355904]])

   a[[[0,1]]]# shape([[[0,1]]])  = (1, 1, 2)
 array([[ 0.87059263,  0.76795743,  0.13844935,  0.69040701,  0.92015062],
   [ 0.97313123,  0.85822558,  0.8579044 ,  0.57425782,  0.57355904]])

   a[[[0],[1]]]  # shape([[[0],[1]]])= (1, 2, 1)
 array([ 0.76795743])

   a[[[0]],[[1]]]# shape([[[0]],[[1]]])  = (2, 1, 1)
 array([[ 0.76795743]])

   a0,1  # shape(0,1)= (1, 1, 1, 2)
 array([[[ 0.87059263,  0.76795743,  0.13844935,  0.69040701,  0.92015062],
[ 0.97313123,  0.85822558,  0.8579044 ,  0.57425782,  0.57355904]]])

   a0],[1# shape(0],[1)  = (1, 1, 2, 1)
 array([[[ 0.87059263,  0.76795743,  0.13844935,  0.69040701,  0.92015062]],

   [[ 0.97313123,  0.85822558,  0.8579044 ,  0.57425782,  0.57355904]]])

   a0]],[[1  # shape(0]],[[1) = (1, 2, 1, 1)
 array([[ 0.76795743]])

   a0]]],[[[1# shape(0]]],[[[1) = (2, 1, 1, 1)
 array([[[ 0.76795743]]])
 ---

 Can anyone explain this?

 Thank you very much,

Hi,

I don't have time to give a comprehensive answer - but I think I can
offer a simple rule. The thing you are indexing (a) is 2 dimensional,
so if you provide 2 arguments to index with (ie a[something,
something]) you will get single elements - if you only provide a
single argument (ie a[something]) it will pull out rows corresponding
to the indexing. If you want just a specific element you have to add a
second argument.

Also - the outer [ ]'s in your indexing operations are just the syntax
for indexing. So your shape comments are wrong:

   a[0,1]# shape([0,1])  = (2,)
 0.767957427399
you are indexing here with two scalars, 0,1.

   a[[0,1]]  # shape([[0,1]])= (1, 2)
 array([[ 0.87059263,  0.76795743,  0.13844935,  0.69040701,  0.92015062],
   [ 0.97313123,  0.85822558,  0.8579044 ,  0.57425782,  0.57355904]])

You are indexing here with a 1d list [0,1]. Since you don't provide a
column index you get rows 0 and 1.
If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1].

Hope this helps,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Question about indexing

2008-05-29 Thread Robin
On Fri, May 30, 2008 at 12:57 AM, Robin [EMAIL PROTECTED] wrote:
 You are indexing here with a 1d list [0,1]. Since you don't provide a
 column index you get rows 0 and 1.
 If you do a[ [0,1] , [0,1] ] then you get element [0,0] and element [0,1].

Whoops - you get [0,0] and [1,1].

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Fancier indexing

2008-05-22 Thread Robin
On Thu, May 22, 2008 at 4:59 PM, Kevin Jacobs [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote:
 After poking around for a bit, I was wondering if there was a faster method
 for the following:

 # Array of index values 0..n
 items = numpy.array([0,3,2,1,4,2],dtype=int)

 # Count the number of occurrences of each index
 counts = numpy.zeros(5, dtype=int)
 for i in items:
   counts[i] += 1

 In my real code, 'items' contain up to a million values and this loop will
 be in a performance critical area of code.  If there is no simple solution,
 I can trivially code this using the C-API.

I would use bincount:
count = bincount(items)
should be all you need:


In [192]: items = [0,3,2,1,4,2]

In [193]: bincount(items)
Out[193]: array([1, 1, 2, 1, 1])

In [194]: bincount?
Type:   builtin_function_or_method
Base Class: type 'builtin_function_or_method'
String Form:built-in function bincount
Namespace:  Interactive
Docstring:
bincount(x,weights=None)

Return the number of occurrences of each value in x.

x must be a list of non-negative integers.  The output, b[i],
represents the number of times that i is found in x.  If weights
is specified, every occurrence of i at a position p contributes
weights[p] instead of 1.

See also: histogram, digitize, unique.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Quick Question about Optimization

2008-05-19 Thread Robin
On Mon, May 19, 2008 at 7:08 PM, James Snyder [EMAIL PROTECTED] wrote:

for n in range(0,time_milliseconds):
self.u  =  self.expfac_m  *  self.prev_u +
 (1-self.expfac_m) * self.aff_input[n,:]
self.v = self.u + self.sigma *
 np.random.standard_normal(size=(1,self.naff))
self.theta = self.expfac_theta * self.prev_theta -
 (1-self.expfac_theta)

idx_spk = np.where(self.v=self.theta)

self.S[n,idx_spk] = 1
self.theta[idx_spk] = self.theta[idx_spk] + self.b

self.prev_u = self.u
self.prev_theta = self.theta

Hello,

The only thoughts I had were that depending on how 'vectorised'
random.standard_normal is, it might be better to calculate a big block
of random data outside the loop and index it in the same way as the
aff_input. Not certain if the indexing would be faster than the
function call but it could be worth a try if you have enough memory.

The other thing is there are a lot of self.'s there. I don't have a
lot of practicle experience, but I've read (
http://wiki.python.org/moin/PythonSpeed/PerformanceTips#head-aa6c07c46a630a2fa10bd6502510e532806f1f62
) that . based lookups are slower than local variables so another
thing to try would be to rebind everything to a local variable outside
the loop:
u = self.u
v = self.v
etc. which although a bit unsightly actually can make the inner loop
more readable and might speed things up.

The only other thing is be careful with things like this when
translating from matlab:
self.prev_u = self.u
since this is a reference not a copy of the data. This is OK because
when you recreate u as a product it creates a new object, but if you
changed u in another way ie self.u[:100] = 10 then self.prev_u would
still be pointing to the same array and also reflect those changes.
In this case it doesn't look like you explicitly need the prev_ values
so it's possible you could do the v and theta updates in place
(although I'm not sure if that's quicker)
u *= expfac_m
u += (1-expfac_m)*aff_input.. etc.
Of course you can also take the (1-)'s outside of the loop although
again I'm not sure how much difference it would make.

So sorry I can't give any concrete advise but I hope I've given some ideas...

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Quick Question about Optimization

2008-05-19 Thread Robin
Also you could use xrange instead of range...

Again, not sure of the size of the effect but it seems to be
recommended by the docstring.

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Quick Question about Optimization

2008-05-19 Thread Robin
Hi,

I think my understanding is somehow incomplete... It's not clear to me
why (simplified case)

a[curidx,:] = scalar * a[2-curidx,:]
should be faster than
a = scalar * b

In both cases I thought the scalar multiplication results in a new
array (new memory allocated) and then the difference between copying
that result into the existing array u[curix,:] or reassining the
reference u to that result should be very similar?

If anything I would have thought the direct assignment would be
quicker since then there is no copying.

What am I missing?

 This should give you a substantial speedup.  Also, I have to say that
 this is begging for weave.blitz, which compiles such statements using
 templated c++ code to avoid temporaries.  It doesn't work on all
 systems, but if it does in your case, here's what your code might look
 like:

If you haven't seen it this page gives useful examples of methods to
speed up python code (incuding weave.blitz), which has Hoyt says would
be ideal in this case:
http://scipy.org/PerformancePython

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] python memory use

2008-05-03 Thread Robin
Hi,

I am starting to push the limits of the available memory and I'd like
to understand a bit better how Python handles memory...

If I try to allocate something too big for the available memory I
often get a MemoryError exception. However, in other situations,
Python memory use continues to grow until the machine falls over. I
was hoping to understand the difference between those cases. From what
I've read Python never returns memory to the OS (is this right?) so
the second case, python is holding on to memory that it isn't really
using (for objects that have been destroyed). I guess my question is
why doesn't it reuse the memory freed from object deletions instead of
requesting more - and even then when requesting more, why does it
continue until the machine falls over and not cause a MemoryError?

While investigating this I found this script:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474
which does wonders for my code. I was wondering if this function
should be included in Numpy as it seems to provide an important
feature, or perhaps an entry on the wiki (in Cookbook section?)

Thanks,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea

2008-03-07 Thread Robin
On Fri, Mar 7, 2008 at 9:06 AM, Fernando Perez [EMAIL PROTECTED] wrote:
 Hi Robin,

  As Ondrej pointed out, the expectation is a full-time commitment to
  the project.  Other than that it sounds like you might be able to
  participate, and it's worth noting that this being open source, if you
  just have some free time and would like to get involved with an
  interesting project, by all means pitch in.  Even if someone picks up
  an 'official' project, there's plenty to be done on the cython/numpy
  front for more than one person.

  Perhaps it's not out of place to mention that many people have made
  solid contributions for years to open source projects without monetary
  compensation, and still see value in the activity.  If you can spend
  the time on it, you may still find many rewards out of the work.

Thanks,

I hadn't seen the link Ondrej provided, although the 40 hour week
seems to be a Python/PSF requirement. Prior to posting I had checked
the Google information, where they say the time commitment depends on
both the scope of your project and the requirements of your mentoring
organisation. They also say they have had successful applicants in
previous years from full-time students at non-US universities (who
don't get a summer break), so I thought it might be possible for me to
be considered. I also asked in #gsoc where I was advised 20 hours per
week would be a good baseline, again depending on the project.

Of course, I hope to contribute to Numpy/Scipy anyway - but this
scheme would be a great way to kick-start that.

I look forward to seeing Numpy/Scipy accepted as a mentor organisation
this year anyway, even if I am unable to take part.

Cheers,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.ndarray constructor from python list: bug?

2008-03-06 Thread Robin
On Thu, Mar 6, 2008 at 9:53 AM, Emanuele Olivetti
[EMAIL PROTECTED] wrote:
 Dear all,

  Look at this little example:
  
  import numpy
  a = numpy.array([1])
  b = numpy.array([1,2,a])
  c = numpy.array([a,1,2])
  
  Which has the following output:
  
  Traceback (most recent call last):
   File b.py, line 4, in module
 c = numpy.array([a,1,2])
  ValueError: setting an array element with a sequence.
  

  It seems that a list starting with an ndarray ('a', of
  a single number) is not a legal input to build an ndarray.
  Instead if 'a' is in other places of the list the ndarray
  builds up flawlessly.

  Is there a meaning for this behavior or is it a bug?

  Details: numpy 1.04 on ubuntu linux x86_64

Hi,

I see the same behaviour with 1.0.5.dev4786. I think the bug is that
the b assignment should also fail. They both fail (as I think they
should) if you take a as an array with more than one element.

I think the array constructor expects lists of numbers, not of arrays
etc. To do what you want try

b = r_[1,2,a]
c = r_[a,1,2]
which works for a an array (and of more than one element).

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy/Cython Google Summer of Code project idea

2008-03-06 Thread Robin
On Thu, Mar 6, 2008 at 6:15 PM, Fernando Perez [EMAIL PROTECTED] wrote:
   Any student interested in this should quickly respond on the list;
   such a project would likely be co-mentored by people on the Numpy and
   Cython teams, since it is likely to require expertise from both ends.

Hello,

I would like to register my keen interest in applying for Numpy/Scipy
GSoC project. I am a first year PhD student in Computational
Neuroscience and my undergraduate degree was in Mathematics.

I have been using Numpy and Scipy for my PhD work for a few months now
and have been building up to trying to contribute something to the
project - I am keen to get more substantial real world programming
experience... The projects described involving Cython definitely
interest me, although I don't yet have a sufficient understanding of
the Python C-API and Pyrex/Cython to gauge how demanding they might
be.

As a PhD student in the UK I don't have any official summer vacation,
so I wouldn't be able to work full time on the project (I also have a
continuation report due just before the GSoC final deadline which is a
bit annoying). However I currently work 20 hours per week part time
anyway, so I'm confident that I could replace that with GSoC and still
keep up with my studies.

I would be keen to chat with someone (perhaps on the IRC channel)
about whether my existing programming experience and availability
would allow me to have a proper crack at this.

I understand that first organisations apply (deadline 12th March) with
some suggested projects, and then towards the end of the month
students can apply to accepted organisations, either for the suggested
project or their own ideas. I'd love to see Numpy/Scipy apply as an
organisation with these projects (and perhaps some others) so that
interested students like myself can apply.

Thanks,

Robin

PS
My nick on IRC is 'thrope' and I try to hang out in there most of the
time I am online. I am also on Google Talk at this email address.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] image to array doubt

2008-02-29 Thread Robin
On Fri, Feb 29, 2008 at 2:54 PM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
 hi
  i came across a codebase by rice univ people..in that there are some
  functions for conversion btw image and vectors

I'm not an expert by any means but I thought I'd try and help...

  1.code
  def image_to_vector(self, filename):
 try:
 im = Image.open(filename)
 except IOError:
 print 'couldn\'t load ' + filename
 sys.exit(1)
 self.im_size = im.size
 a = numpy.array(im.getdata())
 return a - 128
  /code

  what is the purpose of a-128 here? i couldn't quite understand

It looks to me like im.getdata() is returning an unsigned 8 bit
integer (0 to 256), which they are converting to a signed integer by
subtracting 128.


  2.code
  def vector_to_image(self, v, filename):
 v.shape = (-1,)
 a, b = min(v), max(v)
 span = max(abs(b), abs(a))
 im = Image.new('L', self.im_size)
 im.putdata((v * 127. / span) + 128)
 im.save(filename)
   /code

  and why the calculations of a,b span? why do you need the extra  maths
  ops inside im.putdata() I couldn't quite follow it

Similarly, here it looks like they are rescaling whatever is in v to
an unsigned 8-bit range (0-256). The span gives the absolute maximum
value of the input data, so v/span is rescaled to have maximum value
1, so (v * 127. / span) is the (signed) input vector rescaled to have
values in the range [-127,127]. Adding 128 makes unsigned again in
[0,256].

I'm not sure why they would be doing this - to me it looks they might
be using Image as a convenient way to store some other kind of data...

HTH,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] List Array References?

2008-02-15 Thread Robin
On Thu, Feb 14, 2008 at 8:43 PM, Alexander Michael [EMAIL PROTECTED] wrote:
 Is there a way to list all of the arrays that are referencing a given
  array? Similarly, is there a way to get a list of all arrays that are
  currently in memory?

For the second question, if you are working interactively in Ipython, you can do
%who ndarray
or
%whos ndarray
to get a list of arrays.

It might be possible to get this functionality within a script/program
through the ipython api, but I'm afraid I don't know much about that.

Cheers,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] searchsorted bug

2008-01-31 Thread Robin
I do get the problem with a recent(ish) svn, on OS X 10.5.1, python
2.5.1 (from python.org):

In [76]: A = array(['a','aa','b'])
In [77]: B = array(['d','e'])
In [78]: A.searchsorted(B)
Out[78]: array([3, 0])
In [79]: numpy.__version__
Out[79]: '1.0.5.dev4722'
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] argsort memory problem?

2008-01-29 Thread Robin
On Jan 29, 2008 7:16 PM, Lou Pecora [EMAIL PROTECTED] wrote:
 Hmmm... Interesting.  I am using Python 2.4.4.  It
 would be nice to have other Mac people with same/other
 Python and numpy versions try the argsort bug code.

I don't see any memory leak with the test code.
Mac OS X 10.5.1
Python 2.5.1 (not apple one)
Numpy 1.0.5.dev4722

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Test failure with latest SVN

2008-01-16 Thread Robin
I am on OS X 10.5.1 and get a test failure with the latest numpy svn.
(Previously ran with no errors with svn as of a few weeks ago).

test_zero_probability (numpy.tests.test_random.TestMultinomial) ... ok

==
ERROR: Ticket #396
--
Traceback (most recent call last):
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py,
line 600, in check_poly1d_nan_roots
self.failUnlessRaises(np.linalg.LinAlgError,getattr,p,r)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py,
line 320, in failUnlessRaises
callableObj(*args, **kwargs)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py,
line 623, in __getattr__
return roots(self.coeffs)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py,
line 124, in roots
roots = _eigvals(A)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py,
line 37, in _eigvals
return eigvals(arg)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py,
line 378, in eigvals
return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py,
line 128, in eig
a1 = asarray_chkfinite(a)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py,
line 398, in asarray_chkfinite
raise ValueError, array must not contain infs or NaNs
ValueError: array must not contain infs or NaNs

--
Ran 713 tests in 0.697s

FAILED (errors=1)
Out[3]: unittest._TextTestResult run=713 errors=1 failures=0

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] test failure: Ticket 396

2007-11-16 Thread Robin
Hello,

Just thought I'd report this test failure with current svn (rev 4464)
on Mac OS X 10.5 (could be a leopard issue?) since I hadn't seen it
mentioned before.
Oddly it only seems to fail in ipython, importing and running from the
normal python interpreter all the tests pass.

Actually, it only seems to fail when I start ipython with
ipython -p scipy
I haven't been able to recreate the problem with any other combination
of startup and imports.

Here is my ipythonrc-scipy:
# load our basic configuration with generic options
include ipythonrc

# import ...
# Load Numpy an SciPy by themselves so that 'help' works on them
import_mod numpy scipy

# from ... import ...
import_some

# Now we load all of SciPy
# from ... import *
import_all numpy
import_all scipy

---

Here is the error:
==
ERROR: Ticket #396
--
Traceback (most recent call last):
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/core/tests/test_regression.py,
line 600, in check_poly1d_nan_roots
self.failUnlessRaises(N.linalg.LinAlgError,getattr,p,r)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/unittest.py,
line 320, in failUnlessRaises
callableObj(*args, **kwargs)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py,
line 623, in __getattr__
return roots(self.coeffs)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py,
line 124, in roots
roots = _eigvals(A)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/polynomial.py,
line 40, in _eigvals
return eigvals(arg)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py,
line 378, in eigvals
return eig(a,b=b,left=0,right=0,overwrite_a=overwrite_a)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/scipy/linalg/decomp.py,
line 128, in eig
a1 = asarray_chkfinite(a)
  File 
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy/lib/function_base.py,
line 398, in asarray_chkfinite
raise ValueError, array must not contain infs or NaNs
ValueError: array must not contain infs or NaNs

--
Ran 692 tests in 0.456s

FAILED (errors=1)
Out[2]: unittest._TextTestResult run=692 errors=1 failures=0

Thanks,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Memory leak in ndarray

2007-10-26 Thread Robin
I can confirm the same behaviour with numpy '1.0.4.dev4271' on OS X
10.4with python
2.5.1 (installer from python.org).

For me the memory used by the python process grows at about 1MB/sec. The
memory isn't released when the loop is canceled.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fortran 90 compiler problem

2007-10-10 Thread Robin
I had the same problem building on Linux. I found this on a wiki somewhere
(can only find it in google cache now):

(Note: Make sure that if you build with *gfortran* that *g77* is not
installed on your system (or at least is not in your *PATH* when *numpy* is
being built) as you need to link with the same compiler that you built
lapack with when *numpy* builds. It will try and find *g77* first which will
lead to linking errors if you have built lapack with *gfortran*).

The trick for me was that step of removing g77 (or taking it out of the
path) - then distutils seems to pick up gfortran fine, but otherwise it
doesn't seem to work (even with --fcompiler=gnu95 etc)

Cheers

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] diff of find of diff trick

2007-10-09 Thread Robin
Hello,

As a converting MATLAB user I am bringing over some of my code. A core
function I need is to sample the probability distribution of a given set of
data. Sometimes these sets are large so I would like to make it as efficient
as possible. (the data are integers representing members of a discrete
space)

In MATLAB the best way I found was the diff of find of diff trick which
resulted in the completely vectorised solution (below). Does it make sense
to translate this into numpy? I don't have a feel yet for what is fast/slow
- are the functions below built in and so quick (I thought diff might not
be).

Is there a better/more pythonic way to do it?


function Pr=prob(data, nR)

Pr= zeros(nR,1);
% diff of find of diff trick for counting number of elements
temp = sort(data(data0));% want vector excluding P(0)
dtemp= diff([temp;max(temp)+1]);
count = diff(find([1;dtemp]));
indx = temp(dtemp0);
Pr(indx)= count ./ numel(data);% probability


Thanks

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] dot product not behaving as expected

2007-10-08 Thread Robin
Hi,

I have a problem using numpy.dot, see below:

In [151]: m=5

In [152]: n=5

In [153]: x=(m*ones((1,5)))**arange(0,n)

In [154]: y=test.order_length[::-1]

In [155]: x
Out[155]: array([[   1.,5.,   25.,  125.,  625.]])

In [156]: y
Out[156]:
array([[ 1024.],
   [ 1280.],
   [  640.],
   [  160.],
   [   20.]])

In [157]: numpy.dot(x,y)
Out[157]: array([[ 64.]])

In [158]: sum(x*y.T)
Out[158]: 55924.0

Am I missing something?

Thanks,

Robin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


  1   2   >