[Numpy-discussion] upsample or scale an array

2011-12-03 Thread Robin Kraft
Thanks Warren, this is great, and even handles giant arrays just fine if you've 
got enough RAM.

I also just found this StackOverflow post with another solution.

a.repeat(2, axis=0).repeat(2, axis=1). 
http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array

np.kron lets you do more, but for my simple use case the repeat() method is 
faster and more ram efficient with large arrays.

In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')

In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
10 loops, best of 3: 182 ms per loop

In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8'))
1 loops, best of 3: 513 ms per loop


Or for a 43200x4800 array:

In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')

In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
1 loops, best of 3: 6.92 s per loop

In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8'))
1 loops, best of 3: 27.8 s per loop

In this case repeat() peaked at about 1gb of ram usage while np.kron hit about 
1.7gb.

Thanks again Warren. I'd tried way too many variations on reshape and rollaxis, 
and should have come to the Numpy list a lot sooner!

-Robin


On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
 On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
 
  I need to take an array - derived from raster GIS data - and upsample or
  scale it. That is, I need to repeat each value in each dimension so that,
  for example, a 2x2 array becomes a 4x4 array as follows:
 
  [[1, 2],
   [3, 4]]
 
  becomes
 
  [[1,1,2,2],
   [1,1,2,2],
   [3,3,4,4]
   [3,3,4,4]]
 
  It seems like some combination of np.resize or np.repeat and reshape +
  rollaxis would do the trick, but I'm at a loss.
 
  Many thanks!
 
  -Robin
 
 
 
 Just a day or so ago, Josef Perktold showed one way of accomplishing this
 using numpy.kron:
 
 In [14]: a = arange(12).reshape(3,4)
 
 In [15]: a
 Out[15]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])
 
 In [16]: kron(a, ones((2,2)))
 Out[16]:
 array([[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.]])
 
 
 Warren


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Olivier Delalleau
You can also use numpy.tile

-=- Olivier

2011/12/3 Robin Kraft rkra...@gmail.com

 Thanks Warren, this is great, and even handles giant arrays just fine if
 you've got enough RAM.

 I also just found this StackOverflow post with another solution.

 a.repeat(2, axis=0).repeat(2, axis=1).
 http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array

 np.kron lets you do more, but for my simple use case the repeat() method
 is faster and more ram efficient with large arrays.

 In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')

 In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 10 loops, best of 3: 182 ms per loop

 In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8'))
 1 loops, best of 3: 513 ms per loop


 Or for a 43200x4800 array:

 In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')

 In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 1 loops, best of 3: 6.92 s per loop

 In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8'))
 1 loops, best of 3: 27.8 s per loop

 In this case repeat() peaked at about 1gb of ram usage while np.kron hit
 about 1.7gb.

 Thanks again Warren. I'd tried way too many variations on reshape and
 rollaxis, and should have come to the Numpy list a lot sooner!

 -Robin


 On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:

 On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:

 * I need to take an array - derived from raster GIS data - and upsample 
 or** scale it. That is, I need to repeat each value in each dimension so 
 that,** for example, a 2x2 array becomes a 4x4 array as follows: [[1, 
 2],**  [3, 4]] becomes [[1,1,2,2],**  [1,1,2,2],**  
 [3,3,4,4]**  [3,3,4,4]] It seems like some combination of np.resize 
 or np.repeat and reshape +** rollaxis would do the trick, but I'm at a 
 loss. Many thanks! -Robin***

 Just a day or so ago, Josef Perktold showed one way of accomplishing this
 using numpy.kron:

 In [14]: a = arange(12).reshape(3,4)

 In [15]: a
 Out[15]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])

 In [16]: kron(a, ones((2,2)))
 Out[16]:
 array([[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.]])


 Warren




 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Robin Kraft
That does repeat the elements, but doesn't get them into the desired order.

In [4]: print a
[[1 2]
 [3 4]]

In [7]: np.tile(a, 4)
Out[7]: 
array([[1, 2, 1, 2, 1, 2, 1, 2],
   [3, 4, 3, 4, 3, 4, 3, 4]])

In [8]: np.tile(a, 4).reshape(4,4)
Out[8]: 
array([[1, 2, 1, 2],
   [1, 2, 1, 2],
   [3, 4, 3, 4],
   [3, 4, 3, 4]])

It's close, but I want to repeat the elements along the two axes, effectively 
stretching it by the lower right corner:

array([[1, 1, 2, 2],
   [1, 1, 2, 2],
   [3, 3, 4, 4],
   [3, 3, 4, 4]])

It would take some more reshaping/axis rolling to get there, but it seems 
doable.

Anyone know what combination of manipulations would work with the result of 
np.tile?

-Robin



On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote:

 You can also use numpy.tile
 
 -=- Olivier
 
 2011/12/3 Robin Kraft

 Thanks Warren, this is great, and even handles giant arrays just fine if 
 you've got enough RAM.
 
 I also just found this StackOverflow post with another solution.
 
 a.repeat(2, axis=0).repeat(2, axis=1). 
 http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array
 
 np.kron lets you do more, but for my simple use case the repeat() method is 
 faster and more ram efficient with large arrays.
 
 In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')
 
 In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 10 loops, best of 3: 182 ms per loop
 
 In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8'))
 1 loops, best of 3: 513 ms per loop
 
 
 Or for a 43200x4800 array:
 
 In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')
 
 In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 1 loops, best of 3: 6.92 s per loop
 
 In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8'))
 1 loops, best of 3: 27.8 s per loop
 
 In this case repeat() peaked at about 1gb of ram usage while np.kron hit 
 about 1.7gb.
 
 Thanks again Warren. I'd tried way too many variations on reshape and 
 rollaxis, and should have come to the Numpy list a lot sooner!
 
 -Robin
 
 
 On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:
 On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:
 
  I need to take an array - derived from raster GIS data - and upsample or
  scale it. That is, I need to repeat each value in each dimension so that,
  for example, a 2x2 array becomes a 4x4 array as follows:
 
  [[1, 2],
   [3, 4]]
 
  becomes
 
  [[1,1,2,2],
   [1,1,2,2],
   [3,3,4,4]
   [3,3,4,4]]
 
  It seems like some combination of np.resize or np.repeat and reshape +
  rollaxis would do the trick, but I'm at a loss.
 
  Many thanks!
 
  -Robin
 
 
 
 Just a day or so ago, Josef Perktold showed one way of accomplishing this
 using numpy.kron:
 
 In [14]: a = arange(12).reshape(3,4)
 
 In [15]: a
 Out[15]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])
 
 In [16]: kron(a, ones((2,2)))
 Out[16]:
 array([[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.]])
 
 
 Warren
 
 
 
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Olivier Delalleau
Ah sorry, I hadn't read carefully enough what you were trying to achieve. I
think the double repeat solution looks like your best option then.

-=- Olivier

2011/12/3 Robin Kraft rkra...@gmail.com

 That does repeat the elements, but doesn't get them into the desired order.

 In [4]: print a
 [[1 2]
  [3 4]]

 In [7]: np.tile(a, 4)
 Out[7]:
 array([[1, 2, 1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4, 3, 4]])

 In [8]: np.tile(a, 4).reshape(4,4)
 Out[8]:
 array([[1, 2, 1, 2],
[1, 2, 1, 2],
[3, 4, 3, 4],
[3, 4, 3, 4]])

 It's close, but I want to repeat the elements along the two
 axes, effectively stretching it by the lower right corner:

 array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])

 It would take some more reshaping/axis rolling to get there, but it seems
 doable.

 Anyone know what combination of manipulations would work with the result
 of np.tile?

 -Robin



 On Dec 3, 2011, at 11:05 AM, Olivier Delalleau wrote:

 You can also use numpy.tile

 -=- Olivier

 2011/12/3 Robin Kraft

 Thanks Warren, this is great, and even handles giant arrays just fine if
 you've got enough RAM.

 I also just found this StackOverflow post with another solution.

 a.repeat(2, axis=0).repeat(2, axis=1).
 http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array

 np.kron lets you do more, but for my simple use case the repeat() method
 is faster and more ram efficient with large arrays.

 In [3]: a = np.random.randint(0, 255, (2400, 2400)).astype('uint8')

 In [4]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 10 loops, best of 3: 182 ms per loop

 In [5]: timeit np.kron(a, np.ones((2,2), dtype='uint8'))
 1 loops, best of 3: 513 ms per loop


 Or for a 43200x4800 array:

 In [6]: a = np.random.randint(0, 255, (2400*18, 2400*2)).astype('uint8')

 In [7]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
 1 loops, best of 3: 6.92 s per loop

 In [8]: timeit np.kron(a, np.ones((2, 2), dtype='uint8'))
 1 loops, best of 3: 27.8 s per loop

 In this case repeat() peaked at about 1gb of ram usage while np.kron hit
 about 1.7gb.

 Thanks again Warren. I'd tried way too many variations on reshape and
 rollaxis, and should have come to the Numpy list a lot sooner!

 -Robin


 On Dec 3, 2011, at 12:51 AM, Warren Weckesser wrote:

 On Sat, Dec 3, 2011 at 12:35 AM, Robin Kraft wrote:

 * I need to take an array - derived from raster GIS data - and upsample 
 or** scale it. That is, I need to repeat each value in each dimension so 
 that,** for example, a 2x2 array becomes a 4x4 array as follows: [[1, 
 2],**  [3, 4]] becomes [[1,1,2,2],**  [1,1,2,2],**  
 [3,3,4,4]**  [3,3,4,4]] It seems like some combination of np.resize 
 or np.repeat and reshape +** rollaxis would do the trick, but I'm at a 
 loss. Many thanks! -Robin***

 Just a day or so ago, Josef Perktold showed one way of accomplishing this
 using numpy.kron:

 In [14]: a = arange(12).reshape(3,4)

 In [15]: a
 Out[15]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])

 In [16]: kron(a, ones((2,2)))
 Out[16]:
 array([[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  0.,   0.,   1.,   1.,   2.,   2.,   3.,   3.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  4.,   4.,   5.,   5.,   6.,   6.,   7.,   7.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.],
[  8.,   8.,   9.,   9.,  10.,  10.,  11.,  11.]])


 Warren





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Derek Homeier
On 03.12.2011, at 6:22PM, Robin Kraft wrote:

 That does repeat the elements, but doesn't get them into the desired order.
 
 In [4]: print a
 [[1 2]
  [3 4]]
 
 In [7]: np.tile(a, 4)
 Out[7]: 
 array([[1, 2, 1, 2, 1, 2, 1, 2],
[3, 4, 3, 4, 3, 4, 3, 4]])
 
 In [8]: np.tile(a, 4).reshape(4,4)
 Out[8]: 
 array([[1, 2, 1, 2],
[1, 2, 1, 2],
[3, 4, 3, 4],
[3, 4, 3, 4]])
 
 It's close, but I want to repeat the elements along the two axes, effectively 
 stretching it by the lower right corner:
 
 array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
 
 It would take some more reshaping/axis rolling to get there, but it seems 
 doable.
 
 Anyone know what combination of manipulations would work with the result of 
 np.tile?
 
Rolling was the keyword:

np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4))
[[1 1 2 2]
 [1 1 2 2]
 [3 3 4 4]
 [3 3 4 4]]

I leave the generalisation and timing up to you, but it seems for 
a = np.arange(M**2).reshape(M,-1)

np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) 

should do the trick.

Cheers,
Derek

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Derek Homeier
On 03.12.2011, at 6:47PM, Olivier Delalleau wrote:

 Ah sorry, I hadn't read carefully enough what you were trying to achieve. I 
 think the double repeat solution looks like your best option then.

Considering that it is a lot shorter than fixing the tile() result, you 
are probably right (I've only now looked closer at the repeat() 
solution ;-). I'd still be interested in the performance - since I think 
none of the reshape or rollaxis operations actually move any data 
in memory (for numpy  1.6), it might still be faster. 

Cheers,
Derek
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] upsample or scale an array

2011-12-03 Thread Robin Kraft
Ha! I knew it had to be possible! Thanks Derek. So for and N = 2 (now on my 
laptop):

In [70]: M = 1200
In [69]: N = 2
In [71]: a = np.random.randint(0, 255, (M**2)).reshape(M,-1)

In [76]: timeit np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 
1).reshape(M*N,-1)
10 loops, best of 3: 99.1 ms per loop

In [78]: timeit a.repeat(2, axis=0).repeat(2, axis=1)
10 loops, best of 3: 85.6 ms per loop

In [79]: timeit np.kron(a, np.ones((2,2), 'uint8'))
1 loops, best of 3: 521 ms per loop

It turns out np.kron and repeat are pretty straightforward for 
multi-dimensional data too - scaling or stretching a stacked array representing 
pixel data over time, for example. Nothing changes for np.kron - it handles the 
additional dimensionality by itself. With repeat you just tell it to operate on 
the last two dimensions.

So to sum up:

1) np.kron is cool for the simplicity of the code and simple scaling to N 
dimensions. It's also handy if you want to scale the array elements themselves 
too.
2) repeat() along the last N axes is a bit more intuitive (i.e. less magical) 
to me and has a better performance profile. 
3) Derek's reshape/rolling solution is almost as fast but it gives me a 
headache trying to visualize what it's actually doing. I don't want to think 
about adding another dimension ...

Thanks for the help folks. Here's scaling of a hypothetical time series (i.e. 3 
axes), where each sub-array represents a month.


In [26]: print a
[[[1 2]
  [3 4]]

 [[1 2]
  [3 4]]

 [[1 2]
  [3 4]]]

In [27]: np.kron(a, np.ones((2,2), dtype='uint8'))
Out[27]: 
array([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]])

In [64]: a.repeat(2, axis=1).repeat(2, axis=2)
Out[64]: 
array([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],

   [[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]])

On Dec. 3, 2011, at 12:50PM, Derek Homeier wrote:

 On 03.12.2011, at 6:22PM, Robin Kraft wrote:
 
  That does repeat the elements, but doesn't get them into the desired order.
  
  In [4]: print a
  [[1 2]
   [3 4]]
  
  In [7]: np.tile(a, 4)
  Out[7]: 
  array([[1, 2, 1, 2, 1, 2, 1, 2],
 [3, 4, 3, 4, 3, 4, 3, 4]])
  
  In [8]: np.tile(a, 4).reshape(4,4)
  Out[8]: 
  array([[1, 2, 1, 2],
 [1, 2, 1, 2],
 [3, 4, 3, 4],
 [3, 4, 3, 4]])
  
  It's close, but I want to repeat the elements along the two axes, 
  effectively stretching it by the lower right corner:
  
  array([[1, 1, 2, 2],
 [1, 1, 2, 2],
 [3, 3, 4, 4],
 [3, 3, 4, 4]])
  
  It would take some more reshaping/axis rolling to get there, but it seems 
  doable.
  
  Anyone know what combination of manipulations would work with the result of 
  np.tile?
  
 Rolling was the keyword:
 
 np.rollaxis(np.tile(a, 4).reshape(2,2,-1), 2, 1).reshape(4,4))
 [[1 1 2 2]
  [1 1 2 2]
  [3 3 4 4]
  [3 3 4 4]]
 
 I leave the generalisation and timing up to you, but it seems for 
 a = np.arange(M**2).reshape(M,-1)
 
 np.rollaxis(np.tile(a, N**2).reshape(M,N,-1), 2, 1).reshape(M*N,-1) 
 
 should do the trick.
 
 Cheers,
   Derek
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] bug in PyArray_GetCastFunc

2011-12-03 Thread Geoffrey Irving
When attempting to cast to a user defined type, PyArray_GetCast looks
up the cast function in the dictionary but doesn't check if the entry
exists.  This causes segfaults.  Here's a patch.

Geoffrey

diff --git a/numpy/core/src/multiarray/convert_datatype.c
b/numpy/core/src/multiarray/convert_datatype.c
index 818d558..4b8f38b 100644
--- a/numpy/core/src/multiarray/convert_datatype.c
+++ b/numpy/core/src/multiarray/convert_datatype.c
@@ -81,7 +81,7 @@ PyArray_GetCastFunc(PyArray_Descr *descr, int type_num)
 key = PyInt_FromLong(type_num);
 cobj = PyDict_GetItem(obj, key);
 Py_DECREF(key);
-if (NpyCapsule_Check(cobj)) {
+if (cobj  NpyCapsule_Check(cobj)) {
 castfunc = NpyCapsule_AsVoidPtr(cobj);
 }
 }
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] NumPy Governance

2011-12-03 Thread Travis Oliphant

Hi everyone, 

There have been some wonderfully vigorous discussions over the past few months 
that have made it clear that we need some clarity about how decisions will be 
made in the NumPy community.   

When we were a smaller bunch of people it seemed easier to come to an agreement 
and things pretty much evolved based on (mostly) consensus and who was 
available to actually do the work. 

There is a need for a more clear structure so that we know how decisions will 
get made and so that code can move forward while paying attention to the 
current user-base.   There has been a steering committee structure for SciPy 
in the past, and I have certainly been prone to lump both NumPy and SciPy 
together given that I have a strong interest in and have spent a great amount 
of time working on both projects.Others have also spent time on both 
projects. 

However, I think it is critical at this stage to clearly separate the projects 
and define a governing structure that is fair and agreeable for NumPy.   SciPy 
has multiple modules and will probably need structure around each module 
independently.For now, I wanted to open up a discussion to see what people 
thought about NumPy's governance.   

My initial thoughts: 

* discussions happen as they do now on the mailing list
* a small group of developers (5-11) constitute the board and major 
decisions are made by vote of that group (not just simple majority --- needs at 
least 2/3 +1 votes). 
* votes are +1/+0/-0/-1  
* if a topic is difficult to resolve it is moved off the main list and 
discussed on a separate board mailing list --- these should be rare, but 
parts of the NA discussion would probably qualify
* This board mailing list is publically viewable but only board 
members may post. 
* The board is renewed and adjusted each year --- based on nomination 
and 2/3 vote of the current board until board is at 11.  
* The chairman of the board is voted by a majority of the board and has 
veto power unless over-ridden by 3/4 of the board.
* Petitions to remove people off the board can be made by 50+ 
independent reverse nominations (hopefully people will just withdraw if they 
are no longer active). 

All of these points are open for discussion.  I just thought I would start the 
conversation.   I will be much more active this next year with NumPy and will 
be very interested in the direction NumPy is taking.I'm hoping to discern 
by this conversation, who else is very interested in the direction of NumPy so 
that the first board can be formally constituted.  

Best regards,

-Travis


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] failure to register ufunc loops for user defined types

2011-12-03 Thread Geoffrey Irving
Hello,

I'm trying to add a fixed precision rational number dtype to numpy,
and am running into an issue trying to register ufunc loops.  The code
in question looks like

int npy_rational = PyArray_RegisterDataType(rational_descr);
PyObject* equal = ... // extract equal object from the imported numpy module
int types[3] = {npy_rational,npy_rational,NPY_BOOL};
if 
(PyUFunc_RegisterLoopForType((PyUFuncObject*)ufunc,npy_rational,rational_ufunc_##name,_types,0)0)
return 0;

In Python 2.6.7 with the latest numpy from git, I get

 from rational import *
 i = array([rational(5,3)])
 i
array([5/3], dtype=rational)
 equal(i,i)
Traceback (most recent call last):
  File stdin, line 1, in module
TypeError: ufunc 'equal' not supported for the input types, and
the inputs could not be safely coerced to any supported types
according to the casting rule ''safe''

The same thing happens with (rational,rational)-rational ufuncs like multiply.

The full extension module code is here:

https://github.com/girving/poker/blob/rational/rational.cpp

I realize this isn't much information to go on, but let me know if
anything comes to mind in terms of possible reasons or further tests
to run.  Unfortunately it looks like the ufunc ntypes and types
properties aren't updated based on user-defined loops, so I'm not yet
sure if the problem is in registry or resolution.

It's also possible someone else hit this before:
http://projects.scipy.org/numpy/ticket/1913.

Thanks,
Geoffrey
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy Governance

2011-12-03 Thread Matthew Brett
Hi Travis,

On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant teoliph...@gmail.com wrote:

 Hi everyone,

 There have been some wonderfully vigorous discussions over the past few 
 months that have made it clear that we need some clarity about how decisions 
 will be made in the NumPy community.

 When we were a smaller bunch of people it seemed easier to come to an 
 agreement and things pretty much evolved based on (mostly) consensus and who 
 was available to actually do the work.

 There is a need for a more clear structure so that we know how decisions will 
 get made and so that code can move forward while paying attention to the 
 current user-base.   There has been a steering committee structure for 
 SciPy in the past, and I have certainly been prone to lump both NumPy and 
 SciPy together given that I have a strong interest in and have spent a great 
 amount of time working on both projects.    Others have also spent time on 
 both projects.

 However, I think it is critical at this stage to clearly separate the 
 projects and define a governing structure that is fair and agreeable for 
 NumPy.   SciPy has multiple modules and will probably need structure around 
 each module independently.    For now, I wanted to open up a discussion to 
 see what people thought about NumPy's governance.

 My initial thoughts:

        * discussions happen as they do now on the mailing list
        * a small group of developers (5-11) constitute the board and major 
 decisions are made by vote of that group (not just simple majority --- needs 
 at least 2/3 +1 votes).
        * votes are +1/+0/-0/-1
        * if a topic is difficult to resolve it is moved off the main list and 
 discussed on a separate board mailing list --- these should be rare, but 
 parts of the NA discussion would probably qualify
        * This board mailing list is publically viewable but only board 
 members may post.
        * The board is renewed and adjusted each year --- based on nomination 
 and 2/3 vote of the current board until board is at 11.
        * The chairman of the board is voted by a majority of the board and 
 has veto power unless over-ridden by 3/4 of the board.
        * Petitions to remove people off the board can be made by 50+ 
 independent reverse nominations (hopefully people will just withdraw if they 
 are no longer active).

Thanks very much for starting this discussion.

You have probably seen that my preference would be for all discussions
to be public - in the sense that all can contribute.  So, it seems
reasonable to me to have 'board' as you describe, but that the board
should vote on the same mailing list as the rest of the discussion.
Having a separate mailing list for discussion makes the separation
overt between those with a granted voice and those without, and I
would hope for a structure which emphasized discsussion in an open
forum.

Put another way, what advantage would having a separate public mailing
list have?

How does this governance compare to that of - say - Linux or Python or Debian?

My worry will be that it will be too tempting to terminate discussions
and proceed to resolve by vote, when voting (as Karl Vogel describes)
may still do harm.

What will be the position - maybe I mean your position - on consensus
as Nathaniel has described it?  I feel the masked array discussion
would have been more productive (an maybe shorter and more to the
point) if there had been some rule-of-thumb that every effort is made
to reach consensus before proceeding to implementation - or a vote.

For example, in the masked array discussion, I would have liked to be
able to say 'hold on, we have a rule that we try our best to reach
consensus; I do not feel we have done that yet'.

See you,

Matthew

I guess that the
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy Governance

2011-12-03 Thread Travis Oliphant
I like the idea of trying to reach consensus first. The only point of 
having a board is to have someway to resolve issues should consensus not be 
reachable.   Believe me,  I'm not that excited about a separate mailing list.   
It would be great if we could resolve everything on a single list. 

-Travis



On Dec 3, 2011, at 9:42 PM, Matthew Brett wrote:

 Hi Travis,
 
 On Sat, Dec 3, 2011 at 6:18 PM, Travis Oliphant teoliph...@gmail.com wrote:
 
 Hi everyone,
 
 There have been some wonderfully vigorous discussions over the past few 
 months that have made it clear that we need some clarity about how decisions 
 will be made in the NumPy community.
 
 When we were a smaller bunch of people it seemed easier to come to an 
 agreement and things pretty much evolved based on (mostly) consensus and who 
 was available to actually do the work.
 
 There is a need for a more clear structure so that we know how decisions 
 will get made and so that code can move forward while paying attention to 
 the current user-base.   There has been a steering committee structure for 
 SciPy in the past, and I have certainly been prone to lump both NumPy and 
 SciPy together given that I have a strong interest in and have spent a great 
 amount of time working on both projects.Others have also spent time on 
 both projects.
 
 However, I think it is critical at this stage to clearly separate the 
 projects and define a governing structure that is fair and agreeable for 
 NumPy.   SciPy has multiple modules and will probably need structure around 
 each module independently.For now, I wanted to open up a discussion to 
 see what people thought about NumPy's governance.
 
 My initial thoughts:
 
* discussions happen as they do now on the mailing list
* a small group of developers (5-11) constitute the board and major 
 decisions are made by vote of that group (not just simple majority --- needs 
 at least 2/3 +1 votes).
* votes are +1/+0/-0/-1
* if a topic is difficult to resolve it is moved off the main list 
 and discussed on a separate board mailing list --- these should be rare, 
 but parts of the NA discussion would probably qualify
* This board mailing list is publically viewable but only board 
 members may post.
* The board is renewed and adjusted each year --- based on nomination 
 and 2/3 vote of the current board until board is at 11.
* The chairman of the board is voted by a majority of the board and 
 has veto power unless over-ridden by 3/4 of the board.
* Petitions to remove people off the board can be made by 50+ 
 independent reverse nominations (hopefully people will just withdraw if they 
 are no longer active).
 
 Thanks very much for starting this discussion.
 
 You have probably seen that my preference would be for all discussions
 to be public - in the sense that all can contribute.  So, it seems
 reasonable to me to have 'board' as you describe, but that the board
 should vote on the same mailing list as the rest of the discussion.
 Having a separate mailing list for discussion makes the separation
 overt between those with a granted voice and those without, and I
 would hope for a structure which emphasized discsussion in an open
 forum.
 
 Put another way, what advantage would having a separate public mailing
 list have?
 
 How does this governance compare to that of - say - Linux or Python or Debian?
 
 My worry will be that it will be too tempting to terminate discussions
 and proceed to resolve by vote, when voting (as Karl Vogel describes)
 may still do harm.
 
 What will be the position - maybe I mean your position - on consensus
 as Nathaniel has described it?  I feel the masked array discussion
 would have been more productive (an maybe shorter and more to the
 point) if there had been some rule-of-thumb that every effort is made
 to reach consensus before proceeding to implementation - or a vote.
 
 For example, in the masked array discussion, I would have liked to be
 able to say 'hold on, we have a rule that we try our best to reach
 consensus; I do not feel we have done that yet'.
 
 See you,
 
 Matthew
 
 I guess that the
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliph...@enthought.com
1-512-536-1057
http://www.enthought.com



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Convert datetime64 to python datetime.datetime in numpy 1.6.1?

2011-12-03 Thread Warren Weckesser
In numpy 1.6.1, what's the most straightforward way to convert a datetime64
to a python datetime.datetime?  E.g. I have

In [1]: d = datetime64(2011-12-03 12:34:56.75)

In [2]: d
Out[2]: 2011-12-03 12:34:56.75

I want the same time as a datetime.datetime instance.  My best hack so far
is to parse repr(d) with datetime.datetime.strptime:

In [3]: import datetime

In [4]: dt = datetime.datetime.strptime(repr(d), %Y-%m-%d %H:%M:%S.%f)

In [5]: dt
Out[5]: datetime.datetime(2011, 12, 3, 12, 34, 56, 75)

That works--unless there are no microseconds, in which case .%f must be
removed from the format string--but there must be a better way.

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NumPy Governance

2011-12-03 Thread Charles R Harris
On Sat, Dec 3, 2011 at 7:18 PM, Travis Oliphant teoliph...@gmail.comwrote:


 Hi everyone,

 There have been some wonderfully vigorous discussions over the past few
 months that have made it clear that we need some clarity about how
 decisions will be made in the NumPy community.

 When we were a smaller bunch of people it seemed easier to come to an
 agreement and things pretty much evolved based on (mostly) consensus and
 who was available to actually do the work.

 There is a need for a more clear structure so that we know how decisions
 will get made and so that code can move forward while paying attention to
 the current user-base.   There has been a steering committee structure
 for SciPy in the past, and I have certainly been prone to lump both NumPy
 and SciPy together given that I have a strong interest in and have spent a
 great amount of time working on both projects.Others have also spent
 time on both projects.

 However, I think it is critical at this stage to clearly separate the
 projects and define a governing structure that is fair and agreeable for
 NumPy.   SciPy has multiple modules and will probably need structure around
 each module independently.For now, I wanted to open up a discussion to
 see what people thought about NumPy's governance.

 My initial thoughts:

* discussions happen as they do now on the mailing list
* a small group of developers (5-11) constitute the board and
 major decisions are made by vote of that group (not just simple majority
 --- needs at least 2/3 +1 votes).
* votes are +1/+0/-0/-1
* if a topic is difficult to resolve it is moved off the main list
 and discussed on a separate board mailing list --- these should be rare,
 but parts of the NA discussion would probably qualify
* This board mailing list is publically viewable but only board
 members may post.
* The board is renewed and adjusted each year --- based on
 nomination and 2/3 vote of the current board until board is at 11.
* The chairman of the board is voted by a majority of the board and
 has veto power unless over-ridden by 3/4 of the board.
* Petitions to remove people off the board can be made by 50+
 independent reverse nominations (hopefully people will just withdraw if
 they are no longer active).

 All of these points are open for discussion.  I just thought I would start
 the conversation.   I will be much more active this next year with NumPy
 and will be very interested in the direction NumPy is taking.I'm hoping
 to discern by this conversation, who else is very interested in the
 direction of NumPy so that the first board can be formally constituted.


If the purpose of the board is to resolve controversies, the 2/3
requirement is going to cause problems. The reason majority votes are
usually used and that committees are set up with an odd number of members
is that nothing gets resolved otherwise. Doing nothing is not a solution to
missing consensus.  Furthermore, at the current time, I don't think there
are 5 active developers, let alone 11.  With hard work you might scrape
together two or three. Having 5 or 11 people making decisions for the two
or three actually doing the work isn't going to go over well. I would
propose a technical board of one or three people who can step in if an
issue look like it needs outside intervention. And I would suggest at least
one of the members be someone from the outside but familiar with the
project, say someone like Fernando. The one member model is if we decide to
go with a benevolent dictator. Note that for the smaller boards both the
2/3'rds and majority votes would be the same number of people ;)

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion