Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Eelco Hoogendoorn
Thanks Warren, I think these are sensible additions.

I would argue to treat the None-False condition as an error. Indeed I agree
one might argue the correcr behavior is to 'shuffle' the singleton block of
data, which does nothing; but its more likely to come up as an unintended
error than as a natural outcome of parametrized behavior.

On Sun, Oct 12, 2014 at 3:31 AM, John Zwinck jzwi...@gmail.com wrote:

 On Sun, Oct 12, 2014 at 6:51 AM, Warren Weckesser
 warren.weckes...@gmail.com wrote:
  I created an issue on github for an enhancement
  to numpy.random.shuffle:
  https://github.com/numpy/numpy/issues/5173

 I like this idea.  I was a bit surprised there wasn't something like
 this already.

  A small wart in this API is the meaning of
 
shuffle(a, independent=False, axis=None)
 
  It could be argued that the correct behavior is to leave the
  array unchanged. (The current behavior can be interpreted as
  shuffling a 1-d sequence of monolithic blobs; the axis argument
  specifies which axis of the array corresponds to the
  sequence index.  Then `axis=None` means the argument is
  a single monolithic blob, so there is nothing to shuffle.)
  Or an error could be raised.

 Let's think about it from the other direction: if a user wants to
 shuffle all the elements as if it were 1-d, as you point out they
 could do this:

   shuffle(a, axis=None, independent=True)

 But that's a lot of typing.  Maybe we should just let this do the same
 thing:

   shuffle(a, axis=None)

 That seems to be in keeping with the other APIs taking axis as you
 mentioned.  To me, independent has no relevance when the array is
 1-d, it can simply be ignored.

 John Zwinck
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread John Zwinck
On Sun, Oct 12, 2014 at 3:51 PM, Eelco Hoogendoorn
hoogendoorn.ee...@gmail.com wrote:
 I would argue to treat the None-False condition as an error. Indeed I agree
 one might argue the correcr behavior is to 'shuffle' the singleton block of
 data, which does nothing; but its more likely to come up as an unintended
 error than as a natural outcome of parametrized behavior.

I'm interested to know why you think axis=None should raise an error
if independent=False when independent=False is the default.  What I
mean is, if someone uses this function and wants axis=None (which
seems not totally unusual), why force them to always type in the
boilerplate independent=True to make it work?

John Zwinck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Stefan van der Walt
Hi Warren

On 2014-10-12 00:51:56, Warren Weckesser warren.weckes...@gmail.com wrote:
 A small wart in this API is the meaning of

   shuffle(a, independent=False, axis=None)

 It could be argued that the correct behavior is to leave the
 array unchanged.

I like the suggested changes.  Since independent loses its meaning
when axis is None, I would expect this to have the same effect as
`shuffle(a, independent=True, axis=None)`.  I think a shuffle function
that doesn't shuffle will confuse a lot of people!

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Eelco Hoogendoorn
yeah, a shuffle function that does not shuffle indeed seems like a major
source of bugs to me.

Indeed one could argue that setting axis=None should suffice to give a
clear enough declaration of intent; though I wouldn't mind typing the extra
bit to ensure consistent semantics.

On Sun, Oct 12, 2014 at 10:56 AM, Stefan van der Walt ste...@sun.ac.za
wrote:

 Hi Warren

 On 2014-10-12 00:51:56, Warren Weckesser warren.weckes...@gmail.com
 wrote:
  A small wart in this API is the meaning of
 
shuffle(a, independent=False, axis=None)
 
  It could be argued that the correct behavior is to leave the
  array unchanged.

 I like the suggested changes.  Since independent loses its meaning
 when axis is None, I would expect this to have the same effect as
 `shuffle(a, independent=True, axis=None)`.  I think a shuffle function
 that doesn't shuffle will confuse a lot of people!

 Stéfan
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Robert Kern
On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
warren.weckes...@gmail.com wrote:

 A small wart in this API is the meaning of

   shuffle(a, independent=False, axis=None)

 It could be argued that the correct behavior is to leave the
 array unchanged. (The current behavior can be interpreted as
 shuffling a 1-d sequence of monolithic blobs; the axis argument
 specifies which axis of the array corresponds to the
 sequence index.  Then `axis=None` means the argument is
 a single monolithic blob, so there is nothing to shuffle.)
 Or an error could be raised.

 What do you think?

It seems to me a perfectly good reason to have two methods instead of
one. I can't imagine when I wouldn't be using a literal True or False
for this, so it really should be two different methods.

That said, I would just make the axis=None behavior the same for both
methods. axis=None does *not* mean treat this like a single
monolithic blob in any of the axis=-having methods; it means flatten
the array and do the operation on the single flattened axis. I think
the latter behavior is a reasonable interpretation of axis=None for
both methods.

-- 
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern robert.k...@gmail.com wrote:

 On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
 warren.weckes...@gmail.com wrote:

  A small wart in this API is the meaning of
 
shuffle(a, independent=False, axis=None)
 
  It could be argued that the correct behavior is to leave the
  array unchanged. (The current behavior can be interpreted as
  shuffling a 1-d sequence of monolithic blobs; the axis argument
  specifies which axis of the array corresponds to the
  sequence index.  Then `axis=None` means the argument is
  a single monolithic blob, so there is nothing to shuffle.)
  Or an error could be raised.
 
  What do you think?

 It seems to me a perfectly good reason to have two methods instead of
 one. I can't imagine when I wouldn't be using a literal True or False
 for this, so it really should be two different methods.



I agree, and my first inclination was to propose a different method (and I
had the bikeshedding conversation with myself about the name: disarrange,
scramble, disorder, randomize, ashuffle, some other variation of
the word shuffle, ...), but I figured the first thing folks would say is
Why not just add options to shuffle?  So, choose your battles and all
that.

What do other folks think of making a separate method?



 That said, I would just make the axis=None behavior the same for both
 methods. axis=None does *not* mean treat this like a single
 monolithic blob in any of the axis=-having methods; it means flatten
 the array and do the operation on the single flattened axis. I think
 the latter behavior is a reasonable interpretation of axis=None for
 both methods.



Sounds good to me.

Warren




 --
 Robert Kern
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread josef.pktd
On Sun, Oct 12, 2014 at 10:54 AM, Warren Weckesser
warren.weckes...@gmail.com wrote:


 On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern robert.k...@gmail.com wrote:

 On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
 warren.weckes...@gmail.com wrote:

  A small wart in this API is the meaning of
 
shuffle(a, independent=False, axis=None)
 
  It could be argued that the correct behavior is to leave the
  array unchanged. (The current behavior can be interpreted as
  shuffling a 1-d sequence of monolithic blobs; the axis argument
  specifies which axis of the array corresponds to the
  sequence index.  Then `axis=None` means the argument is
  a single monolithic blob, so there is nothing to shuffle.)
  Or an error could be raised.
 
  What do you think?

 It seems to me a perfectly good reason to have two methods instead of
 one. I can't imagine when I wouldn't be using a literal True or False
 for this, so it really should be two different methods.



 I agree, and my first inclination was to propose a different method (and I
 had the bikeshedding conversation with myself about the name: disarrange,
 scramble, disorder, randomize, ashuffle, some other variation of the
 word shuffle, ...), but I figured the first thing folks would say is Why
 not just add options to shuffle?  So, choose your battles and all that.

 What do other folks think of making a separate method?

I'm not a fan of many similar functions.

What's the difference between permute, shuffle and scramble?
And how do I find or remember which is which?





 That said, I would just make the axis=None behavior the same for both
 methods. axis=None does *not* mean treat this like a single
 monolithic blob in any of the axis=-having methods; it means flatten
 the array and do the operation on the single flattened axis. I think
 the latter behavior is a reasonable interpretation of axis=None for
 both methods.



 Sounds good to me.

+1 (since all the arguments have been already given


Josef
- Why does sort treat columns independently instead of sorting rows?
- because there is lexsort
- Oh, lexsort, I haven thought about it in 5 years. It's not even next
to sort in the pop up code completion



 Warren




 --
 Robert Kern
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 11:20 AM, josef.p...@gmail.com wrote:

 On Sun, Oct 12, 2014 at 10:54 AM, Warren Weckesser
 warren.weckes...@gmail.com wrote:
 
 
  On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern robert.k...@gmail.com
 wrote:
 
  On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
  warren.weckes...@gmail.com wrote:
 
   A small wart in this API is the meaning of
  
 shuffle(a, independent=False, axis=None)
  
   It could be argued that the correct behavior is to leave the
   array unchanged. (The current behavior can be interpreted as
   shuffling a 1-d sequence of monolithic blobs; the axis argument
   specifies which axis of the array corresponds to the
   sequence index.  Then `axis=None` means the argument is
   a single monolithic blob, so there is nothing to shuffle.)
   Or an error could be raised.
  
   What do you think?
 
  It seems to me a perfectly good reason to have two methods instead of
  one. I can't imagine when I wouldn't be using a literal True or False
  for this, so it really should be two different methods.
 
 
 
  I agree, and my first inclination was to propose a different method (and
 I
  had the bikeshedding conversation with myself about the name:
 disarrange,
  scramble, disorder, randomize, ashuffle, some other variation of
 the
  word shuffle, ...), but I figured the first thing folks would say is
 Why
  not just add options to shuffle?  So, choose your battles and all that.
 
  What do other folks think of making a separate method?

 I'm not a fan of many similar functions.

 What's the difference between permute, shuffle and scramble?



The difference between `shuffle` and the new method being proposed is
explained in the first email in this thread.
`np.random.permutation` with an array argument returns a shuffled copy of
the array; it does not modify its argument. (It should also get an `axis`
argument when `shuffle` gets an `axis` argument.)


And how do I find or remember which is which?



You could start with `doc(np.random)` (or `np.random?` in ipython).

Warren





 
 
 
  That said, I would just make the axis=None behavior the same for both
  methods. axis=None does *not* mean treat this like a single
  monolithic blob in any of the axis=-having methods; it means flatten
  the array and do the operation on the single flattened axis. I think
  the latter behavior is a reasonable interpretation of axis=None for
  both methods.
 
 
 
  Sounds good to me.

 +1 (since all the arguments have been already given


 Josef
 - Why does sort treat columns independently instead of sorting rows?
 - because there is lexsort
 - Oh, lexsort, I haven thought about it in 5 years. It's not even next
 to sort in the pop up code completion


 
  Warren
 
 
 
 
  --
  Robert Kern
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread josef.pktd
On Sun, Oct 12, 2014 at 11:33 AM, Warren Weckesser
warren.weckes...@gmail.com wrote:


 On Sun, Oct 12, 2014 at 11:20 AM, josef.p...@gmail.com wrote:

 On Sun, Oct 12, 2014 at 10:54 AM, Warren Weckesser
 warren.weckes...@gmail.com wrote:
 
 
  On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern robert.k...@gmail.com
  wrote:
 
  On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
  warren.weckes...@gmail.com wrote:
 
   A small wart in this API is the meaning of
  
 shuffle(a, independent=False, axis=None)
  
   It could be argued that the correct behavior is to leave the
   array unchanged. (The current behavior can be interpreted as
   shuffling a 1-d sequence of monolithic blobs; the axis argument
   specifies which axis of the array corresponds to the
   sequence index.  Then `axis=None` means the argument is
   a single monolithic blob, so there is nothing to shuffle.)
   Or an error could be raised.
  
   What do you think?
 
  It seems to me a perfectly good reason to have two methods instead of
  one. I can't imagine when I wouldn't be using a literal True or False
  for this, so it really should be two different methods.
 
 
 
  I agree, and my first inclination was to propose a different method (and
  I
  had the bikeshedding conversation with myself about the name:
  disarrange,
  scramble, disorder, randomize, ashuffle, some other variation of
  the
  word shuffle, ...), but I figured the first thing folks would say is
  Why
  not just add options to shuffle?  So, choose your battles and all that.
 
  What do other folks think of making a separate method?

 I'm not a fan of many similar functions.

 What's the difference between permute, shuffle and scramble?



 The difference between `shuffle` and the new method being proposed is
 explained in the first email in this thread.
 `np.random.permutation` with an array argument returns a shuffled copy of
 the array; it does not modify its argument. (It should also get an `axis`
 argument when `shuffle` gets an `axis` argument.)


 And how do I find or remember which is which?



 You could start with `doc(np.random)` (or `np.random?` in ipython).

If you have to check the docstring each time, then there is something wrong.
In my opinion all docstrings should be read only once.

It's like a Windows program where the GUI menus are not **self-explanatory**.

What did Save-As do ?

Josef



 Warren





 
 
 
  That said, I would just make the axis=None behavior the same for both
  methods. axis=None does *not* mean treat this like a single
  monolithic blob in any of the axis=-having methods; it means flatten
  the array and do the operation on the single flattened axis. I think
  the latter behavior is a reasonable interpretation of axis=None for
  both methods.
 
 
 
  Sounds good to me.

 +1 (since all the arguments have been already given


 Josef
 - Why does sort treat columns independently instead of sorting rows?
 - because there is lexsort
 - Oh, lexsort, I haven thought about it in 5 years. It's not even next
 to sort in the pop up code completion


 
  Warren
 
 
 
 
  --
  Robert Kern
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 
 
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sat, Oct 11, 2014 at 6:51 PM, Warren Weckesser 
warren.weckes...@gmail.com wrote:

 I created an issue on github for an enhancement
 to numpy.random.shuffle:
 https://github.com/numpy/numpy/issues/5173
 I'd like to get some feedback on the idea.

 Currently, `shuffle` shuffles the first dimension of an array
 in-place.  For example, shuffling a 2D array shuffles the rows:

 In [227]: a
 Out[227]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11]])

 In [228]: np.random.shuffle(a)

 In [229]: a
 Out[229]:
 array([[ 0,  1,  2],
[ 9, 10, 11],
[ 3,  4,  5],
[ 6,  7,  8]])


 To add an axis keyword, we could (in effect) apply `shuffle` to
 `a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
 the columns:

 In [232]: a = np.arange(15).reshape(3,5)

 In [233]: a
 Out[233]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [234]: axis = 1

 In [235]: np.random.shuffle(a.swapaxes(axis, 0))

 In [236]: a
 Out[236]:
 array([[ 3,  2,  4,  0,  1],
[ 8,  7,  9,  5,  6],
[13, 12, 14, 10, 11]])

 So that's the first part--adding an `axis` keyword.

 The other part of the enhancement request is to add a shuffle
 behavior that shuffles the 1-d slices *independently*.  That is,
 for a 2-d array, shuffling with `axis=0` would apply a different
 shuffle to each column.  In the github issue, I defined a
 function called `disarrange` that implements this behavior:

 In [240]: a
 Out[240]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11],
[12, 13, 14]])

 In [241]: disarrange(a, axis=0)

 In [242]: a
 Out[242]:
 array([[ 6,  1,  2],
[ 3, 13, 14],
[ 9, 10,  5],
[12,  7,  8],
[ 0,  4, 11]])

 Note that each column has been shuffled independently.

 This behavior is analogous to how `sort` handles the `axis`
 keyword.  `sort` sorts the 1-d slices along the given axis
 independently.

 In the github issue, I suggested the following signature
 for `shuffle` (but I'm not too fond of the name `independent`):

   def shuffle(a, independent=False, axis=0)

 If `independent` is False, the current behavior of `shuffle`
 is used.  If `independent` is True, each 1-d slice is shuffled
 independently (in the same way that `sort` sorts each 1-d
 slice).

 Like most functions that take an `axis` argument, `axis=None`
 means to shuffle the flattened array.  With `independent=True`,
 it would act like `np.random.shuffle(a.flat)`, e.g.

 In [247]: a
 Out[247]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [248]: np.random.shuffle(a.flat)

 In [249]: a
 Out[249]:
 array([[ 0, 14,  9,  1, 13],
[ 2,  8,  5,  3,  4],
[ 6, 10,  7, 12, 11]])


 A small wart in this API is the meaning of

   shuffle(a, independent=False, axis=None)

 It could be argued that the correct behavior is to leave the
 array unchanged. (The current behavior can be interpreted as
 shuffling a 1-d sequence of monolithic blobs; the axis argument
 specifies which axis of the array corresponds to the
 sequence index.  Then `axis=None` means the argument is
 a single monolithic blob, so there is nothing to shuffle.)
 Or an error could be raised.

 What do you think?

 Warren




It is clear from the comments so far that, when `axis` is None, the result
should be a shuffle of all the elements in the array, for both methods of
shuffling (whether implemented as a new method or with a boolean argument
to `shuffle`).  Forget I ever suggested doing nothing or raising an error.
:)

Josef's comment reminded me that `numpy.random.permutation` returns a
shuffled copy of the array (when its argument is an array).  This function
should also get an `axis` argument.  `permutation` shuffles the same way
`shuffle` does--it simply makes a copy and then calls `shuffle` on the
copy.  If a new method is added for the new shuffling style, then it would
be consistent to also add a new method that uses the new shuffling style
and returns a copy of the shuffled array.   Then we would then have four
methods:

   In-placeCopy
Current shuffle style  shuffle permutation
New shuffle style  (name TBD)  (name TBD)

(All of them will have an `axis` argument.)

I suspect this will make some folks prefer the approach of adding a boolean
argument to `shuffle` and `permutation`.

Warren
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Sebastian

On 2014-10-12 16:54, Warren Weckesser wrote:


 On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern robert.k...@gmail.com
 mailto:robert.k...@gmail.com wrote:

 On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
 warren.weckes...@gmail.com mailto:warren.weckes...@gmail.com
 wrote:

  A small wart in this API is the meaning of
 
shuffle(a, independent=False, axis=None)
 
  It could be argued that the correct behavior is to leave the
  array unchanged. (The current behavior can be interpreted as
  shuffling a 1-d sequence of monolithic blobs; the axis argument
  specifies which axis of the array corresponds to the
  sequence index.  Then `axis=None` means the argument is
  a single monolithic blob, so there is nothing to shuffle.)
  Or an error could be raised.
 
  What do you think?

 It seems to me a perfectly good reason to have two methods instead of
 one. I can't imagine when I wouldn't be using a literal True or False
 for this, so it really should be two different methods.



 I agree, and my first inclination was to propose a different method
 (and I had the bikeshedding conversation with myself about the name:
 disarrange, scramble, disorder, randomize, ashuffle, some
 other variation of the word shuffle, ...), but I figured the first
 thing folks would say is Why not just add options to shuffle?  So,
 choose your battles and all that.

 What do other folks think of making a separate method
I'm not a fan of more methods with similar functionality in Numpy. It's
already hard to overlook the existing functions and all their possible
applications and variants. The axis=None proposal for shuffling all
items is very intuitive.

I think we don't want to take the path of matlab: a huge amount of
powerful functions, but few people know of their powerful possibilities.

regards,
Sebastian


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread josef.pktd
On Sun, Oct 12, 2014 at 12:14 PM, Warren Weckesser
warren.weckes...@gmail.com wrote:


 On Sat, Oct 11, 2014 at 6:51 PM, Warren Weckesser
 warren.weckes...@gmail.com wrote:

 I created an issue on github for an enhancement
 to numpy.random.shuffle:
 https://github.com/numpy/numpy/issues/5173
 I'd like to get some feedback on the idea.

 Currently, `shuffle` shuffles the first dimension of an array
 in-place.  For example, shuffling a 2D array shuffles the rows:

 In [227]: a
 Out[227]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11]])

 In [228]: np.random.shuffle(a)

 In [229]: a
 Out[229]:
 array([[ 0,  1,  2],
[ 9, 10, 11],
[ 3,  4,  5],
[ 6,  7,  8]])


 To add an axis keyword, we could (in effect) apply `shuffle` to
 `a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
 the columns:

 In [232]: a = np.arange(15).reshape(3,5)

 In [233]: a
 Out[233]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [234]: axis = 1

 In [235]: np.random.shuffle(a.swapaxes(axis, 0))

 In [236]: a
 Out[236]:
 array([[ 3,  2,  4,  0,  1],
[ 8,  7,  9,  5,  6],
[13, 12, 14, 10, 11]])

 So that's the first part--adding an `axis` keyword.

 The other part of the enhancement request is to add a shuffle
 behavior that shuffles the 1-d slices *independently*.  That is,
 for a 2-d array, shuffling with `axis=0` would apply a different
 shuffle to each column.  In the github issue, I defined a
 function called `disarrange` that implements this behavior:

 In [240]: a
 Out[240]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11],
[12, 13, 14]])

 In [241]: disarrange(a, axis=0)

 In [242]: a
 Out[242]:
 array([[ 6,  1,  2],
[ 3, 13, 14],
[ 9, 10,  5],
[12,  7,  8],
[ 0,  4, 11]])

 Note that each column has been shuffled independently.

 This behavior is analogous to how `sort` handles the `axis`
 keyword.  `sort` sorts the 1-d slices along the given axis
 independently.

 In the github issue, I suggested the following signature
 for `shuffle` (but I'm not too fond of the name `independent`):

   def shuffle(a, independent=False, axis=0)

 If `independent` is False, the current behavior of `shuffle`
 is used.  If `independent` is True, each 1-d slice is shuffled
 independently (in the same way that `sort` sorts each 1-d
 slice).

 Like most functions that take an `axis` argument, `axis=None`
 means to shuffle the flattened array.  With `independent=True`,
 it would act like `np.random.shuffle(a.flat)`, e.g.

 In [247]: a
 Out[247]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [248]: np.random.shuffle(a.flat)

 In [249]: a
 Out[249]:
 array([[ 0, 14,  9,  1, 13],
[ 2,  8,  5,  3,  4],
[ 6, 10,  7, 12, 11]])


 A small wart in this API is the meaning of

   shuffle(a, independent=False, axis=None)

 It could be argued that the correct behavior is to leave the
 array unchanged. (The current behavior can be interpreted as
 shuffling a 1-d sequence of monolithic blobs; the axis argument
 specifies which axis of the array corresponds to the
 sequence index.  Then `axis=None` means the argument is
 a single monolithic blob, so there is nothing to shuffle.)
 Or an error could be raised.

 What do you think?

 Warren




 It is clear from the comments so far that, when `axis` is None, the result
 should be a shuffle of all the elements in the array, for both methods of
 shuffling (whether implemented as a new method or with a boolean argument to
 `shuffle`).  Forget I ever suggested doing nothing or raising an error. :)

 Josef's comment reminded me that `numpy.random.permutation`

which kind of proofs my point

I sometimes have problems finding `shuffle` because I want a function
that does permutation.

Josef

returns a
 shuffled copy of the array (when its argument is an array).  This function
 should also get an `axis` argument.  `permutation` shuffles the same way
 `shuffle` does--it simply makes a copy and then calls `shuffle` on the copy.
 If a new method is added for the new shuffling style, then it would be
 consistent to also add a new method that uses the new shuffling style and
 returns a copy of the shuffled array.   Then we would then have four
 methods:

In-placeCopy
 Current shuffle style  shuffle permutation
 New shuffle style  (name TBD)  (name TBD)

 (All of them will have an `axis` argument.)

 I suspect this will make some folks prefer the approach of adding a boolean
 argument to `shuffle` and `permutation`.

 Warren


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org

Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Warren Weckesser
On Sun, Oct 12, 2014 at 12:14 PM, Warren Weckesser 
warren.weckes...@gmail.com wrote:



 On Sat, Oct 11, 2014 at 6:51 PM, Warren Weckesser 
 warren.weckes...@gmail.com wrote:

 I created an issue on github for an enhancement
 to numpy.random.shuffle:
 https://github.com/numpy/numpy/issues/5173
 I'd like to get some feedback on the idea.

 Currently, `shuffle` shuffles the first dimension of an array
 in-place.  For example, shuffling a 2D array shuffles the rows:

 In [227]: a
 Out[227]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11]])

 In [228]: np.random.shuffle(a)

 In [229]: a
 Out[229]:
 array([[ 0,  1,  2],
[ 9, 10, 11],
[ 3,  4,  5],
[ 6,  7,  8]])


 To add an axis keyword, we could (in effect) apply `shuffle` to
 `a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
 the columns:

 In [232]: a = np.arange(15).reshape(3,5)

 In [233]: a
 Out[233]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [234]: axis = 1

 In [235]: np.random.shuffle(a.swapaxes(axis, 0))

 In [236]: a
 Out[236]:
 array([[ 3,  2,  4,  0,  1],
[ 8,  7,  9,  5,  6],
[13, 12, 14, 10, 11]])

 So that's the first part--adding an `axis` keyword.

 The other part of the enhancement request is to add a shuffle
 behavior that shuffles the 1-d slices *independently*.  That is,
 for a 2-d array, shuffling with `axis=0` would apply a different
 shuffle to each column.  In the github issue, I defined a
 function called `disarrange` that implements this behavior:

 In [240]: a
 Out[240]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11],
[12, 13, 14]])

 In [241]: disarrange(a, axis=0)

 In [242]: a
 Out[242]:
 array([[ 6,  1,  2],
[ 3, 13, 14],
[ 9, 10,  5],
[12,  7,  8],
[ 0,  4, 11]])

 Note that each column has been shuffled independently.

 This behavior is analogous to how `sort` handles the `axis`
 keyword.  `sort` sorts the 1-d slices along the given axis
 independently.

 In the github issue, I suggested the following signature
 for `shuffle` (but I'm not too fond of the name `independent`):

   def shuffle(a, independent=False, axis=0)

 If `independent` is False, the current behavior of `shuffle`
 is used.  If `independent` is True, each 1-d slice is shuffled
 independently (in the same way that `sort` sorts each 1-d
 slice).

 Like most functions that take an `axis` argument, `axis=None`
 means to shuffle the flattened array.  With `independent=True`,
 it would act like `np.random.shuffle(a.flat)`, e.g.

 In [247]: a
 Out[247]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [248]: np.random.shuffle(a.flat)

 In [249]: a
 Out[249]:
 array([[ 0, 14,  9,  1, 13],
[ 2,  8,  5,  3,  4],
[ 6, 10,  7, 12, 11]])


 A small wart in this API is the meaning of

   shuffle(a, independent=False, axis=None)

 It could be argued that the correct behavior is to leave the
 array unchanged. (The current behavior can be interpreted as
 shuffling a 1-d sequence of monolithic blobs; the axis argument
 specifies which axis of the array corresponds to the
 sequence index.  Then `axis=None` means the argument is
 a single monolithic blob, so there is nothing to shuffle.)
 Or an error could be raised.

 What do you think?

 Warren




 It is clear from the comments so far that, when `axis` is None, the result
 should be a shuffle of all the elements in the array, for both methods of
 shuffling (whether implemented as a new method or with a boolean argument
 to `shuffle`).  Forget I ever suggested doing nothing or raising an error.
 :)

 Josef's comment reminded me that `numpy.random.permutation` returns a
 shuffled copy of the array (when its argument is an array).  This function
 should also get an `axis` argument.  `permutation` shuffles the same way
 `shuffle` does--it simply makes a copy and then calls `shuffle` on the
 copy.  If a new method is added for the new shuffling style, then it would
 be consistent to also add a new method that uses the new shuffling style
 and returns a copy of the shuffled array.   Then we would then have four
 methods:

In-placeCopy
 Current shuffle style  shuffle permutation
 New shuffle style  (name TBD)  (name TBD)

 (All of them will have an `axis` argument.)



That table makes me think that, *if* we go with new methods, the names
should be `shuffleXXX` and `permutationXXX`, where `XXX` is a common suffix
that is to be determined.  That will ensure that the names appear together
in alphabetical lists, and should show up together as options in
tab-completion or code-completion.

Warren


 I suspect this will make some folks prefer the approach of adding a
 boolean argument to `shuffle` and `permutation`.

 Warren


___
NumPy-Discussion mailing list

[Numpy-discussion] [ANN] bcolz 0.7.2

2014-10-12 Thread Valentin Haenel

==
Announcing bcolz 0.7.2
==

What's new
==

This is a maintenance release that fixes various bits and pieces.
Importantly, compatibility with Numpy 1.9 and Cython 0.21 has been fixed
and the test suit no longer segfaults on 32 bit UNIX. Feature-wise a new
``carray.view()`` method has been introduced which allows carrays to
share the same raw data.

``bcolz`` is a renaming of the ``carray`` project.  The new goals for
the project are to create simple, yet flexible compressed containers,
that can live either on-disk or in-memory, and with some
high-performance iterators (like `iter()`, `where()`) for querying them.

Together, bcolz and the Blosc compressor, are finally fulfilling the
promise of accelerating memory I/O, at least for some real scenarios:

http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots

For more detailed info, see the release notes in:
https://github.com/Blosc/bcolz/wiki/Release-Notes


What it is
==

bcolz provides columnar and compressed data containers.  Column storage
allows for efficiently querying tables with a large number of columns.
It also allows for cheap addition and removal of column.  In addition,
bcolz objects are compressed by default for reducing memory/disk I/O
needs.  The compression process is carried out internally by Blosc, a
high-performance compressor that is optimized for binary data.

bcolz can use numexpr internally so as to accelerate many vector and
query operations (although it can use pure NumPy for doing so too).
numexpr optimizes the memory usage and use several cores for doing the
computations, so it is blazing fast.  Moreover, the carray/ctable
containers can be disk-based, and it is possible to use them for
seamlessly performing out-of-memory computations.

bcolz has minimal dependencies (NumPy), comes with an exhaustive test
suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
typically tested on both UNIX and Windows operating systems.


Installing
==

bcolz is in the PyPI repository, so installing it is easy::

$ pip install -U bcolz


Resources
=

Visit the main bcolz site repository at:
http://github.com/Blosc/bcolz

Manual:
http://bcolz.blosc.org

Home of Blosc compressor:
http://blosc.org

User's mail list:
bc...@googlegroups.com
http://groups.google.com/group/bcolz

License is the new BSD:
https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt




  **Enjoy data!**

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Mads Ipsen
Hi,

In part of my C++ code, I often do

void foo(PyObject * matrix) {
   do stuff
}

where matrix is a numpy mxn matrix created on the Python side, where 
foo() eg. is invoked as

a = numpy.array([[1,2],[3,5]])
foo(a)

However, if you call transpose() on a, some care should be taken, since 
numpy's internal matrix data first gets transposed on demand. In that 
case I must do

a = numpy.array([[1,2],[3,5]])
a.transpose()
foo(a.copy())

to make sure the correct data of the array gets transferred to the C++ 
side.

Is there any way for me to detect (on the Python side) that transpose() 
has been invoked on the matrix, and thereby only do the copy operation 
when it really is needed? For example

if a_has_transposed_data:
 foo(a.copy())
else:
 foo(a)

Best regards,

Mads

-- 
+-+
| Mads Ipsen  |
+--+--+
| Gåsebæksvej 7, 4. tv | phone:  +45-29716388 |
| DK-2500 Valby| email:  mads.ip...@gmail.com |
| Denmark  | map  :   www.tinyurl.com/ns52fpa |
+--+--+
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Jaime Fernández del Río
On Sun, Oct 12, 2014 at 9:29 AM, Warren Weckesser 
warren.weckes...@gmail.com wrote:



 On Sun, Oct 12, 2014 at 12:14 PM, Warren Weckesser 
 warren.weckes...@gmail.com wrote:



 On Sat, Oct 11, 2014 at 6:51 PM, Warren Weckesser 
 warren.weckes...@gmail.com wrote:

 I created an issue on github for an enhancement
 to numpy.random.shuffle:
 https://github.com/numpy/numpy/issues/5173
 I'd like to get some feedback on the idea.

 Currently, `shuffle` shuffles the first dimension of an array
 in-place.  For example, shuffling a 2D array shuffles the rows:

 In [227]: a
 Out[227]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11]])

 In [228]: np.random.shuffle(a)

 In [229]: a
 Out[229]:
 array([[ 0,  1,  2],
[ 9, 10, 11],
[ 3,  4,  5],
[ 6,  7,  8]])


 To add an axis keyword, we could (in effect) apply `shuffle` to
 `a.swapaxes(axis, 0)`.  For a 2-D array, `axis=1` would shuffles
 the columns:

 In [232]: a = np.arange(15).reshape(3,5)

 In [233]: a
 Out[233]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [234]: axis = 1

 In [235]: np.random.shuffle(a.swapaxes(axis, 0))

 In [236]: a
 Out[236]:
 array([[ 3,  2,  4,  0,  1],
[ 8,  7,  9,  5,  6],
[13, 12, 14, 10, 11]])

 So that's the first part--adding an `axis` keyword.

 The other part of the enhancement request is to add a shuffle
 behavior that shuffles the 1-d slices *independently*.  That is,
 for a 2-d array, shuffling with `axis=0` would apply a different
 shuffle to each column.  In the github issue, I defined a
 function called `disarrange` that implements this behavior:

 In [240]: a
 Out[240]:
 array([[ 0,  1,  2],
[ 3,  4,  5],
[ 6,  7,  8],
[ 9, 10, 11],
[12, 13, 14]])

 In [241]: disarrange(a, axis=0)

 In [242]: a
 Out[242]:
 array([[ 6,  1,  2],
[ 3, 13, 14],
[ 9, 10,  5],
[12,  7,  8],
[ 0,  4, 11]])

 Note that each column has been shuffled independently.

 This behavior is analogous to how `sort` handles the `axis`
 keyword.  `sort` sorts the 1-d slices along the given axis
 independently.

 In the github issue, I suggested the following signature
 for `shuffle` (but I'm not too fond of the name `independent`):

   def shuffle(a, independent=False, axis=0)

 If `independent` is False, the current behavior of `shuffle`
 is used.  If `independent` is True, each 1-d slice is shuffled
 independently (in the same way that `sort` sorts each 1-d
 slice).

 Like most functions that take an `axis` argument, `axis=None`
 means to shuffle the flattened array.  With `independent=True`,
 it would act like `np.random.shuffle(a.flat)`, e.g.

 In [247]: a
 Out[247]:
 array([[ 0,  1,  2,  3,  4],
[ 5,  6,  7,  8,  9],
[10, 11, 12, 13, 14]])

 In [248]: np.random.shuffle(a.flat)

 In [249]: a
 Out[249]:
 array([[ 0, 14,  9,  1, 13],
[ 2,  8,  5,  3,  4],
[ 6, 10,  7, 12, 11]])


 A small wart in this API is the meaning of

   shuffle(a, independent=False, axis=None)

 It could be argued that the correct behavior is to leave the
 array unchanged. (The current behavior can be interpreted as
 shuffling a 1-d sequence of monolithic blobs; the axis argument
 specifies which axis of the array corresponds to the
 sequence index.  Then `axis=None` means the argument is
 a single monolithic blob, so there is nothing to shuffle.)
 Or an error could be raised.

 What do you think?

 Warren




 It is clear from the comments so far that, when `axis` is None, the
 result should be a shuffle of all the elements in the array, for both
 methods of shuffling (whether implemented as a new method or with a boolean
 argument to `shuffle`).  Forget I ever suggested doing nothing or raising
 an error. :)

 Josef's comment reminded me that `numpy.random.permutation` returns a
 shuffled copy of the array (when its argument is an array).  This function
 should also get an `axis` argument.  `permutation` shuffles the same way
 `shuffle` does--it simply makes a copy and then calls `shuffle` on the
 copy.  If a new method is added for the new shuffling style, then it would
 be consistent to also add a new method that uses the new shuffling style
 and returns a copy of the shuffled array.   Then we would then have four
 methods:

In-placeCopy
 Current shuffle style  shuffle permutation
 New shuffle style  (name TBD)  (name TBD)

 (All of them will have an `axis` argument.)



 That table makes me think that, *if* we go with new methods, the names
 should be `shuffleXXX` and `permutationXXX`, where `XXX` is a common suffix
 that is to be determined.  That will ensure that the names appear together
 in alphabetical lists, and should show up together as options in
 tab-completion or code-completion.


Just to add some noise to a productive conversation: if you add a 'copy'
flag to shuffle, then all the functionality is in one 

Re: [Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Pauli Virtanen
12.10.2014, 20:19, Mads Ipsen kirjoitti:
 Is there any way for me to detect (on the Python side) that transpose() 
 has been invoked on the matrix, and thereby only do the copy operation 
 when it really is needed? 

The correct way to do this is to, either:

In your C code check PyArray_IS_C_CONTIGUOUS(obj) and raise an error if
it is not. In addition, on the Python side, check for
`a.flags.c_contiguous` and make a copy if it is not.

OR

In your C code, get an handle to the array using PyArray_FromANY (or
PyArray_FromOTF) with NPY_ARRAY_C_CONTIGUOUS requirement set so that it
makes a copy when necessary.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Eric Firing
On 2014/10/12, 8:29 AM, Pauli Virtanen wrote:
 12.10.2014, 20:19, Mads Ipsen kirjoitti:
 Is there any way for me to detect (on the Python side) that transpose()
 has been invoked on the matrix, and thereby only do the copy operation
 when it really is needed?

 The correct way to do this is to, either:

 In your C code check PyArray_IS_C_CONTIGUOUS(obj) and raise an error if
 it is not. In addition, on the Python side, check for
 `a.flags.c_contiguous` and make a copy if it is not.

 OR

 In your C code, get an handle to the array using PyArray_FromANY (or
 PyArray_FromOTF) with NPY_ARRAY_C_CONTIGUOUS requirement set so that it
 makes a copy when necessary.


or let numpy handle it on the python side:

foo(numpy.ascontiguousarray(a))



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Stephan Hoyer
On Sun, Oct 12, 2014 at 10:56 AM, Jaime Fernández del Río 
jaime.f...@gmail.com wrote:

 Just to add some noise to a productive conversation: if you add a 'copy'
 flag to shuffle, then all the functionality is in one place, and
 'permutation' can either be deprecated, or trivially implemented in terms
 of the new 'shuffle'.


+1

Unfortunately, shuffle has the better name, but permutation has the better
default behavior.

(also, I think inplace might be a less ambiguous name for the argument
than copy)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Pauli Virtanen
12.10.2014, 22:16, Eric Firing kirjoitti:
 On 2014/10/12, 8:29 AM, Pauli Virtanen wrote:
 12.10.2014, 20:19, Mads Ipsen kirjoitti:
 Is there any way for me to detect (on the Python side) that transpose()
 has been invoked on the matrix, and thereby only do the copy operation
 when it really is needed?

 The correct way to do this is to, either:

 In your C code check PyArray_IS_C_CONTIGUOUS(obj) and raise an error if
 it is not. In addition, on the Python side, check for
 `a.flags.c_contiguous` and make a copy if it is not.

 OR

 In your C code, get an handle to the array using PyArray_FromANY (or
 PyArray_FromOTF) with NPY_ARRAY_C_CONTIGUOUS requirement set so that it
 makes a copy when necessary.
 
 or let numpy handle it on the python side:
 
 foo(numpy.ascontiguousarray(a))

Yes, but the C code really should check that the input array is
C-contiguous, if it only works for C-contiguous inputs.

-- 
Pauli Virtanen

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Detect if array has been transposed

2014-10-12 Thread Nathaniel Smith
On Mon, Oct 13, 2014 at 12:07 AM, Pauli Virtanen p...@iki.fi wrote:
 12.10.2014, 22:16, Eric Firing kirjoitti:
 On 2014/10/12, 8:29 AM, Pauli Virtanen wrote:
 12.10.2014, 20:19, Mads Ipsen kirjoitti:
 Is there any way for me to detect (on the Python side) that transpose()
 has been invoked on the matrix, and thereby only do the copy operation
 when it really is needed?

 The correct way to do this is to, either:

 In your C code check PyArray_IS_C_CONTIGUOUS(obj) and raise an error if
 it is not. In addition, on the Python side, check for
 `a.flags.c_contiguous` and make a copy if it is not.

 OR

 In your C code, get an handle to the array using PyArray_FromANY (or
 PyArray_FromOTF) with NPY_ARRAY_C_CONTIGUOUS requirement set so that it
 makes a copy when necessary.

 or let numpy handle it on the python side:

 foo(numpy.ascontiguousarray(a))

 Yes, but the C code really should check that the input array is
 C-contiguous, if it only works for C-contiguous inputs.

I.e. your original instructions were correct, but instead of checking
a.flags.c_contiguous by hand etc. the OP should just call
ascontiguousarray which takes care of that part.

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Instaling numpy without root access

2014-10-12 Thread Lahiru Samarakoon
Guys, any advice is highly appreciated. I am a little new to building in
Linux.
Thanks,
Lahiru

On Sat, Oct 11, 2014 at 9:43 AM, Lahiru Samarakoon lahir...@gmail.com
wrote:

 I switched to numpy-1.8.2. . Now getting following error. I am using
 LAPACK that comes with atlast installation. Can this be a problem?

 Traceback (most recent call last):
   File stdin, line 1, in module
   File
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/__init__.py,
 line 170, in module
 from . import add_newdocs
   File
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/add_newdocs.py,
 line 13, in module
 from numpy.lib import add_newdoc
   File
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/lib/__init__.py,
 line 18, in module
 from .polynomial import *
   File
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/lib/polynomial.py,
 line 19, in module
 from numpy.linalg import eigvals, lstsq, inv
   File
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/linalg/__init__.py,
 line 51, in module
 from .linalg import *
   File
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/linalg/linalg.py,
 line 29, in module
 from numpy.linalg import lapack_lite, _umath_linalg
 ImportError:
 /home/svu/a0095654/.local/lib/python2.7/site-packages/numpy/linalg/lapack_lite.so:
 undefined symbol: zgesdd_

 On Sat, Oct 11, 2014 at 1:30 AM, Julian Taylor 
 jtaylor.deb...@googlemail.com wrote:

 On 10.10.2014 19:26, Lahiru Samarakoon wrote:
  Red Hat Enterprise Linux release 5.8
  gcc (GCC) 4.1.2
 
  I am also trying to install numpy 1.9.

 that is the broken platform, please try the master branch or the
 maintenance/1.9.x branch, those should work now.

 Are there volunteers to report this to redhat?

 
  On Sat, Oct 11, 2014 at 12:59 AM, Julian Taylor
  jtaylor.deb...@googlemail.com mailto:jtaylor.deb...@googlemail.com
  wrote:
 
  On 10.10.2014 18:51, Lahiru Samarakoon wrote:
   Dear all,
  
   I am trying to install numpy without root access. So I am
 building from
   the source.  I have installed atlas which also has lapack with
 it.  I
   changed the site.cfg file as given below
  
   [DEFAULT]
   library_dirs = /home/svu/a0095654/ATLAS/build/lib
   include_dirs = /home/svu/a0095654/ATLAS/build/include
  
  
   However, I am getting a segmentation fault when importing numpy.
  
   Please advise. I also put the build log file at the end of the
 email if
   necessary.
 
 
  Which platform are you working on? Which compiler version?
  We just solved a segfault on import on red hat 5 gcc 4.1.2. Very
 likely
  caused by a compiler bug. See
 https://github.com/numpy/numpy/issues/5163
 
  The build log is complaining about your atlas being to small,
 possibly
  the installation is broken?
 
 



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Request for enhancement to numpy.random.shuffle

2014-10-12 Thread Nathaniel Smith
On Sun, Oct 12, 2014 at 5:14 PM, Sebastian se...@sebix.at wrote:

 On 2014-10-12 16:54, Warren Weckesser wrote:


 On Sun, Oct 12, 2014 at 7:57 AM, Robert Kern robert.k...@gmail.com
 mailto:robert.k...@gmail.com wrote:

 On Sat, Oct 11, 2014 at 11:51 PM, Warren Weckesser
 warren.weckes...@gmail.com mailto:warren.weckes...@gmail.com
 wrote:

  A small wart in this API is the meaning of
 
shuffle(a, independent=False, axis=None)
 
  It could be argued that the correct behavior is to leave the
  array unchanged. (The current behavior can be interpreted as
  shuffling a 1-d sequence of monolithic blobs; the axis argument
  specifies which axis of the array corresponds to the
  sequence index.  Then `axis=None` means the argument is
  a single monolithic blob, so there is nothing to shuffle.)
  Or an error could be raised.
 
  What do you think?

 It seems to me a perfectly good reason to have two methods instead of
 one. I can't imagine when I wouldn't be using a literal True or False
 for this, so it really should be two different methods.



 I agree, and my first inclination was to propose a different method
 (and I had the bikeshedding conversation with myself about the name:
 disarrange, scramble, disorder, randomize, ashuffle, some
 other variation of the word shuffle, ...), but I figured the first
 thing folks would say is Why not just add options to shuffle?  So,
 choose your battles and all that.

 What do other folks think of making a separate method
 I'm not a fan of more methods with similar functionality in Numpy. It's
 already hard to overlook the existing functions and all their possible
 applications and variants. The axis=None proposal for shuffling all
 items is very intuitive.

 I think we don't want to take the path of matlab: a huge amount of
 powerful functions, but few people know of their powerful possibilities.

I totally agree with this principle, but I think this is an exception
to the rule, b/c unfortunately in this case the function that we *do*
have is weird and inconsistent with how most other functions in numpy
work. It doesn't vectorize! Cf. 'sort' or how a 'shuffle' gufunc
(k,)-(k,) would work. Also, it's easy to implement the current
'shuffle' in terms of any 1d shuffle function, with no explicit loops,
Warren's disarrange requires an explicit loop. So, we really
implemented the wrong one, oops. What this means going forward,
though, is that our only options are either to implement both
behaviours with two functions, or else to give up on have the more
natural behaviour altogether. I think the former is the lesser of two
evils.

Regarding names: shuffle/permutation is a terrible naming convention
IMHO and shouldn't be propagated further. We already have a good
naming convention for inplace-vs-sorted: sort vs. sorted, reverse vs.
reversed, etc.

So, how about:

scramble + scrambled shuffle individual entries within each
row/column/..., as in Warren's suggestion.

shuffle + shuffled to do what shuffle, permutation do now (mnemonic:
these break a 2d array into a bunch of 1d cards, and then shuffle
those cards).

permuted remains indefinitely, with the docstring: Deprecated alias
for 'shuffled'.

-n

-- 
Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion