Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Stephan Hoyer
On Wed, Apr 1, 2015 at 7:06 AM, Jaime Fernández del Río 
jaime.f...@gmail.com wrote:

 Is there any other package implementing non-orthogonal indexing aside from
 numpy?


I think we can safely say that NumPy's implementation of broadcasting
indexing is unique :).

The issue is that many other packages rely on numpy for implementation of
custom array objects (e.g., scipy.sparse and scipy.io.netcdf). It's not
immediately obvious what sort of indexing these objects represent.

If the functionality is lacking, e,g, use of slices in `np.ix_`, I'm all
 for improving that to provide the full functionality of orthogonal
 indexing. I just need a little more convincing that those new
 attributes/indexers are going to ever see any real use.


Orthogonal indexing is close to the norm for packages that implement
labeled data structures, both because it's easier to understand and
implement, and because it's difficult to maintain associations with labels
through complex broadcasting indexing.

Unfortunately, the lack of a full featured implementation of orthogonal
indexing has lead to that wheel being reinvented at least three times (in
Iris, xray [1] and pandas). So it would be nice to have a canonical
implementation that supports slices and integers in numpy for that reason
alone. This could be done by building on the existing `np.ix_` function,
but a new indexer seems more elegant: there's just much less noise with
`arr.ix_[:1, 2, [3]]` than `arr[np.ix_(slice(1), 2, [3])]`.

It's also well known that indexing with __getitem__ can be much slower than
np.take. It seems plausible to me that a careful implementation of
orthogonal indexing could close or eliminate this speed gap, because the
model for orthogonal indexing is so much simpler than that for broadcasting
indexing: each element of the key tuple can be applied separately along the
corresponding axis.

So I think there could be a real benefit to having the feature in numpy. In
particular, if somebody is up for implementing it in C or Cython, I would
be very pleased.

 Cheers,
Stephan

[1] Here is my implementation of remapping from orthogonal to broadcasting
indexing. It works, but it's a real mess, especially because I try to
optimize by minimizing the number of times slices are converted into arrays:
https://github.com/xray/xray/blob/0d164d848401209971ded33aea2880c1fdc892cb/xray/core/indexing.py#L68
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Sebastian Berg
On Do, 2015-04-02 at 01:29 -0700, Stephan Hoyer wrote:
 On Wed, Apr 1, 2015 at 7:06 AM, Jaime Fernández del Río
 jaime.f...@gmail.com wrote:
 Is there any other package implementing non-orthogonal
 indexing aside from numpy?
 
 
 I think we can safely say that NumPy's implementation of broadcasting
 indexing is unique :).
 
 
 The issue is that many other packages rely on numpy for implementation
 of custom array objects (e.g., scipy.sparse and scipy.io.netcdf). It's
 not immediately obvious what sort of indexing these objects represent.
 
 
 If the functionality is lacking, e,g, use of slices in
 `np.ix_`, I'm all for improving that to provide the full
 functionality of orthogonal indexing. I just need a little
 more convincing that those new attributes/indexers are going
 to ever see any real use.
 
 
 
 Orthogonal indexing is close to the norm for packages that implement
 labeled data structures, both because it's easier to understand and
 implement, and because it's difficult to maintain associations with
 labels through complex broadcasting indexing.
 
 
 Unfortunately, the lack of a full featured implementation of
 orthogonal indexing has lead to that wheel being reinvented at least
 three times (in Iris, xray [1] and pandas). So it would be nice to
 have a canonical implementation that supports slices and integers in
 numpy for that reason alone. This could be done by building on the
 existing `np.ix_` function, but a new indexer seems more elegant:
 there's just much less noise with `arr.ix_[:1, 2, [3]]` than
 `arr[np.ix_(slice(1), 2, [3])]`.
 
 
 It's also well known that indexing with __getitem__ can be much slower
 than np.take. It seems plausible to me that a careful implementation
 of orthogonal indexing could close or eliminate this speed gap,
 because the model for orthogonal indexing is so much simpler than that
 for broadcasting indexing: each element of the key tuple can be
 applied separately along the corresponding axis.
 

Wrong (sorry, couldn't resist ;)), since 1.9. take is not typically
faster unless you have a small subspace (subspace are the
non-indexed/slice-indexed axes, though I guess small subspace is common
in some cases, i.e. Nx3 array), it should typically be noticeably slower
for large subspaces at the moment.

Anyway, unfortunately while orthogonal indexing may seem simpler, as you
probably noticed, mapping it fully featured to advanced indexing does
not seem like a walk in the park due to how axis remapping works when
you have a combination of slices and advanced indices.

It might be possible to basically implement a second MapIterSwapaxis in
addition to adding extra axes to the inputs (which I think would need a
post-processing step, but that is not that bad). If you do that, you can
mostly reuse the current machinery and avoid most of the really annoying
code blocks which set up the iterators for the various special cases.
Otherwise, for hacking it of course you can replace the slices by arrays
as well ;).

 
 So I think there could be a real benefit to having the feature in
 numpy. In particular, if somebody is up for implementing it in C or
 Cython, I would be very pleased.
 
 
  Cheers,
 
 Stephan
 
 
 [1] Here is my implementation of remapping from orthogonal to
 broadcasting indexing. It works, but it's a real mess, especially
 because I try to optimize by minimizing the number of times slices are
 converted into arrays:
 https://github.com/xray/xray/blob/0d164d848401209971ded33aea2880c1fdc892cb/xray/core/indexing.py#L68
 
 
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Colin J. Williams


On 02-Apr-15 4:35 PM, Eric Firing wrote:
 On 2015/04/02 10:22 AM, josef.p...@gmail.com wrote:
 Swapping the axis when slices are mixed with fancy indexing was a
 design mistake, IMO. But not fancy indexing itself.
 I'm not saying there should be no fancy indexing capability; I am saying
 that it should be available through a function or method, rather than
 via the square brackets.  Square brackets should do things that people
 expect them to do--the most common and easy-to-understand style of indexing.

 Eric
+1
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Eric Firing
On 2015/04/02 10:22 AM, josef.p...@gmail.com wrote:
 Swapping the axis when slices are mixed with fancy indexing was a
 design mistake, IMO. But not fancy indexing itself.

I'm not saying there should be no fancy indexing capability; I am saying 
that it should be available through a function or method, rather than 
via the square brackets.  Square brackets should do things that people 
expect them to do--the most common and easy-to-understand style of indexing.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread josef.pktd
On Thu, Apr 2, 2015 at 2:03 PM, Eric Firing efir...@hawaii.edu wrote:
 On 2015/04/02 4:15 AM, Jaime Fernández del Río wrote:
 We probably need more traction on the should this be done? discussion
 than on the can this be done? one, the need for a reordering of the
 axes swings me slightly in favor, but I mostly don't see it yet.

 As a long-time user of numpy, and an advocate and teacher of Python for
 science, here is my perspective:

 Fancy indexing is a horrible design mistake--a case of cleverness run
 amok.  As you can read in the Numpy documentation, it is hard to
 explain, hard to understand, hard to remember.  Its use easily leads to
 unreadable code and hard-to-see errors.  Here is the essence of an
 example that a student presented me with just this week, in the context
 of reordering eigenvectors based on argsort applied to eigenvalues:

 In [25]: xx = np.arange(2*3*4).reshape((2, 3, 4))

 In [26]: ii = np.arange(4)

 In [27]: print(xx[0])
 [[ 0  1  2  3]
   [ 4  5  6  7]
   [ 8  9 10 11]]

 In [28]: print(xx[0, :, ii])
 [[ 0  4  8]
   [ 1  5  9]
   [ 2  6 10]
   [ 3  7 11]]

 Quickly now, how many numpy users would look at that last expression and
 say, Of course, that is equivalent to transposing xx[0]?  And, Of
 course that expression should give a completely different result from
 xx[0][:, ii].?

 I would guess it would be less than 1%.  That should tell you right away
 that we have a real problem here.  Fancy indexing can't be *read* by a
 sub-genius--it has to be laboriously figured out piece by piece, with
 frequent reference to the baffling descriptions in the Numpy docs.

 So I think you should turn the question around and ask, What is the
 actual real-world use case for fancy indexing?  How often does real
 code rely on it?  I have taken advantage of it occasionally, maybe you
 have too, but I think a survey of existing code would show that the need
 for it is *far* less common than the need for simple orthogonal
 indexing.  That tells me that it is fancy indexing, not orthogonal
 indexing, that should be available through a function and/or special
 indexing attribute.  The question is then how to make that transition.


Swapping the axis when slices are mixed with fancy indexing was a
design mistake, IMO. But not fancy indexing itself.

 np.triu_indices(5)
(array([0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4], dtype=int64),
array([0, 1, 2, 3, 4, 1, 2, 3, 4, 2, 3, 4, 3, 4, 4], dtype=int64))
 m = np.arange(25).reshape(5, 5)[np.triu_indices(5)]
 m
array([ 0,  1,  2,  3,  4,  6,  7,  8,  9, 12, 13, 14, 18, 19, 24])

 m2 = np.zeros((5,5))
 m2[np.triu_indices(5)] = m
 m2
array([[  0.,   1.,   2.,   3.,   4.],
   [  0.,   6.,   7.,   8.,   9.],
   [  0.,   0.,  12.,  13.,  14.],
   [  0.,   0.,   0.,  18.,  19.],
   [  0.,   0.,   0.,   0.,  24.]])

(I don't remember what's fancy in indexing, just that broadcasting
rules apply.)

Josef



 Eric





 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread josef.pktd
On Thu, Apr 2, 2015 at 10:30 PM, Matthew Brett matthew.br...@gmail.com wrote:
 Hi,

 On Thu, Apr 2, 2015 at 6:09 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing efir...@hawaii.edu wrote:
 On 2015/04/02 1:14 PM, Hanno Klemm wrote:
 Well, I have written quite a bit of code that relies on fancy
 indexing, and I think the question, if the behaviour of the []
 operator should be changed has sailed with numpy now at version 1.9.
 Given the amount packages that rely on numpy, changing this
 fundamental behaviour would not be a clever move.

 Are you *positive* that there is no clever way to make a transition?
 It's not worth any further thought?

 I guess it would be similar to python 3 string versus bytes, but
 without the overwhelming benefits.

 I don't think I would be in favor of deprecating fancy indexing even
 if it were possible. In general, my impression is that if there is a
 trade-off in numpy between powerful machinery versus easy to learn and
 teach, then the design philosophy when in favor of power.

 I think numpy indexing is not too difficult and follows a consistent
 pattern, and I completely avoid mixing slices and index arrays with
 ndim  2.

 I'm sure y'all are totally on top of this, but for myself, I would
 like to distinguish:

 * fancy indexing with boolean arrays - I use it all the time and don't
 get confused;
 * fancy indexing with non-boolean arrays - horrendously confusing,
 almost never use it, except on a single axis when I can't confuse it
 with orthogonal indexing:

 In [3]: a = np.arange(24).reshape(6, 4)

 In [4]: a
 Out[4]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])

 In [5]: a[[1, 2, 4]]
 Out[5]:
 array([[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[16, 17, 18, 19]])

 I also remember a discussion with Travis O where he was also saying
 that this indexing was confusing and that it would be good if there
 was some way to transition to what he called outer product indexing (I
 think that's the same as 'orthogonal' indexing).

 I think it should be DOA, except as a discussion topic for numpy 3000.

 I think there are two proposals here:

 1) Add some syntactic sugar to allow orthogonal indexing of numpy
 arrays, no backward compatibility break.

 That seems like a very good idea to me - were there any big objections to 
 that?

 2) Over some long time period, move the default behavior of np.array
 non-boolean indexing from the current behavior to the orthogonal
 behavior.

 That is going to be very tough, because it will cause very confusing
 breakage of legacy code.

 On the other hand, maybe it is worth going some way towards that, like this:

 * implement orthogonal indexing as a method arr.sensible_index[...]
 * implement the current non-boolean fancy indexing behavior as a
 method - arr.crazy_index[...]
 * deprecate non-boolean fancy indexing as standard arr[...] indexing;
 * wait a long time;
 * remove non-boolean fancy indexing as standard arr[...] (errors are
 preferable to change in behavior)

 Then if we are brave we could:

 * wait a very long time;
 * make orthogonal indexing the default.

 But the not-brave steps above seem less controversial, and fairly reasonable.

 What about that as an approach?

I also thought the transition would have to be something like that or
a clear break point, like numpy 3.0. I would be in favor something
like this for the axis swapping case with ndim2.

However, before going to that, you would still have to provide a list
of behaviors that will be deprecated, and make a poll in various
libraries for how much it is actually used.

My impression is that fancy indexing is used more often than
orthogonal indexing (beyond the trivial case x[:, idx]).
Also, many usecases for orthogonal indexing moved to using pandas, and
numpy is left with non-orthogonal indexing use cases.
And third, fancy indexing is a superset of orthogonal indexing (with
proper broadcasting), and you still need to justify why everyone
should be restricted to the subset instead of a voluntary constraint
to use code that is easier to understand.

I checked numpy.random.choice which I would have implemented with
fancy indexing, but it uses only `take`, AFAICS.

Switching to using a explicit method is not really a problem for
maintained library code, but I still don't really see why we should do
this.

Josef


 Cheers,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Matthew Brett
Hi,

On Thu, Apr 2, 2015 at 8:20 PM, Jaime Fernández del Río
jaime.f...@gmail.com wrote:
 On Thu, Apr 2, 2015 at 7:30 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

 Hi,

 On Thu, Apr 2, 2015 at 6:09 PM,  josef.p...@gmail.com wrote:
  On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing efir...@hawaii.edu wrote:
  On 2015/04/02 1:14 PM, Hanno Klemm wrote:
  Well, I have written quite a bit of code that relies on fancy
  indexing, and I think the question, if the behaviour of the []
  operator should be changed has sailed with numpy now at version 1.9.
  Given the amount packages that rely on numpy, changing this
  fundamental behaviour would not be a clever move.
 
  Are you *positive* that there is no clever way to make a transition?
  It's not worth any further thought?
 
  I guess it would be similar to python 3 string versus bytes, but
  without the overwhelming benefits.
 
  I don't think I would be in favor of deprecating fancy indexing even
  if it were possible. In general, my impression is that if there is a
  trade-off in numpy between powerful machinery versus easy to learn and
  teach, then the design philosophy when in favor of power.
 
  I think numpy indexing is not too difficult and follows a consistent
  pattern, and I completely avoid mixing slices and index arrays with
  ndim  2.

 I'm sure y'all are totally on top of this, but for myself, I would
 like to distinguish:

 * fancy indexing with boolean arrays - I use it all the time and don't
 get confused;
 * fancy indexing with non-boolean arrays - horrendously confusing,
 almost never use it, except on a single axis when I can't confuse it
 with orthogonal indexing:

 In [3]: a = np.arange(24).reshape(6, 4)

 In [4]: a
 Out[4]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])

 In [5]: a[[1, 2, 4]]
 Out[5]:
 array([[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[16, 17, 18, 19]])

 I also remember a discussion with Travis O where he was also saying
 that this indexing was confusing and that it would be good if there
 was some way to transition to what he called outer product indexing (I
 think that's the same as 'orthogonal' indexing).

  I think it should be DOA, except as a discussion topic for numpy 3000.

 I think there are two proposals here:

 1) Add some syntactic sugar to allow orthogonal indexing of numpy
 arrays, no backward compatibility break.

 That seems like a very good idea to me - were there any big objections to
 that?

 2) Over some long time period, move the default behavior of np.array
 non-boolean indexing from the current behavior to the orthogonal
 behavior.

 That is going to be very tough, because it will cause very confusing
 breakage of legacy code.

 On the other hand, maybe it is worth going some way towards that, like
 this:

 * implement orthogonal indexing as a method arr.sensible_index[...]
 * implement the current non-boolean fancy indexing behavior as a
 method - arr.crazy_index[...]
 * deprecate non-boolean fancy indexing as standard arr[...] indexing;
 * wait a long time;
 * remove non-boolean fancy indexing as standard arr[...] (errors are
 preferable to change in behavior)

 Then if we are brave we could:

 * wait a very long time;
 * make orthogonal indexing the default.

 But the not-brave steps above seem less controversial, and fairly
 reasonable.

 What about that as an approach?


 Your option 1 was what was being discussed before the posse was assembled to
 bring fancy indexing before justice... ;-)

Yes, sorry - I was trying to bring the argument back there.

 My background is in image processing, and I have used fancy indexing in all
 its fanciness far more often than orthogonal or outer product indexing. I
 actually have a vivid memory of the moment I fell in love with NumPy: after
 seeing a code snippet that ran a huge image through a look-up table by
 indexing the LUT with the image. Beautifully simple. And here is a younger
 me, learning to ride NumPy without the training wheels.

 Another obvious use case that you can find all over the place in
 scikit-image is drawing a curve on an image from the coordinates.

No question at all that it does have its uses - but then again, no-one
thinks that it should not be available, only, maybe, in the very far
future, not what you get by default...

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread josef.pktd
On Thu, Apr 2, 2015 at 11:30 PM, Nathaniel Smith n...@pobox.com wrote:
 On Thu, Apr 2, 2015 at 6:35 PM,  josef.p...@gmail.com wrote:
 (I thought about this because I was looking at accessing off-diagonal
 elements, m2[np.arange(4), np.arange(4) + 1] )

 Psst: np.diagonal(m2, offset=1)

It was just an example  (banded or toeplitz)
(I know how indexing works, kind off, but don't remember what diag or
other functions are exactly doing.)

 m2b = m2.copy()
 m2b[np.arange(4), np.arange(4) + 1]
array([  1.,   7.,  13.,  19.])
 m2b[np.arange(4), np.arange(4) + 1] = np.nan
 m2b
array([[  0.,  nan,   2.,   3.,   4.],
   [  0.,   6.,  nan,   8.,   9.],
   [  0.,   0.,  12.,  nan,  14.],
   [  0.,   0.,   0.,  18.,  nan],
   [  0.,   0.,   0.,   0.,  24.]])

 m2c = m2.copy()
 np.diagonal(m2c, offset=1) = np.nan
SyntaxError: can't assign to function call
 dd = np.diagonal(m2c, offset=1)
 dd[:] = np.nan
Traceback (most recent call last):
  File pyshell#89, line 1, in module
dd[:] = np.nan
ValueError: assignment destination is read-only
 np.__version__
'1.9.2rc1'

 m2d = m2.copy()
 m2d[np.arange(4)[::-1], np.arange(4) + 1] = np.nan

Josef


 --
 Nathaniel J. Smith -- http://vorpus.org
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Jaime Fernández del Río
On Thu, Apr 2, 2015 at 7:30 PM, Matthew Brett matthew.br...@gmail.com
wrote:

 Hi,

 On Thu, Apr 2, 2015 at 6:09 PM,  josef.p...@gmail.com wrote:
  On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing efir...@hawaii.edu wrote:
  On 2015/04/02 1:14 PM, Hanno Klemm wrote:
  Well, I have written quite a bit of code that relies on fancy
  indexing, and I think the question, if the behaviour of the []
  operator should be changed has sailed with numpy now at version 1.9.
  Given the amount packages that rely on numpy, changing this
  fundamental behaviour would not be a clever move.
 
  Are you *positive* that there is no clever way to make a transition?
  It's not worth any further thought?
 
  I guess it would be similar to python 3 string versus bytes, but
  without the overwhelming benefits.
 
  I don't think I would be in favor of deprecating fancy indexing even
  if it were possible. In general, my impression is that if there is a
  trade-off in numpy between powerful machinery versus easy to learn and
  teach, then the design philosophy when in favor of power.
 
  I think numpy indexing is not too difficult and follows a consistent
  pattern, and I completely avoid mixing slices and index arrays with
  ndim  2.

 I'm sure y'all are totally on top of this, but for myself, I would
 like to distinguish:

 * fancy indexing with boolean arrays - I use it all the time and don't
 get confused;
 * fancy indexing with non-boolean arrays - horrendously confusing,
 almost never use it, except on a single axis when I can't confuse it
 with orthogonal indexing:

 In [3]: a = np.arange(24).reshape(6, 4)

 In [4]: a
 Out[4]:
 array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])

 In [5]: a[[1, 2, 4]]
 Out[5]:
 array([[ 4,  5,  6,  7],
[ 8,  9, 10, 11],
[16, 17, 18, 19]])

 I also remember a discussion with Travis O where he was also saying
 that this indexing was confusing and that it would be good if there
 was some way to transition to what he called outer product indexing (I
 think that's the same as 'orthogonal' indexing).

  I think it should be DOA, except as a discussion topic for numpy 3000.

 I think there are two proposals here:

 1) Add some syntactic sugar to allow orthogonal indexing of numpy
 arrays, no backward compatibility break.

 That seems like a very good idea to me - were there any big objections to
 that?

 2) Over some long time period, move the default behavior of np.array
 non-boolean indexing from the current behavior to the orthogonal
 behavior.

 That is going to be very tough, because it will cause very confusing
 breakage of legacy code.

 On the other hand, maybe it is worth going some way towards that, like
 this:

 * implement orthogonal indexing as a method arr.sensible_index[...]
 * implement the current non-boolean fancy indexing behavior as a
 method - arr.crazy_index[...]
 * deprecate non-boolean fancy indexing as standard arr[...] indexing;
 * wait a long time;
 * remove non-boolean fancy indexing as standard arr[...] (errors are
 preferable to change in behavior)

 Then if we are brave we could:

 * wait a very long time;
 * make orthogonal indexing the default.

 But the not-brave steps above seem less controversial, and fairly
 reasonable.

 What about that as an approach?


Your option 1 was what was being discussed before the posse was assembled
to bring fancy indexing before justice... ;-)

My background is in image processing, and I have used fancy indexing in all
its fanciness far more often than orthogonal or outer product indexing. I
actually have a vivid memory of the moment I fell in love with NumPy: after
seeing a code snippet that ran a huge image through a look-up table by
indexing the LUT with the image. Beautifully simple. And here
http://stackoverflow.com/questions/12014186/fancier-fancy-indexing-in-numpy
is a younger me, learning to ride NumPy without the training wheels.

Another obvious use case that you can find all over the place in
scikit-image is drawing a curve on an image from the coordinates.

If there is such strong agreement on an orthogonal indexer, we might as
well go ahead an implement it. But before considering any bolder steps, we
should probably give it a couple of releases to see how many people out
there really use it.

Jaime

P.S. As an aside on the remapping of axes when arrays and slices are mixed,
there really is no better way. Once you realize that the array indexing a
dimension does not have to be 1-D, it should clearly appear that what seems
the obvious way does not generalize to the general case. E.g.:

One may rightfully think that:

 a = np.arange(60).reshape(3, 4, 5)
 a[np.array([1])[:, None], ::2, [0, 1, 3]].shape
(1, 3, 2)

should not reorder the axes, and return an array of shape (1, 2, 3). But
what do you do in the following case?

 idx0 = np.random.randint(3, size=(10, 1, 10))
 idx2 = np.random.randint(5, size=(1, 20, 

[Numpy-discussion] SciPy 2015 Conference Updates - Call for talks extended to 4/10, registration open, keynotes announced, John Hunter Plotting Contest

2015-04-02 Thread Courtenay Godshall (Enthought)

---

**LAST CALL FOR SCIPY 2015 TALK AND POSTER SUBMISSIONS - EXTENSION TO 4/10*


---

SciPy 2015 will include 3 major topic tracks and 7 mini-symposia tracks.
Submit a proposal on the SciPy 2015 website: http://scipy2015.scipy.org
http://scipy2015.scipy.org. If you have any questions or comments, feel free
to contact us at:  mailto:scipy-organiz...@scipy.org
scipy-organiz...@scipy.org http://scipy2015.scipy.org . You can also
follow @scipyconf on Twitter or sign up for the mailing list on the website
for the latest updates!

 

Major topic tracks include:

- Scientific Computing in Python (General track)

- Python in Data Science

- Quantitative Finance and Computational Social Sciences

 

Mini-symposia will include the applications of Python in:

- Astronomy and astrophysics

- Computational life and medical sciences

- Engineering

- Geographic information systems (GIS)

- Geophysics

- Oceanography and meteorology

- Visualization, vision and imaging

 

--

**SCIPY 2015 REGISTRATION IS OPEN**

Please register ASAP to help us get a good headcount and open the conference
to as many people as we can. PLUS, everyone who registers before May 15 will
not only get early bird discounts, but will also be entered in a drawing for
a free registration (via refund or extra)! Register on the website at
http://scipy2015.scipy.org 

 

--

**SCIPY 2015 KEYNOTE SPEAKERS ANNOUNCED**

Keynote speakers were just announced and include Wes McKinney, author of
Pandas; Chris Wiggins, Chief Data Scientist for The New York Times; and Jake
VanderPlas, director of research at the University of Washington's eScience
Institute and core contributor to a number of scientific Python libraries
including sci-kit learn and AstroML.

 

--

**ENTER THE SCIPY JOHN HUNTER EXCELLENCE IN PLOTTING CONTEST - DUE 4/13**

In memory of John Hunter, creator of matplotlib, we are pleased to announce
the Third Annual SciPy John Hunter Excellence in Plotting Competition. This
open competition aims to highlight the importance of quality plotting to
scientific progress and showcase the capabilities of the current generation
of plotting software. Participants are invited to submit scientific plots to
be judged by a panel. The winning entries will be announced and displayed at
the conference. John Hunter's family is graciously sponsoring cash prizes up
to $1,000 for the winners. We look forward to exciting submissions that push
the boundaries of plotting! See details here:
http://scipy2015.scipy.org/ehome/115969/276538/ Entries must be submitted by
April 13, 2015 via e-mail to plotting-cont...@scipy.org 

 

--

**CALENDAR AND IMPORTANT DATES**

--Sprint, Birds of a Feather, Financial Aid and Talk submissions are open
NOW

--Apr 10, 2015: Talk and Poster submission deadline

--Apr 13, 2015: Plotting contest submissions due

--Apr 15, 2015: Financial aid application deadline

--Apr 17, 2015: Tutorial schedule announced

--May 1, 2015: General conference speakers  schedule announced

--May 15, 2015 (or 150 registrants): Early-bird registration ends

--Jun 1, 2015: BoF submission deadline

--Jul 6-7, 2015: SciPy 2015 Tutorials

--Jul 8-10, 2015: SciPy 2015 General Conference

--Jul 11-12, 2015: SciPy 2015 Sprints

 

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread josef.pktd
On Thu, Apr 2, 2015 at 9:09 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing efir...@hawaii.edu wrote:
 On 2015/04/02 1:14 PM, Hanno Klemm wrote:
 Well, I have written quite a bit of code that relies on fancy
 indexing, and I think the question, if the behaviour of the []
 operator should be changed has sailed with numpy now at version 1.9.
 Given the amount packages that rely on numpy, changing this
 fundamental behaviour would not be a clever move.

 Are you *positive* that there is no clever way to make a transition?
 It's not worth any further thought?

 I guess it would be similar to python 3 string versus bytes, but
 without the overwhelming benefits.

 I don't think I would be in favor of deprecating fancy indexing even
 if it were possible. In general, my impression is that if there is a
 trade-off in numpy between powerful machinery versus easy to learn and
 teach, then the design philosophy when in favor of power.

 I think numpy indexing is not too difficult and follows a consistent
 pattern, and I completely avoid mixing slices and index arrays with
 ndim  2.

 I think it should be DOA, except as a discussion topic for numpy 3000.

 just my opinion


is this fancy?

 vals
array([6, 5, 4, 1, 2, 3])
 a+b
array([[3, 2, 1, 0],
   [4, 3, 2, 1],
   [5, 4, 3, 2]])
 vals[a+b]
array([[1, 4, 5, 6],
   [2, 1, 4, 5],
   [3, 2, 1, 4]])

https://github.com/scipy/scipy/blob/v0.14.0/scipy/linalg/special_matrices.py#L178

(I thought about this because I was looking at accessing off-diagonal
elements, m2[np.arange(4), np.arange(4) + 1] )


How would you find all the code that would not be correct anymore with
a changed definition of indexing and slicing, if there is insufficient
test coverage and it doesn't raise an exception?
If we find it, who fixes all the legacy code? (I don't think it will
be minor unless there is a new method `fix_[...]`  (fancy ix)

Josef


 Josef



 If people want to implement orthogonal indexing with another method,
 by all means I might use it at some point in the future. However,
 adding even more complexity to the behaviour of the bracket slicing
 is probably not a good idea.

 I'm not advocating adding even more complexity, I'm trying to think
 about ways to make it *less* complex from the typical user's standpoint.

 Eric
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread josef.pktd
On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing efir...@hawaii.edu wrote:
 On 2015/04/02 1:14 PM, Hanno Klemm wrote:
 Well, I have written quite a bit of code that relies on fancy
 indexing, and I think the question, if the behaviour of the []
 operator should be changed has sailed with numpy now at version 1.9.
 Given the amount packages that rely on numpy, changing this
 fundamental behaviour would not be a clever move.

 Are you *positive* that there is no clever way to make a transition?
 It's not worth any further thought?

I guess it would be similar to python 3 string versus bytes, but
without the overwhelming benefits.

I don't think I would be in favor of deprecating fancy indexing even
if it were possible. In general, my impression is that if there is a
trade-off in numpy between powerful machinery versus easy to learn and
teach, then the design philosophy when in favor of power.

I think numpy indexing is not too difficult and follows a consistent
pattern, and I completely avoid mixing slices and index arrays with
ndim  2.

I think it should be DOA, except as a discussion topic for numpy 3000.

just my opinion

Josef



 If people want to implement orthogonal indexing with another method,
 by all means I might use it at some point in the future. However,
 adding even more complexity to the behaviour of the bracket slicing
 is probably not a good idea.

 I'm not advocating adding even more complexity, I'm trying to think
 about ways to make it *less* complex from the typical user's standpoint.

 Eric
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Matthew Brett
Hi,

On Thu, Apr 2, 2015 at 6:09 PM,  josef.p...@gmail.com wrote:
 On Thu, Apr 2, 2015 at 8:02 PM, Eric Firing efir...@hawaii.edu wrote:
 On 2015/04/02 1:14 PM, Hanno Klemm wrote:
 Well, I have written quite a bit of code that relies on fancy
 indexing, and I think the question, if the behaviour of the []
 operator should be changed has sailed with numpy now at version 1.9.
 Given the amount packages that rely on numpy, changing this
 fundamental behaviour would not be a clever move.

 Are you *positive* that there is no clever way to make a transition?
 It's not worth any further thought?

 I guess it would be similar to python 3 string versus bytes, but
 without the overwhelming benefits.

 I don't think I would be in favor of deprecating fancy indexing even
 if it were possible. In general, my impression is that if there is a
 trade-off in numpy between powerful machinery versus easy to learn and
 teach, then the design philosophy when in favor of power.

 I think numpy indexing is not too difficult and follows a consistent
 pattern, and I completely avoid mixing slices and index arrays with
 ndim  2.

I'm sure y'all are totally on top of this, but for myself, I would
like to distinguish:

* fancy indexing with boolean arrays - I use it all the time and don't
get confused;
* fancy indexing with non-boolean arrays - horrendously confusing,
almost never use it, except on a single axis when I can't confuse it
with orthogonal indexing:

In [3]: a = np.arange(24).reshape(6, 4)

In [4]: a
Out[4]:
array([[ 0,  1,  2,  3],
   [ 4,  5,  6,  7],
   [ 8,  9, 10, 11],
   [12, 13, 14, 15],
   [16, 17, 18, 19],
   [20, 21, 22, 23]])

In [5]: a[[1, 2, 4]]
Out[5]:
array([[ 4,  5,  6,  7],
   [ 8,  9, 10, 11],
   [16, 17, 18, 19]])

I also remember a discussion with Travis O where he was also saying
that this indexing was confusing and that it would be good if there
was some way to transition to what he called outer product indexing (I
think that's the same as 'orthogonal' indexing).

 I think it should be DOA, except as a discussion topic for numpy 3000.

I think there are two proposals here:

1) Add some syntactic sugar to allow orthogonal indexing of numpy
arrays, no backward compatibility break.

That seems like a very good idea to me - were there any big objections to that?

2) Over some long time period, move the default behavior of np.array
non-boolean indexing from the current behavior to the orthogonal
behavior.

That is going to be very tough, because it will cause very confusing
breakage of legacy code.

On the other hand, maybe it is worth going some way towards that, like this:

* implement orthogonal indexing as a method arr.sensible_index[...]
* implement the current non-boolean fancy indexing behavior as a
method - arr.crazy_index[...]
* deprecate non-boolean fancy indexing as standard arr[...] indexing;
* wait a long time;
* remove non-boolean fancy indexing as standard arr[...] (errors are
preferable to change in behavior)

Then if we are brave we could:

* wait a very long time;
* make orthogonal indexing the default.

But the not-brave steps above seem less controversial, and fairly reasonable.

What about that as an approach?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] IDE's for numpy development?

2015-04-02 Thread Charles R Harris
On Thu, Apr 2, 2015 at 7:46 AM, David Cournapeau courn...@gmail.com wrote:



 On Wed, Apr 1, 2015 at 7:43 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Wed, Apr 1, 2015 at 11:55 AM, Sturla Molden sturla.mol...@gmail.com
 wrote:

 Charles R Harris charlesr.har...@gmail.com wrote:

  I'd be
  interested in information from anyone with experience in using such an
 IDE
  and ideas of how Numpy might make using some of the common IDEs easier.
 
  Thoughts?

 I guess we could include project files for Visual Studio (and perhaps
 Eclipse?), like Python does. But then we would need to make sure the
 different build systems are kept in sync, and it will be a PITA for those
 who do not use Windows and Visual Studio. It is already bad enough with
 Distutils and Bento. I, for one, would really prefer if there only was
 one
 build process to care about. One should also note that a Visual Studio
 project is the only supported build process for Python on Windows. So
 they
 are not using this in addition to something else.

 Eclipse is better than Visual Studio for mixed Python and C development.
 It
 is also cross-platform.

 cmake needs to be mentioned too. It is not fully integrated with Visual
 Studio, but better than having multiple build processes.


 Mark chose cmake for DyND because it supported Visual Studio projects.
 OTOH, he said it was a PITA to program.


 I concur on that:  For the 350+ packages we support at Enthought, cmake
 has been a higher pain point than any other build tool (that is including
 custom ones). And we only support mainstream platforms.

 But the real question for me is what does visual studio support mean ?
 Does it really mean solution files ?


I have no useful experience with Visual Studio, so don't really know, but
solution files sounds like a step in the right direction. What do solution
files provide?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Hanno Klemm

 On 03 Apr 2015, at 00:04, Colin J. Williams c...@ncf.ca wrote:
 
 
 
 On 02-Apr-15 4:35 PM, Eric Firing wrote:
 On 2015/04/02 10:22 AM, josef.p...@gmail.com wrote:
 Swapping the axis when slices are mixed with fancy indexing was a
 design mistake, IMO. But not fancy indexing itself.
 I'm not saying there should be no fancy indexing capability; I am saying
 that it should be available through a function or method, rather than
 via the square brackets.  Square brackets should do things that people
 expect them to do--the most common and easy-to-understand style of indexing.
 
 Eric
 +1

Well, I have written quite a bit of code that relies on fancy indexing, and I 
think the question, if the behaviour of the [] operator should be changed has 
sailed with numpy now at version 1.9. Given the amount packages that rely on 
numpy, changing this fundamental behaviour would not be a clever move. 

If people want to implement orthogonal indexing with another method, by all 
means I might use it at some point in the future. However, adding even more 
complexity to the behaviour of the bracket slicing is probably not a good idea.

Hanno


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Eric Firing
On 2015/04/02 1:14 PM, Hanno Klemm wrote:
 Well, I have written quite a bit of code that relies on fancy
 indexing, and I think the question, if the behaviour of the []
 operator should be changed has sailed with numpy now at version 1.9.
 Given the amount packages that rely on numpy, changing this
 fundamental behaviour would not be a clever move.

Are you *positive* that there is no clever way to make a transition? 
It's not worth any further thought?


 If people want to implement orthogonal indexing with another method,
 by all means I might use it at some point in the future. However,
 adding even more complexity to the behaviour of the bracket slicing
 is probably not a good idea.

I'm not advocating adding even more complexity, I'm trying to think 
about ways to make it *less* complex from the typical user's standpoint.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Nathaniel Smith
On Thu, Apr 2, 2015 at 6:35 PM,  josef.p...@gmail.com wrote:
 (I thought about this because I was looking at accessing off-diagonal
 elements, m2[np.arange(4), np.arange(4) + 1] )

Psst: np.diagonal(m2, offset=1)

-- 
Nathaniel J. Smith -- http://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Benjamin Root
The distinction that boolean indexing has over the other 2 methods of
indexing is that it can guarantee that it references a position at most
once. Slicing and scalar indexes are also this way, hence why these methods
allow for in-place assignments. I don't see boolean indexing as an
extension of orthogonal indexing because of that.

Ben Root

On Thu, Apr 2, 2015 at 2:41 PM, Stephan Hoyer sho...@gmail.com wrote:

 On Thu, Apr 2, 2015 at 11:03 AM, Eric Firing efir...@hawaii.edu wrote:

 Fancy indexing is a horrible design mistake--a case of cleverness run
 amok.  As you can read in the Numpy documentation, it is hard to
 explain, hard to understand, hard to remember.


 Well put!

 I also failed to correct predict your example.


 So I think you should turn the question around and ask, What is the
 actual real-world use case for fancy indexing?  How often does real
 code rely on it?


 I'll just note that Indexing with a boolean array with the same shape as
 the array (e.g., x[x  0] when x has greater than 1 dimension) technically
 falls outside a strict interpretation of orthogonal indexing. But there's
 not any ambiguity in adding that as an extension to orthogonal indexing
 (which otherwise does not allow ndim  1), so I think your point still
 stands.

 Stephan

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] IDE's for numpy development?

2015-04-02 Thread David Cournapeau
On Wed, Apr 1, 2015 at 7:43 PM, Charles R Harris charlesr.har...@gmail.com
wrote:



 On Wed, Apr 1, 2015 at 11:55 AM, Sturla Molden sturla.mol...@gmail.com
 wrote:

 Charles R Harris charlesr.har...@gmail.com wrote:

  I'd be
  interested in information from anyone with experience in using such an
 IDE
  and ideas of how Numpy might make using some of the common IDEs easier.
 
  Thoughts?

 I guess we could include project files for Visual Studio (and perhaps
 Eclipse?), like Python does. But then we would need to make sure the
 different build systems are kept in sync, and it will be a PITA for those
 who do not use Windows and Visual Studio. It is already bad enough with
 Distutils and Bento. I, for one, would really prefer if there only was one
 build process to care about. One should also note that a Visual Studio
 project is the only supported build process for Python on Windows. So they
 are not using this in addition to something else.

 Eclipse is better than Visual Studio for mixed Python and C development.
 It
 is also cross-platform.

 cmake needs to be mentioned too. It is not fully integrated with Visual
 Studio, but better than having multiple build processes.


 Mark chose cmake for DyND because it supported Visual Studio projects.
 OTOH, he said it was a PITA to program.


I concur on that:  For the 350+ packages we support at Enthought, cmake has
been a higher pain point than any other build tool (that is including
custom ones). And we only support mainstream platforms.

But the real question for me is what does visual studio support mean ? Does
it really mean solution files ?

David
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Jaime Fernández del Río
On Thu, Apr 2, 2015 at 1:29 AM, Stephan Hoyer sho...@gmail.com wrote:

 On Wed, Apr 1, 2015 at 7:06 AM, Jaime Fernández del Río 
 jaime.f...@gmail.com wrote:

 Is there any other package implementing non-orthogonal indexing aside
 from numpy?


 I think we can safely say that NumPy's implementation of broadcasting
 indexing is unique :).

 The issue is that many other packages rely on numpy for implementation of
 custom array objects (e.g., scipy.sparse and scipy.io.netcdf). It's not
 immediately obvious what sort of indexing these objects represent.

 If the functionality is lacking, e,g, use of slices in `np.ix_`, I'm all
 for improving that to provide the full functionality of orthogonal
 indexing. I just need a little more convincing that those new
 attributes/indexers are going to ever see any real use.


 Orthogonal indexing is close to the norm for packages that implement
 labeled data structures, both because it's easier to understand and
 implement, and because it's difficult to maintain associations with labels
 through complex broadcasting indexing.

 Unfortunately, the lack of a full featured implementation of orthogonal
 indexing has lead to that wheel being reinvented at least three times (in
 Iris, xray [1] and pandas). So it would be nice to have a canonical
 implementation that supports slices and integers in numpy for that reason
 alone. This could be done by building on the existing `np.ix_` function,
 but a new indexer seems more elegant: there's just much less noise with
 `arr.ix_[:1, 2, [3]]` than `arr[np.ix_(slice(1), 2, [3])]`.

 It's also well known that indexing with __getitem__ can be much slower
 than np.take. It seems plausible to me that a careful implementation of
 orthogonal indexing could close or eliminate this speed gap, because the
 model for orthogonal indexing is so much simpler than that for broadcasting
 indexing: each element of the key tuple can be applied separately along the
 corresponding axis.

 So I think there could be a real benefit to having the feature in numpy.
 In particular, if somebody is up for implementing it in C or Cython, I
 would be very pleased.

  Cheers,
 Stephan

 [1] Here is my implementation of remapping from orthogonal to broadcasting
 indexing. It works, but it's a real mess, especially because I try to
 optimize by minimizing the number of times slices are converted into arrays:

 https://github.com/xray/xray/blob/0d164d848401209971ded33aea2880c1fdc892cb/xray/core/indexing.py#L68


I believe you can leave all slices unchanged if you later reshuffle your
axes. Basically all the fancy-indexed axes go in the front of the shape in
order, and the subspace follows, e.g.:

 a = np.arange(60).reshape(3, 4, 5)
 a[np.array([1])[:, None], ::2, np.array([1, 2, 3])].shape
(1, 3, 2)

So you would need to swap the second and last axes and be done. You would
not get a contiguous array without a copy, but that's a different story.
Assigning to an orthogonally indexed subarray is an entirely different
beast, not sure if there is a use case for that.

We probably need more traction on the should this be done? discussion
than on the can this be done? one, the need for a reordering of the axes
swings me slightly in favor, but I mostly don't see it yet. Nathaniel
usually has good insights on who we are, where do we come from, where are
we going to, type of questions, would be good to have him chime in.

Jaime

-- 
(\__/)
( O.o)
(  ) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Stephan Hoyer
On Thu, Apr 2, 2015 at 11:03 AM, Eric Firing efir...@hawaii.edu wrote:

 Fancy indexing is a horrible design mistake--a case of cleverness run
 amok.  As you can read in the Numpy documentation, it is hard to
 explain, hard to understand, hard to remember.


Well put!

I also failed to correct predict your example.


 So I think you should turn the question around and ask, What is the
 actual real-world use case for fancy indexing?  How often does real
 code rely on it?


I'll just note that Indexing with a boolean array with the same shape as
the array (e.g., x[x  0] when x has greater than 1 dimension) technically
falls outside a strict interpretation of orthogonal indexing. But there's
not any ambiguity in adding that as an extension to orthogonal indexing
(which otherwise does not allow ndim  1), so I think your point still
stands.

Stephan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Advanced indexing: fancy vs. orthogonal

2015-04-02 Thread Eric Firing
On 2015/04/02 4:15 AM, Jaime Fernández del Río wrote:
 We probably need more traction on the should this be done? discussion
 than on the can this be done? one, the need for a reordering of the
 axes swings me slightly in favor, but I mostly don't see it yet.

As a long-time user of numpy, and an advocate and teacher of Python for 
science, here is my perspective:

Fancy indexing is a horrible design mistake--a case of cleverness run 
amok.  As you can read in the Numpy documentation, it is hard to 
explain, hard to understand, hard to remember.  Its use easily leads to 
unreadable code and hard-to-see errors.  Here is the essence of an 
example that a student presented me with just this week, in the context 
of reordering eigenvectors based on argsort applied to eigenvalues:

In [25]: xx = np.arange(2*3*4).reshape((2, 3, 4))

In [26]: ii = np.arange(4)

In [27]: print(xx[0])
[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

In [28]: print(xx[0, :, ii])
[[ 0  4  8]
  [ 1  5  9]
  [ 2  6 10]
  [ 3  7 11]]

Quickly now, how many numpy users would look at that last expression and 
say, Of course, that is equivalent to transposing xx[0]?  And, Of 
course that expression should give a completely different result from 
xx[0][:, ii].?

I would guess it would be less than 1%.  That should tell you right away 
that we have a real problem here.  Fancy indexing can't be *read* by a 
sub-genius--it has to be laboriously figured out piece by piece, with 
frequent reference to the baffling descriptions in the Numpy docs.

So I think you should turn the question around and ask, What is the 
actual real-world use case for fancy indexing?  How often does real 
code rely on it?  I have taken advantage of it occasionally, maybe you 
have too, but I think a survey of existing code would show that the need 
for it is *far* less common than the need for simple orthogonal 
indexing.  That tells me that it is fancy indexing, not orthogonal 
indexing, that should be available through a function and/or special 
indexing attribute.  The question is then how to make that transition.

Eric





___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion