[Numpy-discussion] Parlez-vous français?

2020-07-06 Thread Inessa Pawson
As most of you know, the inaugural NumPy community survey
 is currently
underway.

Fabrice Silva kindly offered his help to translate the survey questionnaire
into French to maximize the participation of NumPy users and developers
from the French-speaking countries. He has already completed his part of
the translation (in less than 24 hours!).

We are looking for another French-speaking volunteer to finalize the
translation process of this document. If you are available, please email me
at ine...@albuscode.org.

--
Every good wish,
Inessa Pawson
NumPy survey team
ine...@albuscode.org
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] What is up with raw boolean indices (like a[False])?

2020-07-06 Thread Aaron Meurer
I've been trying to figure out this behavior. It doesn't seem to be
documented at https://numpy.org/doc/stable/reference/arrays.indexing.html

>>> a = np.empty((2, 3))
>>> a.shape
(2, 5)
>>> a[True].shape
(1, 2, 5)
>>> a[False].shape
(0, 2, 5)

It seems like indexing with a raw boolean (True or False) adds an axis
with a dimension 1 or 0, resp.

Except it only works once:

>>> a[:,False]
array([], shape=(2, 0, 3), dtype=float64)
>>> a[:,False, False]
array([], shape=(2, 0, 3), dtype=float64)
>>> a[:,False,True].shape
(2, 0, 3)
>>> a[:,True,False].shape
(2, 0, 3)

The docs say "A single boolean index array is practically identical to
x[obj.nonzero()]". I have a hard time seeing this as an extension of
that, since indexing by `np.nonzero(False)` or `np.nonzero(True)`
*replaces* the given axis.

 >>> a[np.nonzero(True)].shape
(1, 3)
>>> a[np.nonzero(False)].shape
(0, 3)

I think at best this behavior should be documented. I'm trying to
understand the motivation for it, or if it's even intentional. And in
particular, why do multiple boolean indices not insert multiple axes?
It would actually be useful to be able to generically add length 0
axes using an index, similar to how `newaxis` adds a length 1 axis.

Aaron Meurer
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is up with raw boolean indices (like a[False])?

2020-07-06 Thread Sebastian Berg
On Mon, 2020-07-06 at 12:39 -0600, Aaron Meurer wrote:
> I've been trying to figure out this behavior. It doesn't seem to be
> documented at 
> https://numpy.org/doc/stable/reference/arrays.indexing.html
> 
> > > > a = np.empty((2, 3))
> > > > a.shape
> (2, 5)
> > > > a[True].shape
> (1, 2, 5)
> > > > a[False].shape
> (0, 2, 5)
> 
> It seems like indexing with a raw boolean (True or False) adds an
> axis
> with a dimension 1 or 0, resp.
> 
> Except it only works once:
> 
> > > > a[:,False]
> array([], shape=(2, 0, 3), dtype=float64)
> > > > a[:,False, False]
> array([], shape=(2, 0, 3), dtype=float64)
> > > > a[:,False,True].shape
> (2, 0, 3)
> > > > a[:,True,False].shape
> (2, 0, 3)
> 
> The docs say "A single boolean index array is practically identical
> to
> x[obj.nonzero()]". I have a hard time seeing this as an extension of
> that, since indexing by `np.nonzero(False)` or `np.nonzero(True)`
> *replaces* the given axis.
> 
>  >>> a[np.nonzero(True)].shape
> (1, 3)
> > > > a[np.nonzero(False)].shape
> (0, 3)
> 
> I think at best this behavior should be documented. I'm trying to
> understand the motivation for it, or if it's even intentional. And in
> particular, why do multiple boolean indices not insert multiple axes?
> It would actually be useful to be able to generically add length 0
> axes using an index, similar to how `newaxis` adds a length 1 axis.

Its fully intentional as it is the correct generalization from an N-D
boolean index to include a 0-D boolean index.
To be fair, there is a footnote in the "Detailed notes" saying that:
"the nonzero equivalence for Boolean arrays does not hold for zero
dimensional boolean arrays.", this is for technical reasons since
`nonzero` does not do useful things for 0-D input.


In any case, a boolean index always does the following:

1. It will *remove as many dimensions as the index has, because this
   is the number of dimensions effectively indexed by it*
2. It will add a single new dimension at the same place.  The length of
   this new dimension is the number of `True` elements.
3. If you have multiple advanced indexing you get annoying broadcasting
   of all of these. That is *always* confusing for boolean indices.
   0-D should not be too special there...

And this generalizes to 0-D just as well, even if it may be a bit
surprising at first.


I have written much of this more clearly once before in this NEP, which
may be a good read to _really_ understand it:

https://numpy.org/neps/nep-0021-advanced-indexing.html

In general, I wonder if going into much depth about how 0-D arrays are
not actually really handled very special is good.  Yes, its confusing
on its own, but it seems also a bit like overloading the user with
unnecessary knowledge?

Cheers,

Sebastian



> 
> Aaron Meurer
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is up with raw boolean indices (like a[False])?

2020-07-06 Thread Aaron Meurer
> Its fully intentional as it is the correct generalization from an N-D
> boolean index to include a 0-D boolean index.
> To be fair, there is a footnote in the "Detailed notes" saying that:
> "the nonzero equivalence for Boolean arrays does not hold for zero
> dimensional boolean arrays.", this is for technical reasons since
> `nonzero` does not do useful things for 0-D input.
>
> In any case, a boolean index always does the following:
> 1. It will *remove as many dimensions as the index has, because this
>is the number of dimensions effectively indexed by it*
> 2. It will add a single new dimension at the same place.  The length of
>this new dimension is the number of `True` elements.
> 3. If you have multiple advanced indexing you get annoying broadcasting
>of all of these. That is *always* confusing for boolean indices.
>0-D should not be too special there...
> And this generalizes to 0-D just as well, even if it may be a bit
> surprising at first.

I guess if those are the base rules for boolean indices this makes
sense. So that brings up the question then, is there a way to add
arbitrary empty dimensions using an index?

>
> I have written much of this more clearly once before in this NEP, which
> may be a good read to _really_ understand it:
> https://numpy.org/neps/nep-0021-advanced-indexing.html
> In general, I wonder if going into much depth about how 0-D arrays are
> not actually really handled very special is good.  Yes, its confusing
> on its own, but it seems also a bit like overloading the user with
> unnecessary knowledge?

The page I referenced is already written like a very highly technical
document, so I think it should embrace that and fully describe the
spec of NumPy indexing. NumPy could use more user-friendly
documentation for indexing, but that page ain't it. FWIW, I wrote some
documentation on slices of my own here
https://quansight.github.io/ndindex/slices.html. I eventually plan to
extend this to all forms of NumPy indexing. Anyway, the three bullet
points you mentioned above would be helpful to include in the docs.

> Cheers,
> Sebastian
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion