On Thu, Aug 25, 2016 at 4:37 PM, Sebastian Berg <sebast...@sipsolutions.net> wrote:
> On Do, 2016-08-25 at 10:36 -0400, Joseph Fox-Rabinovitz wrote: > > This issue recently came up on Stack Overflow: http://stackoverflow.c > > om/questions/39145795/masking-a-series-with-a-boolean-array. The > > poster attempted to index an ndarray with a pandas boolean Series > > object (all False), but the result was as if he had indexed with an > > array of integer zeros. > > > > Can someone explain this behavior? I can see two obvious > > possibilities: > > ndarray checks if the input to __getitem__ is of exactly the right > > type, not using instanceof. > > pandas actually uses a wider datatype than boolean internally, so > > indexing with the series is in fact indexing with an integer array. > > You are overthinking it ;). The reason is quite simply that the logic > used to be: > > * Boolean array? -> think about boolean indexing. > * Everything "array-like" (not caught earlier) -> convert to `intp` > array and do integer indexing. > > Now you might wonder why, but probably it is quite simply because > boolean indexing was tagged on later. > > - Sebastian > > > > In my attempt to reproduce the poster's results, I got the following > > warning: > > FutureWarning: in the future, boolean array-likes will be handled as > > a boolean array index > > This indicates that the issue is probably #1 and that a fix is > > already on the way. Please correct me if I am wrong. Also, where does > > the code for ndarray.__getitem__ live? > > Thanks, > > -Joe > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@scipy.org > > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > This makes perfect sense. I would like to help fix it if a fix is desired and has not been done already. Could you point me to where the "Boolean array?, etc." decision happens? I have had trouble navigating to `__getitem__` (which I assume is somewhere in np.core.multiarray C code. -Joe
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion