On Tue, 2020-10-27 at 17:15 -0600, Aaron Meurer wrote: > For ndindex (https://quansight.github.io/ndindex/), the biggest issue > with the API is that to use an ndindex object to actually index an > array, you have to use a[idx.raw] instead of a[idx]. This is because > for NumPy arrays, you cannot allow custom objects to be indices. The > exception is objects that define __index__, but this only works for > integer indices. If __index__ returns anything other than an integer, > you get an IndexError. This is annoying because it's easy to forget > to > do this when working with the ndindex API, and the error message from > NumPy isn't informative about what went wrong unless you know to > expect it. > > I'd like to propose an API that would allow custom objects to define > how they should be converted to a standard NumPy index, similar to > __index__ but that supports all index types. I think there are two > options here: > > - Allow __index__ to return any index type, not just integers. This > is > the simplest because it reuses an existing API, and __index__ is the > best possible name for this API. However, I'm not sure, but this may > actually conflict with the text of PEP 357 > (https://www.python.org/dev/peps/pep-0357/). Also, some other APIs > use > __index__ to check if something is an indexable integer, which > wouldn't accept generic index. For example, elements of a slice can > be > any object that defines __index__. >
Index converts to an integer (safely). There is an assumptions that the integer is good for indexing, but I the name shouldn't be taken to mean it is specific to indexing (even if that was the main motivation). > - Add a new __numpy_index__ API that works like > > def __numpy_index__(self): > return <tuple, integer, slice, newaxis, ellipsis, or integer or > boolean array> > > In NumPy, __getitem__ and __setitem__ on ndarray would first check if > the input index type is one of the known types as it currently does, > then it would try __index__, and if neither of those fails, it would > call __numpy_index__(index) and use that. Do you anticipate just: arr[index] or also: arr[index1, index2] Would you expect pandas or array-like objects to support this as well? If we only do `arr[index]` might subclassing tuple be sufficient? Do you have any thought on how this might play out with a potential `arr.oindex[...]`? Adding either to NumPy is probably fairly straight forward, although I prefer either not slow down every single indexing operation for an extremely niche use-case (which is likely possible) or timing that it is insignificant. What might help me is understanding that `ndindex` itself better. Since it seems like asking to add a protocol that may very well be used by only this one project? > > Note: there is a more general way that NumPy arrays could allow > __getitem__ to be defined on custom objects, which I am NOT > proposing. > Instead of an API that returns one of the current predefined index > types (tuple, integer, slice, newaxis, ellipsis, or integer or > boolean > array), there could instead be an API that takes the array as input > and returns another array (or view) as an output. This would allow an > object to define itself as an index in arbitrary ways, even if such > an > index would not actually be possible via traditional indexing. There > are definitely some interesting ideas that could be done with this, > but this idea would be much more complicated, and isn't something > that > I need. Unless the community feels that a more general API like this > would be preferred, I would suggest deferring something like it to a > later discussion. > > What would be the best way to go about getting something like this > implemented? Is it simple enough that we can just work out the > details > here and on a pull request, or should I write a NEP? A short NEP may make sense, at least if this is supposed to be a generic protocol for general array-likes, which I guess it would have to be ready for. Cheers, Sebastian > > Aaron Meurer > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
signature.asc
Description: This is a digitally signed message part
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion