Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods
On Mon, Apr 22, 2019 at 2:20 PM Ralf Gommers wrote: > > > On Mon, Apr 22, 2019 at 9:26 PM Nathaniel Smith wrote: > >> Your last email didn't really clarify anything for me. I get that >> np.func.__numpy_implementation__ is intended to have the semantics of >> numpy's implementation of func, but that doesn't tell me much :-). And >> also, that's exactly the definition of np.func, isn't it? >> > My understanding of the protocol we came up with in NEP-18 is that every NumPy function (that takes array-like arguments) now has two parts to its implementation: 1. The NEP-18 part involving calling the dispatcher function, and checking for/calling __array_function__ attributes on array-like arguments. This part is documented in NEP-18. 2. The original function definition, which is called if either (a) no __array_function__ attributes exist, or (b) the only __array_function__ attribute is numpy.ndarray.__array_function__. This part is documented in the docstring of the NumPy function. "__numpy_implementation__" provides a short-cut to (2) without (1). That's it. OK, thinking about this a little bit more, there is other one (rare) difference: in cases where a function has deprecated arguments, we are currently only issuing the deprecation warnings in the dispatcher function, rather than in both the dispatcher and the implementation. This is all the more reason to discourage users from calling __numpy_implementation__ directly (I'll update the NEP), but it's still fine to call __numpy_implementation__ from within __array_function__ methods themselves. I guess the other option would be to make it programmatically impossible to access implementations outside of __array_function__, by making numpy_implementation an argument used to call __array_function__() rather than making it an attribute on NumPy functions. I don't like this as much, for two reasons: 1. It would break every existing implementation of __array_function__ before it launches. We did reserve the right to do this, but it's still a little unfriendly to our early adopters. 2. There are still cases where users will prefer to call np.concatenate.__numpy_implementation__ for extra performance, even knowing that they will miss any hypothetical deprecation warnings and removed/renamed function arguments. You're talking about ~doubling the size of numpy's API, >> > > I think we can already get both the NEP 18 wrapped functions and their > underlying implementations today, based on the value of > NUMPY_EXPERIMENTAL_ARRAY_FUNCTION. > > It looks to me like all this proposed change does is bypass a > do-very-little wrapper. > This is how I think of it. and don't seem able to even articulate what the new API's commitments are. >> This still makes me nervous. Maybe it should have a NEP? What's your >> testing strategy for all the new functions? >> > > The current decorator mechanism already checks that the signatures match, > so it shouldn't be possible to get a mismatch. So probably not much is > needed beyond some assert_equal(np.func(...), > np.func.__numpy_implementation__(...)) checks. > > @Stephan the PR for the NEP change is very hard to parse. Maybe easier to > just open a PR with an implementation for one or a few functions + > associated tests? > Sure, here's a full implementation (with tests): https://github.com/numpy/numpy/pull/13389 I have not included tests on every numpy function, but we didn't write those for each NumPy function with __array_function__ overrides, either -- the judgment was that the changes are mechanistic enough that writing a unit test for each function would not be worthwhile. Also you'll note that my PR includes only a single change to np.ndarray.__array_function__ (swapping out __wrapped__ -> __numpy_implementation__). This is because we had actually already changed the implementation of ndarray.__array_function__ without updating the NEP, per prior discussion on the mailing list [1]. The existing use of the __wrapped__ attribute is an undocumented optimization / implementation detail. [1] https://mail.python.org/pipermail/numpy-discussion/2018-November/078912.html > Cheers, > Ralf > > > >> On Mon, Apr 22, 2019, 09:22 Stephan Hoyer wrote: >> >>> Are there still concerns here? If not, I would love to move ahead with >>> these changes so we can get this into NumPy 1.17. >>> >>> On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer wrote: >>> __numpy_implementation__ is indeed simply a slot for third-parties to access NumPy's implementation. It should be considered "NumPy's current implementation", not "NumPy's implementation as of 1.14". Of course, in practice these will remain very similar, because we are already very conservative about how we change NumPy. I would love to have clean well-defined coercion semantics for every NumPy function, which would be implicitly adopted by `__numpy_implementation__` (e.g., we could say that every function always coerces its arguments wit
Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods
On Mon, Apr 22, 2019 at 9:26 PM Nathaniel Smith wrote: > Your last email didn't really clarify anything for me. I get that > np.func.__numpy_implementation__ is intended to have the semantics of > numpy's implementation of func, but that doesn't tell me much :-). And > also, that's exactly the definition of np.func, isn't it? > > You're talking about ~doubling the size of numpy's API, > I think we can already get both the NEP 18 wrapped functions and their underlying implementations today, based on the value of NUMPY_EXPERIMENTAL_ARRAY_FUNCTION. It looks to me like all this proposed change does is bypass a do-very-little wrapper. and don't seem able to even articulate what the new API's commitments are. > This still makes me nervous. Maybe it should have a NEP? What's your > testing strategy for all the new functions? > The current decorator mechanism already checks that the signatures match, so it shouldn't be possible to get a mismatch. So probably not much is needed beyond some assert_equal(np.func(...), np.func.__numpy_implementation__(...)) checks. @Stephan the PR for the NEP change is very hard to parse. Maybe easier to just open a PR with an implementation for one or a few functions + associated tests? Cheers, Ralf > On Mon, Apr 22, 2019, 09:22 Stephan Hoyer wrote: > >> Are there still concerns here? If not, I would love to move ahead with >> these changes so we can get this into NumPy 1.17. >> >> On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer wrote: >> >>> __numpy_implementation__ is indeed simply a slot for third-parties to >>> access NumPy's implementation. It should be considered "NumPy's current >>> implementation", not "NumPy's implementation as of 1.14". Of course, in >>> practice these will remain very similar, because we are already very >>> conservative about how we change NumPy. >>> >>> I would love to have clean well-defined coercion semantics for every >>> NumPy function, which would be implicitly adopted by >>> `__numpy_implementation__` (e.g., we could say that every function always >>> coerces its arguments with `np.asarray()`). But I think that's an >>> orthogonal issue. We have been supporting some ad-hoc duck typing in NumPy >>> for a long time (e.g., the `.sum()` method which is called by `np.sum()`). >>> Removing that would require a deprecation cycle, which may indeed be >>> warranted once we're sure we're happy with __array_function__. But I don't >>> think the deprecation cycle will be any worse if the implementation is also >>> exposed via `__numpy_implementation__`. >>> >>> We should definitely still think about a cleaner "core" implementation >>> of NumPy functions in terms of a minimal core. One recent example of this >>> can be found JAX (see >>> https://github.com/google/jax/blob/04b45e4086249bad691a33438e8bb6fcf639d001/jax/numpy/lax_numpy.py). >>> This would be something appropriate to put into a more generic function >>> attribute on NumPy functions, perhaps `__array_implementation__`. But I >>> don't think formalizing `__numpy_implementation__` as a way to get access >>> to NumPy's default implementation will limit our future options here. >>> >>> Cheers, >>> Stephan >>> >>> >>> On Tue, Apr 16, 2019 at 6:44 AM Marten van Kerkwijk < >>> m.h.vankerkw...@gmail.com> wrote: >>> I somewhat share Nathaniel's worry that by providing `__numpy_implementation__` we essentially get stuck with the implementations we have currently, rather than having the hoped-for freedom to remove all the `np.asarray` coercion. In that respect, an advantage of using `_wrapped` is that it is clearly a private method, so anybody is automatically forewarned that this can change. In principle, ndarray.__array_function__ would be more logical, but as noted in the PR, the problem is that it is non-trivial for a regular __array_function__ implementation to coerce all the arguments to ndarray itself. Which suggests that perhaps what is missing is a general routine that does that, i.e., that re-uses the dispatcher. -- Marten ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion >>> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods
Your last email didn't really clarify anything for me. I get that np.func.__numpy_implementation__ is intended to have the semantics of numpy's implementation of func, but that doesn't tell me much :-). And also, that's exactly the definition of np.func, isn't it? You're talking about ~doubling the size of numpy's API, and don't seem able to even articulate what the new API's commitments are. This still makes me nervous. Maybe it should have a NEP? What's your testing strategy for all the new functions? On Mon, Apr 22, 2019, 09:22 Stephan Hoyer wrote: > Are there still concerns here? If not, I would love to move ahead with > these changes so we can get this into NumPy 1.17. > > On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer wrote: > >> __numpy_implementation__ is indeed simply a slot for third-parties to >> access NumPy's implementation. It should be considered "NumPy's current >> implementation", not "NumPy's implementation as of 1.14". Of course, in >> practice these will remain very similar, because we are already very >> conservative about how we change NumPy. >> >> I would love to have clean well-defined coercion semantics for every >> NumPy function, which would be implicitly adopted by >> `__numpy_implementation__` (e.g., we could say that every function always >> coerces its arguments with `np.asarray()`). But I think that's an >> orthogonal issue. We have been supporting some ad-hoc duck typing in NumPy >> for a long time (e.g., the `.sum()` method which is called by `np.sum()`). >> Removing that would require a deprecation cycle, which may indeed be >> warranted once we're sure we're happy with __array_function__. But I don't >> think the deprecation cycle will be any worse if the implementation is also >> exposed via `__numpy_implementation__`. >> >> We should definitely still think about a cleaner "core" implementation of >> NumPy functions in terms of a minimal core. One recent example of this can >> be found JAX (see >> https://github.com/google/jax/blob/04b45e4086249bad691a33438e8bb6fcf639d001/jax/numpy/lax_numpy.py). >> This would be something appropriate to put into a more generic function >> attribute on NumPy functions, perhaps `__array_implementation__`. But I >> don't think formalizing `__numpy_implementation__` as a way to get access >> to NumPy's default implementation will limit our future options here. >> >> Cheers, >> Stephan >> >> >> On Tue, Apr 16, 2019 at 6:44 AM Marten van Kerkwijk < >> m.h.vankerkw...@gmail.com> wrote: >> >>> >>> I somewhat share Nathaniel's worry that by providing >>> `__numpy_implementation__` we essentially get stuck with the >>> implementations we have currently, rather than having the hoped-for freedom >>> to remove all the `np.asarray` coercion. In that respect, an advantage of >>> using `_wrapped` is that it is clearly a private method, so anybody is >>> automatically forewarned that this can change. >>> >>> In principle, ndarray.__array_function__ would be more logical, but as >>> noted in the PR, the problem is that it is non-trivial for a regular >>> __array_function__ implementation to coerce all the arguments to ndarray >>> itself. >>> >>> Which suggests that perhaps what is missing is a general routine that >>> does that, i.e., that re-uses the dispatcher. >>> >>> -- Marten >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion >>> >> ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods
Are there still concerns here? If not, I would love to move ahead with these changes so we can get this into NumPy 1.17. On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer wrote: > __numpy_implementation__ is indeed simply a slot for third-parties to > access NumPy's implementation. It should be considered "NumPy's current > implementation", not "NumPy's implementation as of 1.14". Of course, in > practice these will remain very similar, because we are already very > conservative about how we change NumPy. > > I would love to have clean well-defined coercion semantics for every NumPy > function, which would be implicitly adopted by `__numpy_implementation__` > (e.g., we could say that every function always coerces its arguments with > `np.asarray()`). But I think that's an orthogonal issue. We have been > supporting some ad-hoc duck typing in NumPy for a long time (e.g., the > `.sum()` method which is called by `np.sum()`). Removing that would require > a deprecation cycle, which may indeed be warranted once we're sure we're > happy with __array_function__. But I don't think the deprecation cycle will > be any worse if the implementation is also exposed via > `__numpy_implementation__`. > > We should definitely still think about a cleaner "core" implementation of > NumPy functions in terms of a minimal core. One recent example of this can > be found JAX (see > https://github.com/google/jax/blob/04b45e4086249bad691a33438e8bb6fcf639d001/jax/numpy/lax_numpy.py). > This would be something appropriate to put into a more generic function > attribute on NumPy functions, perhaps `__array_implementation__`. But I > don't think formalizing `__numpy_implementation__` as a way to get access > to NumPy's default implementation will limit our future options here. > > Cheers, > Stephan > > > On Tue, Apr 16, 2019 at 6:44 AM Marten van Kerkwijk < > m.h.vankerkw...@gmail.com> wrote: > >> >> I somewhat share Nathaniel's worry that by providing >> `__numpy_implementation__` we essentially get stuck with the >> implementations we have currently, rather than having the hoped-for freedom >> to remove all the `np.asarray` coercion. In that respect, an advantage of >> using `_wrapped` is that it is clearly a private method, so anybody is >> automatically forewarned that this can change. >> >> In principle, ndarray.__array_function__ would be more logical, but as >> noted in the PR, the problem is that it is non-trivial for a regular >> __array_function__ implementation to coerce all the arguments to ndarray >> itself. >> >> Which suggests that perhaps what is missing is a general routine that >> does that, i.e., that re-uses the dispatcher. >> >> -- Marten >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Boolean arrays with nulls?
On Thu, Apr 18, 2019 at 10:52 AM Stuart Reynolds wrote: > Is float8 a thing? > no, but np.float16 is -- so at least only twice as much memory as youo need :-) array([ nan, inf, -inf], dtype=float16) I think masked arrays are going to be just as much, as they need to carry the mask. -CHB > > On Thu, Apr 18, 2019 at 9:46 AM Stefan van der Walt > wrote: > >> Hi Stuart, >> >> On Thu, 18 Apr 2019 09:12:31 -0700, Stuart Reynolds wrote: >> > Is there an efficient way to represent bool arrays with null entries? >> >> You can use the bool dtype: >> >> In [5]: x = np.array([True, False, True]) >> >> >> >> In [6]: x >> >> >> Out[6]: array([ True, False, True]) >> >> In [7]: x.dtype >> >> >> Out[7]: dtype('bool') >> >> You should note that this stores one True/False value per byte, so it is >> not optimal in terms of memory use. There is no easy way to do >> bit-arrays with NumPy, because we use strides to determine how to move >> from one memory location to the next. >> >> See also: >> https://www.reddit.com/r/Python/comments/5oatp5/one_bit_data_type_in_numpy/ >> >> > What I’m hoping for is that there’s a structure that is ‘viewed’ as >> > nan-able float data, but backed but a more efficient structures >> > internally. >> >> There are good implementations of this idea, such as: >> >> https://github.com/ilanschnell/bitarray >> >> Those structures cannot typically utilize the NumPy machinery, though. >> With the new array function interface, you should at least be able to >> build something that has something close to the NumPy API. >> >> Best regards, >> Stéfan >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > ___ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion