Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-22 Thread Stephan Hoyer
On Mon, Apr 22, 2019 at 2:20 PM Ralf Gommers  wrote:

>
>
> On Mon, Apr 22, 2019 at 9:26 PM Nathaniel Smith  wrote:
>
>> Your last email didn't really clarify anything for me. I get that
>> np.func.__numpy_implementation__ is intended to have the semantics of
>> numpy's implementation of func, but that doesn't tell me much :-). And
>> also, that's exactly the definition of np.func, isn't it?
>>
>
My understanding of the protocol we came up with in NEP-18 is that every
NumPy function (that takes array-like arguments) now has two parts to its
implementation:
1. The NEP-18 part involving calling the dispatcher function, and checking
for/calling __array_function__ attributes on array-like arguments. This
part is documented in NEP-18.
2. The original function definition, which is called if either (a) no
__array_function__ attributes exist, or (b) the only __array_function__
attribute is numpy.ndarray.__array_function__. This part is documented in
the docstring of the NumPy function.

"__numpy_implementation__" provides a short-cut to (2) without (1). That's
it.

OK, thinking about this a little bit more, there is other one (rare)
difference: in cases where a function has deprecated arguments, we are
currently only issuing the deprecation warnings in the dispatcher function,
rather than in both the dispatcher and the implementation. This is all the
more reason to discourage users from calling __numpy_implementation__
directly (I'll update the NEP), but it's still fine to call
__numpy_implementation__ from within __array_function__ methods themselves.

I guess the other option would be to make it programmatically impossible to
access implementations outside of __array_function__, by making
numpy_implementation an argument used to call __array_function__() rather
than making it an attribute on NumPy functions. I don't like this as much,
for two reasons:
1. It would break every existing implementation of __array_function__
before it launches. We did reserve the right to do this, but it's still a
little unfriendly to our early adopters.
2. There are still cases where users will prefer to call
np.concatenate.__numpy_implementation__ for extra performance, even knowing
that they will miss any hypothetical deprecation warnings and
removed/renamed function arguments.

You're talking about ~doubling the size of numpy's API,
>>
>
> I think we can already get both the NEP 18 wrapped functions and their
> underlying implementations today, based on the value of 
> NUMPY_EXPERIMENTAL_ARRAY_FUNCTION.
>
> It looks to me like all this proposed change does is bypass a
> do-very-little wrapper.
>

This is how I think of it.

and don't seem able to even articulate what the new API's commitments are.
>> This still makes me nervous. Maybe it should have a NEP? What's your
>> testing strategy for all the new functions?
>>
>
> The current decorator mechanism already checks that the signatures match,
> so it shouldn't be possible to get a mismatch. So probably not much is
> needed beyond some assert_equal(np.func(...),
> np.func.__numpy_implementation__(...)) checks.
>
> @Stephan the PR for the NEP change is very hard to parse. Maybe easier to
> just open a PR with an implementation for one or a few functions +
> associated tests?
>

Sure, here's a full implementation (with tests):
https://github.com/numpy/numpy/pull/13389

I have not included tests on every numpy function, but we didn't write
those for each NumPy function with __array_function__ overrides, either --
the judgment was that the changes are mechanistic enough that writing a
unit test for each function would not be worthwhile.

Also you'll note that my PR includes only a single change to
np.ndarray.__array_function__ (swapping out __wrapped__ ->
__numpy_implementation__). This is because we had actually already changed
the implementation of ndarray.__array_function__ without updating the NEP,
per prior discussion on the mailing list [1]. The existing use of the
__wrapped__ attribute is an undocumented optimization / implementation
detail.

[1]
https://mail.python.org/pipermail/numpy-discussion/2018-November/078912.html


> Cheers,
> Ralf
>
>
>
>> On Mon, Apr 22, 2019, 09:22 Stephan Hoyer  wrote:
>>
>>> Are there still concerns here? If not, I would love to move ahead with
>>> these changes so we can get this into NumPy 1.17.
>>>
>>> On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer  wrote:
>>>
 __numpy_implementation__ is indeed simply a slot for third-parties to
 access NumPy's implementation. It should be considered "NumPy's current
 implementation", not "NumPy's implementation as of 1.14". Of course, in
 practice these will remain very similar, because we are already very
 conservative about how we change NumPy.

 I would love to have clean well-defined coercion semantics for every
 NumPy function, which would be implicitly adopted by
 `__numpy_implementation__` (e.g., we could say that every function always
 coerces its arguments wit

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-22 Thread Ralf Gommers
On Mon, Apr 22, 2019 at 9:26 PM Nathaniel Smith  wrote:

> Your last email didn't really clarify anything for me. I get that
> np.func.__numpy_implementation__ is intended to have the semantics of
> numpy's implementation of func, but that doesn't tell me much :-). And
> also, that's exactly the definition of np.func, isn't it?
>
> You're talking about ~doubling the size of numpy's API,
>

I think we can already get both the NEP 18 wrapped functions and their
underlying implementations today, based on the value of
NUMPY_EXPERIMENTAL_ARRAY_FUNCTION.

It looks to me like all this proposed change does is bypass a
do-very-little wrapper.

and don't seem able to even articulate what the new API's commitments are.
> This still makes me nervous. Maybe it should have a NEP? What's your
> testing strategy for all the new functions?
>

The current decorator mechanism already checks that the signatures match,
so it shouldn't be possible to get a mismatch. So probably not much is
needed beyond some assert_equal(np.func(...),
np.func.__numpy_implementation__(...)) checks.

@Stephan the PR for the NEP change is very hard to parse. Maybe easier to
just open a PR with an implementation for one or a few functions +
associated tests?

Cheers,
Ralf



> On Mon, Apr 22, 2019, 09:22 Stephan Hoyer  wrote:
>
>> Are there still concerns here? If not, I would love to move ahead with
>> these changes so we can get this into NumPy 1.17.
>>
>> On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer  wrote:
>>
>>> __numpy_implementation__ is indeed simply a slot for third-parties to
>>> access NumPy's implementation. It should be considered "NumPy's current
>>> implementation", not "NumPy's implementation as of 1.14". Of course, in
>>> practice these will remain very similar, because we are already very
>>> conservative about how we change NumPy.
>>>
>>> I would love to have clean well-defined coercion semantics for every
>>> NumPy function, which would be implicitly adopted by
>>> `__numpy_implementation__` (e.g., we could say that every function always
>>> coerces its arguments with `np.asarray()`). But I think that's an
>>> orthogonal issue. We have been supporting some ad-hoc duck typing in NumPy
>>> for a long time (e.g., the `.sum()` method which is called by `np.sum()`).
>>> Removing that would require a deprecation cycle, which may indeed be
>>> warranted once we're sure we're happy with __array_function__. But I don't
>>> think the deprecation cycle will be any worse if the implementation is also
>>> exposed via `__numpy_implementation__`.
>>>
>>> We should definitely still think about a cleaner "core" implementation
>>> of NumPy functions in terms of a minimal core. One recent example of this
>>> can be found JAX (see
>>> https://github.com/google/jax/blob/04b45e4086249bad691a33438e8bb6fcf639d001/jax/numpy/lax_numpy.py).
>>> This would be something appropriate to put into a more generic function
>>> attribute on NumPy functions, perhaps `__array_implementation__`. But I
>>> don't think formalizing `__numpy_implementation__` as a way to get access
>>> to NumPy's default implementation will limit our future options here.
>>>
>>> Cheers,
>>> Stephan
>>>
>>>
>>> On Tue, Apr 16, 2019 at 6:44 AM Marten van Kerkwijk <
>>> m.h.vankerkw...@gmail.com> wrote:
>>>

 I somewhat share Nathaniel's worry that by providing
 `__numpy_implementation__` we essentially get stuck with the
 implementations we have currently, rather than having the hoped-for freedom
 to remove all the `np.asarray` coercion. In that respect, an advantage of
 using `_wrapped` is that it is clearly a private method, so anybody is
 automatically forewarned that this can change.

 In principle, ndarray.__array_function__ would be more logical, but as
 noted in the PR, the problem is that it is non-trivial for a regular
 __array_function__ implementation to coerce all the arguments to ndarray
 itself.

 Which suggests that perhaps what is missing is a general routine that
 does that, i.e., that re-uses the dispatcher.

 -- Marten
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@python.org
 https://mail.python.org/mailman/listinfo/numpy-discussion

>>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-22 Thread Nathaniel Smith
Your last email didn't really clarify anything for me. I get that
np.func.__numpy_implementation__ is intended to have the semantics of
numpy's implementation of func, but that doesn't tell me much :-). And
also, that's exactly the definition of np.func, isn't it?

You're talking about ~doubling the size of numpy's API, and don't seem able
to even articulate what the new API's commitments are. This still makes me
nervous. Maybe it should have a NEP? What's your testing strategy for all
the new functions?

On Mon, Apr 22, 2019, 09:22 Stephan Hoyer  wrote:

> Are there still concerns here? If not, I would love to move ahead with
> these changes so we can get this into NumPy 1.17.
>
> On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer  wrote:
>
>> __numpy_implementation__ is indeed simply a slot for third-parties to
>> access NumPy's implementation. It should be considered "NumPy's current
>> implementation", not "NumPy's implementation as of 1.14". Of course, in
>> practice these will remain very similar, because we are already very
>> conservative about how we change NumPy.
>>
>> I would love to have clean well-defined coercion semantics for every
>> NumPy function, which would be implicitly adopted by
>> `__numpy_implementation__` (e.g., we could say that every function always
>> coerces its arguments with `np.asarray()`). But I think that's an
>> orthogonal issue. We have been supporting some ad-hoc duck typing in NumPy
>> for a long time (e.g., the `.sum()` method which is called by `np.sum()`).
>> Removing that would require a deprecation cycle, which may indeed be
>> warranted once we're sure we're happy with __array_function__. But I don't
>> think the deprecation cycle will be any worse if the implementation is also
>> exposed via `__numpy_implementation__`.
>>
>> We should definitely still think about a cleaner "core" implementation of
>> NumPy functions in terms of a minimal core. One recent example of this can
>> be found JAX (see
>> https://github.com/google/jax/blob/04b45e4086249bad691a33438e8bb6fcf639d001/jax/numpy/lax_numpy.py).
>> This would be something appropriate to put into a more generic function
>> attribute on NumPy functions, perhaps `__array_implementation__`. But I
>> don't think formalizing `__numpy_implementation__` as a way to get access
>> to NumPy's default implementation will limit our future options here.
>>
>> Cheers,
>> Stephan
>>
>>
>> On Tue, Apr 16, 2019 at 6:44 AM Marten van Kerkwijk <
>> m.h.vankerkw...@gmail.com> wrote:
>>
>>>
>>> I somewhat share Nathaniel's worry that by providing
>>> `__numpy_implementation__` we essentially get stuck with the
>>> implementations we have currently, rather than having the hoped-for freedom
>>> to remove all the `np.asarray` coercion. In that respect, an advantage of
>>> using `_wrapped` is that it is clearly a private method, so anybody is
>>> automatically forewarned that this can change.
>>>
>>> In principle, ndarray.__array_function__ would be more logical, but as
>>> noted in the PR, the problem is that it is non-trivial for a regular
>>> __array_function__ implementation to coerce all the arguments to ndarray
>>> itself.
>>>
>>> Which suggests that perhaps what is missing is a general routine that
>>> does that, i.e., that re-uses the dispatcher.
>>>
>>> -- Marten
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-22 Thread Stephan Hoyer
Are there still concerns here? If not, I would love to move ahead with
these changes so we can get this into NumPy 1.17.

On Tue, Apr 16, 2019 at 10:23 AM Stephan Hoyer  wrote:

> __numpy_implementation__ is indeed simply a slot for third-parties to
> access NumPy's implementation. It should be considered "NumPy's current
> implementation", not "NumPy's implementation as of 1.14". Of course, in
> practice these will remain very similar, because we are already very
> conservative about how we change NumPy.
>
> I would love to have clean well-defined coercion semantics for every NumPy
> function, which would be implicitly adopted by `__numpy_implementation__`
> (e.g., we could say that every function always coerces its arguments with
> `np.asarray()`). But I think that's an orthogonal issue. We have been
> supporting some ad-hoc duck typing in NumPy for a long time (e.g., the
> `.sum()` method which is called by `np.sum()`). Removing that would require
> a deprecation cycle, which may indeed be warranted once we're sure we're
> happy with __array_function__. But I don't think the deprecation cycle will
> be any worse if the implementation is also exposed via
> `__numpy_implementation__`.
>
> We should definitely still think about a cleaner "core" implementation of
> NumPy functions in terms of a minimal core. One recent example of this can
> be found JAX (see
> https://github.com/google/jax/blob/04b45e4086249bad691a33438e8bb6fcf639d001/jax/numpy/lax_numpy.py).
> This would be something appropriate to put into a more generic function
> attribute on NumPy functions, perhaps `__array_implementation__`. But I
> don't think formalizing `__numpy_implementation__` as a way to get access
> to NumPy's default implementation will limit our future options here.
>
> Cheers,
> Stephan
>
>
> On Tue, Apr 16, 2019 at 6:44 AM Marten van Kerkwijk <
> m.h.vankerkw...@gmail.com> wrote:
>
>>
>> I somewhat share Nathaniel's worry that by providing
>> `__numpy_implementation__` we essentially get stuck with the
>> implementations we have currently, rather than having the hoped-for freedom
>> to remove all the `np.asarray` coercion. In that respect, an advantage of
>> using `_wrapped` is that it is clearly a private method, so anybody is
>> automatically forewarned that this can change.
>>
>> In principle, ndarray.__array_function__ would be more logical, but as
>> noted in the PR, the problem is that it is non-trivial for a regular
>> __array_function__ implementation to coerce all the arguments to ndarray
>> itself.
>>
>> Which suggests that perhaps what is missing is a general routine that
>> does that, i.e., that re-uses the dispatcher.
>>
>> -- Marten
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Boolean arrays with nulls?

2019-04-22 Thread Chris Barker
On Thu, Apr 18, 2019 at 10:52 AM Stuart Reynolds 
wrote:

> Is float8 a thing?
>

no, but np.float16 is -- so at least only twice as much memory as youo need
:-)

array([ nan,  inf, -inf], dtype=float16)

I think masked arrays are going to be just as much, as they need to carry
the mask.

-CHB



>
> On Thu, Apr 18, 2019 at 9:46 AM Stefan van der Walt 
> wrote:
>
>> Hi Stuart,
>>
>> On Thu, 18 Apr 2019 09:12:31 -0700, Stuart Reynolds wrote:
>> > Is there an efficient way to represent bool arrays with null entries?
>>
>> You can use the bool dtype:
>>
>> In [5]: x = np.array([True, False, True])
>>
>>
>>
>> In [6]: x
>>
>>
>> Out[6]: array([ True, False,  True])
>>
>> In [7]: x.dtype
>>
>>
>> Out[7]: dtype('bool')
>>
>> You should note that this stores one True/False value per byte, so it is
>> not optimal in terms of memory use.  There is no easy way to do
>> bit-arrays with NumPy, because we use strides to determine how to move
>> from one memory location to the next.
>>
>> See also:
>> https://www.reddit.com/r/Python/comments/5oatp5/one_bit_data_type_in_numpy/
>>
>> > What I’m hoping for is that there’s a structure that is ‘viewed’ as
>> > nan-able float data, but backed but a more efficient structures
>> > internally.
>>
>> There are good implementations of this idea, such as:
>>
>> https://github.com/ilanschnell/bitarray
>>
>> Those structures cannot typically utilize the NumPy machinery, though.
>> With the new array function interface, you should at least be able to
>> build something that has something close to the NumPy API.
>>
>> Best regards,
>> Stéfan
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion