On Wed, Jun 12, 2019 at 5:55 PM Marten van Kerkwijk <
m.h.vankerkw...@gmail.com> wrote:

> Hi Ralf,
>
> You're right, the problem is with the added keyword argument (which would
> appear also if we did not still have to support the old .sum method
> override but just dispatched to __array_ufunc__ with `np.add.reduce` -
> maybe worse given that it seems likely the reduce method has seen much less
> testing in  __array_ufunc__ implementations).
>
> Still, I do think the question stands: we implement a `nansum` for our
> ndarray class a certain way, and provide ways to override it (three now, in
> fact). Is it really reasonable to expect that we wait 4 versions for other
> packages to keep up with this, and thus get stuck with given internal
> implementations?
>
> Aside: note that the present version of the nanfunctions relies on turning
> the arguments into arrays and copying 0s into it - that suggests that
> currently they do not work for duck arrays like Dask.
>

Agreed. We could safely rewrite things to use np.asarray(), without any
need to worry about backends compatibility. From an API perspective,
nothing would change -- we already cast inputs into base numpy arrays
inside the _replace_nan() routine.


>
> All the best,
>
> Marten
>
> On Wed, Jun 12, 2019 at 4:32 PM Ralf Gommers <ralf.gomm...@gmail.com>
> wrote:
>
>>
>>
>> On Wed, Jun 12, 2019 at 12:02 AM Stefan van der Walt <
>> stef...@berkeley.edu> wrote:
>>
>>> On Tue, 11 Jun 2019 15:10:16 -0400, Marten van Kerkwijk wrote:
>>> > In a way, I brought it up mostly as a concrete example of an internal
>>> > implementation which we cannot change to an objectively cleaner one
>>> because
>>> > other packages rely on an out-of-date numpy API.
>>>
>>
>> I think this is not the right way to describe the problem (see below).
>>
>>
>>> This, and the comments Nathaniel made on the array function thread, are
>>> important to take note of.  Would it be worth updating NEP 18 with a
>>> list of pitfalls?  Or should this be a new informational NEP that
>>> discusses—on a higher level—the benefits, risks, and design
>>> considerations of providing protocols?
>>>
>>
>> That would be a nice thing to do (the higher level one), but in this case
>> I think the issue has little to do with NEP 18. The summary of the issue in
>> this thread is a little brief, so let me try to clarify.
>>
>> 1. np.sum gained a new `where=` keyword in 1.17.0
>> 2. using np.sum(x) will detect a `x.sum` method if it's present and try
>> to use that
>> 3. the `_wrapreduction` utility that forwards the function to the method
>> will compare signatures of np.sum and x.sum, and throw an error if there's
>> a mismatch for any keywords that have a value other than the default
>> np._NoValue
>>
>> Code to check this:
>> >>> x1 = np.arange(5)
>> >>> x2 = np.asmatrix(x1)
>> >>> np.sum(x1)  # works
>> >>> np.sum(x2)  # works
>> >>> np.sum(x1, where=x1>3)  # works
>> >>> np.sum(x2, where=x2>3)  # _wrapreduction throws TypeError
>> ...
>> TypeError: sum() got an unexpected keyword argument 'where'
>>
>> Note that this is not specific to np.matrix. Using pandas.Series you also
>> get a TypeError:
>> >>> y = pd.Series(x1)
>> >>> np.sum(y)  # works
>> >>> np.sum(y, where=y>3)  # pandas throws TypeError
>> ...
>> TypeError: sum() got an unexpected keyword argument 'where'
>>
>> The issue is that when we have this kind of forwarding logic,
>> irrespective of how it's implemented, new keywords cannot be used until the
>> array-like objects with the methods that get forwarded to gain the same
>> keyword.
>>
>> tl;dr this is simply a cost we have to be aware of when either proposing
>> to add new keywords, or when proposing any kind of dispatching logic (in
>> this case `_wrapreduction`).
>>
>> Regarding internal use of  `np.sum(..., where=)`: this should not be done
>> until at least 4-5 versions from now, and preferably way longer than that.
>> Because doing so will break already released versions of Pandas, Dask, and
>> other libraries with array-like objects.
>>
>> Cheers,
>> Ralf
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to