On Wed, Jun 12, 2019 at 5:55 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote:
> Hi Ralf, > > You're right, the problem is with the added keyword argument (which would > appear also if we did not still have to support the old .sum method > override but just dispatched to __array_ufunc__ with `np.add.reduce` - > maybe worse given that it seems likely the reduce method has seen much less > testing in __array_ufunc__ implementations). > > Still, I do think the question stands: we implement a `nansum` for our > ndarray class a certain way, and provide ways to override it (three now, in > fact). Is it really reasonable to expect that we wait 4 versions for other > packages to keep up with this, and thus get stuck with given internal > implementations? > > Aside: note that the present version of the nanfunctions relies on turning > the arguments into arrays and copying 0s into it - that suggests that > currently they do not work for duck arrays like Dask. > Agreed. We could safely rewrite things to use np.asarray(), without any need to worry about backends compatibility. From an API perspective, nothing would change -- we already cast inputs into base numpy arrays inside the _replace_nan() routine. > > All the best, > > Marten > > On Wed, Jun 12, 2019 at 4:32 PM Ralf Gommers <ralf.gomm...@gmail.com> > wrote: > >> >> >> On Wed, Jun 12, 2019 at 12:02 AM Stefan van der Walt < >> stef...@berkeley.edu> wrote: >> >>> On Tue, 11 Jun 2019 15:10:16 -0400, Marten van Kerkwijk wrote: >>> > In a way, I brought it up mostly as a concrete example of an internal >>> > implementation which we cannot change to an objectively cleaner one >>> because >>> > other packages rely on an out-of-date numpy API. >>> >> >> I think this is not the right way to describe the problem (see below). >> >> >>> This, and the comments Nathaniel made on the array function thread, are >>> important to take note of. Would it be worth updating NEP 18 with a >>> list of pitfalls? Or should this be a new informational NEP that >>> discusses—on a higher level—the benefits, risks, and design >>> considerations of providing protocols? >>> >> >> That would be a nice thing to do (the higher level one), but in this case >> I think the issue has little to do with NEP 18. The summary of the issue in >> this thread is a little brief, so let me try to clarify. >> >> 1. np.sum gained a new `where=` keyword in 1.17.0 >> 2. using np.sum(x) will detect a `x.sum` method if it's present and try >> to use that >> 3. the `_wrapreduction` utility that forwards the function to the method >> will compare signatures of np.sum and x.sum, and throw an error if there's >> a mismatch for any keywords that have a value other than the default >> np._NoValue >> >> Code to check this: >> >>> x1 = np.arange(5) >> >>> x2 = np.asmatrix(x1) >> >>> np.sum(x1) # works >> >>> np.sum(x2) # works >> >>> np.sum(x1, where=x1>3) # works >> >>> np.sum(x2, where=x2>3) # _wrapreduction throws TypeError >> ... >> TypeError: sum() got an unexpected keyword argument 'where' >> >> Note that this is not specific to np.matrix. Using pandas.Series you also >> get a TypeError: >> >>> y = pd.Series(x1) >> >>> np.sum(y) # works >> >>> np.sum(y, where=y>3) # pandas throws TypeError >> ... >> TypeError: sum() got an unexpected keyword argument 'where' >> >> The issue is that when we have this kind of forwarding logic, >> irrespective of how it's implemented, new keywords cannot be used until the >> array-like objects with the methods that get forwarded to gain the same >> keyword. >> >> tl;dr this is simply a cost we have to be aware of when either proposing >> to add new keywords, or when proposing any kind of dispatching logic (in >> this case `_wrapreduction`). >> >> Regarding internal use of `np.sum(..., where=)`: this should not be done >> until at least 4-5 versions from now, and preferably way longer than that. >> Because doing so will break already released versions of Pandas, Dask, and >> other libraries with array-like objects. >> >> Cheers, >> Ralf >> >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion