OK, fair enough to just drop this!  In this particular case, I do not care
that much about duck-typing, just about making the implementation cleaner
(which it is if one doesn't have to worry about diversion via a `.sum()`
method).

In a way, I brought it up mostly as a concrete example of an internal
implementation which we cannot change to an objectively cleaner one because
other packages rely on an out-of-date numpy API.

-- Marten

On Tue, Jun 11, 2019 at 11:53 AM Sebastian Berg <sebast...@sipsolutions.net>
wrote:

> On Tue, 2019-06-11 at 10:56 +0200, Ralf Gommers wrote:
> >
> >
> > On Mon, Jun 10, 2019 at 7:47 PM Marten van Kerkwijk <
> > m.h.vankerkw...@gmail.com> wrote:
> > > Hi All,
> > >
> > > In https://github.com/numpy/numpy/pull/12801, Tyler has been trying
> > > to use the new `where` argument for reductions to implement
> > > `nansum`, etc., using simplifications that boil down to
> > > `np.sum(..., where=~isnan(...))`.
> > >
> > > A problem that occurs is that `np.sum` will use a `.sum` method if
> > > that is present, and for matrix, the `.sum` method does not take a
> > > `where` argument. Since the `where` argument has been introduced
> > > only recently, similar problems may happen for array mimics that
> > > implement their own `.sum` method.
> > >
> > > The question now is what to do, with options being:
> > > 1. Let's stick with the existing implementation; the speed-up is
> > > not that great anyway.
> > > 2. Use try/except around the new implementation and use the old one
> > > if it fails.
> > > 3. As (2), but emit a deprecation warning. This will help array
> > > mimics, but not matrix (unless someone makes a PR; would we even
> > > accept it?);
> > > 4. Use the new implementation. `matrix` should be gone anyway and
> > > array mimics can either update their `.sum()` method or override
> > > `np.nansum` with `__array_function__`.
> > >
> > > Personally, I'd prefer (4), but realize that (3) is probably the
> > > more safer approach, even if it is really annoying to still be
> > > taking into account matrix deficiencies.
> > >
> >
> > Honestly, I agree with Tyler's assessment when he closed the PR:
> > "there's not much evidence of remarkable performance improvement, and
> > many of the other nan functions would have more complicated
> > implementation details that are unlikely to make the relative
> > performance much better. It doesn't seem to be a priority either."
> >
> > The code also got more rather than less complex. So why do this? Yes
> > you give a reason why it may help duck arrays, but it also breaks
> > things apparently.
>
> There could probably be reasons for this, since replacing with 0 may
> not work always for object arrays. But that is rather hypothetical
> (e.g. using add as string concatenation).
>
> Calling `np.add.reduce` instead of the sum method could be slightly
> more compatible, but could have its own set of issue.
> But I suppose it is true, unless we plan on adding specialized `where`
> loops or a nanadd ufunc itself the gain is likely so small that it it
> arguably may make more sense to defer this until there is an an actual
> benefit (and less pain due to adoption of the new protocols).
>
> Best,
>
> Sebastian
>
> >
> > Re matrix: it's not deprecated yet and is still fairly widely used,
> > so yes we'd accept a PR and no we should not break it gratuitously.
> >
> > Cheers,
> > Ralf
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to