OK, fair enough to just drop this! In this particular case, I do not care that much about duck-typing, just about making the implementation cleaner (which it is if one doesn't have to worry about diversion via a `.sum()` method).
In a way, I brought it up mostly as a concrete example of an internal implementation which we cannot change to an objectively cleaner one because other packages rely on an out-of-date numpy API. -- Marten On Tue, Jun 11, 2019 at 11:53 AM Sebastian Berg <sebast...@sipsolutions.net> wrote: > On Tue, 2019-06-11 at 10:56 +0200, Ralf Gommers wrote: > > > > > > On Mon, Jun 10, 2019 at 7:47 PM Marten van Kerkwijk < > > m.h.vankerkw...@gmail.com> wrote: > > > Hi All, > > > > > > In https://github.com/numpy/numpy/pull/12801, Tyler has been trying > > > to use the new `where` argument for reductions to implement > > > `nansum`, etc., using simplifications that boil down to > > > `np.sum(..., where=~isnan(...))`. > > > > > > A problem that occurs is that `np.sum` will use a `.sum` method if > > > that is present, and for matrix, the `.sum` method does not take a > > > `where` argument. Since the `where` argument has been introduced > > > only recently, similar problems may happen for array mimics that > > > implement their own `.sum` method. > > > > > > The question now is what to do, with options being: > > > 1. Let's stick with the existing implementation; the speed-up is > > > not that great anyway. > > > 2. Use try/except around the new implementation and use the old one > > > if it fails. > > > 3. As (2), but emit a deprecation warning. This will help array > > > mimics, but not matrix (unless someone makes a PR; would we even > > > accept it?); > > > 4. Use the new implementation. `matrix` should be gone anyway and > > > array mimics can either update their `.sum()` method or override > > > `np.nansum` with `__array_function__`. > > > > > > Personally, I'd prefer (4), but realize that (3) is probably the > > > more safer approach, even if it is really annoying to still be > > > taking into account matrix deficiencies. > > > > > > > Honestly, I agree with Tyler's assessment when he closed the PR: > > "there's not much evidence of remarkable performance improvement, and > > many of the other nan functions would have more complicated > > implementation details that are unlikely to make the relative > > performance much better. It doesn't seem to be a priority either." > > > > The code also got more rather than less complex. So why do this? Yes > > you give a reason why it may help duck arrays, but it also breaks > > things apparently. > > There could probably be reasons for this, since replacing with 0 may > not work always for object arrays. But that is rather hypothetical > (e.g. using add as string concatenation). > > Calling `np.add.reduce` instead of the sum method could be slightly > more compatible, but could have its own set of issue. > But I suppose it is true, unless we plan on adding specialized `where` > loops or a nanadd ufunc itself the gain is likely so small that it it > arguably may make more sense to defer this until there is an an actual > benefit (and less pain due to adoption of the new protocols). > > Best, > > Sebastian > > > > > Re matrix: it's not deprecated yet and is still fairly widely used, > > so yes we'd accept a PR and no we should not break it gratuitously. > > > > Cheers, > > Ralf > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion@python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion