On 05/23/2018 04:02 PM, Eric Firing wrote:
> Bad or missing values (and situations where one wants to use a mask to
> operate on a subset of an array) are found in many domains of real life;
> do you really want python users in those domains to have to fall back on
> Matlab-style reliance on nans and/or manual mask manipulations, as the
> new maskedarray package is sidelined?

I also think that missing value support is important to include inside
numpy, just as it is included in other numerical packages like R and Julia.

The time is ripe to write a new and better MaskedArray, because
__array_ufunc__ exists now. With some other numpy devs a few months ago
we also played with rewriting MA using __array_ufunc__ and fixing up all
the bugs and inconsistencies we have discovered over time (eg, getting
rid of the Masked constant). Both Eric and I started working on some
code changes, but never submitted PRs. See a little bit of discussion
here (there was some more elsewhere I can't find now):

https://github.com/numpy/numpy/pull/9792#issuecomment-333346420

As I say there, numpy's current MA support is pretty poor compared to R
- Wes McKinney partly justified his desire to move pandas away from
numpy because of it. We have a lot to gain by implementing it nicely.

We already have an NEP discussing possible ways forward:
https://docs.scipy.org/doc/numpy-1.14.0/neps/missing-data.html

I was pretty excited by discussion above, and still am. I want to get
back to it after I finish more immediate priorities - finishing
printing/loading/saving fixes and structured array fixes.

But Masked-Array-2 is on my list of desired long-term enhancements for
numpy.

Allan


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Reply via email to