Re: [Numpy-discussion] new MaskedArray class

Stephan Hoyer Mon, 24 Jun 2019 16:21:56 -0700

On Mon, Jun 24, 2019 at 3:56 PM Allan Haldane <[email protected]>
wrote:


> I'm not at all set on that behavior and we can do something else. For
> now, I chose this way since it seemed to best match the "IGNORE" mask
> behavior.
>
> The behavior you described further above where the output row/col would
> be masked corresponds better to "NA" (propagating) mask behavior, which
> I am leaving for later implementation.


This does seem like a clean way to *implement* things, but from a user
perspective I'm not sure I would want separate classes for "IGNORE" vs "NA"
masks.

I tend to think of "IGNORE" vs "NA" as descriptions of particular
operations rather than the data itself. There are a spectrum of ways to
handle missing data, and the right way to propagating missing values is
often highly context dependent. The right way to set this is in functions
where operations are defined, not on classes that may be defined far away
from where the computation happen. For example, pandas has a "min_count"
parameter in functions for intermediate use-cases between "IGNORE" and "NA"
semantics, e.g., "take an average, unless the number of data points is
fewer than min_count."

Are there examples of existing projects that define separate user-facing
types for different styles of masks?

_______________________________________________
NumPy-Discussion mailing list
[email protected]
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new MaskedArray class

Reply via email to