On Sun, Jun 23, 2019 at 11:55 PM Marten van Kerkwijk < m.h.vankerkw...@gmail.com> wrote:
> Your proposal would be something like np.sum(array, >> where=np.ones_like(array))? This seems rather verbose for a common >> operation. Perhaps np.sum(array, where=True) would work, making use of >> broadcasting? (I haven't actually checked whether this is well-defined yet.) >> >> I think we'd need to consider separately the operation on the mask and on > the data. In my proposal, the data would always do `np.sum(array, > where=~mask)`, while how the mask would propagate might depend on the mask > itself, i.e., we'd have different mask types for `skipna=True` (default) > and `False` ("contagious") reductions, which differed in doing > `logical_and.reduce` or `logical_or.reduce` on the mask. > OK, I think I finally understand what you're getting at. So suppose this this how we implement it internally. Would we really insist on a user creating a new MaskedArray with a new mask object, e.g., with a GreedyMask? We could add sugar for this, but certainly array.greedy_masked().sum() is significantly less clear than array.sum(skipna=False). I'm also a little concerned about a proliferation of MaskedArray/Mask types. New types are significantly harder to understand than new functions (or new arguments on existing functions). I don't know if we have enough distinct use cases for this many types. Are there use-cases for propagating masks separately from data? If not, it >> might make sense to only define mask operations along with data, which >> could be much simpler. >> > > I had only thought about separating out the concern of mask propagation > from the "MaskedArray" class to the mask proper, but it might indeed make > things easier if the mask also did any required preparation for passing > things on to the data (such as adjusting the "where" argument in a > reduction). I also like that this way the mask can determine even before > the data what functionality is available (i.e., it could be the place from > which to return `NotImplemented` for a ufunc.at call with a masked index > argument). > You're going to have to come up with something more compelling than "separation of concerns" to convince me that this extra Mask abstraction is worthwhile. On its own, I think a separate Mask class would only obfuscate MaskedArray functions. For example, compare these two implementations of add: def add1(x, y): return MaskedArray(x.data + y.data, x.mask | y.mask) def add2(x, y): return MaskedArray(x.data + y.data, x.mask + y.mask) The second version requires that you *also* know how Mask classes work, and how they implement +. So now you need to look in at least twice as many places to understand add() for MaskedArray objects.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion