On Thu, Jun 20, 2019 at 12:44 PM Allan Haldane <allanhald...@gmail.com
<mailto:allanhald...@gmail.com>> wrote:
On 6/19/19 10:19 PM, Marten van Kerkwijk wrote:
> Hi Allan,
>
> This is very impressive! I could get the tests that I wrote for my
class
> pass with yours using Quantity with what I would consider very minimal
> changes. I only could not find a good way to unmask data (I like the
> idea of setting the mask on some elements via `ma[item] = X`); is this
> on purpose?
Yes, I want to make it difficult for the user to access the garbage
values under the mask, which are often clobbered values. The only way to
"remove" a masked value is by replacing it with a new non-masked value.
> Anyway, it would seem easily at the point where I should comment
on your
> repository rather than in the mailing list!
To make further progress on this encapsulation idea I need a more
complete ducktype to pass into MaskedArray to test, so that's what I'll
work on next, when I have time. I'll either try to finish my
ArrayCollection type, or try making a simple NDunit ducktype
piggybacking on astropy's Unit.
Best,
Allan
>
> All the best,
>
> Marten
>
>
> On Wed, Jun 19, 2019 at 5:45 PM Allan Haldane
<allanhald...@gmail.com <mailto:allanhald...@gmail.com>
> <mailto:allanhald...@gmail.com <mailto:allanhald...@gmail.com>>>
wrote:
>
> On 6/18/19 2:04 PM, Marten van Kerkwijk wrote:
> >
> >
> > On Tue, Jun 18, 2019 at 12:55 PM Allan Haldane
> <allanhald...@gmail.com <mailto:allanhald...@gmail.com>
<mailto:allanhald...@gmail.com <mailto:allanhald...@gmail.com>>
> > <mailto:allanhald...@gmail.com
<mailto:allanhald...@gmail.com> <mailto:allanhald...@gmail.com
<mailto:allanhald...@gmail.com>>>>
> wrote:
> > <snip>
> >
> > > This may be too much to ask from the initializer, but, if
> so, it still
> > > seems most useful if it is made as easy as possible to do,
> say, `class
> > > MaskedQuantity(Masked, Quantity): <very few overrides>`.
> >
> > Currently MaskedArray does not accept ducktypes as
underlying
> arrays,
> > but I think it shouldn't be too hard to modify it to do so.
> Good idea!
> >
> >
> > Looking back at my trial, I see that I also never got to
duck arrays -
> > only ndarray subclasses - though I tried to make the code as
> agnostic as
> > possible.
> >
> > (Trial at
> >
>
https://github.com/astropy/astropy/compare/master...mhvk:utils-masked-class?expand=1)
> >
> > I already partly navigated this mixin-issue in the
> > "MaskedArrayCollection" class, which essentially does
> > ArrayCollection(MaskedArray(array)), and only takes about 30
> lines of
> > boilerplate. That's the backwards encapsulation order from
> what you want
> > though.
> >
> >
> > Yes, indeed, from a quick trial `MaskedArray(np.arange(3.) *
u.m,
> > mask=[True, False, False])` does indeed not have a `.unit`
attribute
> > (and cannot represent itself...); I'm not at all sure that my
> method of
> > just creating a mixed class is anything but a recipe for
disaster,
> though!
>
> Based on your suggestion I worked on this a little today, and
now my
> MaskedArray more easily encapsulates both ducktypes and ndarray
> subclasses (pushed to repo). Here's an example I got working
with masked
> units using unyt:
>
> [1]: from MaskedArray import X, MaskedArray, MaskedScalar
>
> [2]: from unyt import m, km
>
> [3]: import numpy as np
>
> [4]: uarr = MaskedArray([1., 2., 3.]*km, mask=[0,1,0])
>
> [5]: uarr
>
> MaskedArray([1., X , 3.])
> [6]: uarr + 1*m
>
> MaskedArray([1.001, X , 3.001])
> [7]: uarr.filled()
>
> unyt_array([1., 0., 3.], 'km')
> [8]: np.concatenate([uarr, 2*uarr]).filled()
> unyt_array([1., 0., 3., 2., 0., 6.], '(dimensionless)')
>
> The catch is the ducktype/subclass has to rigorously follow
numpy's
> indexing rules, including distinguishing 0d arrays from
scalars. For now
> only I used unyt in the example above since it happens to be
less strict
> about dimensionless operations than astropy.units which trips
up my
> repr code. (see below for example with astropy.units). Note in
the last
> line I lost the dimensions, but that is because unyt does not
handle
> np.concatenate. To get that to work we need a true ducktype
for units.
>
> The example above doesn't expose the ".units" attribute
outside the
> MaskedArray, and it doesn't print the units in the repr. But
you can
> access them using "filled".
>
> While I could make MaskedArray forward unknown attribute
accesses to the
> encapsulated array, that seems a bit dangerous/bug-prone at first
> glance, so probably I want to require the user to make a
MaskedArray
> subclass to do so. I've just started playing with that
(probably buggy),
> and Ive attached subclass examples for astropy.unit and unyt,
with some
> example output below.
>
> Cheers,
> Allan
>
>
>
> Example using the attached astropy unit subclass:
>
> >>> from astropy.units import m, km, s
> >>> uarr = MaskedQ(np.ones(3), units=km, mask=[0,1,0])
> >>> uarr
> MaskedQ([1., X , 1.], units=km)
> >>> uarr.units
> km
> >>> uarr + (1*m)
> MaskedQ([1.001, X , 1.001], units=km)
> >>> uarr/(1*s)
> MaskedQ([1., X , 1.], units=km / s)
> >>> (uarr*(1*m))[1:]
> MaskedQ([X , 1.], units=km m)
> >>> np.add.outer(uarr, uarr)
> MaskedQ([[2., X , 2.],
> [X , X , X ],
> [2., X , 2.]], units=km)
> >>> print(uarr)
> [1. X 1.] km m
>
> Cheers,
> Allan
>
>
> > > Even if this impossible, I think it is conceptually useful
> to think
> > > about what the masking class should do. My sense is that,
> e.g., it
> > > should not attempt to decide when an operation
succeeds or not,
> > but just
> > > "or together" input masks for regular, multiple-input
functions,
> > and let
> > > the underlying arrays skip elements for reductions by
using
> `where`
> > > (hey, I did implement that for a reason... ;-). In
> particular, it
> > > suggests one should not have things like domains and all
> that (I never
> > > understood why `MaskedArray` did that). If one wants more,
> the class
> > > should provide a method that updates the mask (a sensible
> default
> > might
> > > be `mask |= ~np.isfinite(result)` - here, the class
being masked
> > should
> > > logically support ufuncs and functions, so it can
decide what
> > "isfinite"
> > > means).
> >
> > I agree it would be nice to remove domains. It would
make life
> easier,
> > and I could remove a lot of twiddly code! I kept it in
for now to
> > minimize the behavior changes from the old MaskedArray.
> >
> >
> > That makes sense. Could be separated out to a
backwards-compatibility
> > class later.
> >
> >
> > > In any case, I would think that a basic truth should
be that
> > everything
> > > has a mask with a shape consistent with the data, so
> > > 1. Each complex numbers has just one mask, and setting
> `a.imag` with a
> > > masked array should definitely propagate the mask.
> > > 2. For a masked array with structured dtype, I'd
similarly say
> > that the
> > > default is for a mask to have the same shape as the array.
> But that
> > > something like your collection makes sense for the case
> where one
> > wants
> > > to mask items in a structure.
> >
> > Agreed that we should have a single bool per complex or
structured
> > element, and the mask shape is the same as the array shape.
> That's how I
> > implemented it. But there is still a problem with
complex.imag
> > assignment:
> >
> > >>> a = MaskedArray([1j, 2, X])
> > >>> i = a.imag
> > >>> i[:] = MaskedArray([1, X, 1])
> >
> > If we make the last line copy the mask to the original
array, what
> > should the real part of a[2] be? Conversely, if we don't
copy
> the mask,
> > what should the imag part of a[1] be? It seems like we might
> "want" the
> > masks to be OR'd instead, but then should i[2] be masked
after
> we just
> > set it to 1?
> >
> > Ah, I see the issue now... Easiest to implement and closest
in analogy
> > to a regular view would be to just let it unmask a[2] (with
> whatever is
> > in real; user beware!).
> >
> > Perhaps better would be to special-case such that `imag`
returns a
> > read-only view of the mask. Making `imag` itself read-only would
> prevent
> > possibly reasonable things like `i[np.isclose(i, 0)] = 0` - but
> there is
> > no reason this should update the mask.
> >
> > Still, neither is really satisfactory...
> >
> >
> >
> > > p.s. I started trying to implement the above "Mixin"
class; will
> > try to
> > > clean that up a bit so that at least it uses `where` and
> push it up.
> >
> > I played with "where", but didn't include it since 1.17
is not
> released.
> > To avoid duplication of effort, I've attached a diff of
what I
> tried. I
> > actually get a slight slowdown of about 10% by using
where...
> >
> >
> > Your implementation is indeed quite similar to what I got in
> > __array_ufunc__ (though one should "&" the where with ~mask).
> >
> > I think the main benefit is not to presume that whatever is
underneath
> > understands 0 or 1, i.e., avoid filling.
> >
> >
> > If you make progress with the mixin, a push is welcome. I
> imagine a
> > problem is going to be that np.isscalar doesn't work to
detect
> duck
> > scalars.
> >
> > I fear that in my attempts I've simply decided that only
array scalars
> > exist...
> >
> > -- Marten
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
<mailto:NumPy-Discussion@python.org>
<mailto:NumPy-Discussion@python.org
<mailto:NumPy-Discussion@python.org>>
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
<mailto:NumPy-Discussion@python.org>
<mailto:NumPy-Discussion@python.org
<mailto:NumPy-Discussion@python.org>>
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org>
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org <mailto:NumPy-Discussion@python.org>
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion