On Thu, Jun 23, 2011 at 8:00 PM, Pierre GM <pgmdevl...@gmail.com> wrote:
> > On Jun 24, 2011, at 2:42 AM, Mark Wiebe wrote: > > > On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM <pgmdevl...@gmail.com> wrote: > > Sorry y'all, I'm just commenting bits by bits: > > > > "One key problem is a lack of orthogonality with other features, for > instance creating a masked array with physical quantities can't be done > because both are separate subclasses of ndarray. The only reasonable way to > deal with this is to move the mask into the core ndarray." > > > > Meh. I did try to make it easy to use masked arrays on top of subclasses. > There's even some tests in the suite to that effect (test_subclassing). I'm > not buying the argument. > > About moving mask in the core ndarray: I had suggested back in the days > to have a mask flag/property built-in ndarrays (which would *really* have > simplified the game), but this suggestion was dismissed very quickly as > adding too much overload. I had to agree. I'm just a tad surprised the wind > has changed on that matter. > > > > Ok, I'll have to change that section then. :) > > > > I don't remember seeing mention of this ability in the documentation, but > I may not have been reading closely enough for that part. > > Or played with it ;) True, I haven't played with it all that much, but the amount I've used it and the amount I've wrestled with it during 1.6 development certainly make me feel I know something about it. ;) > > > "In the current masked array, calculations are done for the whole array, > then masks are patched up afterwords. This means that invalid calculations > sitting in masked elements can raise warnings or exceptions even though they > shouldn't, so the ufunc error handling mechanism can't be relied on." > > > > Well, there's a reason for that. Initially, I tried to guess what the > mask of the output should be from the mask of the inputs, the objective > being to avoid getting NaNs in the C array. That was easy in most cases, > but it turned out it wasn't always possible (the `power` one caused me a > lot of issues, if I recall correctly). So, for performance issues (to avoid > a lot of expensive tests), I fell back on the old concept of "compute them > all, they'll be sorted afterwards". > > Of course, that's rather clumsy an approach. But it works not too badly > when in pure Python. No doubt that a proper C implementation would work > faster. > > Oh, about using NaNs for invalid data ? Well, can't work with integers. > > > > In my proposal, NaNs stay as unmasked NaN values, instead of turning into > masked values. This is necessary for uniform treatment of all dtypes, but a > subclass could override this behavior with an extra mask modification after > arithmetic operations. > > No problem with that... > > > > `mask` property: > > Nothing to add to it. It's basically what we have now (except for the > opposite convention). > > > > Working with masked values: > > I recall some strong points back in the days for not using None to > represent missing values... > > Adding a maskedstr argument to array2string ? Mmh... I prefer a global > flag like we have now. > > > > I'm not really a fan of all the global state that NumPy keeps, I guess > I'm trying to stamp that out bit by bit as well where I can... > > Pretty convenient to define a default once for all, though. > Maybe it needs to go in both places. > > Design questions: > > Adding `masked` or whatever we call it to a number/array should result is > masked/a fully masked array, period. That way, we can have an idea that > something was wrong with the initial dataset. > > > > I'm not sure I understand what you mean, in the design adding a mask > means setting "a.mask = True", "a.mask = False", or "a.mask = <boolean > array>" in general. > > I mean that: > 0 + ma.masked = ma.masked > ma.array([1,2,3], mask=False) + ma.masked = ma.array([1,2,3], > mask=[True,True,True]) > > By extension, any operation involving a masked value should result in a > masked value. > R appears to consistently follow the model Nathaniel pointed out, and adopting the same one seems like a good idea to me. With the model in place, the desired result of these operations follows fairly naturally. > > hardmask: I never used the feature myself. I wonder if anyone did. Still, > it's a nice idea... > > > > Ok, I'll leave that out of the initial design unless someone comes up > with some strong use cases. > > Oh, it doesn't eat bread (as we say in French), so you can leave it where > it is... > Yeah, numpy.ma isn't going to disappear in a puff of smoke. -Mark > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion