On Thu, Jun 30, 2011 at 11:51 AM, Matthew Brett <matthew.br...@gmail.com>wrote:
> Hi, > > On Thu, Jun 30, 2011 at 6:46 PM, Lluís <xscr...@gmx.net> wrote: > > Ok, I think it's time to step back and reformulate the problem by > > completely ignoring the implementation. > > > > Here we have 2 "generic" concepts (i.e., applicable to R), plus another > > extra concept that is exclusive to numpy: > > > > * Assigning np.NA to an array, cannot be undone unless through explicit > > assignment (i.e., assigning a new arbitrary value, or saving a copy of > > the original array before assigning np.NA). > > > > * np.NA values propagate by default, unless ufuncs have the "skipna = > > True" argument (or the other way around, it doesn't really matter to > > this discussion). In order to avoid passing the argument on each > > ufunc, we either have some per-array variable for the default "skipna" > > value (undesirable) or we can make a trivial ndarray subclass that > > will set the "skipna" argument on all ufuncs through the > > "_ufunc_wrapper_" mechanism. > > > > > > > > Now, numpy has the concept of views, which adds some more goodies to the > > list of concepts: > > > > * With views, two arrays can share the same physical data, so that > > assignments to any of them will be seen by others (including NA > > values). > > > > The creation of a view is explicitly stated by the user, so its > > behaviour should not be perceived as odd (after all, you asked for a > > view). > > > > The good thing is that with views you can avoid costly array copies if > > you're careful when writing into these views. > > > > > > > > Now, you can add a new concept: local/temporal/transient missing data. > > > > We can take an existing array and create a view with the new argument > > "transientna = True". > > > > Here, both the view and the "transientna = True" are explicitly stated > > by the user, so it is assumed that she already knows what this is all > > about. > > > > The difference with a regular view is that you also explicitly asked for > > local/temporal/transient NA values. > > > > * Assigning np.NA to an array view with "transientna = True" will > > *not* be seen by any of the other views (nor the "original" array), > > but anything else will still work "as usual". > > > > After all, this is what *you* asked for when using the "transientna = > > True" argument. > > > > > > > > To conclude, say that others *must not* care about whether the arrays > > they're working with have transient NA values. This way, I can create a > > view with transient NAs, set to NA some uninteresting data, and pass it > > to a routine written by someone else that sets to NA elements that, for > > example, are beyond certain threshold from the mean of the elements. > > > > This would be equivalent to storing a copy of the original array before > > passing it to this 3rd party function, only that "transientna", just as > > views, provide some handy shortcuts to avoid copies. > > > > > > My main point here is that views and local/temporal/transient NAs are > > all *explicitly* requested, so that its behaviour should not appear as > > something unexpected. > > > > Is there an agreement on this? > > Absolutely, if by 'transientna' you mean 'masked'. The discussion is > whether the NA API should be the same as the masking API. The thing > you are describing is what masking is for, and what it's always been > for, as far as I can see. We're arguing that to call this > 'transientna' instead of 'masked' confuses two concepts that are > different, to no good purpose. > > It's a hammer. If you want to hammer nails, fine, if you want hammer a bit of tubing flat, fine. It's a tool, the hammer concept if you will. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion