On Fri, Oct 28, 2011 at 5:05 PM, Chris.Barker <chris.bar...@noaa.gov> wrote:
> On 10/28/11 11:37 AM, Matthew Brett wrote: > > The main motivation for the alterNEP was our strong feeling that > > separating ABSENT and IGNORE was easier to comprehend and cleaner. > > I don't know about easier to comprehend, or cleaner, but it is more > feature-full. > > I see two issues here: > > 1) being able to distinguish between "ignore" and "not valid" > -- and being able to stop ignoring an ignored value. > > This could quite easily be accomplished with a mask approach -- indeed > with 8 bits, you could have 8 different possible masked states (not that > I'm suggesting that, at least not in core numpy.) > > However, with a bit-pattern approach, you simply can't implement > "ignore". Once it's been set, the previous value is lost. > > > 2) data size: A full mask takes extra space, sometimes a substantial > amount -- so a bit-pattern approach would be nice. > > > I like the idea (that I think Mark attempted to implement) that the > implementation should be hidden from the user - not necessarily entirely > hidden, but subtle enough that that casual user wouldn't need to care > about it. > > I believe the main reason it is hidden from the user is so that the implementation can be changed without impacting existing applications. What I would like to see at this point is folks trying out the software and asking questions on the list like: "I want to do A and tried B, which didn't work. Any suggestions?" In short, I want people to actually use the software to see what issues arise so that we can fix things up. Memory use is a known problem. One way to start addressing it might be to implement a "bit" arraytype. It might even be possible to prototype that on top of the existing types. Views make bit arrays a bit more interesting ;) In that case, I think if we could decide that we want both "ignore" and > "not valid" (and it seems there is a fair bit of interest in that), then > we can proceed with a mask-based approach, and develop an API that makes > as little reference to the mask as possible. > > Then a bit-pattern approach could be developed that uses the same API -- > it would not have the "ignore" option at all, but would be the same for > the "not valid" option. > > When I write this it seem entirely too complicated for both the > developers and users, but maybe it's not -- it could be analogous to > what we have now: arrays can be Fortran or C ordered, contiguous or not, > be views on other arrays or not. To really make numpy dance, you need to > understand all that, but you can also do a whole lot, and write a lot of > generic code, without digging into that. > > If we do all that, maybe there could be a sparse mask implementation, > etc. as well. > > Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion