On Sat, Apr 28, 2012 at 10:58 AM, Neal Becker <ndbeck...@gmail.com> wrote:
> Nathaniel Smith wrote: > > > On Sat, Apr 28, 2012 at 7:38 AM, Richard Hattersley > > <rhatters...@gmail.com> wrote: > >> So, assuming numpy.ndarray became a strict subclass of some new masked > >> array, it looks plausible that adding just a few checks to > numpy.ndarray to > >> exclude the masked superclass would prevent much downstream code from > >> accidentally operating on masked arrays. > > > > I think the main point I was trying to make is that it's the existence > > and content of these checks that matters. They don't necessarily have > > any relation at all to which thing Python calls a "superclass" or a > > "subclass". > > > > -- Nathaniel > > I don't agree with the argument that ma should be a superclass of ndarray. > It > is ma that is adding features. That makes it a subclass. We're not > talking > mathematics here. > It isn't a subclass either. In a true subclass, anything that worked on the base class would work equally well on a subclass *without modification*. Basically, it's an independent class with special functions that can handle combinations and ufuncs. Look at all the functions exported in numpy/ma/core.py. Inheritance really isn't an concept appropriate to this case. Pretty much all the functions are rewritten for masked arrays. Which is one reason maintenance is a hassle, lots of things have to be maintained in two places. | There is a well-known disease of OOP where everything seems to bubble up to the > top of the class hierarchy - so that the base class becomes bloated to > support > every feature needed by subclasses. I believe that's considered poor > design. > > Is there a way to support ma as a subclass of ndarray, without introducing > overhead into ndarray? Without having given this much real thought, I do > have > some idea. What are the operations that we need on arrays? The most > basic are: > > 1. element access > 2. get size (shape) > > In an OO design, these would be virtual functions (or in C, pointers to > functions). But this would introduce unacceptable overhead. > > Sure, and you would still have two different functions of almost everything. > In a generic programming design (c++ templates), we would essentially > generate 2 > copies of every function, one that operates on plain arrays, and one that > operates on masked arrays, each using the appropriate function for element > access, shape, etc. This way, no uneeded overhead is introduced, > (although the > code size is increased - but this is probably of little consequence on > modern > demand-paged OS). > > Following this approach, ma and ndarray don't have to have any inheritance > relation. OTOH, inheritance is probably useful since there are many common > features to ma and ndarray, and a lot of code could be shared. > Not many common behaviours. Analogous behaviours, perhaps. And since everything ends up written twice the best was to share code is to do it in the base class. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion