Travis Oliphant writes: > This concept has as one use-case, the deferred arrays that Mark Wiebe > has proposed.
Interesting, I didn't read about that. In fact, I was playing around with a proxy wrapper for ndarrays not long ago, in order to build a tree of deferred operations that can be later optimized through numexpr once __str__ or __repr__ is called on such a deferred object. The idea was to have something like: a = np.array(...) a = defereval(a) # returns a proxy wrapper for known methods of np.ndarray b = 10 + a ** 2 print a # here the tree of deferred operations is flattened # into a string that numpexpr can use I didn't play much with it, but proxying all methods but __str__ and __repr__ (thus iterating on the original a.__dict__) seemed to suffice. The benefits I see of building this into ndarray itself is that ndarray would then be the hourglass waist of the framework. Subclassing ndarray is moderately complex right now, so I think that having a way to move some of these subclasses below the hourglass waist and not having to deal with the overloading of ndarray's UI would be a big step forward towards extension code simplicity. So, having near-zero knowledge on the internals of numpy and all new features that have been discussed here, my naive view of what the stack should contain is: * ndarray subclasses Overload indexing (e.g., data_array's named dimension elements), translating any fancy indexing into ndarray's "native" indexing methods Overload user representation (e.g., show some extra info when printing an array) * ndarray slicing and numeric operations A central point for slicing/indexing (the output should be either views or copies) A central point to control the deferral of operations (both native and extensions - see below -). In fact, I see deferred operations as just a form of copy-on-write/evaluate-on-access views (COW must be used when one of the input operands of a deferred tree of operations is modified after capturing it into such a tree). * numeric operations extensions Numeric operations should be first-class if deferred operation evaluation is to be taken to its highest potential, and thus they should be aware of an "operation evaluation engine" (as well as the other way around). If they are not (and they should be able not to be), two things can happen: - for those based only on first-class operations, it is just the root of a subtree - if more complex operations are performed (explicit looping?), they simply diminish the range of possibilities of optimizing opearation evaluation (actually producing multiple evaluation trees, or maybe simply forcing evaluation). * operation evaluation engine This would take care of evaluating the operation tree, while performing optimizations on it. Fortunately, if a sensible interface is established between this and first-class numeric operations, a first implementation can provide just the naive evaluation, and further optimizations can be provided behind the scenes. Such optimizations would provide things like operation tree simplification/reorganization, blocking (a la numexpr) and parallellization of computations. * storage access extensions Slicing in ndarray should be aware of objects represented by means other than "plain strided memory buffers": e.g., the compressed array case (where decompression could be treated with a sliding window), or deferred operation evaluation itself. In fact, as you pointed of with the MEMORY flag, both storage and operation evaluation can be subject to the common concept of deferral (accessing a compressed array is just another form of accessing computed contents, like accessing elements on a deferred array). I just hope they're all not just obvious observations of what has already been said. Lluis PS: sorry for the unnecessarily long mail -- "And it's much the same thing with knowledge, for whenever you learn something new, the whole world becomes that much richer." -- The Princess of Pure Reason, as told by Norton Juster in The Phantom Tollbooth _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion