On Sat, May 12, 2012 at 3:55 PM, Dag Sverre Seljebotn < d.s.seljeb...@astro.uio.no> wrote:
> On 05/11/2012 03:37 PM, mark florisson wrote: > > On 11 May 2012 12:13, Dag Sverre Seljebotn<d.s.seljeb...@astro.uio.no> > wrote: > >> (NumPy devs: I know, I get too many ideas. But this time I *really* > believe > >> in it, I think this is going to be *huge*. And if Mark F. likes it it's > not > >> going to be without manpower; and as his mentor I'd pitch in too here > and > >> there.) > >> > >> (Mark F.: I believe this is *very* relevant to your GSoC. I certainly > don't > >> want to micro-manage your GSoC, just have your take.) > >> > >> Travis, thank you very much for those good words in the "NA-mask > >> interactions..." thread. It put most of my concerns away. If anybody is > >> leaning towards for opaqueness because of its OOP purity, I want to > refer to > >> C++ and its walled-garden of ideological purity -- it has, what, 3-4 > >> different OOP array libraries, neither of which is able to out-compete > the > >> other. Meanwhile the rest of the world happily cooperates using > pointers, > >> strides, CSR and CSC. > >> > >> Now, there are limits to what you can do with strides and pointers. > Noone's > >> denying the need for more. In my mind that's an API where you can do > >> fetch_block and put_block of cache-sized, N-dimensional blocks on an > array; > >> but it might be something slightly different. > >> > >> Here's what I'm asking: DO NOT simply keep extending ndarray and the > NumPy C > >> API to deal with this issue. > >> > >> What we need is duck-typing/polymorphism at the C level. If you keep > >> extending ndarray and the NumPy C API, what we'll have is a one-to-many > >> relationship: One provider of array technology, multiple consumers (with > >> hooks, I'm sure, but all implementations of the hook concept in the > NumPy > >> world I've seen so far are a total disaster!). > >> > >> What I think we need instead is something like PEP 3118 for the > "abstract" > >> array that is only available block-wise with getters and setters. On the > >> Cython list we've decided that what we want for CEP 1000 (for boxing > >> callbacks etc.) is to extend PyTypeObject with our own fields; we could > >> create CEP 1001 to solve this issue and make any Python object an > exporter > >> of "block-getter/setter-arrays" (better name needed). > >> > >> What would be exported is (of course) a simple vtable: > >> > >> typedef struct { > >> int (*get_block)(void *ctx, ssize_t *upper_left, ssize_t > *lower_right, > >> ...); > >> ... > >> } block_getter_setter_array_vtable; > >> > >> Let's please discuss the details *after* the fundamentals. But the > reason I > >> put void* there instead of PyObject* is that I hope this could be used > >> beyond the Python world (say, Python<->Julia); the void* would be > handed to > >> you at the time you receive the vtable (however we handle that). > > > > I suppose it would also be useful to have some way of predicting the > > output format polymorphically for the caller. E.g. dense * > > block_diagonal results in block diagonal, but dense + block_diagonal > > results in dense, etc. It might be useful for the caller to know > > whether it needs to allocate a sparse, dense or block-structured > > array. Or maybe the polymorphic function could even do the allocation. > > This needs to happen recursively of course, to avoid intermediate > > temporaries. The compiler could easily handle that, and so could numpy > > when it gets lazy evaluation. > > Ah. But that depends too on the computation to be performed too; a) > elementwise, b) axis-wise reductions, c) linear algebra... > > In my oomatrix code (please don't look at it, it's shameful) I do this > using multiple dispatch. > > I'd rather ignore this for as long as we can, only implementing "a[:] = > ..." -- I can't see how decisions here would trickle down to the API > that's used in the kernel, it's more like a pre-phase, and better > treated orthogonally. > > > I think if the heavy lifting of allocating output arrays and exporting > > these arrays work in numpy, then support in Cython could use that (I > > can already hear certain people object to more complicated array stuff > > in Cython :). Even better here would be an external project that each > > our projects could use (I still think the nditer sorting functionality > > of arrays should be numpy-agnostic and externally available). > > I agree with the separate project idea. It's trivial for NumPy to > incorporate that as one of its methods for exporting arrays, and I don't > think it makes sense to either build it into Cython, or outright depend > on NumPy. > > Here's what I'd like (working title: NumBridge?). > > - Mission: Be the "double* + shape + strides" in a world where that is > no longer enough, by providing tight, focused APIs/ABIs that are usable > across C/Fortran/Python. > > I basically want something I can quickly acquire from a NumPy array, > then pass it into my C code without dragging along all the cruft that I > don't need. > > - Written in pure C + specs, usable without Python > > - PEP 3118 "done right", basically semi-standardize the internal > Cython memoryview ABI and get something that's passable on stack > > - Get block get/put API > > - Iterator APIs > > - Utility code for exporters and clients (iteration code, axis > reordering, etc.) > > Is the scope of that insane, or is it at least worth a shot to see how > bad it is? Beyond figuring out a small subset that can be done first, > and whether performance considerations must be taken or not, there's two > complicating factors: Pluggable dtypes, memory management. Perhaps you > could come to Oslo for a couple of days to brainstorm... > > There have been musings on this list along those lines with the idea that numpy/ufuncs would be built on top of that base, so it isn't crazy ;) Perhaps it is time to take a more serious look at it. Especially if there is help to get it implemented and made available through tools such as Cython. Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion