On Wed, May 13, 2009 at 2:16 AM, Dag Sverre Seljebotn <[email protected]> wrote: > Hoyt Koepke wrote: >> Hello Robert & Dag, >> >> I'm sending this to you guys first; feel free to bump it to the whole >> list if you think it's appropriate. > > Doing so; I think at least it should be available to Kurt as well.
Thanks -- glad to see more interest in improving numerics in Cython. I don't have much to comment on other than a small pot-shot below :-) > >> I'm planning on making time this weekend to attend part of the >> upcoming sage days, and I definitely want to participate in the coding >> sprint and get my hands dirty with some cython development. >> >> What I'd like to work on, which would be quite relevant in my research >> code, is to write an optimization routine that unrolls numpy array >> expressions involving slices, array operators like +-/*, and scalar >> values into a set of nested loops that avoids the use of any temporary >> array copies. Thus something like: >> >> A[0, a:b] = B[0, a:b]*C[a:2*b:2] + D + n >> >> would get turned into some code that checked broadcasting requirements >> and then into a loop that ran from a to b and looped over the >> appropriate vertices. Thus we would basically replicate what blitz++ >> does with templates or fortran 90 does (so I've heard). I've sketched >> out an approach to actually coding it up and it seems doable for a >> sage days coding project. > > Great that you want to do this :-) > > I think doing it for a Sage days project is a bit ambitious, but you > might be able to get a good start on it. It also depends on what route > is taken, see below. > >> I guess how I see this coming out then is that I'd create a routine >> that takes the cython AST for this expression and a few configuration >> parameters and returns the code for an inline function that accepts as >> parameters all the needed buffer parameters (data ptr, strides, shape, >> and slice parameters if there are some needed (e.g. a, b above), >> constants, etc. ), and does the appropriate looping and copying. >> Part of the reason I want to do the inline function approach is that I >> want this code to be minimally wrapped up in cython's internals, as a >> tested routine to generate such code might be more generally >> applicable. Though if there's a stronger argument in favor of the >> other way, that's okay too. > > I have given this a lot of thought as well. My thoughts diverge from > yours in that I want to improve Cython numerics in lot of areas, and > that the above would come more as a natural consequence than the first > step. What I'll scetch below is thus a more "patient" route. > > My experience is that I'm not sure how good results one can get if one > is set from the beginning on getting tangible results during a > conference. A concrete goal like the above seems great for things like a > GSoC going over one or two months, but perhaps not a week -- the danger > is that one skips important parts of the design etc. just in order to > get some example working, but in the end nothing but the example > works.... (I hope I don't kill your enthusiasm off!, I just want the > perspective noted :-) ). > > Also I should not that I don't have much time the rest of this week to > be involved in designing this. > > Anyway, here's my thoughts. It's pretty much designed to give the > feeling of "oh no not all this, I just want to do something simple and > get this done" -- that's OK, just read it and write your rejoinder :-) > > A) There's a clash with existing semantics to be resolved. For instance > if you have a NumPy "matrix" instance, then * does matrix multiplication > instead, and just implementing this now would break code. > > http://wiki.cython.org/enhancements/buffersyntax > http://thread.gmane.org/gmane.comp.python.numeric.general/28439 > > contains my thoughts on how this could be resolved. I'll use this syntax > below. > > B) I think the feature could be broken down in smaller parts: > > 1) Efficient slicing (#178): > > cdef int[:,:] a > cdef int[:] b = a[3,:] > > perhaps also broadcasting? > > cdef int[:,:,:] c = a # ? > > 2) Efficient ndenumerate for-loop construct. This is the "engine" of > what you want to achieve. > > from cython import ndenumerate > cdef int[:,:,:] a = ... > cdef int[:,:,:,:] b = ... > > # a[...] = 2 * b[3,...] > for idx, value in ndenumerate(a, b[3,...]): > a[idx] = 2 * b[3, idx] > > Or, perhaps > > for a_item, b_item in ndenumerate(a, b[3,...]): > a_item.set(2*b_item.get()) > > or similar. The code making sure that the dimension with the smallest > strides is iterated innermost would be generated here (likely one would > have a while loop around a for loop; no matter how many dimensions there > are; see also NumPy's ND-generic iterators... we could also allow this > to work for "cdef int[...] a", i.e. no declared number of dimensions). > > 3) Transform "a[...] = 2 * b[3, ...]" into the construct in 2). This is > easily done using a transform once 2) is done. > > 3) Ufuncs. I was imagining it like this: > > cdef int multiply(int x, int y): return x * y > > ... > cdef int[:,:] a = ..., b = .. > a[...] = multiply(a, b) # Calls for each item! > > So basically "every function with primitive arguments is an ufunc". > (Fortran has a similar concept, but you need to declare that the > function has no side-effects -- I say let's just make N invocations part > of the spec of how this works.) > > Obviously this is very easily implemented once 2) is in place. The above strikes me as a bit too much magic under the hood -- perhaps I can get used to it. My initial preference would be to have some indication that this is a 'ufuncable' function, or is intended to be so; Cython could then check to be sure the dummy arguments & return type are primitive. (Although I reckon this check would be done during compilation once someone tries to pass a numpy array to a cdef function with primitive-type arguments.) Kurt _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
