On Wed, May 13, 2009 at 2:16 AM, Dag Sverre Seljebotn
<[email protected]> wrote:
> Hoyt Koepke wrote:
>> Hello Robert & Dag,
>>
>> I'm sending this to you guys first; feel free to bump it to the whole
>> list if you think it's appropriate.
>
> Doing so; I think at least it should be available to Kurt as well.

Thanks -- glad to see more interest in improving numerics in Cython.
I don't have much to comment on other than a small pot-shot below :-)

>
>> I'm planning on making time this weekend to attend part of the
>> upcoming sage days, and I definitely want to participate in the coding
>> sprint and get my hands dirty with some cython development.
>>
>> What I'd like to work on, which would be quite relevant in my research
>> code, is to write an optimization routine that unrolls numpy array
>> expressions involving slices, array operators like +-/*, and scalar
>> values into a set of nested loops that avoids the use of any temporary
>> array copies.  Thus something like:
>>
>> A[0, a:b] = B[0, a:b]*C[a:2*b:2] + D + n
>>
>> would get turned into some code that checked broadcasting requirements
>> and then into a loop that ran from a to b and looped over the
>> appropriate vertices.   Thus we would basically replicate what blitz++
>> does with templates or fortran 90 does (so I've heard).  I've sketched
>> out an approach to actually coding it up and it seems doable for a
>> sage days coding project.
>
> Great that you want to do this :-)
>
> I think doing it for a Sage days project is a bit ambitious, but you
> might be able to get a good start on it. It also depends on what route
> is taken, see below.
>
>> I guess how I see this coming out then is that I'd create a routine
>> that takes the cython AST for this expression and a few configuration
>> parameters and returns the code for an inline function that accepts as
>> parameters all the needed buffer parameters (data ptr, strides, shape,
>> and slice parameters if there are some needed (e.g. a, b above),
>> constants, etc. ),  and does the appropriate looping and copying.
>> Part of the reason I want to do the inline function approach is that I
>> want this code to be minimally wrapped up in cython's internals, as a
>> tested routine to generate such code might be more generally
>> applicable.  Though if there's a stronger argument in favor of the
>> other way, that's okay too.
>
> I have given this a lot of thought as well. My thoughts diverge from
> yours in that I want to improve Cython numerics in lot of areas, and
> that the above would come more as a natural consequence than the first
> step. What I'll scetch below is thus a more "patient" route.
>
> My experience is that I'm not sure how good results one can get if one
> is set from the beginning on getting tangible results during a
> conference. A concrete goal like the above seems great for things like a
> GSoC going over one or two months, but perhaps not a week -- the danger
> is that one skips important parts of the design etc. just in order to
> get some example working, but in the end nothing but the example
> works.... (I hope I don't kill your enthusiasm off!, I just want the
> perspective noted :-) ).
>
> Also I should not that I don't have much time the rest of this week to
> be involved in designing this.
>
> Anyway, here's my thoughts. It's pretty much designed to give the
> feeling of "oh no not all this, I just want to do something simple and
> get this done" -- that's OK, just read it and write your rejoinder :-)
>
> A) There's a clash with existing semantics to be resolved. For instance
> if you have a NumPy "matrix" instance, then * does matrix multiplication
> instead, and just implementing this now would break code.
>
> http://wiki.cython.org/enhancements/buffersyntax
> http://thread.gmane.org/gmane.comp.python.numeric.general/28439
>
> contains my thoughts on how this could be resolved. I'll use this syntax
> below.
>
> B) I think the feature could be broken down in smaller parts:
>
> 1) Efficient slicing (#178):
>
> cdef int[:,:] a
> cdef int[:] b = a[3,:]
>
> perhaps also broadcasting?
>
> cdef int[:,:,:] c = a # ?
>
> 2) Efficient ndenumerate for-loop construct. This is the "engine" of
> what you want to achieve.
>
> from cython import ndenumerate
> cdef int[:,:,:] a = ...
> cdef int[:,:,:,:] b = ...
>
> # a[...] = 2 * b[3,...]
> for idx, value in ndenumerate(a, b[3,...]):
>     a[idx] = 2 * b[3, idx]
>
> Or, perhaps
>
> for a_item, b_item in ndenumerate(a, b[3,...]):
>     a_item.set(2*b_item.get())
>
> or similar. The code making sure that the dimension with the smallest
> strides is iterated innermost would be generated here (likely one would
> have a while loop around a for loop; no matter how many dimensions there
> are; see also NumPy's ND-generic iterators... we could also allow this
> to work for "cdef int[...] a", i.e. no declared number of dimensions).
>
> 3) Transform "a[...] = 2 * b[3, ...]" into the construct in 2). This is
> easily done using a transform once 2) is done.
>
> 3) Ufuncs. I was imagining it like this:
>
> cdef int multiply(int x, int y): return x * y
>
> ...
> cdef int[:,:] a = ..., b = ..
> a[...] = multiply(a, b) # Calls for each item!
>
> So basically "every function with primitive arguments is an ufunc".
> (Fortran has a similar concept, but you need to declare that the
> function has no side-effects -- I say let's just make N invocations part
> of the spec of how this works.)
>
> Obviously this is very easily implemented once 2) is in place.

The above strikes me as a bit too much magic under the hood -- perhaps
I can get used to it.  My initial preference would be to have some
indication that this is a 'ufuncable' function, or is intended to be
so; Cython could then check to be sure the dummy arguments & return
type are primitive.  (Although I reckon this check would be done
during compilation once someone tries to pass a numpy array to a cdef
function with primitive-type arguments.)

Kurt
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to