On Wed, May 13, 2009 at 3:16 AM, Dag Sverre Seljebotn <
[email protected]> wrote:

> > I'm planning on making time this weekend to attend part of the
> > upcoming sage days, and I definitely want to participate in the coding
> > sprint and get my hands dirty with some cython development.
> >
> > What I'd like to work on, which would be quite relevant in my research
> > code, is to write an optimization routine that unrolls numpy array
> > expressions involving slices, array operators like +-/*, and scalar
> > values into a set of nested loops that avoids the use of any temporary
> > array copies.  Thus something like:
> >
> > A[0, a:b] = B[0, a:b]*C[a:2*b:2] + D + n
> >
> > would get turned into some code that checked broadcasting requirements
> > and then into a loop that ran from a to b and looped over the
> > appropriate vertices.   Thus we would basically replicate what blitz++
> > does with templates or fortran 90 does (so I've heard).  I've sketched
> > out an approach to actually coding it up and it seems doable for a
> > sage days coding project.


I think this would be very very useful. But it seems reinventing the wheel
to implement the actual vector math. It is not that trivial. Just as one
exmaple: sometimes intermediate buffers are faster so you need to have a
heuristic to decide that.
Why not using a C++ template library for that? One could use Blitz++ but I
think Eigen (eigen.tuxfamily.org) is newer and significant faster (e.g.
automatically uses SSE). Since everything is in a include files no external
libary requirement is added.


> A) There's a clash with existing semantics to be resolved. For instance
> if you have a NumPy "matrix" instance, then * does matrix multiplication
> instead, and just implementing this now would break code.
>
> http://wiki.cython.org/enhancements/buffersyntax
> http://thread.gmane.org/gmane.comp.python.numeric.general/28439
>
> contains my thoughts on how this could be resolved. I'll use this syntax
> below.


I think this could solved by allowing to define custom mapping of
operators/methods of python types/classes to C functions. With that one
could define that for Numpy ndarray the element wise multiplication is used
and for NumPy matrix the matrix multiplication is used. And both could be
translated into calls of Eigen or some other vector library.

This would allow to directly transform normal NumPy, which tries to avoid
using loops, into highly optimized C with SSE without any intermediate
buffers.

Roland


-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to