Roland Schulz wrote: > > > On Wed, May 13, 2009 at 3:16 AM, Dag Sverre Seljebotn > <[email protected] <mailto:[email protected]>> wrote: > > > I'm planning on making time this weekend to attend part of the > > upcoming sage days, and I definitely want to participate in the > coding > > sprint and get my hands dirty with some cython development. > > > > What I'd like to work on, which would be quite relevant in my > research > > code, is to write an optimization routine that unrolls numpy array > > expressions involving slices, array operators like +-/*, and scalar > > values into a set of nested loops that avoids the use of any > temporary > > array copies. Thus something like: > > > > A[0, a:b] = B[0, a:b]*C[a:2*b:2] + D + n > > > > would get turned into some code that checked broadcasting > requirements > > and then into a loop that ran from a to b and looped over the > > appropriate vertices. Thus we would basically replicate what > blitz++ > > does with templates or fortran 90 does (so I've heard). I've > sketched > > out an approach to actually coding it up and it seems doable for a > > sage days coding project. > > > I think this would be very very useful. But it seems reinventing the > wheel to implement the actual vector math. It is not that trivial. Just > as one exmaple: sometimes intermediate buffers are faster so you need to > have a heuristic to decide that. > Why not using a C++ template library for that? One could use Blitz++ but > I think Eigen (eigen.tuxfamily.org <http://eigen.tuxfamily.org>) is > newer and significant faster (e.g. automatically uses SSE). Since > everything is in a include files no external libary requirement is added.
Thanks for pitching in; that's some very interesting thoughts! Here's my immediate reaction: - We should avoid reinventing the wheel at all costs. Heuristics for intermediate buffers in Cython is probably a no-go. - Still, relying explicitly on one specific C++ library makes me uneasy - So perhaps what is needed here is some kind of plugin system. Getting the basics working (just unoptimized, "what you would have written in C") is more trivial, and would get us up fast and allow operation without an external library. Then, if there was a good protocol for it, one would ideally be able to easily add support for other backends as well. (Which would all be required to have the same result, they would just differ in implementation detail.) On one hand, one shouldn't over-design, while on the other hand, a vector math library is a fairly sophisticated thing and one could argue that those should be made swappable if possible. > A) There's a clash with existing semantics to be resolved. For instance > if you have a NumPy "matrix" instance, then * does matrix multiplication > instead, and just implementing this now would break code. > > http://wiki.cython.org/enhancements/buffersyntax > http://thread.gmane.org/gmane.comp.python.numeric.general/28439 > > contains my thoughts on how this could be resolved. I'll use this syntax > below. > > > I think this could solved by allowing to define custom mapping of > operators/methods of python types/classes to C functions. With that one > could define that for Numpy ndarray the element wise multiplication is > used and for NumPy matrix the matrix multiplication is used. And both > could be translated into calls of Eigen or some other vector library. > > This would allow to directly transform normal NumPy, which tries to > avoid using loops, into highly optimized C with SSE without any > intermediate buffers. Well, if you write def f(np.ndarray[int] arr): print arr * arr f(np.matrix(...)) # subclass, so allowed then the Python version of that would be np.dot but the Cython version np.multiply. The problem is that we really need to know the behaviour compile-time. Should we disallow subclasses when a buffer is declared perhaps? Another perspective is that it is nice to be able to have a consistent way of working with any Python 3 buffer. This advantage is not obvious yet as there are very few such buffers, but I think in time Python might contain more and more of what NumPy does now (i.e. the functions in the math module might become "ufuncs" able to operate on any PEP 3118 buffer and so on). But I'm definitely not certain here; more input welcome! -- Dag Sverre _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
