On Aug 28, 2009, at 12:45 AM, Mattia Ziulu wrote:
> Hello everybody!
> I present you this little problem that is slowing down a project of
> mine
> (I have actually already found a way to bypass it, but it is kinda
> hack-ish).
> So, I'm wrapping a C library using Cython, in order to expose its
> functionalities to Python via a simple from [module] import *, and
> until now
> everything is looking pretty good. The underlying library is a
> simple matrix
> library, declared as
>
> typedef struct MatrixFloat{
> int rows;
> int columns;
> float *data;
> } MatrixFloat;
>
> As you probably can see, the main headache is the actual data
> access. In C,
> if I have a MatrixFloat instance named foo, I can access its data via
> conventional offset-based indexing, so e.g. foo.data[0][0] would
> actually
> be foo.data[0 * columns + 0] and so on and so forth.
> Now, let's say that I don't want this for my Python interface, but
> I'd like
> instead to do something like the conventional foo.data[0][0]
> (because of the
> goodies that Python provides, such as slicing and row assignment
> and the likes).
> After some thought and research (I am new to both Cython and
> Python) I found out
> that the common (and perhaps only?) way to expose the data
> contained in the
> wrapped struct is to use properties, and so I did something like this:
>
> cdef class MatrixFloat:
> cdef cMatrixFloat.MatrixFloat *ptr
>
> def __init__( self ):
> self.ptr = cMatrixFloat.MatrixFloat_new()
>
> property rows:
> def __get__(self):
> return self.ptr.rows
>
> def __set__(self, n):
> self.ptr.rows = n
>
> property columns:
> def __get__(self):
> return self.ptr.columns
>
> def __set__(self, n):
> self.ptr.columns = n
>
> property data:
> def __get__(self):
> a = []
> for i in xrange( self.rows*self.columns ):
> a.append( self.ptr.data[i] )
> data = [ [ None for _ in range(self.columns) ] for _ in
> range(self.rows) ]
> for i in xrange(self.columns):
> if i < self.rows:
> data[i] = a[ i*self.columns : i*self.columns
> +self.columns]
> return data
First, implement __getitem__ and __setitem__, so you can do foo[1,5]
= 17. Second, how about
property data:
def __get__(self):
cdef int i,j
return [ [ data[i+self.columns*j ] for i in range
(self.columns) ] for j in range(self.rows) ]
Which should be faster at least (though you're still doing a lot of
float -> Python object conversions.
>
>
> [... plus other methods like MatrixFloat_new() used above,
> MatrixFloat_init() to initialize
> the fields and MatrixFloat_randPopulate() to populate the
> matrix with random
> values ...]
>
> This is good, and provides the functionality what I want, except
> it's *really* slow
> (and it should be, since there are, after all, 3 or 4 for loops).
> I am somewhat limited in my choices. For example, I can't simply
> fill a list / list
> of lists / array during the initialization phase because usually
> the matrices are
> created using the methods outlined in the snippet above. It doesn't
> help also that
> the methods available for the properties (namely __get__ and
> __set__ ) don't accept
> more than one parameter.
> I partially solved this situation using the methods exposed by my
> library, like
> MatrixFloat_setElem() and _getElem(), but they do not provide
> anything fancy like
> slicing, row assignement or even sheer performance, and also
> foo.data[0][0] = n (which
> I think I can't do using properties) is easier than
> MatrixFloat_setElem( foo, 0, 0, n).
> I was wondering if anyone had to deal with a case like this one
> before and if so if
> he/she had found a simpler and more elegant solution!
>
> “Greenspun's Tenth Rule of Programming: any sufficiently
> complicated C or Fortran program contains an ad hoc informally-
> specified bug-ridden slow implementation of half of Common Lisp.”
> _______________________________________________
> Cython-dev mailing list
> [email protected]
> http://codespeak.net/mailman/listinfo/cython-dev
_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev