On May 9, 2008, at 3:13 AM, Dag Sverre Seljebotn wrote:

> I think there's a fundamental flaw to my NumPy proposal, which need
> correction. I've thought about it as "polishing the access to the
> extension type", so that after
>
> cdef object x = numpy.zeros([3,3])
> cdef numpy.ndarray y = x
>
> y will behave efficiently when treated like a Python object.
>
> There are problems to this way of thinking though: "print y.strides"
> will not work as it is a pointer-array, while "print x.strides" will
> work as it is a tuple (one has to do "print (<object>y).strides",  
> which
> is not going to fly with for the users of my NumPy project).

I think the correct way to handle this (if we want to) is to make  
y.strides transform into (<object>y).strides when coerced to a  
PyObject (assuming that y.strides is not a type we already know how  
to turn into an object).

> On the
> other hand, one has to remember to do "print y.shape" for tuple-access
> but "y.dimensions[i]" for speedy, non-Python access. (And this
> difference in behaviour comes entirely from strides having a name- 
> clash
> while shape/dimensions do not).

This is a numpy api issue, nothing to do with Cython.

>
> So while my approach has been to make the y variable act "more like a
> numpy array", I now think this is flawed. Optimizations through typing
> and access to extension structs should probably instead be treated as
> fundamentally different things, and the typical NumPy user shouldn't
> deal with a reference to the extension struct even if typing for speed
> is wanted.
>
> I'll now propose a solution for this. It has been proposed before in a
> different form (pxd shadowing etc.); I hope I succeed better now.
>
> I think that what is wanted here is to /speed up how NumPy objects are
> accessed/, and the extension type only comes into it peripherally. So
> I'd like a new syntax for modifying the compile-time behaviour of how
> objects are treated. It could look something like the following.
>
> I'm calling the keyword "compiletimefeatures" but that is for lack  
> of a
> better word, also I use cython_ndarray to avoid namespace clashes  
> (have
> ideas for allowing "ndarray" directly but I'd like to leave that  
> out of
> the discussion for now).
>
> Anyway, numpy.ndarray is an extension type like before, while
> cython_ndarray is a new "type specifier" providing extra compile-time
> optimizations to the variable that carries its type. numpy.pxd:

[...]

> Flame away :-) (yes, I see that this could be confusing to OOP.  
> However
> it is no worse than the current situation, one cannot really override
> extension type struct items either.)

You asked for it... I think this is a very bad idea. The "compile  
time features" of a type belong to the type, and putting them in some  
overlay (with a different name, that one has to know about) seems  
counterintuitive. And I really don't see any advantages (assuming the  
name clashing can be solved as above).

Here is a very simple prototype of what I think could be done in the  
pxd:

---------a.pxd----------
cdef class A:
     cdef int len
     cdef int* data

     cdef inline [final?] int __getitem__(A a, int i):
         """
         Note that subtypes can't override this.
         """
         if i < 0 or i > a.len:
             raise IndexError
         return data[i]

---------b.pyx--------

from a cimport A
cdef A(len=10) a = a([1,2,3,4,5,6,7,8,9,10]) # I'll leave the init  
function to your imagination
print a[9] # the code from __getitem__ gets inlined here, and since  
len is known the a.len is resolved to 10 at compile time.

(Here a.len tries to do a lookup first on the compile time type of a,  
and that failing the runtime type of a. The compile time types need  
not be struct members, but if they're not then they must be specified  
because the "runtime" lookup would fail.)

- Robert

_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Reply via email to