Re: [Cython] Python type optimizations (NumPy GSoC-related)

Robert Bradshaw Mon, 12 May 2008 10:31:13 -0700

On May 11, 2008, at 2:00 PM, Dag Sverre Seljebotn wrote:

> Ok, been thinking some more. To sum up, what we're after is some  
> way of
> having the following work (A is a class with a value field, printme  
> is a
> method somehow selected for inlining and optimization for Cython):
>
> cdef A a = get_a()
> a.printme()
> a.printme()
>
> turns into
>
> cdef A a = get_a()
> print a.value
> print a.value
>
> , while this:
>
> cdef A(value=23) a = get_a()
> a.printme()
> a.printme()
>
> would be turned into
>
> cdef A a = get_a()
> if a.value != 23: raise TypeError(...) # line (a)
> print 23 # line (b)
> print 23
>
> (This is the final result, not saying it is a recipe for how it  
> happens
> magically).
>
> So, what is happening is that we want to set up some assumptions about
> an object (without actually changing the object); and have the code
> after generated by making use of those assumptions.


Yes, this is exactly what I was saying, and I think it'll be very  
easy to implement. sorry if it wasn't totally clear.

>
> I have some seperate proposals building further on Robert's ideas  
> here.
>
> 1) An explicit method for the assumptions bit. I.e:
>
> cdef class A:
>      cdef int value
>
>      cdef __assume__(self, int value):
>          if self.value != value: raise AssumptionError(...)
>
> The argument list to this function is basically a more explicit
> declaration of type arguments. The reason I want this, so explicitly
> (rather than using a more general __coerce__ which must probably  
> also be
> added) is that it hints strongly to the writer of the class A to treat
> self.value as "const". Once an assumption is made by having __assume__
> called, you cannot "back out" and change self.value. This seem to make
> it explicit that value should be treated as a const after  
> construction;
> and it leaves the contract for the name, number and types of "type
> arguments" on the class creator side rather than the caller side.
>
> In addition to just checking assumptions, this method can also put
> general constraints on the arguments (i.e., "if value < 0: raise
> ValueError(...)").
>
> Important: This does *not* address how one can make optimizations  
> later
> on, line (b), it is simply a way to insert the assumption line, line
> (a). Also, __assume__ can simply be called in the normal way.
>
> 2) With this in place, it seems ok to follow Robert's proposal and
> automatically treat fields having the same name as type arguments as
> known compile-time. The parameter list to __assume__ restricts which
> fields can be used.
>
> I am still thinking about something more explicit like
>
> cdef class A:
>      cdef int value
>      cdef int not_possible_typearg
>
>      __typearguments__ = ["value", "compiletimeonlyarg"]
>
> but it doesn't seem strictly necesarry as the argument list to
> __assume__ can serve the same role; and in a possibly more dynamic  
> way, ie

I'm generally opposed to adding extra keywords like  
__typearguments__, that's why I've been writing A(len=11) rather than  
A(11).

> cdef A(constant_alpha=True, alpha=4) a = x # ok
> print a.alpha # ok to optimize...
>
> cdef A(constant_alpha=False, alpha=4) a = x # __assume__ might  
> raise...
> # .. run-time error because of invalid combination,
> # .. so code below will never run, which is lucky because
> # .. alpha is now for some reason changing constantly.
> print a.alpha # Will be optimized but never run
>
> 3) I still want to throw __init__ into the mix. The main reason: For
> type inference, it would be nice if
>
> A = ndarray(shape=(4, 4), dtype=float64, buffer=arr)
>
> would automatically (because type arguments are somehow interlinked  
> with
> constructor arguments) be type-inferred to
>
> cdef ndarray(shape=(4,4), dtype=float64) A
> A = ndarray(shape=(4,4), dtype=float64, buffer=arr)
>
> If so, keeping () for the type argument syntax would also make more
> sense and be less confusing.
>
> Perhaps all that is needed is this rule at the type-inference stage:
>
>   * A call to a known constructor (of a type which is a candidate for
> typing) obviously leads to typing it explicitly
>   * At the same time, any arguments passed to the constructor is  
> checked
> for a match with the __assume__ signature -- the arguments that
> __assume__ can take are then put into the type arguments list.
>
> (If so, we can pretty much ignore this for now, but we have a  
> "defense"
> for using the () syntax.)

I had actually thought about the "assume" perspective too. The two  
issues I have with it is that it adds an extra "special" method  
__assume__ which could just as well be written

cdef A a = get_a()
assume(a.len = 11)

and also that it requires the use of full control flow to do any  
reasoning (e.g. there's the variable before assumption, the variable  
after, and the variable which (depending on branching) may or may not  
have been certified to have a given property. Then further  
__assumes__ would be illegal? Or just ones that contradict?) It just  
gets a lot messier than simply adding the data to the compile-time  
type of the object. Another problem is that is specifically requires  
one to declare ahead of time what compile-time assumptions can be  
made, rather than letting the user of the .pxd file specify things  
ahead of time for explicit optimization.

The mapping of __init__ parameters to type parameters (for use with  
type inference) could be arbitrarily complicated, and I don't know  
how to do that without having the compiler actually execute code at  
compile time.

- Robert




_______________________________________________
Cython-dev mailing list
[email protected]
http://codespeak.net/mailman/listinfo/cython-dev

Re: [Cython] Python type optimizations (NumPy GSoC-related)

Reply via email to