Stefan Behnel wrote: > Hi, > > I mentioned this idea before, but I think it's worth it's own thread. > > The main problem with the current type inference mechanism for existing > code is that the semantics of assignments to untyped names change, i.e. > > cdef int i = 5 > x = i > > will type 'x' as C int with type interence enabled. This will break > existing code, as I expect most Pyrex/Cython code to depend on the above > idiom for creating a Python object from a C type. > > Robert, please correct me, but IIUC, the above problem was the main reason > not to enable type inference by default (except for some remaining bugs, > but those will go away over time). > > However, type inference has many other virtues, e.g. for code like this: > > l = [] > for x in range(100): > item = do_some_complex_calculation(x) > l.append(item) > > If the type of 'l' was inferred to be a list in this code, "l.append" would > get optimised into the obvious C-API call (potentially even without a None > check), instead of generating additional fallback code for cases where l is > not a list. The same applies to the various other optimisations for builtin > types. > > Another use case is for extension types. Currently, when instantiating an > extension type, the result variable needs to be typed if you want to access > its C methods and fields later on. With type inference enabled, the > following would work without additional typing: > > cdef class MyExt: > cdef int i > > x = MyExt() > print x.i > > And, for the case that 'i' was actually declared 'public' or 'readonly', > the access would go though the C field instead of the Python descriptor. > > So my proposal is to enable type inference by default (after fixing the > remaining bugs), but only for Python types (including extension types). > That should break *very* little code, but would give a major improvement in > terms of both usability and performance. > > The existing directive would then only switch between full inference when > enabled (including C types) and no type inference at all (when disabled > explicitly). > > Comments? > You might already have implied this, but I thought I'd point out that this is more general than your example imply. When you say
x = MyExt() then the important characteristics are 1) MyExt is a callable with MyExt as return type 2) That callable is early-bound So, for instance it's OK to type here: cdef MyExt factoryfunc(): ... x = factoryfunc() Moving on to native types: - Python's floating point type is double, so it should be perfectly safe to do cdef double f(): ... x = f() # infer x as double By the same argument one might actually want to do cdef float f(): ... x = f() # infer x as *double* -- because that's what Python would use - For integers, I think we should have a directive "bigintegers" or similar. With it turned off, we blatantly assume that no integer computation will overflow ssize_t. We are then free to infer any integer type as ssize_t without loosing semantics under this constraint; i.e. x = returns_short() # infer x as ssize_t x = returns_size_t() # hmm, not sure...probably no inference? This is particularily useful for loops, as looping variables can all be set to ssize_t without further ado. - Long term, control flow analysis can be use to tighten this up: x = f() call_function(x) # do not use x again # => safe to type x as return value of f(), regardless of type Dag Sverre _______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
