On 04/10/2012 03:10 PM, Dag Sverre Seljebotn wrote: > On 04/10/2012 03:00 PM, Nathaniel Smith wrote: >> On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn >> <d.s.seljeb...@astro.uio.no> wrote: >>> On 04/10/2012 12:37 PM, Nathaniel Smith wrote: >>>> On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphant<tra...@continuum.io> >>>> wrote: >>>>> On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: >>>>> >>>>> ...isn't this an operation that will be performed once per compiled >>>>> function? Is the overhead of the easy, robust method (calling ctypes.cast) >>>>> actually measurable as compared to, you know, running an optimizing >>>>> compiler? >>>>> >>>>> Yes, there can be significant overhead. The compiler is run once and >>>>> creates the function. This function is then potentially used many, many >>>>> times. Also, it is entirely conceivable that the "build" step happens >>>>> at >>>>> a separate "compilation" time, and Numba actually loads a pre-compiled >>>>> version of the function from disk which it then uses at run-time. >>>>> >>>>> I have been playing with a version of this using scipy.integrate and >>>>> unfortunately the overhead of ctypes.cast is rather significant --- to the >>>>> point of making the code-path using these function pointers to be useless >>>>> when without the ctypes.cast overhed the speed up is 3-5x. >>>> >>>> Ah, I was assuming that you'd do the cast once outside of the inner >>>> loop (at the same time you did type compatibility checking and so >>>> forth). >>>> >>>>> In general, I think NumPy will need its own simple function-pointer object >>>>> to use when handing over raw-function pointers between Python and C. >>>>> SciPy >>>>> can then re-use this object which also has a useful C-API for things like >>>>> signature checking. I have seen that ctypes is nice but very slow and >>>>> without a compelling C-API. >>>> >>>> Sounds reasonable to me. Probably nicer than violating ctypes's >>>> abstraction boundary, and with no real downsides. >>>> >>>>> The kind of new C-level cfuncptr object I imagine has attributes: >>>>> >>>>> void *func_ptr; >>>>> char *signature string /* something like 'dd->d' to indicate a function >>>>> that takes two doubles and returns a double */ >>>> >>>> This looks like it's setting us up for trouble later. We already have >>>> a robust mechanism for describing types -- dtypes. We should use that >>>> instead of inventing Yet Another baby type system. We'll need to >>>> convert between this representation and dtypes anyway if you want to >>>> use these pointers for ufunc loops... and if we just use dtypes from >>>> the start, we'll avoid having to break the API the first time someone >>>> wants to pass a struct or array or something. >>> >>> For some of the things we'd like to do with Cython down the line, >>> something very fast like what Travis describes is exactly what we need; >>> specifically, if you have Cython code like >>> >>> cdef double f(func): >>> return func(3.4) >>> >>> that may NOT be called in a loop. >>> >>> But I do agree that this sounds overkill for NumPy+numba at the moment; >>> certainly for scipy.integrate where you can amortize over N function >>> samples. But Travis perhaps has a usecase I didn't think of. >> >> It sounds sort of like you're disagreeing with me but I can't tell >> about what, so maybe I was unclear :-). >> >> All I was saying was that a list-of-dtype-objects was probably a >> better way to write down a function signature than some ad-hoc string >> language. In both cases you'd do some type-compatibility-checking up >> front and then use C calling afterwards, and I don't see why >> type-checking would be faster or slower for one representation than >> the other. (Certainly one wouldn't have to support all possible dtypes
Rereading this, perhaps this is the statement you seek: Yes, doing a simple strcmp is much, much faster than jumping all around in memory to check the equality of two lists of dtypes. If it is a string less than 8 bytes in length with the comparison string known at compile-time (the Cython case) then the comparison is only a couple of CPU instructions, as you can check 64 bits at the time. Dag >> up front, the point is just that they give us more room to grow >> later.) > > My point was that with Cython you'd get cases where there is no > "up-front", you have to check-and-call as essentially one operation. The > Cython code above would result in something like this: > > if (strcmp("dd->d", signature) == 0) { > /* guess on signature and have fast C dispatch for exact match */ > } > else { > /* fall back to calling as Python object */ > } > > The strcmp would probably be inlined and unrolled, but you get the idea. > > With LLVM available, and if Cython started to use it, we could generate > more such branches on the fly, making it more attractive. > > Dag > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion