Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/12/2012 11:51 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 11:13 PM, Travis Oliphant wrote: Dag, Thanks for the link to your CEP. This is the first time I've seen it. You probably referenced it before, but I hadn't seen it. That CEP seems along the lines of what I was thinking of.We can make scipy follow that CEP and NumPy as well in places that it needs function pointers. I can certainly get behind it with Numba and recommend it to SciPy (and write the scipy.integrate.quad function to support it). Thanks for the CEP. Great. I'll pass this message on to the Cython list and see if anybody wants to provide input (but given the scope, it should be minor tweaks and easy to accommodate in whatever code you write). Getting back with a status update on this, the thread is still rolling and benchmarks getting taken on the Cython list. I think it will take some more time. This CEP will be incredibly important for Cython, e.g. if NumPy starts supporting it then from numpy import sin cdef double f(double x): return sin(x*x) won't be that much slower than early-binding directly with sin. It could take another couple of weeks. So for Numba I think just starting with whatever is fastest is the way to go now; and then hopefully one can have the CEP done and things ported over before a Numba or SciPy release gets into the wild without conforming to it. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
Thanks for the status update. A couple of weeks is a fine timeline to wait. Are you envisioning that the ufuncs in NumPy would have the nativecall attribute? -- Travis Oliphant (on a mobile) 512-826-7480 On Apr 19, 2012, at 6:00 AM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/12/2012 11:51 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 11:13 PM, Travis Oliphant wrote: Dag, Thanks for the link to your CEP. This is the first time I've seen it. You probably referenced it before, but I hadn't seen it. That CEP seems along the lines of what I was thinking of.We can make scipy follow that CEP and NumPy as well in places that it needs function pointers. I can certainly get behind it with Numba and recommend it to SciPy (and write the scipy.integrate.quad function to support it). Thanks for the CEP. Great. I'll pass this message on to the Cython list and see if anybody wants to provide input (but given the scope, it should be minor tweaks and easy to accommodate in whatever code you write). Getting back with a status update on this, the thread is still rolling and benchmarks getting taken on the Cython list. I think it will take some more time. This CEP will be incredibly important for Cython, e.g. if NumPy starts supporting it then from numpy import sin cdef double f(double x): return sin(x*x) won't be that much slower than early-binding directly with sin. It could take another couple of weeks. So for Numba I think just starting with whatever is fastest is the way to go now; and then hopefully one can have the CEP done and things ported over before a Numba or SciPy release gets into the wild without conforming to it. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/19/2012 04:17 PM, Travis Oliphant wrote: Thanks for the status update. A couple of weeks is a fine timeline to wait. Are you envisioning that the ufuncs in NumPy would have the nativecall attribute? I'm envisioning that they would be able to support CEP 1000, yes, but I don't think they would necesarrily use the ufunc machinery or existing implementation -- just a namespace mechanism, a way of getting the np.sin to map to the libc (or npymath?) sin function. Currently when translating In Numba, if somebody writes: from numpy import sin @numba def f(x): return sin(x * x) Then, at numbafication-time, you can presumably lift the sin object out of the module scope, look at it, and through CEP 1000 figure out that there's a d-d function pointer inside the sin object, and use that a) for type inference, b) embed a jump to the function in the generated code. (If somebody rebinds sin after the numba decorator has run, they get to keep the pieces) I don't know how you planned on doing this now, perhaps special casing a few NumPy functions? The nice thing is that through CEP 1000, numba can transparently support whatever special function I write in Cython, and dispatch to it *fast*. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
Example: lib = ctypes.CDLL('libm.dylib') address_as_integer = ctypes.cast(lib.sin, ctypes.c_void_p).value Excellent! Sorry for the hijack, thanks for rhe ride, Nadav. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Wed, Apr 11, 2012 at 10:23 PM, Travis Oliphant teoliph...@gmail.com wrote: In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. If the cython folks are worried about type-checking overhead, then PyCapsule seems sub-optimal, because it's unnecessarily complicated to determine what sort of PyCapsule you have, and then extract the actual C struct. (At a minimum, it requires two calls to non-inlineable functions, plus an unnecessary pointer indirection.) A tiny little custom class in a tiny little library that everyone can share might be better? (Bonus: a custom class could define a __call__ method that used ctypes to call the function directly, for interactive convenience/testing/etc.) -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/12/2012 07:24 PM, Nathaniel Smith wrote: On Wed, Apr 11, 2012 at 10:23 PM, Travis Oliphantteoliph...@gmail.com wrote: In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. If the cython folks are worried about type-checking overhead, then PyCapsule seems sub-optimal, because it's unnecessarily complicated to determine what sort of PyCapsule you have, and then extract the actual C struct. (At a minimum, it requires two calls to non-inlineable functions, plus an unnecessary pointer indirection.) I think this discussion is moot -- the way I reverse-engineer Travis is that there's no time for a cross-project discussion about this now. That's not too bad, Cython will go its own way (eventually), and perhaps we can merge in the future... But for the entertainment value: In my CEP [1] I descripe two access mechanisms, one slow (for which I think capsules is fine), and a faster one. Obviously, only the slow mechanism will be implemented first. So the only things I'd like changed in how Travis' want to do this is a) Storing the signature string data in the struct, rather than as a char*; void *func char string[1]; // variable-size-allocated and null-terminated b) Allow for multiple signatures in the same capsule, i.e. dd-d, ff-f, in the same capsule. A tiny little custom class in a tiny little library that everyone can share might be better? (Bonus: a custom class could define a __call__ method that used ctypes to call the function directly, for interactive convenience/testing/etc.) Having NumPy and Cython depend on a common library, and getting that to work for users, seems rather utopic to me. And if I propose that Cython have a hard dependency of NumPy for a feature as basic as calli.ng a callback object then certain people will be very angry. Anyway, in my CEP I went to great pains to avoid having to do this, with a global registration mechanism for multiple such types. Regarding your idea for the __call__, that's the exact opposite of what I'm doing in the CEP. I'm pretty sure that what I described is what we want for Cython; we will never tell our users to pass capsules around. What I want is this: @numba def f(x): return 2 * x @cython.inline def g(x): return 3 * x print f(3) print g(3) print scipy.integrate.quad(f, 0.2, 3) # fast! print scipy.integrate.quad(g, 0.2, 3) # fast! # If you really want a capsule: print f.__nativecall__ Dag [1] http://wiki.cython.org/enhancements/cep1000 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
Dag, Thanks for the link to your CEP. This is the first time I've seen it. You probably referenced it before, but I hadn't seen it. That CEP seems along the lines of what I was thinking of.We can make scipy follow that CEP and NumPy as well in places that it needs function pointers. I can certainly get behind it with Numba and recommend it to SciPy (and write the scipy.integrate.quad function to support it). Thanks for the CEP. -Travis On Apr 12, 2012, at 2:08 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 07:24 PM, Nathaniel Smith wrote: On Wed, Apr 11, 2012 at 10:23 PM, Travis Oliphantteoliph...@gmail.com wrote: In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. If the cython folks are worried about type-checking overhead, then PyCapsule seems sub-optimal, because it's unnecessarily complicated to determine what sort of PyCapsule you have, and then extract the actual C struct. (At a minimum, it requires two calls to non-inlineable functions, plus an unnecessary pointer indirection.) I think this discussion is moot -- the way I reverse-engineer Travis is that there's no time for a cross-project discussion about this now. That's not too bad, Cython will go its own way (eventually), and perhaps we can merge in the future... But for the entertainment value: In my CEP [1] I descripe two access mechanisms, one slow (for which I think capsules is fine), and a faster one. Obviously, only the slow mechanism will be implemented first. So the only things I'd like changed in how Travis' want to do this is a) Storing the signature string data in the struct, rather than as a char*; void *func char string[1]; // variable-size-allocated and null-terminated b) Allow for multiple signatures in the same capsule, i.e. dd-d, ff-f, in the same capsule. A tiny little custom class in a tiny little library that everyone can share might be better? (Bonus: a custom class could define a __call__ method that used ctypes to call the function directly, for interactive convenience/testing/etc.) Having NumPy and Cython depend on a common library, and getting that to work for users, seems rather utopic to me. And if I propose that Cython have a hard dependency of NumPy for a feature as basic as calli.ng a callback object then certain people will be very angry. Anyway, in my CEP I went to great pains to avoid having to do this, with a global registration mechanism for multiple such types. Regarding your idea for the __call__, that's the exact opposite of what I'm doing in the CEP. I'm pretty sure that what I described is what we want for Cython; we will never tell our users to pass capsules around. What I want is this: @numba def f(x): return 2 * x @cython.inline def g(x): return 3 * x print f(3) print g(3) print scipy.integrate.quad(f, 0.2, 3) # fast! print scipy.integrate.quad(g, 0.2, 3) # fast! # If you really want a capsule: print f.__nativecall__ Dag [1] http://wiki.cython.org/enhancements/cep1000 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/12/2012 11:13 PM, Travis Oliphant wrote: Dag, Thanks for the link to your CEP. This is the first time I've seen it. You probably referenced it before, but I hadn't seen it. That CEP seems along the lines of what I was thinking of.We can make scipy follow that CEP and NumPy as well in places that it needs function pointers. I can certainly get behind it with Numba and recommend it to SciPy (and write the scipy.integrate.quad function to support it). Thanks for the CEP. Great. I'll pass this message on to the Cython list and see if anybody wants to provide input (but given the scope, it should be minor tweaks and easy to accommodate in whatever code you write). You will fill in more of the holes as you implement this in Numba and SciPy of course (my feeling is they will support it before Cython; let's say I hope this happens within the next year). Dag -Travis On Apr 12, 2012, at 2:08 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 07:24 PM, Nathaniel Smith wrote: On Wed, Apr 11, 2012 at 10:23 PM, Travis Oliphantteoliph...@gmail.com wrote: In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. If the cython folks are worried about type-checking overhead, then PyCapsule seems sub-optimal, because it's unnecessarily complicated to determine what sort of PyCapsule you have, and then extract the actual C struct. (At a minimum, it requires two calls to non-inlineable functions, plus an unnecessary pointer indirection.) I think this discussion is moot -- the way I reverse-engineer Travis is that there's no time for a cross-project discussion about this now. That's not too bad, Cython will go its own way (eventually), and perhaps we can merge in the future... But for the entertainment value: In my CEP [1] I descripe two access mechanisms, one slow (for which I think capsules is fine), and a faster one. Obviously, only the slow mechanism will be implemented first. So the only things I'd like changed in how Travis' want to do this is a) Storing the signature string data in the struct, rather than as a char*; void *func char string[1]; // variable-size-allocated and null-terminated b) Allow for multiple signatures in the same capsule, i.e. dd-d, ff-f, in the same capsule. A tiny little custom class in a tiny little library that everyone can share might be better? (Bonus: a custom class could define a __call__ method that used ctypes to call the function directly, for interactive convenience/testing/etc.) Having NumPy and Cython depend on a common library, and getting that to work for users, seems rather utopic to me. And if I propose that Cython have a hard dependency of NumPy for a feature as basic as calli.ng a callback object then certain people will be very angry. Anyway, in my CEP I went to great pains to avoid having to do this, with a global registration mechanism for multiple such types. Regarding your idea for the __call__, that's the exact opposite of what I'm doing in the CEP. I'm pretty sure that what I described is what we want for Cython; we will never tell our users to pass capsules around. What I want is this: @numba def f(x): return 2 * x @cython.inline def g(x): return 3 * x print f(3) print g(3) print scipy.integrate.quad(f, 0.2, 3) # fast! print scipy.integrate.quad(g, 0.2, 3) # fast! # If you really want a capsule: print f.__nativecall__ Dag [1] http://wiki.cython.org/enhancements/cep1000 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Apr 12, 2012, at 4:51 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 11:13 PM, Travis Oliphant wrote: Dag, Thanks for the link to your CEP. This is the first time I've seen it. You probably referenced it before, but I hadn't seen it. That CEP seems along the lines of what I was thinking of.We can make scipy follow that CEP and NumPy as well in places that it needs function pointers. I can certainly get behind it with Numba and recommend it to SciPy (and write the scipy.integrate.quad function to support it). Thanks for the CEP. Great. I'll pass this message on to the Cython list and see if anybody wants to provide input (but given the scope, it should be minor tweaks and easy to accommodate in whatever code you write). You will fill in more of the holes as you implement this in Numba and SciPy of course (my feeling is they will support it before Cython; let's say I hope this happens within the next year). Very nice. This will help immensely I think.It's actually just what I was looking for. Just to be clear, by pad to sizeof(void*) alignment, you mean that after the first 2 bytes there are (sizeof(void*) - 2) bytes before the first function pointer in the memory block pointed to by the PyCObject / Capsule? Thanks, -Travis Dag -Travis On Apr 12, 2012, at 2:08 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 07:24 PM, Nathaniel Smith wrote: On Wed, Apr 11, 2012 at 10:23 PM, Travis Oliphantteoliph...@gmail.com wrote: In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. If the cython folks are worried about type-checking overhead, then PyCapsule seems sub-optimal, because it's unnecessarily complicated to determine what sort of PyCapsule you have, and then extract the actual C struct. (At a minimum, it requires two calls to non-inlineable functions, plus an unnecessary pointer indirection.) I think this discussion is moot -- the way I reverse-engineer Travis is that there's no time for a cross-project discussion about this now. That's not too bad, Cython will go its own way (eventually), and perhaps we can merge in the future... But for the entertainment value: In my CEP [1] I descripe two access mechanisms, one slow (for which I think capsules is fine), and a faster one. Obviously, only the slow mechanism will be implemented first. So the only things I'd like changed in how Travis' want to do this is a) Storing the signature string data in the struct, rather than as a char*; void *func char string[1]; // variable-size-allocated and null-terminated b) Allow for multiple signatures in the same capsule, i.e. dd-d, ff-f, in the same capsule. A tiny little custom class in a tiny little library that everyone can share might be better? (Bonus: a custom class could define a __call__ method that used ctypes to call the function directly, for interactive convenience/testing/etc.) Having NumPy and Cython depend on a common library, and getting that to work for users, seems rather utopic to me. And if I propose that Cython have a hard dependency of NumPy for a feature as basic as calli.ng a callback object then certain people will be very angry. Anyway, in my CEP I went to great pains to avoid having to do this, with a global registration mechanism for multiple such types. Regarding your idea for the __call__, that's the exact opposite of what I'm doing in the CEP. I'm pretty sure that what I described is what we want for Cython; we will never tell our users to pass capsules around. What I want is this: @numba def f(x): return 2 * x @cython.inline def g(x): return 3 * x print f(3) print g(3) print scipy.integrate.quad(f, 0.2, 3) # fast! print scipy.integrate.quad(g, 0.2, 3) # fast! # If you really want a capsule: print f.__nativecall__ Dag [1]
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/12/2012 11:55 PM, Travis Oliphant wrote: On Apr 12, 2012, at 4:51 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 11:13 PM, Travis Oliphant wrote: Dag, Thanks for the link to your CEP. This is the first time I've seen it. You probably referenced it before, but I hadn't seen it. That CEP seems along the lines of what I was thinking of.We can make scipy follow that CEP and NumPy as well in places that it needs function pointers. I can certainly get behind it with Numba and recommend it to SciPy (and write the scipy.integrate.quad function to support it). Thanks for the CEP. Great. I'll pass this message on to the Cython list and see if anybody wants to provide input (but given the scope, it should be minor tweaks and easy to accommodate in whatever code you write). You will fill in more of the holes as you implement this in Numba and SciPy of course (my feeling is they will support it before Cython; let's say I hope this happens within the next year). Very nice. This will help immensely I think.It's actually just what I was looking for. Just to be clear, by pad to sizeof(void*) alignment, you mean that after the first 2 bytes there are (sizeof(void*) - 2) bytes before the first function pointer in the memory block pointed to by the PyCObject / Capsule? You are only right if the starting address of the data is divisible by sizeof(void*). On 64-bit you would do something like func_ptr = (func_ptr_t*)((char*)descriptor 0xfff7) + 8) Hmm, not sure if I like it any longer. I don't know a priori how much alignment really matters either on modern CPUs (but in the Cython case, we would like this to be somewhat competitive with compile-time binding, so it does merit checking I think). Dag Thanks, -Travis Dag -Travis On Apr 12, 2012, at 2:08 PM, Dag Sverre Seljebotn wrote: On 04/12/2012 07:24 PM, Nathaniel Smith wrote: On Wed, Apr 11, 2012 at 10:23 PM, Travis Oliphantteoliph...@gmail.com wrote: In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. If the cython folks are worried about type-checking overhead, then PyCapsule seems sub-optimal, because it's unnecessarily complicated to determine what sort of PyCapsule you have, and then extract the actual C struct. (At a minimum, it requires two calls to non-inlineable functions, plus an unnecessary pointer indirection.) I think this discussion is moot -- the way I reverse-engineer Travis is that there's no time for a cross-project discussion about this now. That's not too bad, Cython will go its own way (eventually), and perhaps we can merge in the future... But for the entertainment value: In my CEP [1] I descripe two access mechanisms, one slow (for which I think capsules is fine), and a faster one. Obviously, only the slow mechanism will be implemented first. So the only things I'd like changed in how Travis' want to do this is a) Storing the signature string data in the struct, rather than as a char*; void *func char string[1]; // variable-size-allocated and null-terminated b) Allow for multiple signatures in the same capsule, i.e. dd-d, ff-f, in the same capsule. A tiny little custom class in a tiny little library that everyone can share might be better? (Bonus: a custom class could define a __call__ method that used ctypes to call the function directly, for interactive convenience/testing/etc.) Having NumPy and Cython depend on a common library, and getting that to work for users, seems rather utopic to me. And if I propose that Cython have a hard dependency of NumPy for a feature as basic as calli.ng a callback object then certain people will be very angry. Anyway, in my CEP I went to great pains to avoid having to do this, with a global registration mechanism for multiple such types. Regarding your idea for the __call__, that's the exact
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 02:11 AM, Travis Oliphant wrote: Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple pointer dereference above (option #1) Thoughts? I really hope we can find some project-neutral common ground, so that lots of tools (Cython, f2py, numba, C extensions in NumPy and SciPy) can agree on how to unbox callables. A new extension type in NumPy would not fit this bill I feel. I've created a specification for this; if a number of projects (the ones mentioned above) agree on this or something similar and implement support, we could propose a PEP and do it properly once it has proven itself. http://wiki.cython.org/enhancements/cep1000 In Cython, this may take the form def call_callback(object func): cdef double (*typed_func)(int) typed_func = func return typed_func(4) ...it would be awesome if passing a Numba-compiled function just worked in this example. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 02:11 AM, Travis Oliphant wrote: Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple pointer dereference above (option #1) Thoughts? I really hope we can find some project-neutral common ground, so that lots of tools (Cython, f2py, numba, C extensions in NumPy and SciPy) can agree on how to unbox callables. A new extension type in NumPy would not fit this bill I feel. I've created a specification for this; if a number of projects (the ones mentioned above) agree on this or something similar and implement support, we could propose a PEP and do it properly once it has proven itself. http://wiki.cython.org/enhancements/cep1000 In Cython, this may take the form def call_callback(object func): cdef double (*typed_func)(int) typed_func = func return typed_func(4) ...it would be awesome if passing a Numba-compiled function just worked in this example. Yes, I think we should go the Python PEP route. However, it will take some time to see that to completion (especially with ctypes already in existence). Dag, this would be a very good thing for you to champion however ;-) In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. -Travis Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/11/2012 11:00 PM, Travis Oliphant wrote: On 04/10/2012 02:11 AM, Travis Oliphant wrote: Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple pointer dereference above (option #1) Thoughts? I really hope we can find some project-neutral common ground, so that lots of tools (Cython, f2py, numba, C extensions in NumPy and SciPy) can agree on how to unbox callables. A new extension type in NumPy would not fit this bill I feel. I've created a specification for this; if a number of projects (the ones mentioned above) agree on this or something similar and implement support, we could propose a PEP and do it properly once it has proven itself. http://wiki.cython.org/enhancements/cep1000 In Cython, this may take the form def call_callback(object func): cdef double (*typed_func)(int) typed_func = func return typed_func(4) ...it would be awesome if passing a Numba-compiled function just worked in this example. Yes, I think we should go the Python PEP route. However, it will take some time to see that to completion (especially with ctypes already in existence). Dag, this would be a very good thing for you to champion however ;-) I was NOT proposing a PEP. The spec is created so that it can be implemented *now*, by the tools we control (and still be very efficient). A sci-PEP, if you will; a mutual understanding between Cython, NumPy, numba (and ideally f2py, which already has something similar, if anyone bothers). When this is implemented in all the tools we care about, we can propose something even nicer as a PEP, but that's far down the road; it'll be another couple of years before I'm on Python 3. In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to
Re: [Numpy-discussion] Getting C-function pointers from Python to C
Thoughts? I really hope we can find some project-neutral common ground, so that lots of tools (Cython, f2py, numba, C extensions in NumPy and SciPy) can agree on how to unbox callables. A new extension type in NumPy would not fit this bill I feel. I've created a specification for this; if a number of projects (the ones mentioned above) agree on this or something similar and implement support, we could propose a PEP and do it properly once it has proven itself. http://wiki.cython.org/enhancements/cep1000 In Cython, this may take the form def call_callback(object func): cdef double (*typed_func)(int) typed_func = func return typed_func(4) ...it would be awesome if passing a Numba-compiled function just worked in this example. Yes, I think we should go the Python PEP route. However, it will take some time to see that to completion (especially with ctypes already in existence). Dag, this would be a very good thing for you to champion however ;-) I was NOT proposing a PEP. The spec is created so that it can be implemented *now*, by the tools we control (and still be very efficient). A sci-PEP, if you will; a mutual understanding between Cython, NumPy, numba (and ideally f2py, which already has something similar, if anyone bothers). When this is implemented in all the tools we care about, we can propose something even nicer as a PEP, but that's far down the road; it'll be another couple of years before I'm on Python 3. Perfect :-) We are on the same page In the mean-time, I think we could do as Robert essentially suggested and just use Capsule Objects around an agreed-upon simple C-structure: int id /* Some number that can be used as a type-check */ void *func; char *string; We can then just create some nice functions to go to and from this form in NumPy ctypeslib and then use this while the Python PEP gets written and adopted. What is not clear to me is how one get from the Python callable to the capsule. This varies substantially based on the tool. Numba would do it's work and create the capsule object using it's approach. Cython would use a different approach. I would also propose to have in NumPy some basic functions that go back-and forth between this representation, ctypes, and any other useful representations that might emerge. Or do you simply intend to pass a non-callable capsule as an argument in place of the callback? I had simply intended to allow a non-callable capsule argument to be passed in instead of another call-back to any SciPy or NumPy function that can take a raw C-function pointer. Thanks, -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphant tra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times. Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking. I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. methods would be: from_ctypes (classmethod) to_ctypes and simple inline functions to get the function pointer and the signature. The other approach would be to define an interface, something like: class MyFuncWrapper: def func_pointer(requested_rettype, requested_argtypes): return an_integer fp = wrapper.func_pointer(float, (float, float)) This would be trivial to implement for ctypes functions, cython functions, and numba. For ctypes or cython you'd probably just check that the requested prototype matched the prototype for the wrapped function and otherwise raise an error. For numba you'd check if you've already compiled the function for the given type signature, and if not then you could compile it on the fly. It'd also let you wrap an entire family of ufunc loop functions at once (maybe np.add.c_func is an object that implements the above interface to return any registered add loop). OTOH maybe there are places where the code that *calls* the c function object should be adapting to its signature, rather than the other way around -- in that case you'd want some way for the c function object to advertise what signature(s) it supports. I'm not sure which way the flexibility goes for the cases you're thinking of. I feel iike I may not be putting my finger on what you're asking, though, so hopefully these random thoughts are helpful. -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Tue, Apr 10, 2012 at 01:11, Travis Oliphant teoliph...@gmail.com wrote: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; Why not just use PyCapsules? http://docs.python.org/release/2.7/c-api/capsule.html -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
Hi Travis, we've been discussing almost the exact same thing in Cython (on a workshop, not on the mailing list, I'm afraid). Our specific example-usecase was passing a Cython function to scipy.integrate. On 04/10/2012 02:57 AM, Travis Oliphant wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times. Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. There's an N where the cost of the ctypes.cast is properly amortized though, right? The ctypes.cast should only be called once at the beginning of scipy.integrate? In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking. I have seen that ctypes is nice but very slow and without a compelling C-API. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ methods would be: from_ctypes (classmethod) to_ctypes and simple inline functions to get the function pointer and the signature. This is more or less the same format we discussed for Cython functions. What we wanted to do is to write Cython code like this: cpdef double f(double x, double y): ... and when passing f to scipy.integrate, let it call the inner C function directly. We even worked with the exact same format string in our disscussions :-) Long term, in Cython we could use the type information together with LLVM to generate adapted code wherever Cython calls objects (in call-sites). So ideally we would want to agree on an API, so that Cython functions can be passed to scipy.integrate, and so that numba functions can be jumped to directly from Cython code. Comments: - PEP3118-augmented format strings should work well, and we may want to enforce a canonicalized subset (i.e. whitespace is not allowed, do not use repeat specifiers, ...anything else?) - What you propose above already do two pointer jumps (with possibly associated cache misses and stalls) if you want to validate the signature, which can be eliminated (at least from Cython's perspective). But I'll let this thread go on a little longer, to figure out the is this needed for NumPy question, before continuing on my bikeshedding on performance issues. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times.Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking.I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times. Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking. I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. It sounds sort of like you're disagreeing with me but I can't tell about what, so maybe I was unclear :-). All I was saying was that a list-of-dtype-objects was probably a better way to write down a function signature than some ad-hoc string language. In both cases you'd do some type-compatibility-checking up front and then use C calling afterwards, and I don't see why type-checking would be faster or slower for one representation than the other. (Certainly one wouldn't have to support all possible dtypes up front, the point is just that they give us more room to grow later.) -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 03:00 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times.Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking.I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. It sounds sort of like you're disagreeing with me but I can't tell about what, so maybe I was unclear :-). All I was saying was that a list-of-dtype-objects was probably a better way to write down a function signature than some ad-hoc string language. In both cases you'd do some type-compatibility-checking up front and then use C calling afterwards, and I don't see why type-checking would be faster or slower for one representation than the other. (Certainly one wouldn't have to support all possible dtypes up front, the point is just that they give us more room to grow later.) My point was that with Cython you'd get cases where there is no up-front, you have to check-and-call as essentially one operation. The Cython code above would result in something like this: if (strcmp(dd-d, signature) == 0) { /* guess on signature and have fast C dispatch for exact match */ } else { /* fall back to calling as Python object */ } The strcmp would probably be inlined and unrolled, but you get the idea. With LLVM available, and if Cython started to use it, we could generate more such branches on the fly, making it more attractive. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 03:10 PM, Dag Sverre Seljebotn wrote: On 04/10/2012 03:00 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times.Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking.I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. It sounds sort of like you're disagreeing with me but I can't tell about what, so maybe I was unclear :-). All I was saying was that a list-of-dtype-objects was probably a better way to write down a function signature than some ad-hoc string language. In both cases you'd do some type-compatibility-checking up front and then use C calling afterwards, and I don't see why type-checking would be faster or slower for one representation than the other. (Certainly one wouldn't have to support all possible dtypes Rereading this, perhaps this is the statement you seek: Yes, doing a simple strcmp is much, much faster than jumping all around in memory to check the equality of two lists of dtypes. If it is a string less than 8 bytes in length with the comparison string known at compile-time (the Cython case) then the comparison is only a couple of CPU instructions, as you can check 64 bits at the time. Dag up front, the point is just that they give us more room to grow later.) My point was that with Cython you'd get cases where there is no up-front, you have to check-and-call as essentially one operation. The Cython code above would result in something like this: if (strcmp(dd-d, signature) == 0) { /* guess on signature and have fast C dispatch for exact match */ } else { /* fall back to calling as Python object */ } The strcmp would probably be inlined and unrolled, but you get the idea. With LLVM available, and if Cython started to use it, we could generate more such branches on the fly, making it more attractive. Dag ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Tue, Apr 10, 2012 at 2:15 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 03:10 PM, Dag Sverre Seljebotn wrote: On 04/10/2012 03:00 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times. Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking. I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. It sounds sort of like you're disagreeing with me but I can't tell about what, so maybe I was unclear :-). All I was saying was that a list-of-dtype-objects was probably a better way to write down a function signature than some ad-hoc string language. In both cases you'd do some type-compatibility-checking up front and then use C calling afterwards, and I don't see why type-checking would be faster or slower for one representation than the other. (Certainly one wouldn't have to support all possible dtypes Rereading this, perhaps this is the statement you seek: Yes, doing a simple strcmp is much, much faster than jumping all around in memory to check the equality of two lists of dtypes. If it is a string less than 8 bytes in length with the comparison string known at compile-time (the Cython case) then the comparison is only a couple of CPU instructions, as you can check 64 bits at the time. Right, that's what I wasn't getting until you mentioned strcmp :-). That said, the core numpy dtypes are singletons. For this purpose, the signature could be stored as C array of PyArray_Descr*, but even if we store it in a Python tuple/list, we'd still end up with a contiguous array of PyArray_Descr*'s. (I'm assuming that we would guarantee that it was always-and-only a real PyTupleObject* here.) So for the function we're talking about, the check would compile down to doing the equivalent of a 3*pointersize-byte strcmp, instead of a 5-byte strcmp. That's admittedly worse, but I think the difference between these two comparisons is unlikely to be measurable, considering that they're followed immediately by a cache miss when we actually jump to the function pointer. -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 03:29 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 2:15 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 03:10 PM, Dag Sverre Seljebotn wrote: On 04/10/2012 03:00 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times.Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking.I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. It sounds sort of like you're disagreeing with me but I can't tell about what, so maybe I was unclear :-). All I was saying was that a list-of-dtype-objects was probably a better way to write down a function signature than some ad-hoc string language. In both cases you'd do some type-compatibility-checking up front and then use C calling afterwards, and I don't see why type-checking would be faster or slower for one representation than the other. (Certainly one wouldn't have to support all possible dtypes Rereading this, perhaps this is the statement you seek: Yes, doing a simple strcmp is much, much faster than jumping all around in memory to check the equality of two lists of dtypes. If it is a string less than 8 bytes in length with the comparison string known at compile-time (the Cython case) then the comparison is only a couple of CPU instructions, as you can check 64 bits at the time. Right, that's what I wasn't getting until you mentioned strcmp :-). That said, the core numpy dtypes are singletons. For this purpose, the signature could be stored as C array of PyArray_Descr*, but even if we store it in a Python tuple/list, we'd still end up with a contiguous array of PyArray_Descr*'s. (I'm assuming that we would guarantee that it was always-and-only a real PyTupleObject* here.) So for the function we're talking about, the check would compile down to doing the equivalent of a 3*pointersize-byte strcmp, instead of a 5-byte strcmp. That's admittedly worse, but I think the difference between these two comparisons is unlikely to be measurable, considering that they're followed immediately by a cache miss when we actually jump to the function pointer. Yes, for singletons you're almost as good off. But if you have a struct argument, say void f(double x, struct {double a, float b} y);
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On 04/10/2012 03:38 PM, Dag Sverre Seljebotn wrote: On 04/10/2012 03:29 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 2:15 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 03:10 PM, Dag Sverre Seljebotn wrote: On 04/10/2012 03:00 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:39 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 12:37 PM, Nathaniel Smith wrote: On Tue, Apr 10, 2012 at 1:57 AM, Travis Oliphanttra...@continuum.io wrote: On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times.Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. Ah, I was assuming that you'd do the cast once outside of the inner loop (at the same time you did type compatibility checking and so forth). In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking.I have seen that ctypes is nice but very slow and without a compelling C-API. Sounds reasonable to me. Probably nicer than violating ctypes's abstraction boundary, and with no real downsides. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ This looks like it's setting us up for trouble later. We already have a robust mechanism for describing types -- dtypes. We should use that instead of inventing Yet Another baby type system. We'll need to convert between this representation and dtypes anyway if you want to use these pointers for ufunc loops... and if we just use dtypes from the start, we'll avoid having to break the API the first time someone wants to pass a struct or array or something. For some of the things we'd like to do with Cython down the line, something very fast like what Travis describes is exactly what we need; specifically, if you have Cython code like cdef double f(func): return func(3.4) that may NOT be called in a loop. But I do agree that this sounds overkill for NumPy+numba at the moment; certainly for scipy.integrate where you can amortize over N function samples. But Travis perhaps has a usecase I didn't think of. It sounds sort of like you're disagreeing with me but I can't tell about what, so maybe I was unclear :-). All I was saying was that a list-of-dtype-objects was probably a better way to write down a function signature than some ad-hoc string language. In both cases you'd do some type-compatibility-checking up front and then use C calling afterwards, and I don't see why type-checking would be faster or slower for one representation than the other. (Certainly one wouldn't have to support all possible dtypes Rereading this, perhaps this is the statement you seek: Yes, doing a simple strcmp is much, much faster than jumping all around in memory to check the equality of two lists of dtypes. If it is a string less than 8 bytes in length with the comparison string known at compile-time (the Cython case) then the comparison is only a couple of CPU instructions, as you can check 64 bits at the time. Right, that's what I wasn't getting until you mentioned strcmp :-). That said, the core numpy dtypes are singletons. For this purpose, the signature could be stored as C array of PyArray_Descr*, but even if we store it in a Python tuple/list, we'd still end up with a contiguous array of PyArray_Descr*'s. (I'm assuming that we would guarantee that it was always-and-only a real PyTupleObject* here.) So for the function we're talking about, the check would compile down to doing the equivalent of a 3*pointersize-byte strcmp, instead of a 5-byte strcmp. That's admittedly worse, but I think the difference between these two comparisons is unlikely to be measurable, considering that they're followed immediately by a cache miss when we actually jump to the function pointer. Actually, I think the performance hit is a problem in the Cython case. While
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Tue, Apr 10, 2012 at 2:38 PM, Dag Sverre Seljebotn d.s.seljeb...@astro.uio.no wrote: On 04/10/2012 03:29 PM, Nathaniel Smith wrote: Right, that's what I wasn't getting until you mentioned strcmp :-). That said, the core numpy dtypes are singletons. For this purpose, the signature could be stored as C array of PyArray_Descr*, but even if we store it in a Python tuple/list, we'd still end up with a contiguous array of PyArray_Descr*'s. (I'm assuming that we would guarantee that it was always-and-only a real PyTupleObject* here.) So for the function we're talking about, the check would compile down to doing the equivalent of a 3*pointersize-byte strcmp, instead of a 5-byte strcmp. That's admittedly worse, but I think the difference between these two comparisons is unlikely to be measurable, considering that they're followed immediately by a cache miss when we actually jump to the function pointer. Yes, for singletons you're almost as good off. But if you have a struct argument, say void f(double x, struct {double a, float b} y); then PEP 3118 gives you the string dT{dd}, whereas with NumPy dtypes you won't have a singleton? I can agree that that is a minor issue though (you could always *make* NumPy dtypes always be singleton). I think the real argument is that for Cython, it just wouldn't do to rely on NumPy dtypes (or NumPy being installed at all) for something as basic as calling to a C-level function; and strings are a simple substitute. And since it is a format defined in PEP 3118, NumPy should already support these kinds of strings internally (i.e. conversion to/from dtype). Good points. PEP 3118 is more thorough than I realized. Is it actually canonical/implemented? The PEP says that all the added type syntax will be added to struct, but that doesn't seem to have happened (except for the ? character, I guess). -- Nathaniel ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
Sorry for being slow. There is (I think) a related question I raised on the skimage list: I have a cython function that calls a C callback function in a loop (one call for each pixel in an image). The C function in compiled in a different shared library (a simple C library, not a python module). I would like a python script to get the address of the C function and pass it on to the cython function as the pointer for the callback function. As I understand Travis' isue starts ones the callback address is obtained, but, is there a direct method to retrieve the address from the shared library? Nadav. From: numpy-discussion-boun...@scipy.org [numpy-discussion-boun...@scipy.org] On Behalf Of Travis Oliphant [teoliph...@gmail.com] Sent: 10 April 2012 03:11 To: Discussion of Numerical Python Subject: [Numpy-discussion] Getting C-function pointers from Python to C Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code.One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple pointer dereference above (option #1) Thoughts? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
That is rather unrelated, you better ask this again on the cython-users list (be warned that top-posting is strongly discouraged in that place). Dag -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. Nadav Horesh nad...@visionsense.com wrote: Sorry for being slow. There is (I think) a related question I raised on the skimage list: I have a cython function that calls a C callback function in a loop (one call for each pixel in an image). The C function in compiled in a different shared library (a simple C library, not a python module). I would like a python script to get the address of the C function and pass it on to the cython function as the pointer for the callback function. As I understand Travis' isue starts ones the callback address is obtained, but, is there a direct method to retrieve the address from the shared library? Nadav. _ From: numpy-discussion-boun...@scipy.org [numpy-discussion-boun...@scipy.org] On Behalf Of Travis Oliphant [teoliph...@gmail.com] Sent: 10 April 2012 03:11 To: Discussion of Numerical Python Subject: [Numpy-discussion] Getting C-function pointers from Python to C Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side. On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see). There is no ctypes C-API, for example. There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple pointer dereference above (option #1) Thoughts? -Travis _ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion _ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Apr 10, 2012, at 10:25 AM, Nadav Horesh wrote: Sorry for being slow. There is (I think) a related question I raised on the skimage list: I have a cython function that calls a C callback function in a loop (one call for each pixel in an image). The C function in compiled in a different shared library (a simple C library, not a python module). I would like a python script to get the address of the C function and pass it on to the cython function as the pointer for the callback function. As I understand Travis' isue starts ones the callback address is obtained, but, is there a direct method to retrieve the address from the shared library? There are several ways to do this. But, ctypes makes it fairly straightforward: Example: lib = ctypes.CDLL('libm.dylib') address_as_integer = ctypes.cast(lib.sin, ctypes.c_void_p).value Basically, what we are talking about is a lighter weight way to do hand this address around instead of using ctypes objects including it's heavy-weight method of creating signatures. During the lengthy PEP 3118 discussions, this question of whether to use NumPy dtypes or ctypes classes was debated in terms of how to represent data-types in the buffer protocol. Guido wisely decided to use the struct-module method of strings duly extended to cover more cases. I think this is definitely the way to go. I also noticed that the dyncall library (http://dyncall.org/) also uses strings to represent signatures (althought it uses a ) to indicate the boundary between inputs and outputs). -Travis Nadav. From: numpy-discussion-boun...@scipy.org [numpy-discussion-boun...@scipy.org] On Behalf Of Travis Oliphant [teoliph...@gmail.com] Sent: 10 April 2012 03:11 To: Discussion of Numerical Python Subject: [Numpy-discussion] Getting C-function pointers from Python to C Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple
Re: [Numpy-discussion] Getting C-function pointers from Python to C
...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? I mean, I doubt there'd be any real problem with adding this extra API to numpy, but it does seem like there might be higher priority targets :-) On Apr 10, 2012 1:12 AM, Travis Oliphant teoliph...@gmail.com wrote: Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of NumPy and * implement it with the simple pointer dereference above (option #1) Thoughts? -Travis ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Getting C-function pointers from Python to C
On Apr 9, 2012, at 7:21 PM, Nathaniel Smith wrote: ...isn't this an operation that will be performed once per compiled function? Is the overhead of the easy, robust method (calling ctypes.cast) actually measurable as compared to, you know, running an optimizing compiler? Yes, there can be significant overhead. The compiler is run once and creates the function. This function is then potentially used many, many times. Also, it is entirely conceivable that the build step happens at a separate compilation time, and Numba actually loads a pre-compiled version of the function from disk which it then uses at run-time. I have been playing with a version of this using scipy.integrate and unfortunately the overhead of ctypes.cast is rather significant --- to the point of making the code-path using these function pointers to be useless when without the ctypes.cast overhed the speed up is 3-5x. In general, I think NumPy will need its own simple function-pointer object to use when handing over raw-function pointers between Python and C. SciPy can then re-use this object which also has a useful C-API for things like signature checking.I have seen that ctypes is nice but very slow and without a compelling C-API. The kind of new C-level cfuncptr object I imagine has attributes: void *func_ptr; char *signature string /* something like 'dd-d' to indicate a function that takes two doubles and returns a double */ methods would be: from_ctypes (classmethod) to_ctypes and simple inline functions to get the function pointer and the signature. I mean, I doubt there'd be any real problem with adding this extra API to numpy, but it does seem like there might be higher priority targets :-) Not if you envision doing a lot of code-development using Numba ;-) -Travis On Apr 10, 2012 1:12 AM, Travis Oliphant teoliph...@gmail.com wrote: Hi all, Some of you are aware of Numba. Numba allows you to create the equivalent of C-function's dynamically from Python. One purpose of this system is to allow NumPy to take these functions and use them in operations like ufuncs, generalized ufuncs, file-reading, fancy-indexing, and so forth. There are actually many use-cases that one can imagine for such things. One question is how do you pass this function pointer to the C-side.On the Python side, Numba allows you to get the raw integer address of the equivalent C-function pointer that it just created out of the Python code. One can think of this as a 32- or 64-bit integer that you can cast to a C-function pointer. Now, how should this C-function pointer be passed from Python to NumPy? One approach is just to pass it as an integer --- in other words have an API in C that accepts an integer as the first argument that the internal function interprets as a C-function pointer. This is essentially what ctypes does when creating a ctypes function pointer out of: func = ctypes.CFUNCTYPE(restype, *argtypes)(integer) Of course the problem with this is that you can easily hand it integers which don't make sense and which will cause a segfault when control is passed to this function We could also piggy-back on-top of Ctypes and assume that a ctypes function-pointer object is passed in. This allows some error-checking at least and also has the benefit that one could use ctypes to access a c-function library where these functions were defined. I'm leaning towards this approach. Now, the issue is how to get the C-function pointer (that npy_intp integer) back and hand it off internally. Unfortunately, ctypes does not make it very easy to get this address (that I can see).There is no ctypes C-API, for example.There are two potential options: 1) Create an API for such Ctypes function pointers in NumPy and use the ctypes object structure. If ctypes were to ever change it's object structure we would have to adapt this API. Something like this is what is envisioned here: typedef struct { PyObject_HEAD char *b_ptr; } _cfuncptr_object; then the function pointer is: (*((void **)(((_sp_cfuncptr_object *)(obj))-b_ptr))) which could be wrapped-up into a nice little NumPy C-API call like void * Npy_ctypes_funcptr(obj) 2) Use the Python API of ctypes to do the same thing. This has the advantage of not needing to mirror the simple _cfuncptr_object structure in NumPy but it is *much* slower to get the address. It basically does the equivalent of ctypes.cast(obj, ctypes.c_void_p).value There is working code for this in the ctypes_callback branch of my scipy fork on github. I would like to propose two things: * creating a Npy_ctypes_funcptr(obj) function in the C-API of