Hi Chuck, Good questions. Responses inline below...
Jason On Thu, Jun 10, 2010 at 8:26 AM, Charles R Harris <charlesr.har...@gmail.com > wrote: > > > On Wed, Jun 9, 2010 at 5:27 PM, Jason McCampbell < > jmccampb...@enthought.com> wrote: > >> Hi everyone, >> >> This is a follow-up to Travis's message on the re-factoring project from >> May 25th and the subsequent discussion. For background, I am a developer at >> Enthought working on the NumPy re-factoring project with Travis and Scott. >> The immediate goal from our perspective is to re-factor the core of NumPy >> into two architectural layers: a true core that is CPython-independent and >> an interface layer that binds the core to CPython. >> >> A design proposal is now posted on the NumPy developer wiki: >> http://projects.scipy.org/numpy/wiki/NumPyRefactoring >> >> The write-up is a fairly high-level description of what we think the split >> will look like and how to deal with issues such as memory management. There >> are also placeholders listed as 'TBD' where more investigation is still >> needed and will be filled in over time. At the end of the page there is a >> section on the C API with a link to a function-by-function breakdown of the >> C API and whether the function belongs in the interface layer, the core, or >> need to be split between the two. All functions listed as 'core' will >> continue to have an interface-level wrapper with the same name to ensure >> source-compatibility. >> >> All of this, particularly the interface/core function designations, is a >> first analysis and in flux. The goal is to get the information out and >> elicit discussion and feedback from the community. >> >> > A few thoughts came to mind while reading the initial writeup. > > 1) How is the GIL handled in the callbacks. > How to handle the GIL still requires some thought. The cleanest way, IMHO, would is for the interface layer to release the lock prior to calling into the core and then each callback function in the interface is responsible for re-acquiring it. That's straightforward to define as a rule and should work well in general, but I'm worried about potential performance issues if/when a callback is called in a loop. A few optimization points is ok, but too many and it will just be a source of heisenbugs. One other option is to just use the existing release/acquire macros in NumPy and redirect them to the interface layer. Any app that isn't CPython would just leave those callback pointers NULL. It's less disruptive but leaves some very CPython-specific behavior in the core. > 2) What about error handling? That is tricky to get right, especially in C > and with reference counting. > The error reporting functions in the core will likely look a lot like the CPython functions - they seem general enough. The biggest change is the CPython ones take a PyObject as the error type. 99% of the errors reported in NumPy use one of a half-dozen pre-defined types that are easy to translate. There is at least one case where an object type (complex number) is dynamically and used as the type, but so far I believe it's only one case. The reference counting does get a little more complex because a core routine will need to decref the core object on error and the interface layer will need to similarly detect the error and potentially do it's own decref. Each layer is still responsible for it's own clean up, but there are now two opportunities to introduce leaks. > 3) Is there a general policy as to how the reference counting should be > handled in specific functions? That is, who does the reference > incrementing/decrementing? > Both layers should implement the existing policy for the objects that it manages. Essentially a function can use it's caller's reference but needs to increment the count if it's going to store it. A new instance is returned with a refcnt of 1 and the caller needs to clean it up when it's no longer needed. But that means that if the core returns a new NpyArray instance to the interface layer, the receiving function in the interface must allocate a PyObject wrapper around it and set the wrapper's refcnt to 1 before returning it. Is that what you were asking? 4) Boost has some reference counted pointers, have you looked at them? C++ > is admittedly a very different animal for this sort of application. > There is also need to replace the usage of PyDict and other uses of CPython for basic data structures that aren't present in C. Having access to C++ for this and reference counting would be nice, but has the potential to break builds for everyone who use the C API. I think it's worth discussing for the future but a bigger (and possibly more contentious) change than we are able to take on for this project. > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion