There are three separate patches in this message plus some remarks on "stealing" reference counts at the bottom.
On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > Michael Abbott wrote: > > On Tue, 8 Jul 2008, Travis E. Oliphant wrote: > >> The first part of this patch is good. The second is not needed. > > I don't see that. > Don't forget that PyArray_FromAny consumes the reference even if it > returns with an error. Oh dear. That's not good. Well then, I need to redo my patch. Here's the new patch for ..._arrtype_new: commit 431d99f40ca200201ba59c74a88b0bd972022ff0 Author: Michael Abbott <[EMAIL PROTECTED]> Date: Tue Jul 8 10:10:59 2008 +0100 Another reference leak using PyArray_DescrFromType This change fixes two issues: a spurious ADDREF on a typecode returned from PyArray_DescrFromType and an awkward interaction with PyArray_FromAny. diff --git a/numpy/core/src/scalartypes.inc.src b/numpy/core/src/scalartypes.inc.src index 3feefc0..7d3e562 100644 --- a/numpy/core/src/scalartypes.inc.src +++ b/numpy/core/src/scalartypes.inc.src @@ -1886,7 +1886,6 @@ static PyObject * if (!PyArg_ParseTuple(args, "|O", &obj)) return NULL; typecode = PyArray_DescrFromType([EMAIL PROTECTED]@); - Py_INCREF(typecode); if (obj == NULL) { #if @default@ == 0 char *mem; @@ -1903,8 +1902,12 @@ static PyObject * goto finish; } + Py_XINCREF(typecode); arr = PyArray_FromAny(obj, typecode, 0, 0, FORCECAST, NULL); - if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) return arr; + if ((arr==NULL) || (PyArray_NDIM(arr) > 0)) { + Py_XDECREF(typecode); + return arr; + } robj = PyArray_Return((PyArrayObject *)arr); finish: I don't think we can dispense with the extra INCREF and DECREF. Looking at the uses of PyArray_FromAny I can see the motivation for this design: core/include/numpy/ndarrayobject.h has a lot of calls which take a value returned by PyArray_DescrFromType as argument. This has prompted me to take a trawl through the code to see what else is going on, and I note a couple more issues with patches below. In the patch below the problem being fixed is that the first call to PyArray_FromAny can result in the erasure of dtype *before* Py_INCREF is called. Perhaps you can argue that this only occurs when NULL is returned... diff --git a/numpy/core/blasdot/_dotblas.c b/numpy/core/blasdot/_dotblas.c index e2619b6..0b34ec7 100644 --- a/numpy/core/blasdot/_dotblas.c +++ b/numpy/core/blasdot/_dotblas.c @@ -234,9 +234,9 @@ dotblas_matrixproduct(PyObject *dummy, PyObject *args) } dtype = PyArray_DescrFromType(typenum); - ap1 = (PyArrayObject *)PyArray_FromAny(op1, dtype, 0, 0, ALIGNED, NULL); - if (ap1 == NULL) return NULL; Py_INCREF(dtype); + ap1 = (PyArrayObject *)PyArray_FromAny(op1, dtype, 0, 0, ALIGNED, NULL); + if (ap1 == NULL) { Py_DECREF(dtype); return NULL; } ap2 = (PyArrayObject *)PyArray_FromAny(op2, dtype, 0, 0, ALIGNED, NULL); if (ap2 == NULL) goto fail; The next patch deals with an interestingly subtle memory leak in _string_richcompare where if casting to a common type fails then a reference count will leaked. Actually this one has nothing to do with PyArray_FromAny, but I spotted it in passing. diff --git a/numpy/core/src/arrayobject.c b/numpy/core/src/arrayobject.c index ee4e945..2294b8d 100644 --- a/numpy/core/src/arrayobject.c +++ b/numpy/core/src/arrayobject.c @@ -4715,7 +4715,6 @@ _strings_richcompare(PyArrayObject *self, PyArrayObject *other, int cmp_op, PyObject *new; if (self->descr->type_num == PyArray_STRING && \ other->descr->type_num == PyArray_UNICODE) { - Py_INCREF(other); Py_INCREF(other->descr); new = PyArray_FromAny((PyObject *)self, other->descr, 0, 0, 0, NULL); @@ -4723,16 +4722,17 @@ _strings_richcompare(PyArrayObject *self, PyArrayObject *other, int cmp_op, return NULL; } self = (PyArrayObject *)new; + Py_INCREF(other); } else if (self->descr->type_num == PyArray_UNICODE && \ other->descr->type_num == PyArray_STRING) { - Py_INCREF(self); Py_INCREF(self->descr); new = PyArray_FromAny((PyObject *)other, self->descr, 0, 0, 0, NULL); if (new == NULL) { return NULL; } + Py_INCREF(self); other = (PyArrayObject *)new; } else { I really don't think that this design of reference count handling in PyArray_FromAny (and consequently PyArray_CheckFromAny) is a good idea. Unfortunately these seem to be part of the published API, so presumably it's too late to change this? (Otherwise I might see how the corresponding patch comes out.) Not only is this not a good idea, it's not documented in the API documentation (I'm referring to the "Guide to NumPy" book), although at least the inline comment on the implementation of PyArray_FromAny does mention that the reference is "stolen". I've been trying to find some documentation on stealing references. The Python C API reference (http://docs.python.org/api/refcountDetails.html) says Few functions steal references; the two notable exceptions are PyList_SetItem() and PyTuple_SetItem() An interesting essay on reference counting is at http://lists.blender.org/pipermail/bf-python/2005-September/003092.html In conclusion, I can't find much about the role of stealing in reference count management, but it's such a source of surprise (and frankly doesn't actually work out all that well in numpy) that I don't think it's justified. If PyList_SetItem() and PyTuple_SetItem() could remain the only examples of this it would be good. _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion