On 1/18/19, Steven D'Aprano <st...@pearwood.info> wrote: > On Thu, Jan 17, 2019 at 07:50:51AM -0600, eryk sun wrote: >> >> It's kind of dangerous to pass an object to C without an increment of >> its reference count. > > "Kind of dangerous?" How dangerous?
I take that back. Dangerous is too strong of a word. It can be managed if we're careful to avoid expressions like c_function(id(f())). Using py_object simply avoids that problem. Bear with me while I make a few more comments about py_object, even though it's straying off topic. For a type "O" argument (i.e. py_object is in the function's `argtypes`), we might be able to borrow the reference from the argument tuple. As implemented, however, the argument actually keeps its own reference. For example, we can observe this by calling the from_param method: >>> b = bytearray(b'spam') >>> arg = ctypes.py_object.from_param(b) >>> print(arg) <cparam 'O' at 0x7f32a49699b0> >>> print(arg._obj) bytearray(b'spam') This is due to the type "O" setfunc, which needs to keep a reference to the object when setting the value of a py_object instance. The reference is stored as the _objects attribute. (For non-simple pointer and aggregate types, _objects is instead a dict keyed by the index as a hexadecimal string.) (The getfunc and setfunc of a simple ctypes object are called to get and set the value, which also includes cases in which we don't have an actual py_object instance, such as function call arguments; pointer and array indexes; and struct and union fields. These functions are defined in Modules/_ctypes/cfield.c.) IMO, a downside of py_object is that it's a simple type, so the getfunc gets called automatically when getting fields or indexes. This is annoying for py_object since a NULL value raises ValueError. Returning None in this case isn't possible, in contrast to other simple pointer types. We can work around this by subclassing py_object. For example: >>> a1 = (ctypes.py_object * 1)() >>> a1[0] Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: PyObject is NULL py_object = type('py_object', (ctypes.py_object,), {}) >>> a2 = (py_object * 1)() >>> a2[0] <py_object object at 0x7f10dc7d9158> Then, like all ctypes pointers, a false boolean value means it's NULL: >>> bool(a2[0]) False >>> a2[0] = b'spam' >>> bool(a2[0]) True py_object doesn't help if a library holds onto the pointer and tries to use it later on. For example, with Python's C API there are functions that 'steal' a reference (with the assumption that it's a newly created object, in which case it's more like 'claiming'), such as PyTuple_SetItem. In this case, we need to increment the reference count via Py_IncRef. py_object can be returned from a callback without leaking a reference, assuming the library manages the new reference. In contrast, other types that need memory support have to leak a reference (e.g. c_wchar_p, i.e. type "Z", needs a capsule object for the wchar_t buffer). In case of a leak, we get warned with RuntimeWarning('memory leak in callback function.'). > If I am reading this correctly, I think you are saying that using id() > in this way is never(?) correct. Yes, it's incorrect, but I've been guilty of using id() like this, too, because it's convenient. Perhaps we could provide a function that's explicitly specified to return the address, if implemented. Maybe call it sys.getaddress()? In my first reply, I provided two alternatives that use ctypes to return the address instead of id(). So there's that as well. The fine print is that ctypes is optional in the standard library. Platforms and implementations don't have to support it. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com