Re: [Python-Dev] Py_CLEAR to avoid crashes
>> Most Py_DECREF calls are probably okay but it's going to be hard >> to find the ones that are not. > > I suppose Py_DECREF is not the only source of trouble. Many calls > to the Python API can end up calling arbitrary user code (via > __getattr__, __getitem__, etc.). Whenever an object does that, it > must be prepared to be accessed from user code. I'm guessing there > are many subtle bugs of this nature lurking. Py_DECREF is perhaps > the most common though. Maybe renaming it to > Py_DECREF_AND_RUN_EVIL_USER_CODE would help. ;-) But that's unrelated to this issue. In those other cases, the refcount won't be zero, so the object is still there. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
On Feb 18, 2008 4:52 PM, Neil Schemenauer <[EMAIL PROTECTED]> wrote: > That sucks. Most Py_DECREF calls are probably okay but it's going > to be hard to find the ones that are not. I can't think of anything > we can do to make this trap harder to fall into. Even using > Py_CLEAR as a blunt tool is not a total solution. You could still > end up with a null pointer dereference if the code is not written > carefully. > Container types (particularly lists) go through great lengths to postpone object deletion. For example, to delete a slice from a list all of the items must be copied to a temporary array, then the list object's pointers are modified, then all the Py_DECREF's are called just before returning. I have always seen this as a robustness versus efficiency issue. It's theoretically possible to set things up so that reference counter decrements are actually postponed until after the C method/slot returns, but it's slower than doing it immediately. I wonder if adding support for postponed decrements (without making it mandatory) would at least make the trap harder to fall into. For example: - maintain a global array of pending decrefs - before calling into any C method/slot, save the index of the current end-of-array (in a local C variable on the stack) - call the C method, which may call Py_DECREF_LATER(x) to append x to the global array - when the C method returns, decref anything newly appended to the array The array would grow and shrink just as a list does (O(1) amortized time to add/remove a pointer). This would simplify a number of places in listobject.c as well as remove the need for Py_TRASHCAN_*. It would be entirely optional, so anyone who is very careful and wants the speed of Py_DECREF can have it. Also, the deferment is very brief, since the decrefs occur right after the C method returns. The downside is having to store and check the global array length on every C method call (basically 3 machine instructions). The machine instructions aren't so bad, but I'm not sure about the effects on the CPU cache. So, like I said, a robustness versus performance trade-off. :-( -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Py_CLEAR to avoid crashes
> A simple way to do this would be to push objects whose > refcounts had reached 0 onto a list instead of finalizing them > immediately, and have PyEval_EvalFrameEx periodically swap > in a new to-delete list and delete the objects on the old one. Some of the memory management threads discussed something similar to this, and pointed to IBM papers on Java. By adding states like "tenatively finalizable", the cost of using multiple processors was reduced. The down side is that objects which could be released (and recycled) immediately won't be -- which slows down a fair number of real-world programs that are used to the CPython refcount model. If the resource not being immediately released is scarce (such as file handles), it gets even worse. -jJ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
I wrote: > Most Py_DECREF calls are probably okay but it's going to be hard > to find the ones that are not. I suppose Py_DECREF is not the only source of trouble. Many calls to the Python API can end up calling arbitrary user code (via __getattr__, __getitem__, etc.). Whenever an object does that, it must be prepared to be accessed from user code. I'm guessing there are many subtle bugs of this nature lurking. Py_DECREF is perhaps the most common though. Maybe renaming it to Py_DECREF_AND_RUN_EVIL_USER_CODE would help. ;-) Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
> That sucks. Most Py_DECREF calls are probably okay but it's going > to be hard to find the ones that are not. Methinks that egrep 'DECREF\([a-zA-Z0-9_]->[a-zA-Z0-9_]+\)' */*.c gives a good overview - you should never DECREF a variable on heap. In the trunk, this command finds 36 candidates. Now, some of them are in tp_dealloc methods, so they shouldn't cause problems - but it can't hurt to replace them with Py_CLEAR, either. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
On Mon, Feb 18, 2008 at 05:48:57PM +0100, Amaury Forgeot d'Arc wrote: > For example, in exception.c, BaseException_init() starts with the instruction: > Py_DECREF(self->args); > this may call __del__ on self->args Ah, I understand now. We are not talking about tp_dealloc methods (the GC takes great pains to avoid this scenario). However, any object that calls Py_DECREF outside of its tp_dealloc method must be prepared for finalizers to access it in arbitrary ways. That sucks. Most Py_DECREF calls are probably okay but it's going to be hard to find the ones that are not. I can't think of anything we can do to make this trap harder to fall into. Even using Py_CLEAR as a blunt tool is not a total solution. You could still end up with a null pointer dereference if the code is not written carefully. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
On Feb 17, 2008 12:29 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Jeffrey Yasskin wrote: > > On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: > >> Should we however intensively search and correct all of them? > >> Is there a clever way to prevent these problems globally, for example > >> by delaying finalizers "just a little"? > > > > A simple way to do this would be to push objects whose refcounts had > > reached 0 onto a list instead of finalizing them immediately, and have > > PyEval_EvalFrameEx periodically swap in a new to-delete list and > > delete the objects on the old one. > > -1. This would only soften the problem in a shallow way. It should still > be possible to create cases where the final cleanup of the object comes > "too early", e.g. when some caller of PyEval_EvalFrameEx still carries > a pointer to some object that got deleted, and then still some code can > get hold of the then-deleted object. It will just be more difficult to > trigger such crashes, depending on the period in which objects are > cleaned. Right. Changing introducing these bugs from "easy" to "possible" sounds like an improvement. Making them harder to trigger has both good and bad effects: they're harder to exploit and harder to reproduce. If we delete on every iteration, it only costs a couple memory accesses more, and should eliminate most of the non-determinism. Anyway, I saw an approach like this work well on a server I used to work on, but there were differences. In particular, there was no way to call a function to trigger deletions; you had to return out to the main loop, which made it a little more reliable than it would be in Python. I suspect it would still fix nearly all of the bugs Amaury is finding since Py_DECREF would no longer return control to the interpreter, but I could be wrong. > The only sane way to never touch deleted objects is to have no > references to them on heap anymore, and to not touch stack references > after the DECREF. > > Regards, > Martin > -- Namasté, Jeffrey Yasskin http://jeffrey.yasskin.info/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
Hello, Neil Schemenauer wrote: > Nick Coghlan <[EMAIL PROTECTED]> wrote: > > The problem is calls to Py_DECREF(self->attr) where some of the code > > invoked by __del__ manages to find a way back around to reference > > self->attr and gets access to a half-deleted object. > > Don't you mean "__del__ manages to find a way back around to self"? > If so, how can that happen? If such a reference path exists, the > reference count of self should not be zero. I don't understand why > Py_CLEAR is necessary outside of tp_clear functions. Of course we are speaking of different objects. For example, in exception.c, BaseException_init() starts with the instruction: Py_DECREF(self->args); this may call __del__ on self->args, which can execute arbitrary python code - including access to the now-invalid "args" member of the exception. class S: def __del__(self): print e.args e = BaseException(1, S()) e.__init__("hello") # segfault -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
Nick Coghlan <[EMAIL PROTECTED]> wrote: > The problem is calls to Py_DECREF(self->attr) where some of the code > invoked by __del__ manages to find a way back around to reference > self->attr and gets access to a half-deleted object. Don't you mean "__del__ manages to find a way back around to self"? If so, how can that happen? If such a reference path exists, the reference count of self should not be zero. I don't understand why Py_CLEAR is necessary outside of tp_clear functions. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
Greg Ewing wrote: > Martin v. Löwis wrote: >> when some caller of PyEval_EvalFrameEx still carries >> a pointer to some object that got deleted, and then still some code can >> get hold of the then-deleted object. > > I seem to have missed the beginning of this discussion. > I don't see what the problem is here. If a caller needs > a pointer to an object, shouldn't it be holding a > counted reference to it? The problem is calls to Py_DECREF(self->attr) where some of the code invoked by __del__ manages to find a way back around to reference self->attr and gets access to a half-deleted object. Amaury fixed a few of these recently by replacing the Py_DECREF calls with Py_CLEAR calls (and added the relevant pathological destructors to the test suite), but was wondering if there was a way to be more systematic about fixing them. About the only idea I have is to grep the source for all calls to Py_DECREF that contain a pointer deference and manually check them to see if they should use Py_CLEAR instead. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
Martin v. Löwis wrote: > when some caller of PyEval_EvalFrameEx still carries > a pointer to some object that got deleted, and then still some code can > get hold of the then-deleted object. I seem to have missed the beginning of this discussion. I don't see what the problem is here. If a caller needs a pointer to an object, shouldn't it be holding a counted reference to it? -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
Jeffrey Yasskin wrote: > On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: >> Should we however intensively search and correct all of them? >> Is there a clever way to prevent these problems globally, for example >> by delaying finalizers "just a little"? > > A simple way to do this would be to push objects whose refcounts had > reached 0 onto a list instead of finalizing them immediately, and have > PyEval_EvalFrameEx periodically swap in a new to-delete list and > delete the objects on the old one. -1. This would only soften the problem in a shallow way. It should still be possible to create cases where the final cleanup of the object comes "too early", e.g. when some caller of PyEval_EvalFrameEx still carries a pointer to some object that got deleted, and then still some code can get hold of the then-deleted object. It will just be more difficult to trigger such crashes, depending on the period in which objects are cleaned. The only sane way to never touch deleted objects is to have no references to them on heap anymore, and to not touch stack references after the DECREF. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: > Should we however intensively search and correct all of them? > Is there a clever way to prevent these problems globally, for example > by delaying finalizers "just a little"? A simple way to do this would be to push objects whose refcounts had reached 0 onto a list instead of finalizing them immediately, and have PyEval_EvalFrameEx periodically swap in a new to-delete list and delete the objects on the old one. A linked list would cost an extra pointer in PyObject_HEAD, but a growable array would only cost allocations, which would be amortized over the allocations of the objects you're deleting, so that's probably the way to go. A fixed-size queue that just delayed finalization by a constant number of objects would usually work without any allocations, but there would sometimes be single finalizers that recursively freed too many other objects, which would defeat the delay. Jeffrey ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
Brett Cannon wrote: > On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: >> Is there a clever way to prevent these problems globally, for example >> by delaying finalizers "just a little"? > > Beats me. Finalizers can be delayed 'just a little' with a macro called Py_CLEAR ;) (in other words, what Amaury is doing already is the best solution I am aware of) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Py_CLEAR to avoid crashes
On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: > Hello, > > I think I found a prolific vein of potential crashes throughout the > python interpreter. > The idea is inspired from the discussion Issue1020188 and is quite > simple: find a Py_DECREF(xxx->yyy), where xxx represents any kind of > Python object, and craft a __del__ method so that the same function is > recursively called with the same object. > > I recently submitted corrections for 3 problems found this way. Here > are two more examples of this method: > > # > class Special: > def __del__(self): > print a.args > > class MyException(BaseException): > def __init__(self): > global a > a = self > BaseException.__init__(self, Special(), 0) > BaseException.__init__(self, "other", 1) > > MyException() # segfault > # > import sys > class Special2(str): > def __del__(self): > f.__init__("@temp", "w") > > f = file(Special2("@temp"), "w") > f.__init__("@temp", "w") # segfault > # > > Issue1020188 (which is still open btw) deals with member reset to > NULL; but any modification of the pointer is potentially concerned. > > And the correction is always the same: use Py_CLEAR, or a similar > construction that detach the value from the structure before > defcref'ing it. > > I think it would help a lot of code if we had a safe standard way to > set struct members from C. A macro like the following: > Py_SETREF(lvalue, newobj) > (whatever its name) would perform the assignment and expand to code > equivalent to: > PyObject* oldobj = lvalue; > Py_INCREF(newobj); > lvalue = newobj; > Py_DECREF(oldobj); > > I am currently searching for all places that could benefit of this, > but I am not sure to find a test case for every potential crash. > Most of the time, it is very unlikely that "normal" python code can > fall in these traps (who calls f.__init__ in a __del__ method?), with > the exception of the one corrected by r60871. > Testing C code like this can be tough. Usually the crasher is used as the test in hopes that any fix is general enough that if it fails that same crasher will rear its ugly head again. > Should we however intensively search and correct all of them? I think it's fine to. > Is there a clever way to prevent these problems globally, for example > by delaying finalizers "just a little"? Beats me. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Py_CLEAR to avoid crashes
Hello, I think I found a prolific vein of potential crashes throughout the python interpreter. The idea is inspired from the discussion Issue1020188 and is quite simple: find a Py_DECREF(xxx->yyy), where xxx represents any kind of Python object, and craft a __del__ method so that the same function is recursively called with the same object. I recently submitted corrections for 3 problems found this way. Here are two more examples of this method: # class Special: def __del__(self): print a.args class MyException(BaseException): def __init__(self): global a a = self BaseException.__init__(self, Special(), 0) BaseException.__init__(self, "other", 1) MyException() # segfault # import sys class Special2(str): def __del__(self): f.__init__("@temp", "w") f = file(Special2("@temp"), "w") f.__init__("@temp", "w") # segfault # Issue1020188 (which is still open btw) deals with member reset to NULL; but any modification of the pointer is potentially concerned. And the correction is always the same: use Py_CLEAR, or a similar construction that detach the value from the structure before defcref'ing it. I think it would help a lot of code if we had a safe standard way to set struct members from C. A macro like the following: Py_SETREF(lvalue, newobj) (whatever its name) would perform the assignment and expand to code equivalent to: PyObject* oldobj = lvalue; Py_INCREF(newobj); lvalue = newobj; Py_DECREF(oldobj); I am currently searching for all places that could benefit of this, but I am not sure to find a test case for every potential crash. Most of the time, it is very unlikely that "normal" python code can fall in these traps (who calls f.__init__ in a __del__ method?), with the exception of the one corrected by r60871. Should we however intensively search and correct all of them? Is there a clever way to prevent these problems globally, for example by delaying finalizers "just a little"? -- Amaury Forgeot d'Arc ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com