Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Martin v. Löwis
>> Most Py_DECREF calls are probably okay but it's going to be hard
>> to find the ones that are not.
> 
> I suppose Py_DECREF is not the only source of trouble.  Many calls
> to the Python API can end up calling arbitrary user code (via
> __getattr__, __getitem__, etc.).  Whenever an object does that, it
> must be prepared to be accessed from user code.  I'm guessing there
> are many subtle bugs of this nature lurking.  Py_DECREF is perhaps
> the most common though.  Maybe renaming it to
> Py_DECREF_AND_RUN_EVIL_USER_CODE would help. ;-)

But that's unrelated to this issue. In those other cases, the refcount
won't be zero, so the object is still there.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Daniel Stutzbach
On Feb 18, 2008 4:52 PM, Neil Schemenauer <[EMAIL PROTECTED]> wrote:

> That sucks.  Most Py_DECREF calls are probably okay but it's going
> to be hard to find the ones that are not.  I can't think of anything
> we can do to make this trap harder to fall into.  Even using
> Py_CLEAR as a blunt tool is not a total solution. You could still
> end up with a null pointer dereference if the code is not written
> carefully.
>

Container types (particularly lists) go through great lengths to postpone
object deletion.  For example, to delete a slice from a list all of the
items must be copied to a temporary array, then the list object's pointers
are modified, then all the Py_DECREF's are called just before returning.

I have always seen this as a robustness versus efficiency issue.  It's
theoretically possible to set things up so that reference counter decrements
are actually postponed until after the C method/slot returns, but it's
slower than doing it immediately.  I wonder if adding support for postponed
decrements (without making it mandatory) would at least make the trap harder
to fall into.

For example:

- maintain a global array of pending decrefs
- before calling into any C method/slot, save the index of the current
end-of-array (in a local C variable on the stack)
- call the C method, which may call Py_DECREF_LATER(x) to append x to the
global array
- when the C method returns, decref anything newly appended to the array

The array would grow and shrink just as a list does (O(1) amortized time to
add/remove a pointer).

This would simplify a number of places in listobject.c as well as remove the
need for Py_TRASHCAN_*.  It would be entirely optional, so anyone who is
very careful and wants the speed of Py_DECREF can have it.  Also, the
deferment is very brief, since the decrefs occur right after the C method
returns.

The downside is having to store and check the global array length on every C
method call (basically 3 machine instructions).  The machine instructions
aren't so bad, but I'm not sure about the effects on the CPU cache.

So, like I said, a robustness versus performance trade-off. :-(

-- 
Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Jim Jewett
> A simple way to do this would be to push objects whose
> refcounts had reached 0 onto a list instead of finalizing them
> immediately, and have PyEval_EvalFrameEx periodically swap
> in a new to-delete list and delete the objects on the old one.

Some of the memory management threads discussed something similar to
this, and pointed to IBM papers on Java.  By adding states like
"tenatively finalizable", the cost of using multiple processors was
reduced.

The down side is that objects which could be released (and recycled)
immediately won't be -- which slows down a fair number of real-world
programs that are used to the CPython refcount model.  If the resource
not being immediately released is scarce (such as file handles), it
gets even worse.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Neil Schemenauer
I wrote:
> Most Py_DECREF calls are probably okay but it's going to be hard
> to find the ones that are not.

I suppose Py_DECREF is not the only source of trouble.  Many calls
to the Python API can end up calling arbitrary user code (via
__getattr__, __getitem__, etc.).  Whenever an object does that, it
must be prepared to be accessed from user code.  I'm guessing there
are many subtle bugs of this nature lurking.  Py_DECREF is perhaps
the most common though.  Maybe renaming it to
Py_DECREF_AND_RUN_EVIL_USER_CODE would help. ;-)

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Martin v. Löwis
> That sucks.  Most Py_DECREF calls are probably okay but it's going
> to be hard to find the ones that are not.

Methinks that

egrep 'DECREF\([a-zA-Z0-9_]->[a-zA-Z0-9_]+\)' */*.c

gives a good overview - you should never DECREF a variable on heap.
In the trunk, this command finds 36 candidates. Now, some of them
are in tp_dealloc methods, so they shouldn't cause problems - but
it can't hurt to replace them with Py_CLEAR, either.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Neil Schemenauer
On Mon, Feb 18, 2008 at 05:48:57PM +0100, Amaury Forgeot d'Arc wrote:
> For example, in exception.c, BaseException_init() starts with the instruction:
> Py_DECREF(self->args);
> this may call __del__ on self->args

Ah, I understand now.  We are not talking about tp_dealloc methods
(the GC takes great pains to avoid this scenario).  However, any
object that calls Py_DECREF outside of its tp_dealloc method must be
prepared for finalizers to access it in arbitrary ways.

That sucks.  Most Py_DECREF calls are probably okay but it's going
to be hard to find the ones that are not.  I can't think of anything
we can do to make this trap harder to fall into.  Even using
Py_CLEAR as a blunt tool is not a total solution. You could still
end up with a null pointer dereference if the code is not written
carefully.

  Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Jeffrey Yasskin
On Feb 17, 2008 12:29 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Jeffrey Yasskin wrote:
> > On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote:
> >> Should we however intensively search and correct all of them?
> >> Is there a clever way to prevent these problems globally, for example
> >> by delaying finalizers "just a little"?
> >
> > A simple way to do this would be to push objects whose refcounts had
> > reached 0 onto a list instead of finalizing them immediately, and have
> > PyEval_EvalFrameEx periodically swap in a new to-delete list and
> > delete the objects on the old one.
>
> -1. This would only soften the problem in a shallow way. It should still
> be possible to create cases where the final cleanup of the object comes
> "too early", e.g. when some caller of PyEval_EvalFrameEx still carries
> a pointer to some object that got deleted, and then still some code can
> get hold of the then-deleted object. It will just be more difficult to
> trigger such crashes, depending on the period in which objects are
> cleaned.

Right. Changing introducing these bugs from "easy" to "possible"
sounds like an improvement. Making them harder to trigger has both
good and bad effects: they're harder to exploit and harder to
reproduce. If we delete on every iteration, it only costs a couple
memory accesses more, and should eliminate most of the
non-determinism.

Anyway, I saw an approach like this work well on a server I used to
work on, but there were differences. In particular, there was no way
to call a function to trigger deletions; you had to return out to the
main loop, which made it a little more reliable than it would be in
Python. I suspect it would still fix nearly all of the bugs Amaury is
finding since Py_DECREF would no longer return control to the
interpreter, but I could be wrong.

> The only sane way to never touch deleted objects is to have no
> references to them on heap anymore, and to not touch stack references
> after the DECREF.
>
> Regards,
> Martin
>



-- 
Namasté,
Jeffrey Yasskin
http://jeffrey.yasskin.info/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Amaury Forgeot d'Arc
Hello,
Neil Schemenauer wrote:
> Nick Coghlan <[EMAIL PROTECTED]> wrote:
> > The problem is calls to Py_DECREF(self->attr) where some of the code
> > invoked by __del__ manages to find a way back around to reference
> > self->attr and gets access to a half-deleted object.
>
> Don't you mean "__del__ manages to find a way back around to self"?
> If so, how can that happen?  If such a reference path exists, the
> reference count of self should not be zero.  I don't understand why
> Py_CLEAR is necessary outside of tp_clear functions.

Of course we are speaking of different objects.

For example, in exception.c, BaseException_init() starts with the instruction:
Py_DECREF(self->args);
this may call __del__ on self->args, which can execute arbitrary
python code - including access to the now-invalid "args" member of the
exception.

class S:
  def __del__(self):
 print e.args

e = BaseException(1, S())
e.__init__("hello")   # segfault

-- 
Amaury Forgeot d'Arc
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Neil Schemenauer
Nick Coghlan <[EMAIL PROTECTED]> wrote:
> The problem is calls to Py_DECREF(self->attr) where some of the code 
> invoked by __del__ manages to find a way back around to reference 
> self->attr and gets access to a half-deleted object.

Don't you mean "__del__ manages to find a way back around to self"?
If so, how can that happen?  If such a reference path exists, the
reference count of self should not be zero.  I don't understand why
Py_CLEAR is necessary outside of tp_clear functions.

  Neil

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-18 Thread Nick Coghlan
Greg Ewing wrote:
> Martin v. Löwis wrote:
>> when some caller of PyEval_EvalFrameEx still carries
>> a pointer to some object that got deleted, and then still some code can
>> get hold of the then-deleted object.
> 
> I seem to have missed the beginning of this discussion.
> I don't see what the problem is here. If a caller needs
> a pointer to an object, shouldn't it be holding a
> counted reference to it?

The problem is calls to Py_DECREF(self->attr) where some of the code 
invoked by __del__ manages to find a way back around to reference 
self->attr and gets access to a half-deleted object.

Amaury fixed a few of these recently by replacing the Py_DECREF calls 
with Py_CLEAR calls (and added the relevant pathological destructors to 
the test suite), but was wondering if there was a way to be more 
systematic about fixing them. About the only idea I have is to grep the 
source for all calls to Py_DECREF that contain a pointer deference and 
manually check them to see if they should use Py_CLEAR instead.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-17 Thread Greg Ewing
Martin v. Löwis wrote:
> when some caller of PyEval_EvalFrameEx still carries
> a pointer to some object that got deleted, and then still some code can
> get hold of the then-deleted object.

I seem to have missed the beginning of this discussion.
I don't see what the problem is here. If a caller needs
a pointer to an object, shouldn't it be holding a
counted reference to it?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-17 Thread Martin v. Löwis
Jeffrey Yasskin wrote:
> On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote:
>> Should we however intensively search and correct all of them?
>> Is there a clever way to prevent these problems globally, for example
>> by delaying finalizers "just a little"?
> 
> A simple way to do this would be to push objects whose refcounts had
> reached 0 onto a list instead of finalizing them immediately, and have
> PyEval_EvalFrameEx periodically swap in a new to-delete list and
> delete the objects on the old one.

-1. This would only soften the problem in a shallow way. It should still
be possible to create cases where the final cleanup of the object comes
"too early", e.g. when some caller of PyEval_EvalFrameEx still carries
a pointer to some object that got deleted, and then still some code can
get hold of the then-deleted object. It will just be more difficult to
trigger such crashes, depending on the period in which objects are
cleaned.

The only sane way to never touch deleted objects is to have no 
references to them on heap anymore, and to not touch stack references
after the DECREF.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-17 Thread Jeffrey Yasskin
On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote:
> Should we however intensively search and correct all of them?
> Is there a clever way to prevent these problems globally, for example
> by delaying finalizers "just a little"?

A simple way to do this would be to push objects whose refcounts had
reached 0 onto a list instead of finalizing them immediately, and have
PyEval_EvalFrameEx periodically swap in a new to-delete list and
delete the objects on the old one. A linked list would cost an extra
pointer in PyObject_HEAD, but a growable array would only cost
allocations, which would be amortized over the allocations of the
objects you're deleting, so that's probably the way to go. A
fixed-size queue that just delayed finalization by a constant number
of objects would usually work without any allocations, but there would
sometimes be single finalizers that recursively freed too many other
objects, which would defeat the delay.

Jeffrey
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-16 Thread Nick Coghlan
Brett Cannon wrote:
> On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote:
>> Is there a clever way to prevent these problems globally, for example
>> by delaying finalizers "just a little"?
> 
> Beats me.

Finalizers can be delayed 'just a little' with a macro called Py_CLEAR ;)

(in other words, what Amaury is doing already is the best solution I am 
aware of)

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Py_CLEAR to avoid crashes

2008-02-16 Thread Brett Cannon
On Feb 16, 2008 3:12 PM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote:
> Hello,
>
> I think I found a prolific vein of potential crashes throughout the
> python interpreter.
> The idea is inspired from the discussion Issue1020188 and is quite
> simple: find a Py_DECREF(xxx->yyy), where xxx represents any kind of
> Python object, and craft a __del__ method so that the same function is
> recursively called with the same object.
>
> I recently submitted corrections for 3 problems found this way. Here
> are two more examples of this method:
>
> #
> class Special:
> def __del__(self):
> print a.args
>
> class MyException(BaseException):
> def __init__(self):
> global a
> a = self
> BaseException.__init__(self, Special(), 0)
> BaseException.__init__(self, "other", 1)
>
> MyException() # segfault
> #
> import sys
> class Special2(str):
> def __del__(self):
> f.__init__("@temp", "w")
>
> f = file(Special2("@temp"), "w")
> f.__init__("@temp", "w")  # segfault
> #
>
> Issue1020188 (which is still open btw) deals with member reset to
> NULL; but any modification of the pointer is potentially concerned.
>
> And the correction is always the same: use Py_CLEAR, or a similar
> construction that detach the value from the structure before
> defcref'ing it.
>
> I think it would help a lot of code if we had a safe standard way to
> set struct members from C. A macro like the following:
> Py_SETREF(lvalue, newobj)
> (whatever its name) would perform the assignment and expand to code
> equivalent to:
> PyObject* oldobj = lvalue;
> Py_INCREF(newobj);
> lvalue = newobj;
> Py_DECREF(oldobj);
>
> I am currently searching for all places that could benefit of this,
> but I am not sure to find a test case for every potential crash.
> Most of the time, it is very unlikely that "normal" python code can
> fall in these traps (who calls f.__init__ in a __del__ method?), with
> the exception of the one corrected by r60871.
>

Testing C code like this can be tough. Usually the crasher is used as
the test in hopes that any fix is general enough that if it fails that
same crasher will rear its ugly head again.

> Should we however intensively search and correct all of them?

I think it's fine to.

> Is there a clever way to prevent these problems globally, for example
> by delaying finalizers "just a little"?

Beats me.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Py_CLEAR to avoid crashes

2008-02-16 Thread Amaury Forgeot d'Arc
Hello,

I think I found a prolific vein of potential crashes throughout the
python interpreter.
The idea is inspired from the discussion Issue1020188 and is quite
simple: find a Py_DECREF(xxx->yyy), where xxx represents any kind of
Python object, and craft a __del__ method so that the same function is
recursively called with the same object.

I recently submitted corrections for 3 problems found this way. Here
are two more examples of this method:

#
class Special:
def __del__(self):
print a.args

class MyException(BaseException):
def __init__(self):
global a
a = self
BaseException.__init__(self, Special(), 0)
BaseException.__init__(self, "other", 1)

MyException() # segfault
#
import sys
class Special2(str):
def __del__(self):
f.__init__("@temp", "w")

f = file(Special2("@temp"), "w")
f.__init__("@temp", "w")  # segfault
#

Issue1020188 (which is still open btw) deals with member reset to
NULL; but any modification of the pointer is potentially concerned.

And the correction is always the same: use Py_CLEAR, or a similar
construction that detach the value from the structure before
defcref'ing it.

I think it would help a lot of code if we had a safe standard way to
set struct members from C. A macro like the following:
Py_SETREF(lvalue, newobj)
(whatever its name) would perform the assignment and expand to code
equivalent to:
PyObject* oldobj = lvalue;
Py_INCREF(newobj);
lvalue = newobj;
Py_DECREF(oldobj);

I am currently searching for all places that could benefit of this,
but I am not sure to find a test case for every potential crash.
Most of the time, it is very unlikely that "normal" python code can
fall in these traps (who calls f.__init__ in a __del__ method?), with
the exception of the one corrected by r60871.

Should we however intensively search and correct all of them?
Is there a clever way to prevent these problems globally, for example
by delaying finalizers "just a little"?

-- 
Amaury Forgeot d'Arc
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com