[issue6721] Locks in python standard library should be sanitized on fork

lesha Fri, 01 Jun 2012 17:33:19 -0700

lesha <pybug.20.le...@xoxy.net> added the comment:

I feel like I'm failing to get my thesis across. I'll write it out fully:


== Thesis start ==

Basic fact: It is an error to use threading locks in _any_ way after a
fork. I think we mostly agree on this. The programs we discussing are
**inherently buggy**.

We disagree on the right action when such a bug happens. I see 3 possibilities:

1) deadlock (the current behavior, if the lock was held in the parent at the 
time of fork)

2) continue to execute:
 a) as if nothing happened (the current behavior, if the lock was not
held in the parent)
 b) throw an Exception (equivalent to a, see below)

3) crash hard.

I think both 1 and 3 are tolerable, while 2 is **completely unsafe**
because the resulting behavior of the program is unexpected and unpredictable 
(data corruption, deletion, random actions, etc).

== Thesis end ==



I will now address Gregory's, Richard's, and Vinay's comments in view
of this thesis:



1) Gregory suggests throwing an exception when the locks are used in a
child. He also discusses some cases, in which he believes one could
safely continue execution.

My responses:

a) Throwing an exception is tatamount to continuing execution.

Imagine that the parent has a tempfile RAII object that erases the
file after the object disappears, or in some exception handler.

The destructor / handler will now get called in the child... and the
parent's tempfile is gone. Good luck tracking that one down.

b) In general, is not safe to continue execution on release(). If you
release() and reinitialize, the lock could still later be reused by
both parent and child, and there would still be contention leading to
data corruption.

c) Re: deadlocks are unacceptable...

A deadlock is better than data corruption. Whether you prefer a
deadlock or a crash depends on whether your system is set up to dump
core. You can always debug a deadlock with gdb. A crash without a core
dump is impossible to diagnose. However, a crash is harder to ignore,
and it lets the process recover. So, in my book, though I'm not 100%
certain: hard crash > deadlock > corruption

d) However, we can certainly do better than today:

i) Right now, we sometimes deadlock, and sometimes continue execution.
It would be better to deadlock always (or crash always), no matter how
the child uses the lock.

ii) We can log before deadlocking (this is hard in general, because
it's unclear where to log to), but it would immensely speed up
debugging.

iii) We can hard-crash with an extra-verbose stack dump (i.e. dump the lock 
details in addition to the stack)



2) Richard explains how my buggy snippets are buggy, and how to fix them.

I respond: Richard, thanks for explaining how to avoid these bugs!

Nonetheless, people make bugs all the time, especially in areas like
this. I made these bugs. I now know better, mostly, but I wouldn't bet on it.

We should choose the safest way to handle these bugs: deadlocking
always, or crashing always. Reinitializing the locks is going to cost
Python users a lot more in the long run. Deadlocking _sometimes_, as we do now, 
is equally bad. 

Also, even your code is potentially unsafe: when you execute the
excepthook in the child, you could be running custom exception logic,
or even a custom excepthook. Those could well-intentionedly, but
stupidly, destroy some of the parent's valuable data.



3) Vinay essentially says "using logging after fork is user error". 

I respond: Yes, it is. In any other logging library, this error would only 
result in mangled log lines, but no lasting harm.

In Python, you sometimes get a deadlock, and other times, mangled lines.

> logging is not doing anything to protect things *outside* of a single process 

A file is very much outside a single process. If you are logging to a file, the 
only correct way is to use a file lock. Thus, I stand by my assertion that 
"logging" is buggy.

Windows programs generally have no problems with this. fork() on UNIX gives you 
both the rope and the gallows to hang yourself.

Specifically for logging, I think reasonable options include:

a) [The Right Way (TM)] Using a file lock + CLOEXEC when available; this lets 
multiple processes cooperate safely.

b) It's okay to deadlock & log with an explanation of why the deadlock is 
happening.

c) It's okay to crash with a similar explanation.

d) It's pretty okay even to reinitialize logs, although mangled log lines do 
prevent automated parsing.



I really hope that my compact thesis can help us get closer to a consensus, 
instead of arguing about the details of specific bugs.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6721>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue6721] Locks in python standard library should be sanitized on fork

Reply via email to