Nir Aides <n...@winpdb.org> added the comment:

Well, my brain did not deadlock, but after spinning on the problem for a while 
longer, it now thinks Tomaž Šolc and Steffen are right.

We should try to fix the multiprocessing module so it does not deadlock 
single-thread programs and deprecate fork in multi-threaded programs.

Here is the longer version, which is a summary of what people said here in 
various forms, observations from diving into the code and Googling around:


1) The rabbit hole

a) In a multi-threaded program, fork() may be called while another thread is in 
a critical section. That thread will not exist in the child and the critical 
section will remain locked. Any attempt to enter that critical section will 
deadlock the child.

b) POSIX included the pthread_atfork() API in an attempt to deal with the 
problem:
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html

c) But it turns out atfork handlers are limited to calling async-signal-safe 
functions since fork may be called from a signal handler:
http://download.oracle.com/docs/cd/E19963-01/html/821-1601/gen-61908.html#gen-95948

This means atfork handlers may not actually acquire or release locks. See 
opinion by David Butenhof who was involved in the standardization effort of 
POSIX threads:
http://groups.google.com/group/comp.programming.threads/msg/3a43122820983fde

d) One consequence is that we can not assume third party libraries are safe to 
fork in multi-threaded program. It is likely their developers consider this 
scenario broken.

e) It seems the general consensus across the WWW concerning this problem is 
that it has no solution and that a fork should be followed by exec as soon as 
possible.

Some references:
http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html
http://austingroupbugs.net/view.php?id=62
http://sourceware.org/bugzilla/show_bug.cgi?id=4737
http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them


2) Python’s killer rabbit

The standard library multiprocessing module does two things that force us into 
the rabbit hole; it creates worker threads and forks without exec.

Therefore, any program that uses the multiprocessing module is a 
multi-threading forking program.

One immediate problem is that a multiprocessing.Pool may fork from its worker 
thread in Pool._handle_workers(). This puts the forked child at risk of 
deadlock with any code that was run by the parent’s main thread (the entire 
program logic).

More problems may be found with a code review.

Other modules to look at are concurrent.futures.process (creates a worker 
thread and uses multiprocessing) and socketserver (ForkingMixIn forks without 
exec).


3) God bless the GIL

a) Python signal handlers run synchronously in the interpreter loop of the main 
thread, so os.fork() will never be called from a POSIX signal handler.

This means Python atfork prepare and parent handlers may run any code. The code 
run at the child is still restricted though and may deadlock into any acquired 
locks left behind by dead threads in the standard library or lower level third 
party libraries.

b) Turns out the GIL also helps by synchronizing threads.

Any lock held for the duration of a function call while the GIL is held will be 
released by the time os.fork() is called. But if a thread in the program calls 
a function that yields the GIL we are in la la land again and need to watch out 
step.


4) Landing gently at the bottom

a) I think we should try to review and sanitize the worker threads of the 
multiprocessing module and other implicit worker threads in the standard 
library.

Once we do (and since os.fork() is never run from a POSIX signal handler) the 
multiprocessing library should be safe to use in programs that do not start 
other threads.

b) Then we should declare the user scenario of mixing the threading and 
multiprocessing modules as broken by design.

c) Finally, optionally provide atfork API

The atfork API can be used to refactor existing fork handlers in the standard 
library, provide handlers for modules such as the logging module that will 
reduce the risk of deadlock in existing programs, and can be used by developers 
who insist on mixing threading and forking in their programs.


5) Sanitizing worker threads in the multiprocessing module

TODO :) 

(will try to post some ideas soon)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue6721>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to