Alexey Izbyshev <izbys...@ispras.ru> added the comment:

Victor and Joannah, thanks for working on adding vfork() support to subprocess. 
Regarding speedups in the real world, I can share a personal anecdote. Back at 
the time when AOSP was built with make (I think it was AOSP 5) I've observed 
~2x slowdown (or something close to that) if fork() is used instead of vfork() 
in make. This is the slowdown of the *whole* build (not just process creation 
time), and it's dramatic given the amount of pure computation (compilation of 
C++/Java code) involved in building AOSP. The underlying reason was that make 
merged all AOSP subproject makefiles into a single gigantic one, so make 
consumed more than 1 GB of RAM, and each shell invocation in recipes resulted 
in copying a large page table.

That said, I'm not sure that the chosen approach of adding posix_spawn() "fast 
paths" for particular combinations of arguments is an optimal way to expose 
vfork() benefits to most users. For example, close_fds is True by default in 
subprocess, and I don't think there is a reliable way to use posix_spawn() in 
this case. So users would need to use "magic" combinations of arguments to 
subprocess.Popen to benefit from vfork(). Another example is "cwd" parameter: 
there is no standard posix_spawn file action to change the working directory in 
the child, so such a seemingly trivial operation would trigger the "slow" 
fork-exec path.

Another approach would be to use vfork-exec instead fork-exec in 
_posixsubprocess. I know that previous discussions were resolved against 
vfork(), but I'm still not convinced and suggest to reevaluate those decisions 
again. Some arguments:

1. While POSIX contain scare-wording forbidding to do pretty anything in a 
vfork-spawned child, real systems where posix_spawn() is not a system call call 
vfork() and then execute all specified actions in the child. Cases where 
vfork() is avoided seem to be a quality-of-implementation issue, not a 
fundamental issue (for example, until recently glibc used fork() in some cases 
because of heap memory allocations, but they could be avoided). In practice, 
there is no problem with calling at least async-signal-safe functions in 
vfork-children, and there is no technical reason why there would be any problem.

2. _posixsubprocess is already very careful and calls only async-signal-safe 
functions in almost all cases (an obvious exception is preexec_fn, which is 
discouraged anyway).

3. Our use case for vfork() is restricted, so some concerns don't apply to us. 
For example, the setuid() problem outlined in Rich Felker's post 
(https://ewontfix.com/7) doesn't seem to apply. The problem with signal 
handlers from the same post should be possible to avoid except for preexec_fn, 
but we'd have to fallback to fork() in this case anyway due to memory 
allocation.

4. In the standard example of fork() unsafety in multithreaded processes (state 
of memory-based locks is copied to the child, and there could be nobody to 
unlock them), vfork() is *safer* than fork() because it still shares memory 
with the parent, and all threads other than the parent one are running. In 
particular, it should be perfectly safe to use any memory allocator that 
protects its state with something like futexes and atomics in a vfork-child 
even if other threads of the parent are using it concurrently.

5. Other language runtimes are already using vfork(). Java has been doing it 
for ages, and Victor referenced Go.

Some possible counter-arguments:

1. We seem to be reimplementing posix_spawn(). In fact, due to (2) above, we 
can fully reuse the existing fork-exec code, with additional tweaks that 
doesn't seem hard at this point.

2. My points above are based on Linux, but I don't know much about other 
Unix-likes, so in theory additional complications could exist. I can't, 
however, imagine any fundamental complication.

In the end, I think that migrating subprocess to vfork-exec would have more 
impact for users than adding "fast paths" and have consistent performance 
regardless of subprocess.Popen arguments (with few exceptions). Please consider 
it. Thanks!

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue35537>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to