On Tue, 3 Mar 2026 11:13:14 GMT, Thomas Stuefe <[email protected]> wrote:
>> When starting child processes from Java, we bootstrap the child process
>> after fork and before exec. As part of that process, up to five pipes are
>> handed to the child: three for stdin/out/err, respectively, and two internal
>> communication pipes (fail and childenv).
>>
>> If, concurrently with our invocation of `ProcessBuilder.start()`,
>> third-party native code forks a child of its own, the natively forked child
>> carries copies of these pipes. It then may keep these pipes open. This
>> results in various forms of communication errors, most likely hangs - either
>> in `ProcessBuilder.start()`, or in customer code.
>>
>> In the customer case that started this investigation,
>> `ProcessBuilder.start()` hung intermittently when using a third-party
>> Eclipse library that happened to perform forks natively.
>>
>> The JVM has no full control over what happens in its process, since we allow
>> native code to run. Therefore, native forks can happen, and we have to work
>> around them.
>>
>> The fix makes sure that the pipes we use in ProcessBuilder are always tagged
>> with CLOEXEC. Since forks are typically followed by execs, this will close
>> any file descriptors that were accidentally inherited.
>>
>> ### FORK/VFORK mode
>>
>> Here, it is sufficient to open all our pipes with O_CLOEXEC.
>>
>> The caveat here is that only Linux offers an API to do that cleanly:
>> `pipe2(2)` ([1]). On MacOS and AIX, we don't have `pipe2(2)`, so we need to
>> emulate that behavior using `pipe(2)` and `fcntl(2)` in quick succession.
>> That is still racy, since we did not completely close the time window within
>> which pipe file descriptors are not O_CLOEXEC. But this is the best we can
>> do.
>>
>> ### POSIX_SPAWN mode
>>
>> Creating the pipes with CLOEXEC alone is not sufficient. With
>> `posix_spawn(3)`, we exec twice: first to load the jspawnhelper (inside
>> `posix_spawn(3)`), a second time to load the target binary. Pipes created
>> with O_CLOEXEC would not survive the first exec.
>>
>> Therefore, instead of manually `dup2(2)`'ing our file descriptors after the
>> first exec in jspawnhelper itself, we set up dup2 file actions to let
>> posix_spawn do the dup'ing. According to POSIX, these dup2 file actions will
>> be processed before the kernel closes the inherited CLOEXEC file descriptors.
>>
>> Unfortunately, macOS is again not POSIX-compliant, since the macOS kernel
>> can close CLOEXEC file descriptors before posix_spawn processes them in its
>> dup2 file actions. As a workaround for that bug, we create temporary copies
>> of the pipe file descr...
>
> Thomas Stuefe has updated the pull request incrementally with one additional
> commit since the last revision:
>
> Feedback Volker
src/java.base/unix/native/jspawnhelper/jspawnhelper.c line 186:
> 184: #endif
> 185:
> 186: initChildStuff (CHILDENV_FILENO, FAIL_FILENO, &c);
Suggestion:
initChildStuff(CHILDENV_FILENO, FAIL_FILENO, &c);
drive-by cleanup...
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/29939#discussion_r2878259139