On Thu, 25 May 2023 15:25:40 GMT, Volker Simonis <[email protected]> wrote:
>> Since JDK13, executing commands in a sub-process defaults to the so called
>> `POSIX_SPAWN` launching mechanism (i.e.
>> `-Djdk.lang.Process.launchMechanism=POSIX_SPAWN`) on Linux. This works by
>> using `posix_spawn(3)` to firstly start a tiny helper program called
>> `jspawnhelper` in a subprocess. In a second step, `jspawnhelper` reads the
>> command data from the parent Java process over a Unix pipe and finally
>> executes (i.e. `execvp(3)`) the requested command.
>>
>> In cases where the parent process terminates abnormally before
>> `jspawnhelper` has read all the expected data from the pipe, `jspawnhelper`
>> will block indefinitely on the reading end of the pipe. This is especially
>> harmful if the parent process had open sockets, because in that case, the
>> forked `jspawnhelper` process will inherit them and keep all the
>> corresponding ports open effectively preventing other processes to bind to
>> them. Notice that this is not an abstract scenario. We've observed this
>> regularly in production with services which couldn't be restarted after a
>> crash after migrating to JDK 17.
>>
>> The fix of the issue is rather trivial. `jspawnhelper` has to close its
>> writing end of the pipe which connects it with the parent Java process
>> *before* starting to read from that pipe such that reads from the pipe can
>> immediately return with EOF if the parent process terminates abnormally.
>>
>> Also did some cleanup:
>> - in `jspawnhelper.c` correctly chek the result of `readFully()`
>> - in `ProcessImpl_md.c` use `restartableWrite()` instead of `write()` to
>> write the command data from the parent process to `jspawnhelper`
>> - in `childproc.{c,h}` remove remaining traces of `MODE_CLONE` because
>> that's long gone.
>
> Volker Simonis has updated the pull request incrementally with one additional
> commit since the last revision:
>
> Fix for failing CloseRace test
> Fixing the last bug :)
>
> If we close the writing end of the pipe to jspawnhelper early in
> `spawnChild()` right after the last write (which I still think is a good
> idea), we have to use `c->childenv` rather than `childenv` when closing the
> pipe descriptors at the end of `Java_java_lang_ProcessImpl_forkAndExec()` in
> order to avoid double closing:
>
> ```c++
> - closeSafely(childenv[0]);
> - closeSafely(childenv[1]);
> + /* We use 'c->childenv' here rather than 'childenv' because
> 'spawnChild()' might have
> + * already closed 'c->childenv[1]' and signaled this by setting
> 'c->childenv[1]' to '-1'.
> + * Otherwise 'c->childenv' and 'childenv' are the same because we just
> copied 'childenv'
> + * to 'c->childenv' (with 'copyPipe()') before calling 'startChild()'. */
> + closeSafely(c->childenv[0]);
> + closeSafely(c->childenv[1]);
> ```
>
> This also fixes `test/jdk/java/lang/ProcessBuilder/CloseRace.java` which
> could failed sporadically the previous version of the change.
I missed that. Christ this stuff is complex :(
Still think it would be cleaner and simpler to set the FD in the parent to
CLOEXEC, before doing posix_spawn, and at the same time set the childStuff
variable to -1 to prevent the child from attempting to re-close it. Reconsider?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13956#issuecomment-1564700036