I think this is a cool idea.  Do you have any performance numbers?
On Tue, Sep 11, 2018 at 12:52 PM Thomas Stüfe <thomas.stu...@gmail.com> wrote:
>
> Hi all,
>
> I wanted to gauge opinions on the following issue:
>
> Runtime.exec, on Linux, uses vfork(2) by default. It gives us better
> performance compared with fork() and robustness in constrained memory
> situations.
>
> But as we know vfork() can be dangerous if used incorrectly. In the
> child process before exec'ing, we live in the memory of the parent
> process. If we are not very careful we can influence or crash the
> parent process.
>
> According to POSIX pretty much the only thing the child process is
> allowed to do after vfork(2) is to exec(3) immediately; if that fails,
> you must call _exit(2).
>
> http://pubs.opengroup.org/onlinepubs/009604599/functions/vfork.html
>
> However, in the openjdk we do a number of things beyond that:
>
> - stdin,out,err pipe handling business
> - closing all file descriptors
> - we change the working directory
> - we may actually modify errno manually
> - in case exec fails, we communicate the error back to the parent using pipe.
>
> This involves calling a number of libc functions beyond exec(), namely
> read, close, dup2, opendir/readdir, write, chdir... It also needs a
> bit of stack, since we assemble path names.
>
> --
>
> I was curious whether there were any real issues, so I tested (on
> Ubuntu 16.4) and found:
>
> 1) A crash - any crash - in the child process before exec() will kill
> the parent jvm dead. Weirdly enough, we do not even enter our error
> handling, but seem to die instantly with the default "Segmentation
> Fault".
>
> 2) Signals received by the child process before exec() influence the
> parent process. For example:
>  - SIGINT set to the child ends both parent and child, immediately
>  - SIGABRT aborts both child and parent
>  - any error signal sent to the child lead to the behavior described at (1)
>
> 3) A stack overflow in the child before exec() also kills the parent.
> Unsurprising, since guard page hit -> segfault -> see (1).
>
> 4) more amusing, setting errno in the child before exec() changes the
> errno in the parent process. propagates to the parent process.
> But since errno is thread local and the thread in the parent process
> is waiting in vfork() and will, upon return, not look at errno (after
> all, vfork succeeded) this causes no trouble.
>
> There may be more issues, but these were the ones I tested.
>
> In all cases I counter-tested with fork() instead of vfork() and as
> expected  with fork() the parent process stays unaffected as it should
> be.
>
> -------------
>
> Whether you think these issues are worth solving is an open question.
>
> All these cases may happen in the wild (well, apart from
> crash-by-programming-error if one assumes the program to be really bug
> free) albeit with a very small probability. But once these bugs occur,
> they can be very difficult to analyse. So fixing this may be
> worthwhile.
>
> At SAP, we opted for robustness, so we changed the Runtime.exec()
> implementation to deal with vfork() issues. Basically, we employ the
> exec-twice technique:
>
> - in the child, after the vfork(), we immediately exec() into a little
> bootstrap binary ("forkhelper").
> - Now we are safe in the sense that we do not share memory with the
> parent process anymore
> - Then, parent process communicates with the child via pipes and gives
> it all information needed to do the "real" exec: environ, current dir,
> arguments... .
> - Now the child exec's a second time, this time into the real target binary.
>
> The point of this technique is that we minimize the window in the
> child between vfork and the first exec. In fact, we are now fully
> POSIX compliant. This solves the described pathological cases.
>
> It has some other advantages too, e.g. allowing for better error
> handling and tracing in the Runtime.exec() area. Performance-wise it
> depends: we exec twice, so we pay twice. However, since the parent
> continues execution after the first exec, it spends less time waiting
> on the child process, which can make a difference if there are many
> file descriptors open.
>
> ---
>
> Checking opinions here. Do you think we are okay with our current
> implementation or would a change as described above be welcome in the
> OpenJDK too?
>
> Thanks, and Best Regards, Thomas



-- 
- DML

Reply via email to