Hi all, I am currently busy with other things, and I'd like to wait for Rogers and Martins opinion on this. I think Florans idea is elegant, but as I said, this is an area where I want to be super careful.
To recap Florians idea: Using the already existing fail pipe Martin added for jspawnhelper, change the protocol such that: - jspawnhelper, just before exec() ing the target binary, will send an "alive and well and about to exec" code up the pipe to the parent. Then exec() the target binary. - for the parent process that means: -> if we get nothing, it means the first exec() already failed, we were not even able to exec jspawnhelper itself -> if we get a single error code it means jspawnhelper came up alright but hit an error before exec()ing the target binary -> if we get a "alive and well" code and then pipe breaks this means all is well - jspawnhelper came up and successfully exec()d the target binary -> if we get a "alive and well" code followed by an error code it means jspawnhelper came up but we failed to exec() the target binary Beauty is that this would work for all exec() errors, not just permission problems. Problem is that this incurs additional costs and potential error surface to cover a rare situation probably caused by a botched up JDK installation. @Florian : Since we deal with many libc variants, not only glibc, fixing posix_spawn just in glibc may not be sufficient, at least not for a long time. But if you would fix glibc and give it a error-reporting back channel for exec() this would already help a lot. I dug up https://sourceware.org/bugzilla/show_bug.cgi?id=18433 which sounded pretty hopeless. I have long given up trying to report bugs to glibc maintainers :( Thanks, Thomas On Tue, May 14, 2019 at 9:00 PM Florian Weimer <fwei...@redhat.com> wrote: > * Thomas Stüfe: > > > Right now I am worried more about things I cannot determine yet. Where > > before we would wait for the pipe to get broken, now we have a read > > call on the parent side, a write call on the child side, which both > > must succeed. Could they fail sporadically, e.g. due to EINTR? I know > > this sounds very vague but around this API I am super careful. > > EINTR should only arrive if there's a signal handler, otherwise the > signal is either ignored or terminates the process. I don't think > jspawnhelper installs any. If the write fails, jspawnhelper can just > exit, and it will look like as if it had never launched (resulting in an > error). The write-after-exec-error case is more problematic than that. > > I'm working on this from the other end—adding functionality to glibc, so > that we can eliminate jspawnhelper. But that's a more long-term effort, > of course. > > Thanks, > Florian >