Thank you David for the detailed analysis!
I'm having an impression that the general direction has been to add the
RESTARTABLE loops where sensible. (E.g. a recent fix for JDK-8197498,
and lots of older similar bugs.)
I understand that the fix only touches a small portion of potentially
problematic syscalls.
The intention was to cover UnixFileSystem_md.c and to close that ancient
bug :)
With kind regards,
Ivan
On 3/14/19 11:43 PM, David Holmes wrote:
Hi Ivan,
On 15/03/2019 11:42 am, Ivan Gerasimov wrote:
Thank you David!
On 3/14/19 4:48 PM, David Holmes wrote:
Hi Ivan,
This is an "ancient" bug that you are fixing. I don't think it is
valid.
On 15/03/2019 3:29 am, Ivan Gerasimov wrote:
Hello!
Not all the man pages agree that chmod, access and statvfs64 can be
interrupted, but at least on some platforms they are allowed to
fail with EINTR: chmod(2) on MacOS, access(2) on Solaris and
statvfs(3) on Linux.
So, it would be more accurate to wrap these up into a RESTARTABLE
loop.
When Java threads are created, or native threads attach to the VM to
become Java threads, they all get a very specific signal-mask to
block most (non synchronous) signals. The signals that we install
handlers for in the VM are also configured with SA_RESTART. So
unless specifically specified as not honouring SA_RESTART we should
not need the RESTARTABLE loop.
But isn't it possible to install a custom signal handler through JNI,
omitting SA_RESTART flag?
It's possible - then you also have to update signal masks. But yes
possible.
So I'm not clear exactly what signals we need to be guarding against
here, or whether this indicates some kind of (historic) mismatch
between the library and VM code?
grep shows that RESTARTABLE macro and its variants are used
throughout hotspot and jdk code.
Yes and on closer examination you will find a lot of inconsistencies.
RESTARTABLE goes back a long way and many of the I/O APIs have
switched locations over the years. Stuff was copied from the HPI layer
into hotspot, then back to the JDK and sometimes things were copied
with RESTARTABLE and sometimes not; and sometimes ports were written
that copied RESTARTABLE and sometimes not; and sometime new APIs were
added that were RESTARTABLE and sometimes not. All in all a bit of a
mess.
For example here are some uses of write in JDK libs:
./share/native/libzip/zlib/gzwrite.c: writ =
write(state->fd, state->x.next, put);
./unix/native/libnio/ch/IOUtil.c: return convertReturnVal(env,
write(fd, &c, 1), JNI_FALSE);
./unix/native/libnio/ch/FileDispatcherImpl.c: return
convertReturnVal(env, write(fd, buf, len), JNI_FALSE);
./unix/native/libnio/fs/UnixCopyFile.c: RESTARTABLE(write((int)dst,
bufp, len), n);
./unix/native/libnio/fs/UnixNativeDispatcher.c:
RESTARTABLE(write((int)fd, bufp, (size_t)nbytes), n)
./unix/native/libjava/ProcessImpl_md.c: write(c->childenv[1], (char
*)&magic, sizeof(magic)); // magic number first
./unix/native/libjava/io_util_md.c: RESTARTABLE(write(fd, buf,
len), result);
A mix of RESTARTABLE and not.
If it were possible to guarantee that no syscalls are ever
interrupted, it would surely be much cleaner to remove all these
wrappers and loops.
There is no guarantee as you note - someone could install their own
handler for SIGUSR1 (not used by the VM) for example and not use
SA_RESTART and cause unexpected EINTR.
But that problem could arise today in many different places not just
the couple you are changing here.
So it comes down to a basic question of signal handling philosophy: do
we expect/require SA_RESTART to always be used, or do we always guard
against EINTR? The Go folk had a similar choice:
https://github.com/golang/go/issues/20400
We're kind of in a messy undecided state. We use SA_RESTART ourselves
but don't document its need for others to use, nor do we enforce its
use even through libjsig (for signal chaining). We use RESTARTABLE in
some places but not in others.
So yeah feel free to make these changes, just realize they are only
one part of a larger problem (if we intend to allow no SA_RESTART).
Cheers,
David
With kind regards,
Ivan
Thanks,
David
-----
Also, java_io_UnixFileSystem_list was tiny-optimized to avoid
unnecessary reallocation.
Would you please help review the fix?
BUGURL: https://bugs.openjdk.java.net/browse/JDK-6307456
WEBREV: http://cr.openjdk.java.net/~igerasim/6307456/00/webrev/
Thanks in advance!
--
With kind regards,
Ivan Gerasimov