On Wed, 12 Oct 2022 17:00:15 GMT, Andrew Haley <a...@openjdk.org> wrote:
>> A bug in GCC causes shared libraries linked with -ffast-math to disable >> denormal arithmetic. This breaks Java's floating-point semantics. >> >> The bug is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522 >> >> One solution is to save and restore the floating-point control word around >> System.loadLibrary(). This isn't perfect, because some shared library might >> load another shared library at runtime, but it's a lot better than what we >> do now. >> >> However, this fix is not complete. `dlopen()` is called from many places in >> the JDK. I guess the best thing to do is find and wrap them all. I'd like to >> hear people's opinions. > > Andrew Haley has updated the pull request incrementally with one additional > commit since the last revision: > > 8295159: DSO created with -ffast-math breaks Java floating-point arithmetic For upcalls on non-Windows platforms, we also save MXCSR and restore it after the call, and load a set standard value for the Java code that's about to be executed. (I'm not sure why this isn't done on Windows, tbh) Relevant code for JNI is here: - Downcalls: https://github.com/openjdk/jdk/blob/1961e81e02e707cd0c8241aa3af6ddabf7668589/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L5002 - Upcalls: https://github.com/openjdk/jdk/blob/1961e81e02e707cd0c8241aa3af6ddabf7668589/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#L262 FPCSR is only handled on non _LP64 it looks like. I agree with Vladimir that this seems like a general problem of foreign code potentially messing with control bits (in theory, foreign code could violate its ABI in other ways as well). It seems that both major C/C++ x64 ABIs ([1], [2], [3]) treat the control bits as non-volatile, so the callee should preserve them. This is in theory a choice of a particular ABI, but I think in general we can assume foreign code does not modify the control bits. Though, we never know for sure of course, and I suppose this is where `RestoreMXCSROnJNICalls` comes in. There's no equivalent flag for FPCSR atm AFAICS, so the answer there seems to be "just don't do it". Following that logic: from our perspective `dlopen` violates its ABI in certain cases. Preserving the control bits across calls to `dlopen` seems like a pragmatic solution. I'm not sure how important it is to have an opt-in for the current (broken) behavior... [1]: https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#fpcsr [2]: https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#mxcsr [3]: https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf ------------- PR: https://git.openjdk.org/jdk/pull/10661