On Wed, 12 Oct 2022 17:00:15 GMT, Andrew Haley <a...@openjdk.org> wrote:

>> A bug in GCC causes shared libraries linked with -ffast-math to disable 
>> denormal arithmetic. This breaks Java's floating-point semantics.
>> 
>> The bug is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522
>> 
>> One solution is to save and restore the floating-point control word around 
>> System.loadLibrary(). This isn't perfect, because some shared library might 
>> load another shared library at runtime, but it's a lot better than what we 
>> do now. 
>> 
>> However, this fix is not complete. `dlopen()` is called from many places in 
>> the JDK. I guess the best thing to do is find and wrap them all. I'd like to 
>> hear people's opinions.
>
> Andrew Haley has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   8295159: DSO created with -ffast-math breaks Java floating-point arithmetic

For upcalls on non-Windows platforms, we also save MXCSR and restore it after 
the call, and load a set standard value for the Java code that's about to be 
executed. (I'm not sure why this isn't done on Windows, tbh)

Relevant code for JNI is here:
- Downcalls: 
https://github.com/openjdk/jdk/blob/1961e81e02e707cd0c8241aa3af6ddabf7668589/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L5002
- Upcalls: 
https://github.com/openjdk/jdk/blob/1961e81e02e707cd0c8241aa3af6ddabf7668589/src/hotspot/cpu/x86/stubGenerator_x86_64.cpp#L262

FPCSR is only handled on non _LP64 it looks like.

I agree with Vladimir that this seems like a general problem of foreign code 
potentially messing with control bits (in theory, foreign code could violate 
its ABI in other ways as well). It seems that both major C/C++ x64 ABIs ([1], 
[2], [3]) treat the control bits as non-volatile, so the callee should preserve 
them. This is in theory a choice of a particular ABI, but I think in general we 
can assume foreign code does not modify the control bits. Though, we never know 
for sure of course, and I suppose this is where `RestoreMXCSROnJNICalls` comes 
in. There's no equivalent flag for FPCSR atm AFAICS, so the answer there seems 
to be "just don't do it".

Following that logic: from our perspective `dlopen` violates its ABI in certain 
cases. Preserving the control bits across calls to `dlopen` seems like a 
pragmatic solution. I'm not sure how important it is to have an opt-in for the 
current (broken) behavior...

[1]: 
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#fpcsr
[2]: 
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#mxcsr
[3]: https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf

-------------

PR: https://git.openjdk.org/jdk/pull/10661

Reply via email to