On Wed, 12 Oct 2022 17:00:15 GMT, Andrew Haley <a...@openjdk.org> wrote:

>> A bug in GCC causes shared libraries linked with -ffast-math to disable 
>> denormal arithmetic. This breaks Java's floating-point semantics.
>> 
>> The bug is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522
>> 
>> One solution is to save and restore the floating-point control word around 
>> System.loadLibrary(). This isn't perfect, because some shared library might 
>> load another shared library at runtime, but it's a lot better than what we 
>> do now. 
>> 
>> However, this fix is not complete. `dlopen()` is called from many places in 
>> the JDK. I guess the best thing to do is find and wrap them all. I'd like to 
>> hear people's opinions.
>
> Andrew Haley has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   8295159: DSO created with -ffast-math breaks Java floating-point arithmetic

FTR I did an exercise in source code archeology and here are my findings.

The origin of `AlwaysRestoreFPU`-related code (both in x86-32 and arm-specific 
code) can be traced back to 
[JDK-6487931](https://bugs.openjdk.org/browse/JDK-6487931) and 
[JDK-6550813](https://bugs.openjdk.org/browse/JDK-6550813). Though both issues 
manifested as JVM crashes, the underlying problem was identified as FPU control 
word corruption by native code.

The regression test does trigger the corruption from a JNI call (using either 
`_FPU_SETCW` [1] or `_controlfp` [2]), but it was deliberately limited to 
x86-32. 

Based on that, I conclude that the problem with FP environment corruption by 
native code was known before, but an opt-in solution was chosen. 

(Frankly speaking, I don't know why an opt-in solution was considered 
sufficient. Maybe because it was erroneously believed it can only lead to a 
crash at runtime?)

Considering we are now aware about insidious nature of the problem (silent 
result corruption), I'm inclined to propose either to turn on 
`AlwaysRestoreFPU` by default (and provide implementation on platforms where it 
is missed) or, at least, catch FPU control word corruption on native->java 
transitions and crash the JVM advertising `-XX:+AlwaysRestoreFPU` as a solution.

[1] https://man7.org/linux/man-pages/man3/__setfpucw.3.html
[2] 
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/control87-controlfp-control87-2?view=msvc-170

-------------

PR: https://git.openjdk.org/jdk/pull/10661

Reply via email to