On Fri, 3 Mar 2023 21:41:35 GMT, Vladimir Kozlov <[email protected]> wrote:
> Implemented `Float.floatToFloat16` and `Float.float16ToFloat` intrinsics in
> Interpreter and C1 compiler to produce the same results as C2 intrinsics on
> x64, Aarch64 and RISC-V - all platforms where C2 intrinsics for these Java
> methods were implemented originally.
>
> Replaced `SharedRuntime::f2hf()` and `hf2f()` C runtime functions with calls
> to runtime stubs which use the same HW instructions as C2 intrinsics. Only
> for 64-bit x64 because 32-bit x86 stub does not work: result is passed
> through FPU register and NaN values become different from C2 intrinsic. This
> runtime stub is only used to calculate constant values during C2 compilation
> and can be skipped.
>
> I added new tests based on Tobias's `TestAll.java` And copied
> `jdk/lang/Float/Binary16Conversion*.java` tests to run them with `-Xcomp` to
> make sure code is compiled by C1 or C2. I modified
> `Binary16ConversionNaN.java` to compare results from Interpreter, C1 and C2.
>
> Tested tier1-5, Xcomp, stress
src/hotspot/cpu/x86/macroAssembler_x86.hpp line 199:
> 197: void flt_to_flt16(Register dst, XMMRegister src, XMMRegister tmp) {
> 198: // Instruction requires different XMM registers
> 199: vcvtps2ph(tmp, src, 0x04, Assembler::AVX_128bit);
vcvtps2ph can have source and destination as same. Did you mean to say here in
the comment that "Instruction requires XMM register as destination"?
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3928:
> 3926: }
> 3927:
> 3928: if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) {
We could check for VM_Version::supports_float16() here instead.
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3931:
> 3929: // For results consistency both intrinsics should be enabled.
> 3930: if
> (vmIntrinsics::is_intrinsic_available(vmIntrinsics::_float16ToFloat) &&
> 3931:
> vmIntrinsics::is_intrinsic_available(vmIntrinsics::_floatToFloat16)) {
Should this also check for InlineIntrinsics?
src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp line 346:
> 344: }
> 345: // For AVX CPUs only. f16c support is disabled if UseAVX == 0.
> 346: if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) {
We could check for VM_Version::supports_float16() here instead.
-------------
PR: https://git.openjdk.org/jdk/pull/12869