On Tue, 7 Mar 2023 01:04:00 GMT, Sandhya Viswanathan <sviswanat...@openjdk.org> wrote:
>> Implemented `Float.floatToFloat16` and `Float.float16ToFloat` intrinsics in >> Interpreter and C1 compiler to produce the same results as C2 intrinsics on >> x64, Aarch64 and RISC-V - all platforms where C2 intrinsics for these Java >> methods were implemented originally. >> >> Replaced `SharedRuntime::f2hf()` and `hf2f()` C runtime functions with calls >> to runtime stubs which use the same HW instructions as C2 intrinsics. Only >> for 64-bit x64 because 32-bit x86 stub does not work: result is passed >> through FPU register and NaN values become different from C2 intrinsic. This >> runtime stub is only used to calculate constant values during C2 compilation >> and can be skipped. >> >> I added new tests based on Tobias's `TestAll.java` And copied >> `jdk/lang/Float/Binary16Conversion*.java` tests to run them with `-Xcomp` to >> make sure code is compiled by C1 or C2. I modified >> `Binary16ConversionNaN.java` to compare results from Interpreter, C1 and C2. >> >> Tested tier1-5, Xcomp, stress > > src/hotspot/cpu/x86/macroAssembler_x86.hpp line 199: > >> 197: void flt_to_flt16(Register dst, XMMRegister src, XMMRegister tmp) { >> 198: // Instruction requires different XMM registers >> 199: vcvtps2ph(tmp, src, 0x04, Assembler::AVX_128bit); > > vcvtps2ph can have source and destination as same. Did you mean to say here > in the comment that "Instruction requires XMM register as destination"? `flt_to_flt16` is used in `x86.ad` instruction which requires preserving `src` register. I did not want to add an other macroassembler instruction for src->src case. I will add this to this comment. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3928: > >> 3926: } >> 3927: >> 3928: if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) { > > We could check for VM_Version::supports_float16() here instead. Yes. > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3931: > >> 3929: // For results consistency both intrinsics should be enabled. >> 3930: if >> (vmIntrinsics::is_intrinsic_available(vmIntrinsics::_float16ToFloat) && >> 3931: >> vmIntrinsics::is_intrinsic_available(vmIntrinsics::_floatToFloat16)) { > > Should this also check for InlineIntrinsics? `vmIntrinsics::disabled_by_jvm_flags()` checks `InlineIntrinsics`. See `vmIntrinsics.cpp` changes. > src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp line 346: > >> 344: } >> 345: // For AVX CPUs only. f16c support is disabled if UseAVX == 0. >> 346: if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) { > > We could check for VM_Version::supports_float16() here instead. Yes. And I need to remove `!InlineIntrinsics` check at line 340. ------------- PR: https://git.openjdk.org/jdk/pull/12869