Re: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter

Sandhya Viswanathan Mon, 06 Mar 2023 17:24:50 -0800

On Fri, 3 Mar 2023 21:41:35 GMT, Vladimir Kozlov <[email protected]> wrote:


> Implemented `Float.floatToFloat16` and `Float.float16ToFloat` intrinsics in 
> Interpreter and C1 compiler to produce the same results as C2 intrinsics on 
> x64, Aarch64 and RISC-V - all platforms where C2 intrinsics for these Java 
> methods were implemented originally.
> 
> Replaced `SharedRuntime::f2hf()` and `hf2f()` C runtime functions with calls 
> to runtime stubs which use the same HW instructions as C2 intrinsics. Only 
> for 64-bit x64 because 32-bit x86 stub does not work: result is passed 
> through FPU register and NaN values become different from C2 intrinsic. This 
> runtime stub is only used to calculate constant values during C2 compilation 
> and can be skipped.
> 
> I added new tests based on Tobias's `TestAll.java` And copied 
> `jdk/lang/Float/Binary16Conversion*.java` tests to run them with `-Xcomp` to 
> make sure code is compiled by C1 or C2. I modified 
> `Binary16ConversionNaN.java` to compare results from Interpreter, C1 and C2.
> 
> Tested tier1-5, Xcomp, stress

src/hotspot/cpu/x86/macroAssembler_x86.hpp line 199:

> 197:   void flt_to_flt16(Register dst, XMMRegister src, XMMRegister tmp) {
> 198:     // Instruction requires different XMM registers
> 199:     vcvtps2ph(tmp, src, 0x04, Assembler::AVX_128bit);

vcvtps2ph can have source and destination as same. Did you mean to say here in 
the comment that "Instruction requires XMM register as destination"?

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3928:

> 3926:   }
> 3927: 
> 3928:   if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) {

We could check for VM_Version::supports_float16() here instead.

src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3931:

> 3929:     // For results consistency both intrinsics should be enabled.
> 3930:     if 
> (vmIntrinsics::is_intrinsic_available(vmIntrinsics::_float16ToFloat) &&
> 3931:         
> vmIntrinsics::is_intrinsic_available(vmIntrinsics::_floatToFloat16)) {

Should this also check for InlineIntrinsics?

src/hotspot/cpu/x86/templateInterpreterGenerator_x86_64.cpp line 346:

> 344:   }
> 345:   // For AVX CPUs only. f16c support is disabled if UseAVX == 0.
> 346:   if (VM_Version::supports_f16c() || VM_Version::supports_avx512vl()) {

We could check for VM_Version::supports_float16() here instead.

-------------

PR: https://git.openjdk.org/jdk/pull/12869

Re: RFR: 8302976: C2 intrinsification of Float.floatToFloat16 and Float.float16ToFloat yields different result than the interpreter

Reply via email to