On Fri, 24 Apr 2026 16:28:50 GMT, Jatin Bhateja <[email protected]> wrote:

> Patch optimizes Float16 to integral conversion operations. Currently, its a 
> two step process where by first a Float16 value is
> converted to a single precision floating point value followed by a conversion 
> to an integral value.
> 
> x86 targets supporting AVX512-FP16 feature (Intel Sapphire Rapids+ and 
> upcoming AMD Zen6) provides direct instruction to convert a Float16 value to 
> integral value.
> 
> Following are the performance numbers of micro benchmark included with the 
> patch on Granite Rapids with and without auto-vectorization.
> 
> <img width="1125" height="636" alt="image" 
> src="https://github.com/user-attachments/assets/ca6e6757-1579-475f-8307-9454c7c025c1";
>  />
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin
> 
> ---------
> - [x] I confirm that I make this contribution in accordance with the [OpenJDK 
> Interim AI Policy](https://openjdk.org/legal/ai).

Changes requested by galder (Committer).

src/hotspot/cpu/x86/x86.ad line 14734:

> 14732:   format %{ "convert_hf2l $dst, $src !\t using $xtmp as TEMP" %}
> 14733:   ins_encode %{
> 14734:     __ convertHF2I(T_LONG, $dst$$Register, $src$$Register, 
> $xtmp$$XMMRegister);

Minor comment: isn't it a bit confusing to call `convertHF2I` with a `T_LONG`? 
Maybe `convertHF2I` could be renamed to `convertHF2X` to not commit to the type?

-------------

PR Review: https://git.openjdk.org/jdk/pull/30928#pullrequestreview-4242968982
PR Review Comment: https://git.openjdk.org/jdk/pull/30928#discussion_r3200551189

Reply via email to