arm: Implement F1CVTL, F1CVTL2, F2CVTL, F2CVTL2 for AdvSIMD

Peter Maydell Tue, 19 May 2026 05:08:14 -0700

On Sun, 17 May 2026 at 01:31, Richard Henderson
<[email protected]> wrote:
>
> Signed-off-by: Richard Henderson <[email protected]>
> ---
>  target/arm/tcg/helper-fp8-defs.h |  2 ++
>  target/arm/tcg/fp8_helper.c      | 29 +++++++++++++++++++++++++++++
>  target/arm/tcg/translate-a64.c   |  3 +++
>  target/arm/tcg/a64.decode        |  3 +++
>  4 files changed, 37 insertions(+)
>
> diff --git a/target/arm/tcg/helper-fp8-defs.h 
> b/target/arm/tcg/helper-fp8-defs.h
> index 966f83d796..718463422b 100644
> --- a/target/arm/tcg/helper-fp8-defs.h
> +++ b/target/arm/tcg/helper-fp8-defs.h
> @@ -7,3 +7,5 @@ DEF_HELPER_FLAGS_4(advsimd_bfcvtl, TCG_CALL_NO_RWG, void, 
> ptr, ptr, env, i32)
>  DEF_HELPER_FLAGS_4(sve2_bfcvt, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
>  DEF_HELPER_FLAGS_4(sme2_bfcvt_hb, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
>  DEF_HELPER_FLAGS_4(sme2_bfcvtl_hb, TCG_CALL_NO_RWG, void, ptr, ptr, env, i32)
> +
> +DEF_HELPER_FLAGS_4(advsimd_fcvtl_hb, TCG_CALL_NO_RWG, void, ptr, ptr, env, 
> i32)
> diff --git a/target/arm/tcg/fp8_helper.c b/target/arm/tcg/fp8_helper.c
> index 49e87b845b..66411bf275 100644
> --- a/target/arm/tcg/fp8_helper.c
> +++ b/target/arm/tcg/fp8_helper.c
> @@ -111,6 +111,14 @@ static bfloat16 fcvt_fp8_to_b16(uint8_t x, fp8_input_fn 
> *f8fmt,
>      return bfloat16_round_pack_canonical(&p, s);
>  }
>
> +static float16 fcvt_fp8_to_f16(uint8_t x, fp8_input_fn *f8fmt,
> +                               int scale, float_status *s)
> +{
> +    FloatParts64 p = f8fmt(x, s);
> +    p = parts64_scalbn(&p, scale, s);
> +    return float16_round_pack_canonical(&p, s);
> +}


This doesn't quite do the right thing for an input SNaN,
which is supposed to set the Invalid Operation bit. This
I think is because the parts_canonicalize code in fpu/ isn't
handling the "only one NaN" case quite right:

        case float_expmax_e4m3:
            if (p->frac_hi == 0b111) {
                fracN(shl)(p, fmt->frac_shift);
                p->cls = (parts_is_snan_frac(p->frac_hi, status)
                          ? float_class_snan : float_class_qnan);
                return;
            }

This tries to use parts_is_snan_frac() to determine whether
the FP8 NaN is an SNaN or a QNaN. But for Arm at least in
E4M3 the NaN is always an SNaN, and parts_is_snan_frac()
gives the wrong answer because it's looking at the MSB
of the fraction and for Arm snan_bit_is_one() is false.
The upshot is that later on when processing this value we
don't recognize that it was an SNaN and don't set Invalid.

Unfortunately the OCP OFP8 specification doesn't say whether
the E4M3 NaN should be signaling or not, which is thus
leaving it up to the implementation to decide. If Arm is
the only user of the E4M3 code at the moment (looks like
maybe we are?) then we can hard-code "always SNaN":

diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index 89e94f202c..902f42fee5 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -246,8 +246,13 @@ static void partsN(canonicalize)(FloatPartsN *p,
float_status *status,
         case float_expmax_e4m3:
             if (p->frac_hi == 0b111) {
                 fracN(shl)(p, fmt->frac_shift);
-                p->cls = (parts_is_snan_frac(p->frac_hi, status)
-                          ? float_class_snan : float_class_qnan);
+                /*
+                 * The OCP OFP8 spec leaves this not nailed down, but
+                 * for Arm the E4M3 NaN is an SNaN.  If other targets
+                 * choose to make it a QNaN this will need to become
+                 * controllable via status.
+                 */
+                p->cls = float_class_snan;
                 return;
             }
             /* otherwise normal */

hthanks
-- PMM

Re: [PATCH v5 27/63] target/arm: Implement F1CVTL, F1CVTL2, F2CVTL, F2CVTL2 for AdvSIMD

Reply via email to