On Wed, 3 Jun 2026 at 04:29, Richard Henderson
<[email protected]> wrote:
>
> Use softfloat-parts.h so that we can more naturally
> perform the required operations witha single rounding step.
> This happens to also simplify the NaN detection step.
>
> Signed-off-by: Richard Henderson <[email protected]>
> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
> Message-Id: <[email protected]>
> ---
>  target/arm/tcg/sme_helper.c | 96 ++++++++++++++++---------------------
>  1 file changed, 40 insertions(+), 56 deletions(-)



> @@ -1240,48 +1238,49 @@ static float32 f16_dotadd(float32 sum, uint32_t e1, 
> uint32_t e2,
>      float16 h2c = e2 >> 16;
>      float32 t32;
>
> +    FloatParts64 p1r = float16_unpack_canonical(h1r, s_f16);
> +    FloatParts64 p1c = float16_unpack_canonical(h1c, s_f16);
> +    FloatParts64 p2r = float16_unpack_canonical(h2r, s_f16);
> +    FloatParts64 p2c = float16_unpack_canonical(h2c, s_f16);
> +
> +    int all_mask = (float_cmask(p1r.cls) | float_cmask(p1c.cls) |
> +                    float_cmask(p1r.cls) | float_cmask(p1c.cls));

This looks like a cut-and-paste error : we don't take account
of p2r.cls and p2c.cls, so if those inputs are NaNs we won't
handle them correctly. This should fix it:

--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1244,7 +1244,7 @@ static float32 f16_dotadd(float32 sum, uint32_t
e1, uint32_t e2,
     FloatParts64 p2c = float16_unpack_canonical(h2c, s_f16);

     int all_mask = (float_cmask(p1r.cls) | float_cmask(p1c.cls) |
-                    float_cmask(p1r.cls) | float_cmask(p1c.cls));
+                    float_cmask(p2r.cls) | float_cmask(p2c.cls));

     /* C.f. FPProcessNaNs4 */
     if (unlikely(all_mask & float_cmask_anynan)) {


thanks
-- PMM

Reply via email to