On Wed, 3 Jun 2026 at 04:29, Richard Henderson
<[email protected]> wrote:
>
> Use softfloat-parts.h so that we can more naturally
> perform the required operations witha single rounding step.
> This happens to also simplify the NaN detection step.
>
> Signed-off-by: Richard Henderson <[email protected]>
> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
> Message-Id: <[email protected]>
> ---
> target/arm/tcg/sme_helper.c | 96 ++++++++++++++++---------------------
> 1 file changed, 40 insertions(+), 56 deletions(-)
> @@ -1240,48 +1238,49 @@ static float32 f16_dotadd(float32 sum, uint32_t e1,
> uint32_t e2,
> float16 h2c = e2 >> 16;
> float32 t32;
>
> + FloatParts64 p1r = float16_unpack_canonical(h1r, s_f16);
> + FloatParts64 p1c = float16_unpack_canonical(h1c, s_f16);
> + FloatParts64 p2r = float16_unpack_canonical(h2r, s_f16);
> + FloatParts64 p2c = float16_unpack_canonical(h2c, s_f16);
> +
> + int all_mask = (float_cmask(p1r.cls) | float_cmask(p1c.cls) |
> + float_cmask(p1r.cls) | float_cmask(p1c.cls));
This looks like a cut-and-paste error : we don't take account
of p2r.cls and p2c.cls, so if those inputs are NaNs we won't
handle them correctly. This should fix it:
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -1244,7 +1244,7 @@ static float32 f16_dotadd(float32 sum, uint32_t
e1, uint32_t e2,
FloatParts64 p2c = float16_unpack_canonical(h2c, s_f16);
int all_mask = (float_cmask(p1r.cls) | float_cmask(p1c.cls) |
- float_cmask(p1r.cls) | float_cmask(p1c.cls));
+ float_cmask(p2r.cls) | float_cmask(p2c.cls));
/* C.f. FPProcessNaNs4 */
if (unlikely(all_mask & float_cmask_anynan)) {
thanks
-- PMM