Hi Max,

I've been testing the OCP FP8 implementation by writing
a simple test suite in tests/fp/ that covers various boundary cases for E4M3,
E5M2, E2M1, and BFloat16 formats. During testing, I found some issues in the
float_class_inf handling in partsN(uncanon_sat).

On Wed, Feb 04, 2026 at 01:17:41PM +0800, Max Chou wrote:
> This commit provides the implementation defined behavior flags and the basic
> operation support for the OCP float8 data types(E4M3 & E5M2).
> 
> According to the definition in OFP8 spec, the conversion from a wider
> format infinity depends on the saturation mode defined in the spec.
> 
> Signed-off-by: Max Chou <[email protected]>
> ---
>  fpu/softfloat-parts.c.inc      | 159 +++++++++++++++++++++------
>  fpu/softfloat-specialize.c.inc |  62 +++++++++++
>  fpu/softfloat.c                | 191 +++++++++++++++++++++++++++++++--
>  include/fpu/softfloat-types.h  |  12 +++
>  include/fpu/softfloat.h        |  81 ++++++++++++++
>  5 files changed, 467 insertions(+), 38 deletions(-)
> 
> diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
> index 5e0438fc0b..eee7daae4d 100644
> --- a/fpu/softfloat-parts.c.inc
> +++ b/fpu/softfloat-parts.c.inc

[...]

> -static void partsN(uncanon)(FloatPartsN *p, float_status *s,
> -                            const FloatFmt *fmt)
> +static void partsN(uncanon_sat)(FloatPartsN *p, float_status *s,
> +                                const FloatFmt *fmt, bool saturate)
>  {
>      if (likely(is_anynorm(p->cls))) {
> -        parts_uncanon_normal(p, s, fmt);
> +        parts_uncanon_normal(p, s, fmt, saturate);
>      } else {
>          switch (p->cls) {
>          case float_class_zero:
> @@ -434,13 +510,30 @@ static void partsN(uncanon)(FloatPartsN *p, 
> float_status *s,
>              frac_clear(p);
>              return;
>          case float_class_inf:
> -            g_assert(!fmt->arm_althp);
> -            p->exp = fmt->exp_max;
> -            frac_clear(p);
> +            /*
> +             * Unified infinity handling using format capabilities.
> +             * Formats with no_infinity must convert infinity to something 
> else
> +             */
> +            if (!fmt->no_infinity) {
> +                p->exp = fmt->exp_max;
> +                frac_clear(p);
The saturate flag is not checked here. For IEEE-like formats such as
E5M2 that have infinity encoding, when saturate=true, the result should be
the maximum normal value, not infinity.

Per OCP FP8 specification Section 4.2 "Saturation", when saturation mode is
enabled, infinity should be converted to the maximum finite value even for
formats that support infinity representation.

My case:
  bfloat16_to_float8_e5m2(BF16_INF_POS, true, &status)
  Expected: 0x7b (max normal)
  Actual:   0x7c (infinity)

Suggested fix:
            if (!fmt->no_infinity && !saturate) {
                p->exp = fmt->exp_max;
                frac_clea
                r(p);
            } else if (!fmt->no_infinity && saturate) {
                /* Saturate infinity to max normal for IEEE-like formats */
                p->exp = fmt->exp_max - 1;
                frac_allones(p);
                frac_shr(p, fmt->frac_shift);
            } else if ...

> +            } else if (fmt_has_nan_encoding(fmt)) {
> +                if (saturate) {
> +                    parts_set_max_normal(p, fmt);
Missing frac_shr() call after parts_set_max_normal().

The parts_set_max_normal() function sets frac_hi to the normalized
fraction value (with implicit bit at position 63). Before packing into
the final format, the fraction must be shifted right by frac_shift to
position it correctly.

Compare with the float_class_qnan/snan case below which correctly calls
frac_shr(p, fmt->frac_shift) before returning.

My case:
  bfloat16_to_float8_e4m3(BF16_INF_POS, true, &status)
  Expected: 0x7e (max normal, exp=15, frac=6)
  Actual:   0x78 (exp=15, frac=0 - incorrect due to missing shift)

Suggested fix:
                if (saturate) {
                    parts_set_max_normal(p, fmt);
                    frac_shr(p, fmt->frac_shift);

> +                } else {
> +                    uint8_t dnan = s->default_nan_pattern;
> +                    p->cls = float_class_qnan;
> +                    p->sign = dnan >> 7;
> +                    p->exp = fmt->exp_max;
> +                    frac_allones(p);
Same issue - missing frac_shr() call after frac_allones().

> +                }
> +            } else {
> +                parts_set_max_normal(p, fmt);
Same issue - missing frac_shr() call after parts_set_max_normal().

PS: This path is taken for formats without NaN encoding (like E2M1).

> +            }
>              return;
>          case float_class_qnan:
>          case float_class_snan:
> -            g_assert(!fmt->arm_althp);
> +            g_assert(fmt_has_nan_encoding(fmt));
>              p->exp = fmt->exp_max;
>              frac_shr(p, fmt->frac_shift); /* <-- This is correct */
>              return;
The qnan/snan case correctly calls frac_shr(), which is good, but the inf case 
above does not.

---

I've prepared a fix patch and a test suite (tests/fp/fp-test-ocp.c) with 97
test cases covering:

- Classification functions for E4M3, E5M2, E2M1, BFloat16
- Format conversions with and without saturation
- Rounding mode handling
- Canonical NaN generation per Zvfofp8min specification


git repo:
https://github.com/zevorn/qemu/tree/riscv-zvfofp8min-zvfofp4min-v3

command:
cd $QEMU_SRC_PATH/build && ninja tests/fp/fp-test-ocp
./pyvenv/bin/meson test --suite softfloat-ocp -v


With the fix applied, all saturation tests pass:
  PASS: BF16 +inf -> E4M3 max normal (with saturation), got 0x7e
  PASS: BF16 +inf -> E5M2 max normal (with saturation), got 0x7b
  PASS: F32 +inf -> E4M3 max normal (with saturation), got 0x7e
  PASS: F32 +inf -> E5M2 max normal (with saturation), got 0x7b

---

@Richard: I noticed that the current tests/fp/ directory doesn't have test
coverage for BFloat16, OCP FP8 (E4M3/E5M2), or FP4 (E2M1) formats.

The existing fp-test relies on Berkeley TestFloat which doesn't support
these newer formats. Would it be useful if I clean up and submit my test
suite (fp-test-ocp.c) as a separate patch to provide basic test coverage
for these OCP floating-point formats? It could help catch similar issues
in future softfloat changes.

Thanks,
Chao

Reply via email to