Hi Max, I've been testing the OCP FP8 implementation by writing a simple test suite in tests/fp/ that covers various boundary cases for E4M3, E5M2, E2M1, and BFloat16 formats. During testing, I found some issues in the float_class_inf handling in partsN(uncanon_sat).
On Wed, Feb 04, 2026 at 01:17:41PM +0800, Max Chou wrote: > This commit provides the implementation defined behavior flags and the basic > operation support for the OCP float8 data types(E4M3 & E5M2). > > According to the definition in OFP8 spec, the conversion from a wider > format infinity depends on the saturation mode defined in the spec. > > Signed-off-by: Max Chou <[email protected]> > --- > fpu/softfloat-parts.c.inc | 159 +++++++++++++++++++++------ > fpu/softfloat-specialize.c.inc | 62 +++++++++++ > fpu/softfloat.c | 191 +++++++++++++++++++++++++++++++-- > include/fpu/softfloat-types.h | 12 +++ > include/fpu/softfloat.h | 81 ++++++++++++++ > 5 files changed, 467 insertions(+), 38 deletions(-) > > diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc > index 5e0438fc0b..eee7daae4d 100644 > --- a/fpu/softfloat-parts.c.inc > +++ b/fpu/softfloat-parts.c.inc [...] > -static void partsN(uncanon)(FloatPartsN *p, float_status *s, > - const FloatFmt *fmt) > +static void partsN(uncanon_sat)(FloatPartsN *p, float_status *s, > + const FloatFmt *fmt, bool saturate) > { > if (likely(is_anynorm(p->cls))) { > - parts_uncanon_normal(p, s, fmt); > + parts_uncanon_normal(p, s, fmt, saturate); > } else { > switch (p->cls) { > case float_class_zero: > @@ -434,13 +510,30 @@ static void partsN(uncanon)(FloatPartsN *p, > float_status *s, > frac_clear(p); > return; > case float_class_inf: > - g_assert(!fmt->arm_althp); > - p->exp = fmt->exp_max; > - frac_clear(p); > + /* > + * Unified infinity handling using format capabilities. > + * Formats with no_infinity must convert infinity to something > else > + */ > + if (!fmt->no_infinity) { > + p->exp = fmt->exp_max; > + frac_clear(p); The saturate flag is not checked here. For IEEE-like formats such as E5M2 that have infinity encoding, when saturate=true, the result should be the maximum normal value, not infinity. Per OCP FP8 specification Section 4.2 "Saturation", when saturation mode is enabled, infinity should be converted to the maximum finite value even for formats that support infinity representation. My case: bfloat16_to_float8_e5m2(BF16_INF_POS, true, &status) Expected: 0x7b (max normal) Actual: 0x7c (infinity) Suggested fix: if (!fmt->no_infinity && !saturate) { p->exp = fmt->exp_max; frac_clea r(p); } else if (!fmt->no_infinity && saturate) { /* Saturate infinity to max normal for IEEE-like formats */ p->exp = fmt->exp_max - 1; frac_allones(p); frac_shr(p, fmt->frac_shift); } else if ... > + } else if (fmt_has_nan_encoding(fmt)) { > + if (saturate) { > + parts_set_max_normal(p, fmt); Missing frac_shr() call after parts_set_max_normal(). The parts_set_max_normal() function sets frac_hi to the normalized fraction value (with implicit bit at position 63). Before packing into the final format, the fraction must be shifted right by frac_shift to position it correctly. Compare with the float_class_qnan/snan case below which correctly calls frac_shr(p, fmt->frac_shift) before returning. My case: bfloat16_to_float8_e4m3(BF16_INF_POS, true, &status) Expected: 0x7e (max normal, exp=15, frac=6) Actual: 0x78 (exp=15, frac=0 - incorrect due to missing shift) Suggested fix: if (saturate) { parts_set_max_normal(p, fmt); frac_shr(p, fmt->frac_shift); > + } else { > + uint8_t dnan = s->default_nan_pattern; > + p->cls = float_class_qnan; > + p->sign = dnan >> 7; > + p->exp = fmt->exp_max; > + frac_allones(p); Same issue - missing frac_shr() call after frac_allones(). > + } > + } else { > + parts_set_max_normal(p, fmt); Same issue - missing frac_shr() call after parts_set_max_normal(). PS: This path is taken for formats without NaN encoding (like E2M1). > + } > return; > case float_class_qnan: > case float_class_snan: > - g_assert(!fmt->arm_althp); > + g_assert(fmt_has_nan_encoding(fmt)); > p->exp = fmt->exp_max; > frac_shr(p, fmt->frac_shift); /* <-- This is correct */ > return; The qnan/snan case correctly calls frac_shr(), which is good, but the inf case above does not. --- I've prepared a fix patch and a test suite (tests/fp/fp-test-ocp.c) with 97 test cases covering: - Classification functions for E4M3, E5M2, E2M1, BFloat16 - Format conversions with and without saturation - Rounding mode handling - Canonical NaN generation per Zvfofp8min specification git repo: https://github.com/zevorn/qemu/tree/riscv-zvfofp8min-zvfofp4min-v3 command: cd $QEMU_SRC_PATH/build && ninja tests/fp/fp-test-ocp ./pyvenv/bin/meson test --suite softfloat-ocp -v With the fix applied, all saturation tests pass: PASS: BF16 +inf -> E4M3 max normal (with saturation), got 0x7e PASS: BF16 +inf -> E5M2 max normal (with saturation), got 0x7b PASS: F32 +inf -> E4M3 max normal (with saturation), got 0x7e PASS: F32 +inf -> E5M2 max normal (with saturation), got 0x7b --- @Richard: I noticed that the current tests/fp/ directory doesn't have test coverage for BFloat16, OCP FP8 (E4M3/E5M2), or FP4 (E2M1) formats. The existing fp-test relies on Berkeley TestFloat which doesn't support these newer formats. Would it be useful if I clean up and submit my test suite (fp-test-ocp.c) as a separate patch to provide basic test coverage for these OCP floating-point formats? It could help catch similar issues in future softfloat changes. Thanks, Chao
