On 7/22/20 2:15 AM, frank.ch...@sifive.com wrote: > -/* > - * A simplification for VLMAX > - * = (1 << LMUL) * VLEN / (8 * (1 << SEW)) > - * = (VLEN << LMUL) / (8 << SEW) > - * = (VLEN << LMUL) >> (SEW + 3) > - * = VLEN >> (SEW + 3 - LMUL) > - */ > static inline uint32_t vext_get_vlmax(RISCVCPU *cpu, target_ulong vtype) > { > uint8_t sew, lmul; > - > sew = FIELD_EX64(vtype, VTYPE, VSEW); > - lmul = FIELD_EX64(vtype, VTYPE, VLMUL); > - return cpu->cfg.vlen >> (sew + 3 - lmul); > + lmul = (FIELD_EX64(vtype, VTYPE, VFLMUL) << 2) > + | FIELD_EX64(vtype, VTYPE, VLMUL); > + float flmul = flmul_table[lmul]; > + return cpu->cfg.vlen * flmul / (1 << (sew + 3)); > }
I think if you encode lmul differently, the original formulation can still work. E.g. LMUL = 1 -> lmul = 0 LMUL = 2 -> lmul = 1 LMUL = 1/2 -> lmul = -1 so that, for SEW=8 and LMUL=1/2 we get cfg.vlen >> (0 + 3 - (-1)) = cfg.vlen >> (0 + 3 + 1) = cfg.vlen >> 4 Which neatly avoids the floating-point calculation that I don't like. r~